Python Version Control Basics Practice Problems & Exercises
Practice: Version Control Basics
← Back to lessonEasy
Predict all five outputs. Each assertion checks whether a .gitignore pattern correctly categorises a file or directory. Reason through each pattern rule.
# This code simulates gitignore-style pattern matching.
# We verify whether each path SHOULD be ignored.
import fnmatch
def should_ignore(pattern, path):
"""Simplified gitignore matching: does path match pattern?"""
# Pattern ending with / matches directory names only
if pattern.endswith("/"):
return fnmatch.fnmatch(path, pattern.rstrip("/")) or path.startswith(pattern)
return fnmatch.fnmatch(path, pattern) or fnmatch.fnmatch(path.split("/")[-1], pattern)
# Pattern: __pycache__/ matches directory named __pycache__
print(should_ignore("__pycache__/", "__pycache__"))
# Pattern: *.pyc matches any .pyc file
print(should_ignore("*.pyc", "src/myapp/services.pyc"))
# Pattern: .env matches the .env file exactly
print(should_ignore(".env", ".env"))
# Pattern: .venv/ matches virtual environment directory
print(should_ignore(".venv/", ".venv"))
# Pattern: dist/ matches the dist build output directory
print(should_ignore("dist/", "dist"))Solution
import fnmatch
def should_ignore(pattern, path):
if pattern.endswith("/"):
return fnmatch.fnmatch(path, pattern.rstrip("/")) or path.startswith(pattern)
return fnmatch.fnmatch(path, pattern) or fnmatch.fnmatch(path.split("/")[-1], pattern)
print(should_ignore("__pycache__/", "__pycache__"))
print(should_ignore("*.pyc", "src/myapp/services.pyc"))
print(should_ignore(".env", ".env"))
print(should_ignore(".venv/", ".venv"))
print(should_ignore("dist/", "dist"))
Output:
True
True
True
True
True
How it works:
__pycache__/— the trailing/means "directory only".__pycache__matches the stripped pattern, soTrue.*.pyc— the glob wildcard matches any filename ending in.pyc, including the basename of a nested path (services.pyc)..env— an exact filename match. No wildcards needed..venv/— the trailing/matches the directory name.venv.dist/— matches thedistbuild output directory.
Key insight: The three most important Python .gitignore entries are .env (secrets), .venv/ (virtual environment — contains absolute paths and platform binaries), and dist/ (built artifacts — always regenerate, never commit). Committing .env is a security incident; committing .venv/ silently breaks the project on every other machine.
Expected Output
True\nTrue\nTrue\nTrue\nTrueHints
Hint 1: A .gitignore pattern ending with / only matches directories.
Hint 2: A pattern starting with * matches any number of characters.
Hint 3: The .env file must never be committed — it contains real secrets.
Hint 4: Virtual environment directories contain absolute paths and platform binaries that break on other machines.
Predict all five outputs. A validator checks whether each commit message follows the Conventional Commits format: type(optional-scope): subject.
import re
VALID_TYPES = {"feat", "fix", "refactor", "test", "docs", "style", "chore", "perf", "ci"}
def is_conventional(msg):
"""Return True if msg follows Conventional Commits format."""
pattern = r'^(' + '|'.join(VALID_TYPES) + r')(\([^)]+\))?: .+'
return bool(re.match(pattern, msg))
print(is_conventional("WIP"))
print(is_conventional("fix(auth): handle expired JWT with 401 not 500"))
print(is_conventional("changes to auth module"))
print(is_conventional("feat(api): add pagination to courses endpoint"))
print(is_conventional("chore(deps): upgrade pydantic from 1.x to 2.x"))Solution
import re
VALID_TYPES = {"feat", "fix", "refactor", "test", "docs", "style", "chore", "perf", "ci"}
def is_conventional(msg):
pattern = r'^(' + '|'.join(VALID_TYPES) + r')(\([^)]+\))?: .+'
return bool(re.match(pattern, msg))
print(is_conventional("WIP"))
print(is_conventional("fix(auth): handle expired JWT with 401 not 500"))
print(is_conventional("changes to auth module"))
print(is_conventional("feat(api): add pagination to courses endpoint"))
print(is_conventional("chore(deps): upgrade pydantic from 1.x to 2.x"))
Output:
False
True
False
True
True
How it works:
"WIP"— no type prefix, no colon separator. Invalid."fix(auth): handle expired JWT with 401 not 500"— type isfix, scope isauth, subject follows after:. Valid."changes to auth module"— plain prose with no type prefix. Invalid."feat(api): add pagination to courses endpoint"— typefeat, scopeapi, subject present. Valid."chore(deps): upgrade pydantic from 1.x to 2.x"— typechore, scopedeps, subject present. Valid.
Key insight: The Conventional Commits format is machine-parseable. Tools like semantic-release read commit types to automatically bump version numbers: feat: triggers a MINOR bump, fix: triggers a PATCH bump, and a BREAKING CHANGE: footer triggers a MAJOR bump. The scope (in parentheses) is optional but recommended — it names the subsystem affected and makes the log far more navigable when scanning hundreds of commits.
Expected Output
False\nTrue\nFalse\nTrue\nTrueHints
Hint 1: A conventional commit must start with a type: feat, fix, refactor, test, docs, style, chore, perf, ci.
Hint 2: The format is type(scope): subject — the colon and space after the type are required.
Hint 3: The subject must use imperative mood and not exceed 50 characters.
Hint 4: "WIP" and "fix" alone are not conventional commit messages.
Predict all five outputs. A classifier maps a description of a change to the correct Conventional Commits type.
def classify_commit(description):
"""Return the Conventional Commits type for a described change."""
desc = description.lower()
if any(w in desc for w in ["bug", "crash", "broken", "wrong", "incorrect", "error", "500"]):
return "fix"
if any(w in desc for w in ["new feature", "add endpoint", "add function", "new capability"]):
return "feat"
if any(w in desc for w in ["rename", "extract", "restructure", "inline", "no behavior", "no behaviour"]):
return "refactor"
if any(w in desc for w in ["dependency", "upgrade", "update library", "build", "tooling"]):
return "chore"
if any(w in desc for w in ["test", "tests", "pytest", "parametrized"]):
return "test"
return "unknown"
print(classify_commit("Fix bug where search returns 500 on empty query"))
print(classify_commit("Add new feature: export_csv in reports module"))
print(classify_commit("Rename get_usr to get_user, no behaviour change"))
print(classify_commit("Upgrade dependency pydantic from 1.x to 2.x"))
print(classify_commit("Add parametrized tests for discount calculation"))Solution
def classify_commit(description):
desc = description.lower()
if any(w in desc for w in ["bug", "crash", "broken", "wrong", "incorrect", "error", "500"]):
return "fix"
if any(w in desc for w in ["new feature", "add endpoint", "add function", "new capability"]):
return "feat"
if any(w in desc for w in ["rename", "extract", "restructure", "inline", "no behavior", "no behaviour"]):
return "refactor"
if any(w in desc for w in ["dependency", "upgrade", "update library", "build", "tooling"]):
return "chore"
if any(w in desc for w in ["test", "tests", "pytest", "parametrized"]):
return "test"
return "unknown"
print(classify_commit("Fix bug where search returns 500 on empty query"))
print(classify_commit("Add new feature: export_csv in reports module"))
print(classify_commit("Rename get_usr to get_user, no behaviour change"))
print(classify_commit("Upgrade dependency pydantic from 1.x to 2.x"))
print(classify_commit("Add parametrized tests for discount calculation"))
Output:
fix
feat
refactor
chore
test
How it works: Each description matches keywords specific to one commit type:
"500"and"bug"signal a broken behaviour being corrected —fix."new feature"signals new user-facing capability —feat."rename"and"no behaviour change"signal structural improvement with no logic change —refactor."upgrade dependency"signals housekeeping, not a feature or fix —chore."parametrized tests"signals test-only work —test.
Key insight: The choice between fix and refactor depends on whether user-observable behaviour changed. If the bug caused a crash that now does not crash, that is fix. If you renamed a variable purely for readability and nothing a caller would notice changed, that is refactor. The distinction matters for automated changelog generation — users only see feat and fix entries in release notes; refactor and chore are typically suppressed.
Expected Output
fix\nfeat\nrefactor\nchore\ntestHints
Hint 1: Bug fixes use type fix.
Hint 2: New capabilities or features use type feat.
Hint 3: Code restructuring with no behaviour change uses type refactor.
Hint 4: Dependency updates, build tooling, and housekeeping use type chore.
Hint 5: Adding or updating tests uses type test.
Predict all four outputs. A function applies Semantic Versioning rules given a current version and a bump type.
def bump_version(current, bump_type):
"""Apply a SemVer bump to current version string."""
major, minor, patch = map(int, current.split("."))
if bump_type == "patch":
return f"{major}.{minor}.{patch + 1}"
elif bump_type == "minor":
return f"{major}.{minor + 1}.0"
elif bump_type == "major":
return f"{major + 1}.0.0"
raise ValueError(f"Unknown bump type: {bump_type}")
print(bump_version("1.0.0", "patch")) # backward-compatible bug fix
print(bump_version("1.0.1", "minor")) # backward-compatible new feature
print(bump_version("1.9.3", "major")) # breaking change
print(bump_version("1.2.3", "patch")) # another bug fixSolution
def bump_version(current, bump_type):
major, minor, patch = map(int, current.split("."))
if bump_type == "patch":
return f"{major}.{minor}.{patch + 1}"
elif bump_type == "minor":
return f"{major}.{minor + 1}.0"
elif bump_type == "major":
return f"{major + 1}.0.0"
raise ValueError(f"Unknown bump type: {bump_type}")
print(bump_version("1.0.0", "patch"))
print(bump_version("1.0.1", "minor"))
print(bump_version("1.9.3", "major"))
print(bump_version("1.2.3", "patch"))
Output:
1.0.1
1.1.0
2.0.0
1.2.4
How it works:
patchon1.0.0— increment patch:1.0.1. MINOR and MAJOR unchanged.minoron1.0.1— increment minor, reset patch to zero:1.1.0.majoron1.9.3— increment major, reset both minor and patch to zero:2.0.0.patchon1.2.3— increment patch:1.2.4.
Key insight: The reset rules are mandatory in SemVer, not optional. A minor bump that does not reset patch to zero would suggest 1.0.1 is a patch of 1.1 — which is meaningless. When using setuptools-scm, you never manually call bump_version; you tag a commit with git tag v1.1.0 and the build tooling reads the version directly from the tag. The commit log of Conventional Commits can drive this automatically via semantic-release.
Expected Output
1.0.1\n1.1.0\n2.0.0\n1.2.4Hints
Hint 1: PATCH bump: backward-compatible bug fix — increment the third number, reset nothing.
Hint 2: MINOR bump: backward-compatible new feature — increment the second number, reset PATCH to 0.
Hint 3: MAJOR bump: breaking change — increment the first number, reset MINOR and PATCH to 0.
Hint 4: A BREAKING CHANGE footer always triggers MAJOR regardless of the commit type.
Medium
Implement a .gitignore validator that checks whether a Python project has all critical patterns and warns about common omissions.
def validate_gitignore(content):
"""
Validate a .gitignore file for a Python project.
Returns (is_valid, messages).
"""
lines = set(line.strip() for line in content.splitlines() if line.strip() and not line.startswith("#"))
critical = [".env", "__pycache__/", ".venv/"]
warnings = []
missing = [p for p in critical if p not in lines]
if missing:
return False, [f"Missing critical patterns: {missing}"]
messages = ["All critical patterns present"]
risky_missing = [
(".env.local", "Warning: .env.local not ignored — may contain local secrets"),
("*.egg-info/", "Warning: *.egg-info/ not ignored — generated build metadata"),
("dist/", "Warning: dist/ not ignored — built packages should not be committed"),
]
for pattern, msg in risky_missing:
if pattern not in lines:
messages.append(msg)
return True, messages
# Test 1: incomplete .gitignore
incomplete = """
# Python
dist/
*.pyc
"""
valid, msgs = validate_gitignore(incomplete)
for m in msgs:
print(m)
# Test 2: complete critical entries
complete = """
.env
__pycache__/
.venv/
dist/
*.egg-info/
.env.local
"""
valid, msgs = validate_gitignore(complete)
for m in msgs:
print(m)
# Test 3: missing .env.local
partial = """
.env
__pycache__/
.venv/
dist/
*.egg-info/
"""
valid, msgs = validate_gitignore(partial)
for m in msgs:
print(m)Solution
def validate_gitignore(content):
lines = set(line.strip() for line in content.splitlines() if line.strip() and not line.startswith("#"))
critical = [".env", "__pycache__/", ".venv/"]
missing = [p for p in critical if p not in lines]
if missing:
return False, [f"Missing critical patterns: {missing}"]
messages = ["All critical patterns present"]
risky_missing = [
(".env.local", "Warning: .env.local not ignored — may contain local secrets"),
("*.egg-info/", "Warning: *.egg-info/ not ignored — generated build metadata"),
("dist/", "Warning: dist/ not ignored — built packages should not be committed"),
]
for pattern, msg in risky_missing:
if pattern not in lines:
messages.append(msg)
return True, messages
incomplete = """
# Python
dist/
*.pyc
"""
valid, msgs = validate_gitignore(incomplete)
for m in msgs:
print(m)
complete = """
.env
__pycache__/
.venv/
dist/
*.egg-info/
.env.local
"""
valid, msgs = validate_gitignore(complete)
for m in msgs:
print(m)
partial = """
.env
__pycache__/
.venv/
dist/
*.egg-info/
"""
valid, msgs = validate_gitignore(partial)
for m in msgs:
print(m)
Output:
Missing critical patterns: ['.env', '__pycache__/', '.venv/']
All critical patterns present
Warning: .env.local not ignored — may contain local secrets
How it works:
-
The incomplete
.gitignorehasdist/and*.pycbut is missing all three critical entries:.env,__pycache__/, and.venv/. The validator returns the missing list immediately. -
The complete
.gitignorehas all three critical entries plusdist/,*.egg-info/, and.env.local. No warnings issued. -
The partial
.gitignorehas all three critical entries but omits.env.local. The validator passes the critical check but emits a warning about the potential local secrets file.
Key insight: The validator separates blocking failures (missing .env — this is a security incident waiting to happen) from advisory warnings (missing .env.local — less severe but common source of leaked credentials). In a CI pipeline, you would fail the build on blocking failures and emit warnings for advisory ones. The .env.example pattern (commit the template, never the real values) is the correct approach: new team members copy .env.example to .env and fill in their own values.
Expected Output
Missing critical patterns: ['.env', '__pycache__/', '.venv/']\nAll critical patterns present\nWarning: .env.local not ignored — may contain local secretsHints
Hint 1: The three non-negotiable entries for any Python project are .env, __pycache__/, and .venv/.
Hint 2: Read the .gitignore content as a set of lines and check for each required pattern.
Hint 3: An .env.local file often contains local overrides with real credentials.
Implement a changelog generator that parses a list of Conventional Commit messages and groups them into a human-readable release notes format.
def generate_release_notes(commits):
"""
Group conventional commit messages into release note sections.
Returns formatted string.
"""
features = []
bug_fixes = []
other = []
for commit in commits:
commit = commit.strip()
if commit.startswith("feat"):
features.append(commit)
elif commit.startswith("fix"):
bug_fixes.append(commit)
else:
other.append(commit)
lines = []
if features:
lines.append(f"Features ({len(features)}):")
for f in features:
lines.append(f" {f}")
if bug_fixes:
lines.append(f"Bug Fixes ({len(bug_fixes)}):")
for b in bug_fixes:
lines.append(f" {b}")
if other:
lines.append(f"Other ({len(other)})")
return "\n".join(lines)
commits = [
"feat(api): add pagination to courses endpoint",
"fix(search): return empty list on empty query not 500",
"feat(auth): add JWT refresh token endpoint",
"refactor(models): extract UserValidator from User",
"chore(deps): upgrade pytest to 8.x",
]
print(generate_release_notes(commits))Solution
def generate_release_notes(commits):
features = []
bug_fixes = []
other = []
for commit in commits:
commit = commit.strip()
if commit.startswith("feat"):
features.append(commit)
elif commit.startswith("fix"):
bug_fixes.append(commit)
else:
other.append(commit)
lines = []
if features:
lines.append(f"Features ({len(features)}):")
for f in features:
lines.append(f" {f}")
if bug_fixes:
lines.append(f"Bug Fixes ({len(bug_fixes)}):")
for b in bug_fixes:
lines.append(f" {b}")
if other:
lines.append(f"Other ({len(other)})")
return "\n".join(lines)
commits = [
"feat(api): add pagination to courses endpoint",
"fix(search): return empty list on empty query not 500",
"feat(auth): add JWT refresh token endpoint",
"refactor(models): extract UserValidator from User",
"chore(deps): upgrade pytest to 8.x",
]
print(generate_release_notes(commits))
Output:
Features (2):
feat(api): add pagination to courses endpoint
feat(auth): add JWT refresh token endpoint
Bug Fixes (1):
fix(search): return empty list on empty query not 500
Other (2)
How it works: The function iterates over commits and routes each to one of three buckets based on its type prefix. Features and bug fixes are user-facing changes that appear in full in the release notes. The refactor and chore entries go to "Other" with only a count — internal changes are not meaningful to users.
The ordering (Features first, Bug Fixes second, Other last) matches the convention used by tools like semantic-release and GitHub's release notes generator.
Key insight: This is exactly what automated release tooling does with your commit history. When you enforce Conventional Commits across a team, you get a changelog generator "for free" — the structure of the commit messages is already the release notes. The value of the standard compounds over time: a year of consistent Conventional Commits means a year of machine-readable project history.
Expected Output
Features (2):\n feat(api): add pagination to courses endpoint\n feat(auth): add JWT refresh token endpoint\nBug Fixes (1):\n fix(search): return empty list on empty query not 500\nOther (2)Hints
Hint 1: Group commits by type: feat goes to Features, fix goes to Bug Fixes, everything else to Other.
Hint 2: Parse the type by splitting on the colon.
Hint 3: The release notes should list Features first, then Bug Fixes, then a count for everything else.
Simulate git bisect by implementing a binary search that finds the first "bad" commit in a history where a test transitions from passing to failing.
def git_bisect(commits, is_good):
"""
Simulate git bisect.
commits: list of commit messages (oldest first)
is_good: function(index) -> bool, True if the commit is good
Returns (steps_taken, bad_commit_index).
"""
lo, hi = 0, len(commits) - 1
steps = 0
while lo < hi:
mid = (lo + hi) // 2
steps += 1
if is_good(mid):
lo = mid + 1
else:
hi = mid
return steps, lo
# 12 commits: first 7 are good (test passes), commit 7 is the first bad one
commits = [
"feat(models): add Order model",
"feat(services): add create_order function",
"test(services): add order creation tests",
"fix(models): handle null user_id",
"feat(api): add POST /orders endpoint",
"docs(api): document Order schema",
"refactor(services): extract price calculation",
"feat(services): optimize discount calculation", # index 7: BAD
"refactor(models): inline trivial helper",
"chore(deps): upgrade sqlalchemy",
"test(api): add integration tests for orders",
"docs(readme): update local setup instructions",
]
def is_good(index):
# Commits 0-6 are good, 7+ are bad
return index < 7
steps, bad_idx = git_bisect(commits, is_good)
print(f"Steps: {steps}")
print(f"Bad commit index: {bad_idx}")
print(f"Commit message: {commits[bad_idx]}")Solution
def git_bisect(commits, is_good):
lo, hi = 0, len(commits) - 1
steps = 0
while lo < hi:
mid = (lo + hi) // 2
steps += 1
if is_good(mid):
lo = mid + 1
else:
hi = mid
return steps, lo
commits = [
"feat(models): add Order model",
"feat(services): add create_order function",
"test(services): add order creation tests",
"fix(models): handle null user_id",
"feat(api): add POST /orders endpoint",
"docs(api): document Order schema",
"refactor(services): extract price calculation",
"feat(services): optimize discount calculation",
"refactor(models): inline trivial helper",
"chore(deps): upgrade sqlalchemy",
"test(api): add integration tests for orders",
"docs(readme): update local setup instructions",
]
def is_good(index):
return index < 7
steps, bad_idx = git_bisect(commits, is_good)
print(f"Steps: {steps}")
print(f"Bad commit index: {bad_idx}")
print(f"Commit message: {commits[bad_idx]}")
Output:
Steps: 4
Bad commit index: 7
Commit message: feat(services): optimize discount calculation
How it works: The binary search maintains [lo, hi] as the range of commits still under investigation. At each step, it checks the midpoint:
- If
is_good(mid)is True, the bad commit must be in[mid+1, hi], solo = mid + 1. - If
is_good(mid)is False, the bad commit could be atmidor earlier, sohi = mid.
The loop terminates when lo == hi, pointing to the first bad commit.
For 12 commits, we check at most ceil(log2(12)) = 4 steps. The sequence: mid=5 (good, lo=6), mid=9 (bad, hi=9), mid=7 (bad, hi=7), mid=6 (good, lo=7) — loop ends with lo=hi=7.
Key insight: With 100 commits, manual checking takes up to 100 steps; git bisect takes at most 7 steps. The real power is automation: git bisect run pytest tests/test_services.py -x uses the test exit code to mark commits good or bad automatically. This turns a potentially day-long regression hunt into a ten-minute automated process — especially valuable when the bug was introduced days or weeks ago across dozens of commits.
Expected Output
Steps: 4\nBad commit index: 7\nCommit message: feat(services): optimize discount calculationHints
Hint 1: git bisect uses binary search: each step halves the remaining search space.
Hint 2: The bad commit is the first one where the test function returns False.
Hint 3: Start with lo=0, hi=len(commits)-1 and repeatedly check the midpoint.
Implement a branching strategy recommender that selects the right strategy based on team characteristics.
def recommend_strategy(team_size, release_cycle, ci_speed_minutes, multiple_versions):
"""
Recommend a Git branching strategy.
team_size: int — number of engineers
release_cycle: str — "continuous", "weekly", "monthly", "quarterly"
ci_speed_minutes: float — time for full test suite to run
multiple_versions: bool — True if multiple versions are maintained simultaneously
"""
# Trunk-based: large team, very fast CI, no multiple maintained versions
if team_size >= 50 and ci_speed_minutes <= 10 and not multiple_versions:
return "trunk-based"
# GitFlow: formal release cycle or multiple maintained versions
if release_cycle in ("quarterly", "monthly") or multiple_versions:
return "gitflow"
# Feature branch: the default for most teams
return "feature-branch"
# Small startup, continuous releases
print(recommend_strategy(4, "continuous", 8.0, False))
# Mid-size team, quarterly releases, 2 supported versions
print(recommend_strategy(15, "quarterly", 20.0, True))
# Large org, continuous, fast CI, single version
print(recommend_strategy(200, "continuous", 5.0, False))
# Small agency, weekly releases
print(recommend_strategy(8, "weekly", 12.0, False))
# Enterprise, monthly versioned releases
print(recommend_strategy(30, "monthly", 25.0, False))Solution
def recommend_strategy(team_size, release_cycle, ci_speed_minutes, multiple_versions):
if team_size >= 50 and ci_speed_minutes <= 10 and not multiple_versions:
return "trunk-based"
if release_cycle in ("quarterly", "monthly") or multiple_versions:
return "gitflow"
return "feature-branch"
print(recommend_strategy(4, "continuous", 8.0, False))
print(recommend_strategy(15, "quarterly", 20.0, True))
print(recommend_strategy(200, "continuous", 5.0, False))
print(recommend_strategy(8, "weekly", 12.0, False))
print(recommend_strategy(30, "monthly", 25.0, False))
Output:
feature-branch
gitflow
trunk-based
feature-branch
gitflow
How it works:
- 4-person startup, continuous — small team, simple workflow. Feature branch is appropriate; GitFlow's ceremony is unnecessary overhead.
- 15-person team, quarterly + two versions — both signals independently point to GitFlow: quarterly release cadence and maintaining two supported versions simultaneously require dedicated
release/andhotfix/branch types. - 200-person org, continuous, 5-minute CI — meets all three trunk-based requirements: large team, fast CI, single version. Google-style development.
- 8-person agency, weekly — weekly releases with a small team do not need GitFlow. Feature branch with short-lived branches and a disciplined merge cadence works fine.
- 30-person team, monthly releases — monthly versioned releases without continuous deployment benefit from GitFlow's
release/branch for stabilization.
Key insight: The worst mistake is choosing a strategy based on what sounds impressive rather than what fits the team. GitFlow adds real ceremony that costs engineer time — creating release branches, cherry-picking hotfixes to both main and develop, managing the develop integration branch. That cost is worth paying when you maintain multiple versions or have a formal release cycle. For a five-person team shipping code daily, it is pure overhead that slows everyone down.
Expected Output
feature-branch\ngitflow\ntrunk-based\nfeature-branch\ngitflowHints
Hint 1: Feature branch workflow suits teams of 2-10 with no formal release cycle.
Hint 2: GitFlow suits teams with versioned releases, multiple maintained versions, or quarterly release cycles.
Hint 3: Trunk-based development requires comprehensive CI/CD, fast test suites, and feature flags — typically large org scale.
Hint 4: A startup with 3 engineers releasing continuously does not need GitFlow ceremony.
Hard
Implement a pre-commit config auditor that checks whether a .pre-commit-config.yaml has the five recommended quality gates for a Python team.
def audit_precommit_config(config):
"""
Audit a pre-commit config dict for recommended Python project hooks.
config: parsed dict representing .pre-commit-config.yaml
Returns list of (passed, message) tuples.
"""
# Collect all hook IDs across all repos
all_hook_ids = set()
for repo in config.get("repos", []):
for hook in repo.get("hooks", []):
all_hook_ids.add(hook.get("id", ""))
checks = [
(
any(h in all_hook_ids for h in ["ruff", "flake8"]),
"ruff linting configured",
"ruff linting not configured — code quality issues will reach review",
),
(
any(h in all_hook_ids for h in ["black", "ruff-format"]),
"code formatting configured",
"code formatting not configured — style issues will clutter reviews",
),
(
"detect-private-key" in all_hook_ids,
"secret detection configured",
"secret detection not configured — credentials may be committed accidentally",
),
(
"no-commit-to-branch" in all_hook_ids,
"no-commit-to-branch configured",
"no-commit-to-branch not configured — engineers can commit directly to main",
),
(
"mypy" in all_hook_ids,
"type checking (mypy) configured",
"type checking (mypy) not configured",
),
]
results = []
passed = 0
for ok, success_msg, fail_msg in checks:
if ok:
results.append(f"PASS: {success_msg}")
passed += 1
else:
results.append(f"FAIL: {fail_msg}")
results.append(f"Overall: {passed}/{len(checks)} checks passed")
return results
# Config missing no-commit-to-branch and mypy
config = {
"repos": [
{
"repo": "https://github.com/astral-sh/ruff-pre-commit",
"rev": "v0.3.0",
"hooks": [
{"id": "ruff", "args": ["--fix"]},
{"id": "ruff-format"},
],
},
{
"repo": "https://github.com/pre-commit/pre-commit-hooks",
"rev": "v4.5.0",
"hooks": [
{"id": "detect-private-key"},
{"id": "trailing-whitespace"},
],
},
]
}
for line in audit_precommit_config(config):
print(line)Solution
def audit_precommit_config(config):
all_hook_ids = set()
for repo in config.get("repos", []):
for hook in repo.get("hooks", []):
all_hook_ids.add(hook.get("id", ""))
checks = [
(
any(h in all_hook_ids for h in ["ruff", "flake8"]),
"ruff linting configured",
"ruff linting not configured — code quality issues will reach review",
),
(
any(h in all_hook_ids for h in ["black", "ruff-format"]),
"code formatting configured",
"code formatting not configured — style issues will clutter reviews",
),
(
"detect-private-key" in all_hook_ids,
"secret detection configured",
"secret detection not configured — credentials may be committed accidentally",
),
(
"no-commit-to-branch" in all_hook_ids,
"no-commit-to-branch configured",
"no-commit-to-branch not configured — engineers can commit directly to main",
),
(
"mypy" in all_hook_ids,
"type checking (mypy) configured",
"type checking (mypy) not configured",
),
]
results = []
passed = 0
for ok, success_msg, fail_msg in checks:
if ok:
results.append(f"PASS: {success_msg}")
passed += 1
else:
results.append(f"FAIL: {fail_msg}")
results.append(f"Overall: {passed}/{len(checks)} checks passed")
return results
config = {
"repos": [
{
"repo": "https://github.com/astral-sh/ruff-pre-commit",
"rev": "v0.3.0",
"hooks": [
{"id": "ruff", "args": ["--fix"]},
{"id": "ruff-format"},
],
},
{
"repo": "https://github.com/pre-commit/pre-commit-hooks",
"rev": "v4.5.0",
"hooks": [
{"id": "detect-private-key"},
{"id": "trailing-whitespace"},
],
},
]
}
for line in audit_precommit_config(config):
print(line)
Output:
PASS: ruff linting configured
PASS: code formatting configured
PASS: secret detection configured
FAIL: no-commit-to-branch not configured — engineers can commit directly to main
FAIL: type checking (mypy) not configured
Overall: 3/5 checks passed
How it works: The auditor flattens all hook IDs from the entire config into a single set, then checks for each recommended hook. The config has ruff and ruff-format (satisfying linting and formatting), and detect-private-key (satisfying secret detection), but it is missing no-commit-to-branch and mypy.
ruff-format satisfies the formatting check because the check accepts either black or ruff-format.
Key insight: The pre-commit tool's key advantage over manually reminding team members to run linters is that the configuration is version-controlled in .pre-commit-config.yaml. Every engineer who runs pre-commit install gets identical quality gates. The detect-private-key hook prevents the most catastrophic class of commit mistake. The no-commit-to-branch hook enforces the rule that main is only modified via pull requests — without it, an engineer in a hurry can bypass your entire review process with a single git push origin main.
Expected Output
PASS: ruff linting configured\nPASS: code formatting configured\nPASS: secret detection configured\nFAIL: no-commit-to-branch not configured — engineers can commit directly to main\nFAIL: type checking (mypy) not configured\nOverall: 3/5 checks passedHints
Hint 1: Parse the YAML-like structure as a plain dict and check for hook IDs.
Hint 2: The five recommended hooks for a Python team are: ruff, a formatter (black or ruff-format), detect-private-key, no-commit-to-branch, and mypy.
Hint 3: Check hook IDs nested under repos -> hooks -> id.
Implement a commit history analyser that computes adoption metrics for Conventional Commits across a sprint's commit log.
from collections import Counter
import re
VALID_TYPES = {"feat", "fix", "refactor", "test", "docs", "style", "chore", "perf", "ci"}
CONVENTIONAL_PATTERN = re.compile(
r'^(' + '|'.join(VALID_TYPES) + r')(\([^)]+\))?: (.+)$'
)
def analyse_commit_history(commits):
"""
Analyse a list of commit messages.
Returns a dict with metrics.
"""
type_counts = Counter()
non_conventional = []
subject_lengths = []
for msg in commits:
msg = msg.strip()
m = CONVENTIONAL_PATTERN.match(msg)
if m:
commit_type = m.group(1)
subject = m.group(3)
type_counts[commit_type] += 1
subject_lengths.append((len(msg), msg))
else:
non_conventional.append(msg)
total = len(commits)
conventional_count = total - len(non_conventional)
pct = round(conventional_count / total * 100, 1) if total else 0.0
longest_len, longest_msg = max(subject_lengths) if subject_lengths else (0, "")
return {
"total": total,
"conventional": conventional_count,
"pct": pct,
"types": dict(type_counts),
"non_conventional": non_conventional,
"longest_chars": longest_len,
"longest_msg": longest_msg,
}
commits = [
"feat(auth): add JWT refresh token endpoint",
"fix(api): handle empty query string in search",
"WIP",
"feat(models): add discount_percent field to Order",
"test(services): add parametrized tests for discounts",
"refactor(services): extract OrderValidator from services",
"fix(auth): return 401 not 500 for expired tokens",
"chore(deps): upgrade sqlalchemy and alembic to latest",
"fixed the thing",
"feat(reports): add CSV export function",
]
r = analyse_commit_history(commits)
print(f"Total commits: {r['total']}")
print(f"Conventional: {r['conventional']} ({r['pct']}%)")
print(f"Types: {r['types']}")
print(f"Non-conventional: {r['non_conventional']}")
print(f"Longest subject: {r['longest_chars']} chars — {r['longest_msg']}")Solution
from collections import Counter
import re
VALID_TYPES = {"feat", "fix", "refactor", "test", "docs", "style", "chore", "perf", "ci"}
CONVENTIONAL_PATTERN = re.compile(
r'^(' + '|'.join(VALID_TYPES) + r')(\([^)]+\))?: (.+)$'
)
def analyse_commit_history(commits):
type_counts = Counter()
non_conventional = []
subject_lengths = []
for msg in commits:
msg = msg.strip()
m = CONVENTIONAL_PATTERN.match(msg)
if m:
commit_type = m.group(1)
subject = m.group(3)
type_counts[commit_type] += 1
subject_lengths.append((len(msg), msg))
else:
non_conventional.append(msg)
total = len(commits)
conventional_count = total - len(non_conventional)
pct = round(conventional_count / total * 100, 1) if total else 0.0
longest_len, longest_msg = max(subject_lengths) if subject_lengths else (0, "")
return {
"total": total,
"conventional": conventional_count,
"pct": pct,
"types": dict(type_counts),
"non_conventional": non_conventional,
"longest_chars": longest_len,
"longest_msg": longest_msg,
}
commits = [
"feat(auth): add JWT refresh token endpoint",
"fix(api): handle empty query string in search",
"WIP",
"feat(models): add discount_percent field to Order",
"test(services): add parametrized tests for discounts",
"refactor(services): extract OrderValidator from services",
"fix(auth): return 401 not 500 for expired tokens",
"chore(deps): upgrade sqlalchemy and alembic to latest",
"fixed the thing",
"feat(reports): add CSV export function",
]
r = analyse_commit_history(commits)
print(f"Total commits: {r['total']}")
print(f"Conventional: {r['conventional']} ({r['pct']}%)")
print(f"Types: {r['types']}")
print(f"Non-conventional: {r['non_conventional']}")
print(f"Longest subject: {r['longest_chars']} chars — {r['longest_msg']}")
Output:
Total commits: 10
Conventional: 8 (80.0%)
Types: {'feat': 3, 'fix': 2, 'refactor': 1, 'test': 1, 'chore': 1}
Non-conventional: ['WIP', 'fixed the thing']
Longest subject: 55 chars — chore(deps): upgrade sqlalchemy and alembic to latest
How it works: The regex captures the type, optional scope, and subject from each commit. Non-matching commits go into the non_conventional list. The Counter accumulates how many commits of each type appeared. Longest subject is found by comparing full message lengths.
The longest message is "chore(deps): upgrade sqlalchemy and alembic to latest" at 55 characters — slightly over the 50-character guideline. This is a common real-world tradeoff: library names that are long sometimes push the subject over 50 characters.
Key insight: Running this analyser on a team's commit history at the end of a sprint reveals adoption patterns. 80% Conventional Commits with WIP and "fixed the thing" as outliers is a typical early-adoption profile. The non-conventional commits tend to cluster around quick late-night fixes or the same few engineers who have not internalized the standard yet. Sharing these metrics in retros, rather than policing individual commits, is the most effective way to improve team-wide adoption.
Expected Output
Total commits: 10\nConventional: 8 (80.0%)\nTypes: {'feat': 3, 'fix': 2, 'refactor': 1, 'test': 1, 'chore': 1}\nNon-conventional: ['WIP', 'fixed the thing']\nLongest subject: 55 chars — chore(deps): upgrade sqlalchemy and alembic to latestHints
Hint 1: Iterate through commits, parse the type using split(":")[0] and check against known types.
Hint 2: Track both the matching and non-matching commits separately.
Hint 3: The subject length is everything after the colon-space: type(scope): SUBJECT.
Hint 4: Count each type using a Counter or a defaultdict.
Build a Git workflow simulator that models the complete feature branch lifecycle: create branch, commit, validate PR readiness, merge to main, and clean up.
import re
from copy import deepcopy
VALID_TYPES = {"feat", "fix", "refactor", "test", "docs", "style", "chore", "perf", "ci"}
CONVENTIONAL = re.compile(r'^(' + '|'.join(VALID_TYPES) + r')(\([^)]+\))?: .+$')
class GitRepo:
def __init__(self):
self.branches = {"main": ["chore(repo): initial commit"]}
self.current = "main"
def checkout(self, branch, create=False):
if create:
if branch in self.branches:
raise ValueError(f"Branch {branch} already exists")
self.branches[branch] = list(self.branches[self.current])
print(f"Branch created: {branch}")
elif branch not in self.branches:
raise ValueError(f"Branch {branch} does not exist")
self.current = branch
def commit(self, message):
if not CONVENTIONAL.match(message):
raise ValueError(f"Non-conventional commit rejected: {message}")
self.branches[self.current].append(message)
def pr_ready(self, feature_branch, base="main"):
"""A branch is PR-ready if it has new commits and they are all conventional."""
base_commits = set(self.branches[base])
feature_commits = self.branches[feature_branch]
new_commits = [c for c in feature_commits if c not in base_commits]
return len(new_commits) > 0
def merge(self, feature_branch, base="main"):
"""Merge feature branch commits into base (squash-style for simulation)."""
base_commits = set(self.branches[base])
new_commits = [c for c in self.branches[feature_branch] if c not in base_commits]
if not new_commits:
raise ValueError("Nothing to merge")
self.branches[base].extend(new_commits)
return new_commits
def delete_branch(self, branch):
if branch == "main":
raise ValueError("Cannot delete main")
del self.branches[branch]
if self.current == branch:
self.current = "main"
repo = GitRepo()
# Start a feature
repo.checkout("main")
repo.checkout("feature/add-search", create=True)
# Make three focused commits
repo.commit("feat(api): add full-text search endpoint for courses")
repo.commit("test(api): add parametrized tests for search edge cases")
repo.commit("docs(api): document search query parameters")
# Count commits on the feature branch that are new vs main
main_set = set(repo.branches["main"])
new = [c for c in repo.branches["feature/add-search"] if c not in main_set]
print(f"Commits on feature branch: {len(new)}")
# Check PR readiness
ready = repo.pr_ready("feature/add-search")
print(f"PR ready: {ready}")
# Merge and validate
merged = repo.merge("feature/add-search")
print(f"Merge validated: {len(merged) == 3}")
# Clean up the feature branch
repo.delete_branch("feature/add-search")
print(f"Post-merge cleanup: {'feature/add-search' not in repo.branches}")
# Check main now has all commits
print(f"Final main commits: {len(repo.branches['main'])}")Solution
import re
from copy import deepcopy
VALID_TYPES = {"feat", "fix", "refactor", "test", "docs", "style", "chore", "perf", "ci"}
CONVENTIONAL = re.compile(r'^(' + '|'.join(VALID_TYPES) + r')(\([^)]+\))?: .+$')
class GitRepo:
def __init__(self):
self.branches = {"main": ["chore(repo): initial commit"]}
self.current = "main"
def checkout(self, branch, create=False):
if create:
if branch in self.branches:
raise ValueError(f"Branch {branch} already exists")
self.branches[branch] = list(self.branches[self.current])
print(f"Branch created: {branch}")
elif branch not in self.branches:
raise ValueError(f"Branch {branch} does not exist")
self.current = branch
def commit(self, message):
if not CONVENTIONAL.match(message):
raise ValueError(f"Non-conventional commit rejected: {message}")
self.branches[self.current].append(message)
def pr_ready(self, feature_branch, base="main"):
base_commits = set(self.branches[base])
feature_commits = self.branches[feature_branch]
new_commits = [c for c in feature_commits if c not in base_commits]
return len(new_commits) > 0
def merge(self, feature_branch, base="main"):
base_commits = set(self.branches[base])
new_commits = [c for c in self.branches[feature_branch] if c not in base_commits]
if not new_commits:
raise ValueError("Nothing to merge")
self.branches[base].extend(new_commits)
return new_commits
def delete_branch(self, branch):
if branch == "main":
raise ValueError("Cannot delete main")
del self.branches[branch]
if self.current == branch:
self.current = "main"
repo = GitRepo()
repo.checkout("main")
repo.checkout("feature/add-search", create=True)
repo.commit("feat(api): add full-text search endpoint for courses")
repo.commit("test(api): add parametrized tests for search edge cases")
repo.commit("docs(api): document search query parameters")
main_set = set(repo.branches["main"])
new = [c for c in repo.branches["feature/add-search"] if c not in main_set]
print(f"Commits on feature branch: {len(new)}")
ready = repo.pr_ready("feature/add-search")
print(f"PR ready: {ready}")
merged = repo.merge("feature/add-search")
print(f"Merge validated: {len(merged) == 3}")
repo.delete_branch("feature/add-search")
print(f"Post-merge cleanup: {'feature/add-search' not in repo.branches}")
print(f"Final main commits: {len(repo.branches['main'])}")
Output:
Branch created: feature/add-search
Commits on feature branch: 3
PR ready: True
Merge validated: True
Post-merge cleanup: True
Final main commits: 4
How it works:
checkout("feature/add-search", create=True)copiesmain's commit list to the new branch — just asgit checkout -bdoes — and prints the confirmation.- Three commits are added to the feature branch. Each is validated against the Conventional Commits regex before being appended; a non-conventional message would raise
ValueErrorand block the commit. pr_readycomputes the set difference: commits in the feature branch that are not inmain. Three new commits means the branch has work to merge.mergeappends the three new commits tomain, simulating a fast-forward or squash merge.delete_branchremoves the feature branch from the repo model — the post-merge cleanup step that keeps the branch list clean.- Final count:
mainstarted with 1 commit, gained 3 from the merge = 4 total.
Key insight: This simulation captures the invariants of the feature branch workflow: main is only modified through merge, branches are short-lived and deleted after merging, and every commit must be conventional. In the real git workflow you would add git rebase origin/main before merging to ensure the feature branch is up-to-date with any changes merged to main since the branch was created — this keeps the commit history linear and makes git bisect maximally effective.
Expected Output
Branch created: feature/add-search\nCommits on feature branch: 3\nPR ready: True\nMerge validated: True\nPost-merge cleanup: True\nFinal main commits: 4Hints
Hint 1: Model git state as a dict with branches mapping to lists of commit messages.
Hint 2: A PR is ready when the branch has at least one commit and all commits are conventional.
Hint 3: Merging appends the feature branch commits to main and then deletes the feature branch.
Hint 4: Validate that the feature branch is up-to-date with main before merging.
