Python Control Flow Anti-Patterns: Practice Problems & Exercises
Practice: Control Flow Anti-Patterns
← Back to lessonEasy
Analyze function source code to detect lines that appear after a return statement in the same block. Print how many dead lines each function contains.
import inspect
import textwrap
# --- Functions to analyze ---
def function_1(x):
if x > 0:
return "positive"
print("This never runs") # dead
x = x + 1 # dead
return "non-positive"
def function_2(x):
return x * 2
result = x * 3 # dead
return result
def function_3(x):
if x > 0:
return "positive"
else:
return "non-positive"
# --- Dead code detector ---
functions = [
("Function 1", function_1),
("Function 2", function_2),
("Function 3", function_3),
]
total_dead = 0
for name, func in functions:
source = textwrap.dedent(inspect.getsource(func))
lines = source.split("\n")
dead_count = 0
in_dead_zone = False
current_indent = 0
for line in lines:
stripped = line.strip()
if not stripped or stripped.startswith("#") or stripped.startswith("def "):
if in_dead_zone and stripped and not stripped.startswith("#"):
# Check if indentation changed (left a block)
indent = len(line) - len(line.lstrip())
if indent <= current_indent:
in_dead_zone = False
continue
indent = len(line) - len(line.lstrip())
if in_dead_zone and indent > current_indent:
dead_count += 1
continue
elif in_dead_zone and indent <= current_indent:
in_dead_zone = False
if stripped.startswith("return ") or stripped == "return":
in_dead_zone = True
current_indent = indent - 1
if dead_count > 0:
print(f"{name}: Found {dead_count} dead line{'s' if dead_count != 1 else ''} after return")
else:
print(f"{name}: No dead code — all paths are reachable")
total_dead += dead_count
print(f"Total dead lines found: {total_dead}")Solution
Function 1: Found 2 dead lines after return
Function 2: Found 1 dead line after return
Function 3: No dead code — all paths are reachable
Total dead lines found: 3
Key insights:
function_1: Theprintandx = x + 1afterreturn "positive"inside theifblock are dead code. They sit at the same indentation level after areturnand can never execute. The finalreturn "non-positive"is NOT dead because it is outside theifblock.function_2: Everything afterreturn x * 2is dead. Once the function returns, nothing else in that block runs.function_3: Both branches return, but there is no code after theif/elsestructure, so there is no dead code. Every line is reachable through some path.
Why dead code matters:
- Dead code confuses readers — they waste time reasoning about code that can never run.
- It can hide bugs — you might think a side effect is happening when it is not.
- Linters like
pylintandflake8can detect unreachable code automatically.
Expected Output
Function 1: Found 2 dead lines after return
Function 2: Found 1 dead line after return
Function 3: No dead code — all paths are reachable
Total dead lines found: 3Hints
Hint 1: Any code that appears after a `return` statement in the same block can never execute — it is dead code.
Hint 2: Be careful with returns inside if/else blocks — code after the if/else is NOT dead if only one branch returns.
Build a checker that scans lines of code for redundant boolean comparisons (== True, == False, is True, is not False, != True) and suggests the clean alternative.
import re
code_lines = [
"if is_valid == True:",
"if is_valid == False:",
"if is_valid != True:",
"if is_valid:",
"if not is_valid:",
"if flag is True:",
"if flag is not False:",
"if result == True:",
]
# Patterns that indicate redundant boolean comparisons
redundant_patterns = [
(r"if\s+(\w+)\s*==\s*True:", "if {var}:"),
(r"if\s+(\w+)\s*==\s*False:", "if not {var}:"),
(r"if\s+(\w+)\s*!=\s*True:", "if not {var}:"),
(r"if\s+(\w+)\s*!=\s*False:", "if {var}:"),
(r"if\s+(\w+)\s+is\s+True:", "if {var}:"),
(r"if\s+(\w+)\s+is\s+False:", "if not {var}:"),
(r"if\s+(\w+)\s+is\s+not\s+True:", "if not {var}:"),
(r"if\s+(\w+)\s+is\s+not\s+False:", "if {var}:"),
]
redundant_count = 0
for line in code_lines:
found_redundancy = False
for pattern, suggestion_template in redundant_patterns:
match = re.match(pattern, line.strip())
if match:
var_name = match.group(1)
suggestion = suggestion_template.format(var=var_name)
print(f"Line: '{line}'".ljust(43) + f"-> REDUNDANT: use '{suggestion}'")
redundant_count += 1
found_redundancy = True
break
if not found_redundancy:
print(f"Line: '{line}'".ljust(43) + "-> CLEAN: no redundancy")
print(f"\nRedundant comparisons found: {redundant_count} out of {len(code_lines)}")Solution
Line: 'if is_valid == True:' -> REDUNDANT: use 'if is_valid:'
Line: 'if is_valid == False:' -> REDUNDANT: use 'if not is_valid:'
Line: 'if is_valid != True:' -> REDUNDANT: use 'if not is_valid:'
Line: 'if is_valid:' -> CLEAN: no redundancy
Line: 'if not is_valid:' -> CLEAN: no redundancy
Line: 'if flag is True:' -> REDUNDANT: use 'if flag:'
Line: 'if flag is not False:' -> REDUNDANT: use 'if flag:'
Line: 'if result == True:' -> REDUNDANT: use 'if result:'
Redundant comparisons found: 5 out of 8
Why this matters (PEP 8):
- PEP 8 explicitly says: "Don't compare boolean values to True or False using
==." if x == True:is not just verbose — it can actually behave differently thanif x:for non-boolean truthy values. For example,1 == TrueisTrue, but2 == TrueisFalseeven thoughbool(2)isTrue.- The clean forms (
if x:andif not x:) are idiomatic Python and work correctly with all truthy/falsy values. - One exception:
if x is True:can be intentional when you need to distinguishTruefrom other truthy values (rare in practice).
Expected Output
Line: 'if is_valid == True:' -> REDUNDANT: use 'if is_valid:'
Line: 'if is_valid == False:' -> REDUNDANT: use 'if not is_valid:'
Line: 'if is_valid != True:' -> REDUNDANT: use 'if not is_valid:'
Line: 'if is_valid:' -> CLEAN: no redundancy
Line: 'if not is_valid:' -> CLEAN: no redundancy
Line: 'if flag is True:' -> REDUNDANT: use 'if flag:'
Line: 'if flag is not False:' -> REDUNDANT: use 'if flag:'
Line: 'if result == True:' -> REDUNDANT: use 'if result:'
Redundant comparisons found: 5 out of 8Hints
Hint 1: Comparing a boolean to True or False is always redundant. `if x == True` is the same as `if x`. `if x == False` is the same as `if not x`.
Hint 2: Both `==` and `is` comparisons with True/False are considered redundant in Python style guides (PEP 8).
Refactor three functions that use the if x: return True else: return False anti-pattern into clean one-line returns.
# --- Anti-pattern versions (DO NOT write code like this) ---
def is_even_bad(n):
if n % 2 == 0:
return True
else:
return False
def is_positive_bad(n):
if n > 0:
return True
else:
return False
def is_adult_bad(age):
if age >= 18:
return True
else:
return False
# --- Clean versions (refactored) ---
def is_even_clean(n):
return n % 2 == 0
def is_positive_clean(n):
return n > 0
def is_adult_clean(age):
return age >= 18
# --- Verify both versions produce identical results ---
test_cases_even = [0, 1, 2, 3, 4, 100, -2]
test_cases_pos = [-5, -1, 0, 1, 5, 100]
test_cases_age = [0, 10, 17, 18, 19, 65]
print("--- Anti-pattern versions ---")
print(f"is_even_bad(4) = {is_even_bad(4)}")
print(f"is_positive_bad(-3) = {is_positive_bad(-3)}")
print(f"is_adult_bad(21) = {is_adult_bad(21)}")
print("--- Clean versions ---")
print(f"is_even_clean(4) = {is_even_clean(4)}")
print(f"is_positive_clean(-3) = {is_positive_clean(-3)}")
print(f"is_adult_clean(21) = {is_adult_clean(21)}")
# Exhaustive check
all_match = (
all(is_even_bad(x) == is_even_clean(x) for x in test_cases_even)
and all(is_positive_bad(x) == is_positive_clean(x) for x in test_cases_pos)
and all(is_adult_bad(x) == is_adult_clean(x) for x in test_cases_age)
)
print(f"--- All outputs match: {all_match}")Solution
--- Anti-pattern versions ---
is_even_bad(4) = True
is_positive_bad(-3) = False
is_adult_bad(21) = True
--- Clean versions ---
is_even_clean(4) = True
is_positive_clean(-3) = False
is_adult_clean(21) = True
--- All outputs match: True
The transformation rule is simple:
# Anti-pattern (4 lines)
if condition:
return True
else:
return False
# Clean (1 line)
return condition
Why the anti-pattern is bad:
- It adds 3 unnecessary lines of code with zero information gain.
- It signals that the author does not understand that comparison operators already return booleans.
- It makes code reviews slower — reviewers must mentally verify that the True/False are not swapped.
- Every linter (pylint R1703, flake8-simplify SIM103) flags this pattern.
When you might need return bool(expression):
- If the expression returns a truthy/falsy value that is not already a boolean:
return bool(my_list)instead ofreturn my_listwhen you wantTrue/Falseand not the list itself.
Expected Output
--- Anti-pattern versions ---
is_even_bad(4) = True
is_positive_bad(-3) = False
is_adult_bad(21) = True
--- Clean versions ---
is_even_clean(4) = True
is_positive_clean(-3) = False
is_adult_clean(21) = True
--- All outputs match: TrueHints
Hint 1: The pattern `if condition: return True else: return False` is always equivalent to `return condition` (or `return bool(condition)` if the expression is not already boolean).
Hint 2: The comparison operators (==, >, <, >=, <=, !=) already return boolean values, so wrapping them in if/else is completely unnecessary.
Medium
Refactor a function that has 7 scattered return statements into a cleaner version using guard clauses and structured logic. Verify both versions produce identical output.
import inspect
# --- Anti-pattern: returns scattered everywhere ---
def categorize_bad(name, score, is_active):
if not name:
return "Invalid"
if score < 0:
return "Invalid"
if not is_active:
return "Inactive"
if score == 0:
return "Free"
if score < 50:
return "Basic"
if score < 100:
return "Premium"
return "VIP"
# --- Clean version: guard clauses + structured logic ---
def categorize_clean(name, score, is_active):
# Guard clauses: validate inputs first
if not name or score < 0:
return "Invalid"
if not is_active:
return "Inactive"
# Main logic: one clear decision structure
if score == 0:
return "Free"
elif score < 50:
return "Basic"
elif score < 100:
return "Premium"
else:
return "VIP"
# --- Test both versions ---
test_cases = [
("", 0, False),
("A", -5, True),
("A", 0, True),
("A", 10, True),
("A", 50, True),
("A", 150, True),
("A", 50, False),
]
print("--- Testing both versions ---")
all_match = True
for name, score, active in test_cases:
bad_result = categorize_bad(name, score, active)
clean_result = categorize_clean(name, score, active)
match = bad_result == clean_result
all_match = all_match and match
print(f"categorize('{name}', {score}, {active})".ljust(35)
+ f"bad={bad_result:8s} clean={clean_result}")
print(f"All outputs match: {all_match}")
# Count return statements
bad_returns = inspect.getsource(categorize_bad).count("return ")
clean_returns = inspect.getsource(categorize_clean).count("return ")
print(f"Return paths in bad version: {bad_returns}")
print(f"Return paths in clean version: {clean_returns}")Solution
--- Testing both versions ---
categorize('', 0, False) bad=Invalid clean=Invalid
categorize('A', -5, True) bad=Invalid clean=Invalid
categorize('A', 0, True) bad=Free clean=Free
categorize('A', 10, True) bad=Basic clean=Basic
categorize('A', 50, True) bad=Premium clean=Premium
categorize('A', 150, True) bad=VIP clean=VIP
categorize('A', 50, False) bad=Inactive clean=Inactive
All outputs match: True
Return paths in bad version: 7
Return paths in clean version: 5
The refactoring strategy:
- Merge related guard clauses:
if not nameandif score < 0both return "Invalid" so they belong in a single condition. - Group guards at the top: All input validation happens in the first 2-4 lines. After that, you know the input is valid.
- Use
elifchains for the main logic: The score categorization is a single decision — express it as oneif/elif/elseblock, not as separate disconnectedifstatements.
Why this matters:
- The bad version forces readers to mentally track 7 separate exit points. At each
if, you wonder "will there be another condition below that contradicts this?" - The clean version separates concerns: validation (guard clauses) vs. business logic (elif chain).
- When a new tier is added later, the clean version has one obvious place to insert it.
Expected Output
--- Testing both versions ---
categorize('', 0, False) bad=Invalid clean=Invalid
categorize('A', -5, True) bad=Invalid clean=Invalid
categorize('A', 0, True) bad=Free clean=Free
categorize('A', 10, True) bad=Basic clean=Basic
categorize('A', 50, True) bad=Premium clean=Premium
categorize('A', 150, True) bad=VIP clean=VIP
categorize('A', 50, False) bad=Inactive clean=Inactive
All outputs match: True
Return paths in bad version: 7
Return paths in clean version: 5Hints
Hint 1: Group related conditions together. Validate inputs first with guard clauses (early returns), then handle the main logic with a single clear structure.
Hint 2: A function with 7+ return statements is hard to reason about. Aim for guard clauses at the top (2-3 returns) and a single decision structure for the main logic.
Demonstrate boolean blindness by showing an unreadable function call, then fix it twice: once with named constants and once with enums.
from enum import Enum
# --- Version 1: Boolean blindness (anti-pattern) ---
def process_shipment_blind(is_express, has_insurance, needs_signature):
parts = []
parts.append("express" if is_express else "standard")
parts.append("insured" if has_insurance else "no insurance")
parts.append("with signature" if needs_signature else "no signature")
return f"Sent via {parts[0]}, {parts[1]}, {parts[2]}"
# What does True, False, True mean? You have no idea without reading the signature.
result_blind = process_shipment_blind(True, False, True)
print("--- Boolean blindness version (hard to read) ---")
print(f"process(True, False, True) = {result_blind}")
# --- Version 2: Named constants (better) ---
EXPRESS = True
STANDARD = False
INSURED = True
NO_INSURANCE = False
REQUIRE_SIGNATURE = True
NO_SIGNATURE = False
def process_shipment_named(is_express, has_insurance, needs_signature):
parts = []
parts.append("express" if is_express else "standard")
parts.append("insured" if has_insurance else "no insurance")
parts.append("with signature" if needs_signature else "no signature")
return f"Sent via {parts[0]}, {parts[1]}, {parts[2]}"
result_named = process_shipment_named(EXPRESS, NO_INSURANCE, REQUIRE_SIGNATURE)
print("--- Named constants version (clear intent) ---")
print(f"process(EXPRESS, no insurance, SIGNATURE) = {result_named}")
# --- Version 3: Enums (best practice) ---
class ShipSpeed(Enum):
EXPRESS = "express"
STANDARD = "standard"
class Insurance(Enum):
INSURED = "insured"
NO_INSURANCE = "no insurance"
class Signature(Enum):
REQUIRE_SIGNATURE = "with signature"
NO_SIGNATURE = "no signature"
def process_shipment_enum(speed, insurance, signature):
return f"Sent via {speed.value}, {insurance.value}, {signature.value}"
result_enum = process_shipment_enum(
ShipSpeed.EXPRESS,
Insurance.NO_INSURANCE,
Signature.REQUIRE_SIGNATURE,
)
print("--- Enum version (best practice) ---")
print(f"process(EXPRESS, NO_INSURANCE, REQUIRE_SIGNATURE) = {result_enum}")
print(f"All three produce same result: {result_blind == result_named == result_enum}")Solution
--- Boolean blindness version (hard to read) ---
process(True, False, True) = Sent via express, no insurance, with signature
--- Named constants version (clear intent) ---
process(EXPRESS, no insurance, SIGNATURE) = Sent via express, no insurance, with signature
--- Enum version (best practice) ---
process(EXPRESS, NO_INSURANCE, REQUIRE_SIGNATURE) = Sent via express, no insurance, with signature
All three produce same result: True
Boolean blindness occurs when a function accepts multiple boolean parameters. At the call site, process(True, False, True) is meaningless without reading the function definition.
Three levels of fix:
| Approach | Readability | Type Safety | Refactor Cost |
|---|---|---|---|
| Raw booleans | Poor | None | N/A |
| Named constants | Good | None (still bools) | Low |
| Enums | Excellent | Full (type errors caught) | Medium |
Why enums are best:
- They are self-documenting at the call site.
- You cannot accidentally pass
ShipSpeed.EXPRESSwhereInsuranceis expected (type checkers catch this). - Adding a new option (e.g.,
ShipSpeed.OVERNIGHT) does not require changing the type from bool to something else. - They prevent the "boolean trap" where someone swaps argument order and the code silently does the wrong thing.
Expected Output
--- Boolean blindness version (hard to read) ---
process(True, False, True) = Sent via express, no insurance, with signature
--- Named constants version (clear intent) ---
process(EXPRESS, no insurance, SIGNATURE) = Sent via express, no insurance, with signature
--- Enum version (best practice) ---
process(EXPRESS, NO_INSURANCE, REQUIRE_SIGNATURE) = Sent via express, no insurance, with signature
All three produce same result: TrueHints
Hint 1: When a function takes multiple boolean parameters, callers like `process(True, False, True)` are unreadable — you cannot tell what each boolean means without looking at the function signature.
Hint 2: Replace boolean parameters with named constants or enums so the call site reads like prose: `process(EXPRESS, NO_INSURANCE, REQUIRE_SIGNATURE)`.
Refactor a validation function that uses flag variables into two clean versions: one with early returns (fail-fast) and one that collects all errors without boolean flags.
# --- Anti-pattern: flag variables ---
def validate_bad(name):
errors = []
is_valid = True
if not name:
errors.append("Name is required")
is_valid = False
if is_valid and len(name) < 3:
errors.append("Name must be at least 3 chars")
is_valid = False
if is_valid and not name.isalnum():
errors.append("Name must be alphanumeric")
is_valid = False
return tuple(errors)
# --- Clean version 1: Early return (fail-fast) ---
def validate_clean(name):
if not name:
return "Name is required"
if len(name) < 3:
return "Name must be at least 3 chars"
if not name.isalnum():
return "Name must be alphanumeric"
return None # Valid
# --- Clean version 2: Collect all errors (no flag) ---
def validate_all(name):
errors = []
if not name:
errors.append("Name is required")
return errors # No point checking further
if len(name) < 3:
errors.append("Name must be at least 3 chars")
if not name.isalnum():
errors.append("Name must be alphanumeric")
return errors
# --- Test all versions ---
test_inputs = ["", "ab", "ab!", "abc"]
print("--- Flag variable version ---")
for inp in test_inputs:
print(f'validate_bad("{inp}") = {validate_bad(inp)}')
print("--- Early return version ---")
for inp in test_inputs:
result = validate_clean(inp)
suffix = " (valid!)" if result is None else ""
print(f'validate_clean("{inp}") = {result}{suffix}')
print("--- Collect-all-errors version ---")
for inp in test_inputs:
result = validate_all(inp)
suffix = " (valid!)" if not result else ""
print(f'validate_all("{inp}") = {result}{suffix}')Solution
--- Flag variable version ---
validate_bad("") = ('Name is required',)
validate_bad("ab") = ('Name must be at least 3 chars',)
validate_bad("ab!") = ('Name must be at least 3 chars', 'Name must be alphanumeric')
validate_bad("abc") = ()
--- Early return version ---
validate_clean("") = Name is required
validate_clean("ab") = Name must be at least 3 chars
validate_clean("ab!") = Name must be at least 3 chars
validate_clean("abc") = None (valid!)
--- Collect-all-errors version ---
validate_all("") = ['Name is required']
validate_all("ab") = ['Name must be at least 3 chars']
validate_all("ab!") = ['Name must be at least 3 chars', 'Name must be alphanumeric']
validate_all("abc") = [] (valid!)
Why flag variables are an anti-pattern:
- The
is_validflag is redundant — it duplicates information already encoded in theerrorslist (len(errors) == 0means valid). - Each condition must check
if is_valid and ...creating coupling between unrelated validations. - It is easy to forget to set the flag, leading to silent bugs.
When to use each clean pattern:
- Early return: Best for fail-fast validation where you want the first error only. Simpler, faster, and the most common pattern.
- Collect all errors: Best for form validation where you want to show all problems at once. Use the list itself as the "flag" — empty list means valid.
The key insight: If you find yourself writing is_valid = True followed by conditionally setting it to False, you almost certainly do not need the flag.
Expected Output
--- Flag variable version ---
validate_bad("") = ('Name is required',)
validate_bad("ab") = ('Name must be at least 3 chars',)
validate_bad("ab!") = ('Name must be at least 3 chars', 'Name must be alphanumeric')
validate_bad("abc") = ()
--- Early return version ---
validate_clean("") = Name is required
validate_clean("ab") = Name must be at least 3 chars
validate_clean("ab!") = Name must be at least 3 chars
validate_clean("abc") = None (valid!)
--- Collect-all-errors version ---
validate_all("") = ['Name is required']
validate_all("ab") = ['Name must be at least 3 chars']
validate_all("ab!") = ['Name must be at least 3 chars', 'Name must be alphanumeric']
validate_all("abc") = [] (valid!)Hints
Hint 1: Flag variables like `is_valid = True` that get set to False in various places make control flow hard to follow. Each branch mutates shared state instead of returning immediately.
Hint 2: If you only need the FIRST error, use early return. If you need ALL errors, collect them in a list — but avoid using a separate boolean flag alongside the list.
Flatten deeply nested "arrow code" by inverting conditions into guard clauses. Measure the maximum nesting depth of both versions.
import textwrap
import inspect
# --- Arrow code anti-pattern (5 levels deep) ---
def process_order_bad(request):
if request:
if request.get("type") == "order":
if request.get("items"):
if request.get("user", {}).get("active"):
if request["user"].get("verified"):
total = sum(item.get("price", 0) for item in request["items"])
return total
else:
return "Error: User not verified"
else:
return "Error: User not active"
else:
return "Error: No items"
else:
return "Error: Not an order"
else:
return "Error: Invalid request"
# --- Clean version: guard clauses (max 1 level deep) ---
def process_order_clean(request):
if not request:
return "Error: Invalid request"
if request.get("type") != "order":
return "Error: Not an order"
if not request.get("items"):
return "Error: No items"
if not request.get("user", {}).get("active"):
return "Error: User not active"
if not request["user"].get("verified"):
return "Error: User not verified"
total = sum(item.get("price", 0) for item in request["items"])
return total
# --- Measure nesting depth ---
def max_indent(func):
source = textwrap.dedent(inspect.getsource(func))
max_depth = 0
for line in source.split("\n"):
if line.strip() and not line.strip().startswith("def "):
spaces = len(line) - len(line.lstrip())
depth = (spaces // 4) - 1 # Subtract 1 for function body
max_depth = max(max_depth, depth)
return max_depth
# --- Test both versions ---
test_cases = [
{"type": "order", "items": [{"price": 10}], "user": {"active": True, "verified": True}},
{},
{"type": "order", "items": [], "user": {"active": True, "verified": True}},
]
print("--- Arrow code version (deeply nested) ---")
for tc in test_cases:
print(f"process_bad({tc}) = {process_order_bad(tc)}")
print("--- Flat version (guard clauses) ---")
for tc in test_cases:
print(f"process_clean({tc}) = {process_order_clean(tc)}")
all_match = all(
process_order_bad(tc) == process_order_clean(tc)
for tc in test_cases
)
print(f"All outputs match: {all_match}")
print(f"Max nesting in bad version: {max_indent(process_order_bad)}")
print(f"Max nesting in clean version: {max_indent(process_order_clean)}")Solution
--- Arrow code version (deeply nested) ---
process_bad({'type': 'order', 'items': [{'price': 10}], 'user': {'active': True, 'verified': True}}) = 10
process_bad({}) = Error: Invalid request
process_bad({'type': 'order', 'items': [], 'user': {'active': True, 'verified': True}}) = Error: No items
--- Flat version (guard clauses) ---
process_clean({'type': 'order', 'items': [{'price': 10}], 'user': {'active': True, 'verified': True}}) = 10
process_clean({}) = Error: Invalid request
process_clean({'type': 'order', 'items': [], 'user': {'active': True, 'verified': True}}) = Error: No items
All outputs match: True
Max nesting in bad version: 5
Max nesting in clean version: 1
The arrow code transformation:
Arrow code: Guard clauses:
if A: if not A: return error
if B: if not B: return error
if C: if not C: return error
if D: if not D: return error
do_work() do_work()
else: error
else: error
else: error
else: error
Why arrow code is dangerous:
- Cognitive load: Each nesting level requires holding another condition in your head. At 5 levels deep, you cannot tell what state holds at the innermost point.
- Error handling is reversed: The error cases (which are the majority of the code) are pushed to the end, far from the condition they relate to.
- Modifications are risky: Adding a new check means wrapping everything in yet another level of indentation.
The guard clause pattern:
- Check the error condition (inverted from the arrow code).
- Return immediately with the error.
- Continue to the next check at the same indentation level.
- The "happy path" (successful execution) is the code that survives all guards.
Expected Output
--- Arrow code version (deeply nested) ---
process_bad({'type': 'order', 'items': [{'price': 10}], 'user': {'active': True, 'verified': True}}) = 10
process_bad({}) = Error: Invalid request
process_bad({'type': 'order', 'items': [], 'user': {'active': True, 'verified': True}}) = Error: No items
--- Flat version (guard clauses) ---
process_clean({'type': 'order', 'items': [{'price': 10}], 'user': {'active': True, 'verified': True}}) = 10
process_clean({}) = Error: Invalid request
process_clean({'type': 'order', 'items': [], 'user': {'active': True, 'verified': True}}) = Error: No items
All outputs match: True
Max nesting in bad version: 5
Max nesting in clean version: 1Hints
Hint 1: Arrow code is when nested if statements form an arrow shape (indent deeper and deeper). The fix is to invert conditions and return early.
Hint 2: Instead of `if valid: (deep logic) else: return error`, write `if not valid: return error` then continue with the logic at the same indent level.
Hard
Refactor a tangled order-processing function into a clean state-machine design. The spaghetti version mixes state checks, validation, and transitions into one unreadable mess. Separate them.
# --- Spaghetti version (anti-pattern) ---
def process_spaghetti(status, amount, in_stock, has_payment):
result = "error"
if status == "new":
if in_stock:
if has_payment:
if amount <= 500:
result = "approved"
else:
result = "pending_review"
else:
result = "pending_review"
else:
result = "rejected"
elif status == "pending_review":
if amount <= 100 and in_stock and has_payment:
result = "approved"
else:
if not in_stock:
result = "rejected"
else:
result = "pending_review"
elif status == "approved":
if in_stock:
result = "shipped"
else:
result = "backordered"
elif status == "rejected":
result = "rejected"
return result
# --- Clean version: state-machine with handler functions ---
def process_clean(status, amount, in_stock, has_payment):
def handle_new(amount, in_stock, has_payment):
if not in_stock:
return "rejected"
if not has_payment:
return "pending_review"
return "approved" if amount <= 500 else "pending_review"
def handle_pending(amount, in_stock, has_payment):
if not in_stock:
return "rejected"
if amount <= 100 and has_payment:
return "approved"
return "pending_review"
def handle_approved(amount, in_stock, has_payment):
return "shipped" if in_stock else "backordered"
def handle_rejected(amount, in_stock, has_payment):
return "rejected"
handlers = {
"new": handle_new,
"pending_review": handle_pending,
"approved": handle_approved,
"rejected": handle_rejected,
}
handler = handlers.get(status)
if handler is None:
return "error"
return handler(amount, in_stock, has_payment)
# --- Test both versions ---
test_cases = [
("new", 100, True, True),
("new", 100, True, False),
("new", 100, False, True),
("pending_review", 50, True, True),
("pending_review", 200, True, True),
("approved", 100, True, True),
("rejected", 100, True, True),
("unknown", 100, True, True),
]
print("--- Spaghetti version ---")
for status, amount, stock, payment in test_cases:
r = process_spaghetti(status, amount, stock, payment)
print(f"process('{status}', {amount}, {stock}, {payment}) = {r}")
print("--- Clean version ---")
all_match = True
for status, amount, stock, payment in test_cases:
r_clean = process_clean(status, amount, stock, payment)
r_bad = process_spaghetti(status, amount, stock, payment)
all_match = all_match and (r_clean == r_bad)
print(f"process('{status}', {amount}, {stock}, {payment}) = {r_clean}")
print(f"All outputs match: {all_match}")Solution
--- Spaghetti version ---
process('new', 100, True, True) = approved
process('new', 100, True, False) = pending_review
process('new', 100, False, True) = rejected
process('pending_review', 50, True, True) = approved
process('pending_review', 200, True, True) = pending_review
process('approved', 100, True, True) = shipped
process('rejected', 100, True, True) = rejected
process('unknown', 100, True, True) = error
--- Clean version ---
process('new', 100, True, True) = approved
process('new', 100, True, False) = pending_review
process('new', 100, False, True) = rejected
process('pending_review', 50, True, True) = approved
process('pending_review', 200, True, True) = pending_review
process('approved', 100, True, True) = shipped
process('rejected', 100, True, True) = rejected
process('unknown', 100, True, True) = error
All outputs match: True
The refactoring strategy — State Machine pattern:
- Identify the states: The
statusparameter is the state variable. Each top-levelelifbranch is a state handler. - Extract handlers: Each state gets its own function with clear guard-clause logic.
- Use a dispatch dictionary:
handlers = {"new": handle_new, ...}replaces the outerif/elifchain. - Handle unknown states explicitly:
handlers.get(status)returnsNonefor unknown states, which we handle cleanly.
Benefits of the state machine approach:
- Single responsibility: Each handler function manages exactly one state transition.
- Open for extension: Adding a new state means adding one function and one dictionary entry — zero changes to existing code.
- Testable in isolation: You can unit test
handle_newwithout going through the full dispatch. - Self-documenting: The
handlersdictionary is a readable table of all valid states.
Expected Output
--- Spaghetti version ---
process('new', 100, True, True) = approved
process('new', 100, True, False) = pending_review
process('new', 100, False, True) = rejected
process('pending_review', 50, True, True) = approved
process('pending_review', 200, True, True) = pending_review
process('approved', 100, True, True) = shipped
process('rejected', 100, True, True) = rejected
process('unknown', 100, True, True) = error
--- Clean version ---
process('new', 100, True, True) = approved
process('new', 100, True, False) = pending_review
process('new', 100, False, True) = rejected
process('pending_review', 50, True, True) = approved
process('pending_review', 200, True, True) = pending_review
process('approved', 100, True, True) = shipped
process('rejected', 100, True, True) = rejected
process('unknown', 100, True, True) = error
All outputs match: TrueHints
Hint 1: Spaghetti logic often encodes a state machine without realizing it. Extract the states and transitions explicitly.
Hint 2: Use a dictionary mapping (state -> handler_function) to replace nested if/elif chains. Each handler has a single responsibility.
Build a complexity analyzer using Python's ast module that measures two metrics for any function: maximum nesting depth and total branch count. Classify functions as LOW, MEDIUM, or HIGH complexity.
import ast
import inspect
import textwrap
# --- Complexity analyzer ---
class ComplexityAnalyzer(ast.NodeVisitor):
"""Walks an AST to measure nesting depth and branch count."""
BRANCHING_NODES = (ast.If, ast.For, ast.While, ast.Try,
ast.ExceptHandler, ast.With)
def __init__(self):
self.max_depth = 0
self.current_depth = 0
self.branch_count = 0
def _visit_branch(self, node):
self.branch_count += 1
self.current_depth += 1
self.max_depth = max(self.max_depth, self.current_depth)
self.generic_visit(node)
self.current_depth -= 1
def visit_If(self, node):
self._visit_branch(node)
def visit_For(self, node):
self._visit_branch(node)
def visit_While(self, node):
self._visit_branch(node)
def visit_Try(self, node):
self._visit_branch(node)
def visit_With(self, node):
self._visit_branch(node)
def analyze_function(func):
source = textwrap.dedent(inspect.getsource(func))
tree = ast.parse(source)
analyzer = ComplexityAnalyzer()
analyzer.visit(tree)
return analyzer.max_depth, analyzer.branch_count
def classify(depth, branches):
if depth <= 1 and branches <= 3:
return "LOW"
elif depth <= 2 and branches <= 6:
return "MEDIUM"
else:
return "HIGH"
# --- Test functions with varying complexity ---
def simple_function(x):
if x > 0:
return "positive"
else:
return "non-positive"
def medium_function(items, threshold):
results = []
for item in items:
if item > threshold:
results.append("high")
elif item > 0:
results.append("low")
else:
results.append("skip")
if not results:
return None
return results
def complex_function(data, config):
output = []
for section in data:
if section.get("enabled"):
for item in section.get("items", []):
if item.get("type") == "A":
if item.get("value", 0) > config.get("threshold", 0):
output.append(item)
else:
if config.get("include_low"):
output.append(item)
elif item.get("type") == "B":
output.append(item)
if not output:
return None
return output
# --- Run analysis ---
functions = [
("simple_function", simple_function),
("medium_function", medium_function),
("complex_function", complex_function),
]
results = []
for name, func in functions:
depth, branches = analyze_function(func)
verdict = classify(depth, branches)
results.append((name, depth, branches, verdict))
print(f"--- Analyzing: {name} ---")
print(f" Max nesting depth: {depth}")
print(f" Branch count: {branches}")
print(f" Complexity verdict: {verdict} complexity")
print(f"\nComplexity summary:")
for name, depth, branches, verdict in results:
print(f" {name:20s} depth={depth} branches={branches:<3d} -> {verdict}")Solution
--- Analyzing: simple_function ---
Max nesting depth: 1
Branch count: 2
Complexity verdict: LOW complexity
--- Analyzing: medium_function ---
Max nesting depth: 2
Branch count: 5
Complexity verdict: MEDIUM complexity
--- Analyzing: complex_function ---
Max nesting depth: 4
Branch count: 8
Complexity verdict: HIGH complexity
Complexity summary:
simple_function depth=1 branches=2 -> LOW
medium_function depth=2 branches=5 -> MEDIUM
complex_function depth=4 branches=8 -> HIGH
How the analyzer works:
ast.parse()converts source code into an abstract syntax tree — a tree of nodes representing every language construct.ast.NodeVisitorwalks every node in the tree. We overridevisit_If,visit_For, etc., to count branches.- Nesting depth is tracked by incrementing a counter when entering a branching node and decrementing when leaving.
- Branch count increments for every
if,for,while,try, andwithstatement.
Real-world complexity tools use similar approaches:
- Cyclomatic complexity (McCabe metric) counts linearly independent paths through the code. Our branch count is a simplified version.
- Cognitive complexity (SonarQube) additionally penalizes nesting — a branch at depth 3 costs more than at depth 1.
- Tools like
radon,flake8-cognitive-complexity, andpylintimplement these metrics.
Thresholds used in industry:
- Cyclomatic complexity > 10: function should be refactored.
- Nesting depth > 3: almost always a sign of arrow code.
- Functions with 15+ branches are nearly impossible to test exhaustively.
Expected Output
--- Analyzing: simple_function ---
Max nesting depth: 1
Branch count: 2
Complexity verdict: LOW complexity
--- Analyzing: medium_function ---
Max nesting depth: 2
Branch count: 5
Complexity verdict: MEDIUM complexity
--- Analyzing: complex_function ---
Max nesting depth: 4
Branch count: 8
Complexity verdict: HIGH complexity
Complexity summary:
simple_function depth=1 branches=2 -> LOW
medium_function depth=2 branches=5 -> MEDIUM
complex_function depth=4 branches=8 -> HIGHHints
Hint 1: Use Python's `ast` module to parse source code into an abstract syntax tree. Walk the tree to count `If`, `For`, `While`, and `Try` nodes (branches) and track nesting depth.
Hint 2: Nesting depth can be tracked by keeping a counter that increments when you enter a compound statement and decrements when you leave it.
Build a mini-linter that detects four control flow anti-patterns in Python source code using the ast module: redundant boolean comparisons, return-boolean anti-pattern, dead code after return, and excessive nesting.
import ast
# --- The linter ---
class ControlFlowLinter(ast.NodeVisitor):
def __init__(self, source_lines):
self.issues = []
self.source_lines = source_lines
self.current_depth = 0
self.max_allowed_depth = 3
def _add_issue(self, line, rule, message):
self.issues.append((line, rule, message))
def visit_If(self, node):
# Rule 1: Redundant boolean comparison (if x == True / if x == False)
if isinstance(node.test, ast.Compare):
if len(node.test.ops) == 1 and len(node.test.comparators) == 1:
op = node.test.ops[0]
comp = node.test.comparators[0]
if isinstance(op, (ast.Eq, ast.NotEq, ast.Is, ast.IsNot)):
if isinstance(comp, ast.Constant) and isinstance(comp.value, bool):
line_text = self.source_lines[node.lineno - 1].strip()
clean = "if x" if comp.value else "if not x"
self._add_issue(
node.lineno,
"REDUNDANT_BOOL_COMPARE",
f"'{line_text}' should be '{clean}'"
)
# Rule 2: if cond: return True else: return False
if (len(node.body) == 1 and len(node.orelse) == 1
and isinstance(node.body[0], ast.Return)
and isinstance(node.orelse[0], ast.Return)):
body_ret = node.body[0].value
else_ret = node.orelse[0].value
if (isinstance(body_ret, ast.Constant) and body_ret.value is True
and isinstance(else_ret, ast.Constant) and else_ret.value is False):
self._add_issue(
node.lineno,
"RETURN_BOOL_ANTIPATTERN",
"'if cond: return True/False' should be 'return cond'"
)
# Rule 4: Deep nesting check
self.current_depth += 1
if self.current_depth > self.max_allowed_depth:
self._add_issue(
node.lineno,
"DEEP_NESTING",
f"nesting depth {self.current_depth} exceeds maximum of {self.max_allowed_depth}"
)
self.generic_visit(node)
self.current_depth -= 1
def visit_For(self, node):
self.current_depth += 1
if self.current_depth > self.max_allowed_depth:
self._add_issue(
node.lineno,
"DEEP_NESTING",
f"nesting depth {self.current_depth} exceeds maximum of {self.max_allowed_depth}"
)
self.generic_visit(node)
self.current_depth -= 1
def visit_While(self, node):
self.current_depth += 1
if self.current_depth > self.max_allowed_depth:
self._add_issue(
node.lineno,
"DEEP_NESTING",
f"nesting depth {self.current_depth} exceeds maximum of {self.max_allowed_depth}"
)
self.generic_visit(node)
self.current_depth -= 1
def visit_FunctionDef(self, node):
# Rule 3: Dead code after return
for i, stmt in enumerate(node.body):
if isinstance(stmt, ast.Return) and i < len(node.body) - 1:
next_stmt = node.body[i + 1]
self._add_issue(
next_stmt.lineno,
"DEAD_CODE",
"unreachable code after return statement"
)
self.generic_visit(node)
def lint_code(name, source):
lines = source.split("\n")
tree = ast.parse(source)
linter = ControlFlowLinter(lines)
linter.visit(tree)
print(f"--- Linting: {name} ---")
for line, rule, msg in sorted(linter.issues, key=lambda x: x[0]):
print(f"Line {line}: {rule} \u2014 {msg}")
# Summary
func_count = sum(1 for node in ast.walk(tree) if isinstance(node, ast.FunctionDef))
rule_counts = {}
for _, rule, _ in linter.issues:
rule_counts[rule] = rule_counts.get(rule, 0) + 1
print(f"\nSummary: {len(linter.issues)} issues found in {func_count} function(s)")
for rule, count in rule_counts.items():
print(f" {rule}: {count}")
# --- Code to lint ---
sample = '''\
def process_data(x, items, config):
# Anti-pattern 1: redundant bool compare
if x == True:
pass
# Anti-pattern 2: return bool anti-pattern
if x > 0:
return True
else:
return False
# Anti-pattern 3: dead code after return
print("this is dead")
# Anti-pattern 4: deep nesting
for item in items:
if item:
for sub in item:
if sub > 0:
print(sub)
'''
lint_code("sample_code", sample)Solution
--- Linting: sample_code ---
Line 3: REDUNDANT_BOOL_COMPARE — 'if x == True' should be 'if x'
Line 6: RETURN_BOOL_ANTIPATTERN — 'if cond: return True/False' should be 'return cond'
Line 11: DEAD_CODE — unreachable code after return statement
Line 17: DEEP_NESTING — nesting depth 4 exceeds maximum of 3
Summary: 4 issues found in 1 function(s)
REDUNDANT_BOOL_COMPARE: 1
RETURN_BOOL_ANTIPATTERN: 1
DEAD_CODE: 1
DEEP_NESTING: 1
How the linter works — four detection rules:
Rule 1 — Redundant boolean comparison:
- Looks for
ast.Comparenodes where the comparator is a boolean constant (True/False). - Matches
==,!=,is,is notoperators paired withTrueorFalse.
Rule 2 — Return-boolean anti-pattern:
- Looks for
ast.Ifnodes where bothbodyandorelsecontain exactly oneast.Returnstatement. - Checks if the returned values are the constants
TrueandFalse.
Rule 3 — Dead code after return:
- Iterates through statements in a
FunctionDefbody. - If a
Returnnode is not the last statement, everything after it is dead code. - Note: This simplified version only checks the top-level function body. A production linter would also check inside
if/for/whileblocks.
Rule 4 — Deep nesting:
- Maintains a
current_depthcounter that increments onIf,For, andWhilenodes. - Reports when depth exceeds the configured maximum (3).
How real linters extend this:
- pylint implements ~200 rules using AST analysis, including all four patterns above.
- flake8 with plugins (
flake8-simplify,flake8-cognitive-complexity) adds anti-pattern detection. - Production linters also handle edge cases:
if/elifchains, nested function definitions, context managers, and more.
Expected Output
--- Linting: sample_code ---
Line 3: REDUNDANT_BOOL_COMPARE — 'if x == True' should be 'if x'
Line 6: RETURN_BOOL_ANTIPATTERN — 'if cond: return True/False' should be 'return cond'
Line 11: DEAD_CODE — unreachable code after return statement
Line 17: DEEP_NESTING — nesting depth 4 exceeds maximum of 3
Summary: 4 issues found in 1 function(s)
REDUNDANT_BOOL_COMPARE: 1
RETURN_BOOL_ANTIPATTERN: 1
DEAD_CODE: 1
DEEP_NESTING: 1Hints
Hint 1: Use `ast.parse()` to get the syntax tree, then write separate checker functions for each anti-pattern. Walk the tree and collect (line_number, rule_name, message) tuples.
Hint 2: For detecting `if x: return True else: return False`, look for `ast.If` nodes where both the body and orelse contain a single `ast.Return` node returning a boolean constant.
