Skip to main content

Python Designing Clean APIs Practice Problems & Exercises

Practice: Designing Clean APIs

11 problems4 Easy4 Medium3 Hard40-55 min
← Back to lesson

Easy

#1Fix the Function NameEasy
namingverb-phrasesintent

The function below works correctly but has a terrible name. Rename it (and its internal variable) so that a caller can understand what it does without reading the body.

Python
def filter_active_users(users):
    """Return only users whose 'active' field is True."""
    active = [u for u in users if u.get("active")]
    return active

# Test
users = [
    {"name": "Alice", "active": True},
    {"name": "Bob", "active": False},
    {"name": "Carol", "active": True},
]

result = filter_active_users(users)
print(f"Active users: {result}")
print(f"Total active: {len(result)}")
Solution
# BEFORE — bad name
def do(d):
r = [u for u in d if u.get("active")]
return r

# AFTER — intent is clear from the name alone
def filter_active_users(users):
"""Return only users whose 'active' field is True."""
active = [u for u in users if u.get("active")]
return active

users = [
{"name": "Alice", "active": True},
{"name": "Bob", "active": False},
{"name": "Carol", "active": True},
]

result = filter_active_users(users)
print(f"Active users: {result}")
print(f"Total active: {len(result)}")

Naming principles applied:

  • do(d) violates every naming rule: no verb phrase, no indication of what d is, no hint at the return type.
  • filter_active_users(users) tells you: (1) it filters, (2) it returns active items, (3) it works on users.
  • The variable r becomes active — descriptive of what it holds.

The newspaper test: Can you describe the function in one verb phrase? "Filter active users" — passes. "Do d" — fails.

Expected Output
Active users: [{'name': 'Alice', 'active': True}, {'name': 'Carol', 'active': True}]\nTotal active: 2
Hints

Hint 1: Function names should be verb phrases that describe WHAT the function does, not vague words like "do", "run", or "process".

Hint 2: Use the prefix table: `get_` for read-only retrieval, `filter_` or descriptive verb for transformations. The function filters users by active status, so a name like `get_active_users` or `filter_active_users` communicates intent.

#2Enforce Keyword-Only ArgumentsEasy
keyword-onlystar-separatorcall-site-clarity

The function below takes too many positional arguments. Refactor it so that data remains positional but all other parameters are keyword-only (using the * separator). This prevents callers from passing unlabeled values.

Python
def create_report(
    data,
    *,
    title,
    author,
    format="pdf",
    include_charts=True,
):
    """Create a report from data with the given configuration."""
    return f"Report: {title} by {author} ({format}, charts={include_charts})"

# These calls MUST use keyword names — positional won't work
report1 = create_report(
    [1, 2, 3],
    title="Q4 Summary",
    author="Alice",
)
print(report1)

report2 = create_report(
    [4, 5, 6],
    title="Monthly",
    author="Bob",
    format="html",
    include_charts=False,
)
print(report2)
Solution
# BEFORE — all positional, caller can mix up order
def create_report(data, title, author, format="pdf", include_charts=True):
return f"Report: {title} by {author} ({format}, charts={include_charts})"

# Dangerous call — which string is title, which is author?
# create_report([1,2,3], "Q4 Summary", "Alice", "pdf", True)

# AFTER — keyword-only after *
def create_report(
data,
*,
title,
author,
format="pdf",
include_charts=True,
):
"""Create a report from data with the given configuration."""
return f"Report: {title} by {author} ({format}, charts={include_charts})"

report1 = create_report(
[1, 2, 3],
title="Q4 Summary",
author="Alice",
)
print(report1)

report2 = create_report(
[4, 5, 6],
title="Monthly",
author="Bob",
format="html",
include_charts=False,
)
print(report2)

Why keyword-only matters:

BEFORE (positional):
create_report(data, "Q4", "Alice", "pdf", True)
# What does True mean? What if you swap "Q4" and "Alice"?

AFTER (keyword-only):
create_report(data, title="Q4", author="Alice", include_charts=True)
# Every argument is labeled — impossible to confuse

Rule of thumb: If a function has more than 2-3 parameters, make everything after the first one or two keyword-only. The * separator costs nothing at runtime but prevents entire categories of bugs.

Expected Output
Report: Q4 Summary by Alice (pdf, charts=True)\nReport: Monthly by Bob (html, charts=False)
Hints

Hint 1: Place a bare `*` after the positional parameter `data` in the function signature. Everything after `*` becomes keyword-only.

Hint 2: Keyword-only arguments force callers to write `title="Q4 Summary"` instead of just `"Q4 Summary"` — making every call site self-documenting.

#3Consistent Return TypesEasy
return-typeconsistencyempty-collection

The two functions below have inconsistent return types — they return None when there are no results, forcing callers to check for None before iterating. Fix both functions to always return a list (empty list for no results).

Python
# Simulated database
tag_db = {1: ["python", "tutorial"], 2: ["javascript", "react"]}
score_db = {"alice": [95, 87], "bob": [72, 68, 91]}

def get_tags(post_id):
    """Return tags for a post. Always returns a list."""
    return tag_db.get(post_id, [])

def get_scores(username):
    """Return scores for a user. Always returns a list."""
    return score_db.get(username, [])

# Callers can iterate safely without None checks
tags1 = get_tags(1)
print(f"Tags for post 1: {tags1}")

tags_missing = get_tags(999)
print(f"Tags for post 999: {tags_missing}")
print(f"Can iterate empty: {[t.upper() for t in tags_missing] == []}")

scores1 = get_scores("alice")
print(f"Scores for alice: {scores1}")

scores_missing = get_scores("unknown")
print(f"Scores for unknown: {scores_missing}")
Solution
# BEFORE — inconsistent, returns None or list
def get_tags_bad(post_id):
tags = tag_db.get(post_id)
if not tags:
return None # Caller must check for None!
return tags

def get_scores_bad(username):
if username not in score_db:
return None # Caller must check for None!
return score_db[username]

# AFTER — always returns a list
tag_db = {1: ["python", "tutorial"], 2: ["javascript", "react"]}
score_db = {"alice": [95, 87], "bob": [72, 68, 91]}

def get_tags(post_id):
"""Return tags for a post. Always returns a list."""
return tag_db.get(post_id, [])

def get_scores(username):
"""Return scores for a user. Always returns a list."""
return score_db.get(username, [])

tags1 = get_tags(1)
print(f"Tags for post 1: {tags1}")

tags_missing = get_tags(999)
print(f"Tags for post 999: {tags_missing}")
print(f"Can iterate empty: {[t.upper() for t in tags_missing] == []}")

scores1 = get_scores("alice")
print(f"Scores for alice: {scores1}")

scores_missing = get_scores("unknown")
print(f"Scores for unknown: {scores_missing}")

Why consistent return types matter:

BEFORE (returns None or list):
tags = get_tags(999)
# tags is None — next line crashes:
for t in tags: # TypeError: 'NoneType' is not iterable
print(t.upper())

# Caller must write defensive code EVERY time:
tags = get_tags(999)
if tags is not None:
for t in tags:
print(t.upper())

AFTER (always returns list):
tags = get_tags(999) # returns []
for t in tags: # safe — iterating empty list does nothing
print(t.upper())

Rule: Prefer empty collections over None for "no results." Reserve None for cases where absence has a distinct semantic meaning (e.g., find_user() returns None meaning "user does not exist" vs returning a user dict).

Expected Output
Tags for post 1: ['python', 'tutorial']\nTags for post 999: []\nCan iterate empty: True\nScores for alice: [95, 87]\nScores for unknown: []
Hints

Hint 1: A function that returns a list should ALWAYS return a list — including an empty list `[]` for "no results". Never return `None` when the caller expects a list.

Hint 2: Returning `None` for "not found" forces every caller to write `if result is not None:` before iterating. An empty list lets callers iterate safely without any check.

#4Replace Boolean Flag with EnumEasy
enumboolean-flagreadability

The function below uses a boolean flag reverse to control sort order. At the call site, sort_values(data, True) is ambiguous. Refactor to use a SortOrder enum so call sites are self-documenting.

Python
from enum import Enum

class SortOrder(Enum):
    ASCENDING = "ascending"
    DESCENDING = "descending"

def sort_values(values, *, order=SortOrder.ASCENDING):
    """Sort a list of values in the specified order."""
    descending = order == SortOrder.DESCENDING
    return sorted(values, reverse=descending)

# Test — call sites are now unambiguous
data = [3, 1, 4, 1, 5, 2]

asc = sort_values(data, order=SortOrder.ASCENDING)
print(f"Ascending: {asc}")

desc = sort_values(data, order=SortOrder.DESCENDING)
print(f"Descending: {desc}")

# Default is ascending
default = sort_values(data)
print(f"Ascending (explicit): {default}")
Solution
# BEFORE — boolean flag
def sort_values_bad(values, reverse=False):
return sorted(values, reverse=reverse)

# Call site: what does True mean?
# sort_values_bad([3,1,4], True) # Reverse what? Sort? Order?

# AFTER — enum makes intent explicit
from enum import Enum

class SortOrder(Enum):
ASCENDING = "ascending"
DESCENDING = "descending"

def sort_values(values, *, order=SortOrder.ASCENDING):
"""Sort a list of values in the specified order."""
descending = order == SortOrder.DESCENDING
return sorted(values, reverse=descending)

data = [3, 1, 4, 1, 5, 2]

asc = sort_values(data, order=SortOrder.ASCENDING)
print(f"Ascending: {asc}")

desc = sort_values(data, order=SortOrder.DESCENDING)
print(f"Descending: {desc}")

default = sort_values(data)
print(f"Ascending (explicit): {default}")

Why enums beat booleans:

BEFORE:
sort_values(data, True) # What does True mean??
sort_values(data, False) # And False??

AFTER:
sort_values(data, order=SortOrder.DESCENDING) # Crystal clear
sort_values(data, order=SortOrder.ASCENDING) # No ambiguity

Bonus: Enums are extensible. If you later need SortOrder.RANDOM or SortOrder.STABLE_DESCENDING, you add a member — no boolean gymnastics needed. Booleans only offer two states; enums offer as many as you need.

Expected Output
Ascending: [1, 2, 3, 4, 5]\nDescending: [5, 4, 3, 2, 1]\nAscending (explicit): [1, 2, 3, 4, 5]
Hints

Hint 1: A boolean flag `reverse=True` at a call site leaves the reader guessing: "reverse what?" An enum like `SortOrder.DESCENDING` is unambiguous.

Hint 2: Create a `SortOrder` enum with `ASCENDING` and `DESCENDING` members. Replace the boolean parameter with a `sort_order: SortOrder` parameter.


Medium

#5Apply Single Responsibility PrincipleMedium
single-responsibilityseparation-of-concernsrefactoring

The function below violates single responsibility — it validates, creates, saves, and notifies all in one place. Split it into focused functions that each do one thing. Then compose them in a coordinator function.

Python
def create_user(name, email):
    """Create a user dict from name and email."""
    return {"name": name, "email": email}

def validate_user(user):
    """Validate that a user dict has required fields with valid data."""
    if not user.get("name"):
        raise ValueError("Name is required")
    if not user.get("email") or "@" not in user["email"]:
        raise ValueError("Valid email is required")
    return True

def save_user(user):
    """Persist a user to the database."""
    print(f"Saved: {user['email']}")
    return True

def notify_user(user):
    """Send a welcome notification to the user."""
    print(f"Notified: {user['email']}")
    return True

def register_user(name, email):
    """Orchestrate user registration: create, validate, save, notify."""
    user = create_user(name, email)
    print(f"User: {user}")
    valid = validate_user(user)
    print(f"Valid: {valid}")
    save_user(user)
    notify_user(user)

# Test
register_user("Alice", "[email protected]")
Solution
# BEFORE — one function does everything
def register_user_bad(name, email):
if not name:
raise ValueError("Name required")
if "@" not in email:
raise ValueError("Invalid email")
user = {"name": name, "email": email}
# save to db...
print(f"Saved: {email}")
# send notification...
print(f"Notified: {email}")
return user

# AFTER — each function does one thing
def create_user(name, email):
"""Create a user dict from name and email."""
return {"name": name, "email": email}

def validate_user(user):
"""Validate that a user dict has required fields with valid data."""
if not user.get("name"):
raise ValueError("Name is required")
if not user.get("email") or "@" not in user["email"]:
raise ValueError("Valid email is required")
return True

def save_user(user):
"""Persist a user to the database."""
print(f"Saved: {user['email']}")
return True

def notify_user(user):
"""Send a welcome notification to the user."""
print(f"Notified: {user['email']}")
return True

def register_user(name, email):
"""Orchestrate user registration: create, validate, save, notify."""
user = create_user(name, email)
print(f"User: {user}")
valid = validate_user(user)
print(f"Valid: {valid}")
save_user(user)
notify_user(user)

register_user("Alice", "[email protected]")

Single responsibility benefits:

Testability:
- test_validate_user() — no DB, no network, just logic
- test_save_user() — mock only the DB
- test_notify_user() — mock only the mailer

Reusability:
- validate_user() can be used in update_user() too
- save_user() works for any user dict, not just new registrations
- notify_user() can be reused for password resets

Debuggability:
- If notifications fail, you know it is in notify_user()
- The monolithic version could fail anywhere in 20 lines

The newspaper test for each function:

  • create_user — "Create a user dict." Pass.
  • validate_user — "Validate user data." Pass.
  • save_user — "Save user to database." Pass.
  • notify_user — "Send welcome notification." Pass.
  • register_user — "Orchestrate registration." Pass (it delegates, does not implement).
Expected Output
User: {'name': 'Alice', 'email': '[email protected]'}\nValid: True\nSaved: [email protected]\nNotified: [email protected]
Hints

Hint 1: The original function does four things: validate, create, save, and notify. Each should be its own function with a single purpose.

Hint 2: Apply the newspaper test: describe each function in one sentence without "and" or "or." If you cannot, split further. `fetch_user_data`, `validate_user`, `save_user`, `notify_user` each pass this test.

#6Design Error Handling with Consistent ContractsMedium
error-handlingreturn-typeValueErrorcontract

Design three functions with clear, consistent error handling contracts. Each function demonstrates a different but internally consistent pattern: raise on invalid input, raise on domain violations, or return None for lookups.

Python
def divide(a, b):
    """Divide a by b. Raises ZeroDivisionError if b is zero."""
    if b == 0:
        raise ZeroDivisionError("Cannot divide by zero")
    return a / b

def parse_age(value):
    """Parse a string into a valid age. Raises ValueError on invalid input."""
    try:
        age = int(value)
    except (ValueError, TypeError):
        raise ValueError(f"Cannot parse age from: {value!r}")
    if age < 0 or age > 150:
        raise ValueError(f"Age out of valid range: {age}")
    return age

def find_user(username):
    """Look up a user by username. Returns dict or None if not found."""
    users = {
        "alice": {"name": "Alice", "email": "[email protected]"},
        "bob": {"name": "Bob", "email": "[email protected]"},
    }
    return users.get(username)

# Test divide
result = divide(10, 3)
print(f"divide(10, 3): {result:.4f}")
try:
    divide(10, 0)
except ZeroDivisionError:
    print("divide(10, 0): raised ZeroDivisionError")

# Test parse_age
print(f"parse_age('25'): {parse_age('25')}")
try:
    parse_age("abc")
except ValueError:
    print("parse_age('abc'): raised ValueError")
try:
    parse_age("-5")
except ValueError:
    print("parse_age('-5'): raised ValueError")

# Test find_user
user = find_user("alice")
print(f"find_user('alice'): {user['email']}")
missing = find_user("unknown")
print(f"find_user('unknown'): {missing}")
Solution
def divide(a, b):
"""Divide a by b. Raises ZeroDivisionError if b is zero."""
if b == 0:
raise ZeroDivisionError("Cannot divide by zero")
return a / b

def parse_age(value):
"""Parse a string into a valid age. Raises ValueError on invalid input."""
try:
age = int(value)
except (ValueError, TypeError):
raise ValueError(f"Cannot parse age from: {value!r}")
if age < 0 or age > 150:
raise ValueError(f"Age out of valid range: {age}")
return age

def find_user(username):
"""Look up a user by username. Returns dict or None if not found."""
users = {
"alice": {"name": "Alice", "email": "[email protected]"},
"bob": {"name": "Bob", "email": "[email protected]"},
}
return users.get(username)

result = divide(10, 3)
print(f"divide(10, 3): {result:.4f}")
try:
divide(10, 0)
except ZeroDivisionError:
print("divide(10, 0): raised ZeroDivisionError")

print(f"parse_age('25'): {parse_age('25')}")
try:
parse_age("abc")
except ValueError:
print("parse_age('abc'): raised ValueError")
try:
parse_age("-5")
except ValueError:
print("parse_age('-5'): raised ValueError")

user = find_user("alice")
print(f"find_user('alice'): {user['email']}")
missing = find_user("unknown")
print(f"find_user('unknown'): {missing}")

Three error handling contracts:

Contract 1 — Always returns or raises (computation):
divide(a, b) -> float # always float
divide(a, 0) -> raises # never returns None for errors

Contract 2 — Always returns or raises (validation):
parse_age("25") -> int # always int
parse_age("abc") -> raises # invalid input = exception
parse_age("-5") -> raises # domain violation = exception

Contract 3 — Returns value or None (lookup):
find_user("alice") -> dict # found
find_user("xxx") -> None # not found (absence, not error)

When to raise vs return None:

  • Raise when the caller gave you bad input (wrong type, out of range, impossible operation). The caller made a mistake.
  • Return None when the input is valid but the item does not exist. "Not found" is a normal outcome, not an error.
  • Never mix both in the same function — pick one contract and stick to it.
Expected Output
divide(10, 3): 3.3333\ndivide(10, 0): raised ZeroDivisionError\nparse_age('25'): 25\nparse_age('abc'): raised ValueError\nparse_age('-5'): raised ValueError\nfind_user('alice'): [email protected]\nfind_user('unknown'): None
Hints

Hint 1: Functions that compute or get values should raise exceptions on invalid input rather than returning None. Reserve None for "not found" semantics in lookup functions.

Hint 2: Design three contracts: (1) `divide` always returns float or raises, (2) `parse_age` always returns int or raises, (3) `find_user` returns dict or None (lookup semantics). Each is consistent within its own contract.

#7Separate Pure Logic from I/OMedium
pure-functionstestabilityI/O-boundary

The function below mixes I/O (database lookup, logging) with business logic (discount calculation). Refactor so the discount logic is a pure function that can be tested without any mocks or database.

Python
def compute_price(
    base_price,
    *,
    is_premium=False,
    years_active=0,
):
    """Compute final price after applying loyalty discount.

    Pure function — no I/O, no side effects, fully testable.
    """
    if is_premium:
        discount = min(0.20 + years_active * 0.01, 0.35)
    else:
        discount = 0.0
    return round(base_price * (1 - discount), 2)

# Test pure logic — no mocks needed!
print(f"Basic (0 years): ${compute_price(100, is_premium=False):.2f}")
print(f"Premium (0 years): ${compute_price(100, is_premium=True):.2f}")
print(f"Premium (5 years): ${compute_price(100, is_premium=True, years_active=5):.2f}")
print(f"Premium (20 years): ${compute_price(100, is_premium=True, years_active=20):.2f}")
print("Loyalty discount capped at 35%")
Solution
# BEFORE — impure, mixes I/O with logic
def get_price_bad(user_id, base_price):
user = db.get_user(user_id) # I/O: database read
logger.info(f"Computing price for {user_id}") # side effect: logging
if user.is_premium:
discount = min(0.20 + user.years_active * 0.01, 0.35)
else:
discount = 0.0
return round(base_price * (1 - discount), 2)

# Testing requires: mock db, mock logger, create fake user — painful

# AFTER — pure logic separated from I/O
def compute_price(
base_price,
*,
is_premium=False,
years_active=0,
):
"""Compute final price after applying loyalty discount.

Pure function — no I/O, no side effects, fully testable.
"""
if is_premium:
discount = min(0.20 + years_active * 0.01, 0.35)
else:
discount = 0.0
return round(base_price * (1 - discount), 2)

# I/O wrapper (lives at the boundary of the system)
def get_user_price(user_id, base_price):
"""Fetch user and compute their price. I/O at the boundary."""
# user = db.get_user(user_id) # I/O happens here
# return compute_price(base_price, is_premium=user.is_premium, years_active=user.years_active)
pass

# Test the PURE logic — zero mocks
print(f"Basic (0 years): ${compute_price(100, is_premium=False):.2f}")
print(f"Premium (0 years): ${compute_price(100, is_premium=True):.2f}")
print(f"Premium (5 years): ${compute_price(100, is_premium=True, years_active=5):.2f}")
print(f"Premium (20 years): ${compute_price(100, is_premium=True, years_active=20):.2f}")
print("Loyalty discount capped at 35%")

Pure vs impure comparison:

IMPURE — get_price_bad(user_id, 100):
Requires: database mock, logger mock, fake user object
Test setup: ~15 lines of mocking boilerplate
Risk: test might fail due to mock setup, not logic bugs

PURE — compute_price(100, is_premium=True, years_active=5):
Requires: nothing
Test: assert compute_price(100, is_premium=True, years_active=5) == 75.0
Risk: only fails if the logic is wrong

The pattern: Push I/O to the edges of your system. Keep business logic pure in the middle. The thin I/O wrapper fetches data, calls the pure function, and saves the result.

Expected Output
Basic (0 years): $100.00\nPremium (0 years): $80.00\nPremium (5 years): $75.00\nPremium (20 years): $65.00\nLoyalty discount capped at 35%
Hints

Hint 1: A pure function takes all data as arguments and returns a result with no side effects. It never calls a database, API, or logger.

Hint 2: Extract the pricing logic into a pure function that takes `is_premium` and `years_active` as inputs. The impure wrapper fetches the user data and calls the pure function.

#8Config Object Instead of Many ParametersMedium
dataclassconfig-objectparameter-grouping

The function below has 7 parameters — too many to keep straight. Refactor by grouping the delivery options into an EmailConfig dataclass, reducing the function signature to essentials plus a config object.

Python
from dataclasses import dataclass

@dataclass
class EmailConfig:
    """Configuration for email delivery."""
    html: bool = True
    retries: int = 3
    timeout_seconds: int = 30

def send_email(
    to,
    subject,
    body,
    *,
    config=None,
):
    """Send an email with the given configuration."""
    if config is None:
        config = EmailConfig()
    print(f"Sending email...")
    print(f"  to: {to}")
    print(f"  subject: {subject}")
    print(f"  html: {config.html}, retries: {config.retries}, timeout: {config.timeout_seconds}s")

# Default config
send_email("[email protected]", "Hello", "Hi Alice!")

# Custom config
urgent_config = EmailConfig(html=False, retries=1, timeout_seconds=10)
send_email("[email protected]", "Alert", "System down!", config=urgent_config)
Solution
# BEFORE — too many parameters
def send_email_bad(to, subject, body, html=True, retries=3,
timeout=30, track_opens=False):
pass

# Call site is messy:
# send_email_bad("[email protected]", "Hi", "body", True, 3, 30, False)

# AFTER — grouped into config dataclass
from dataclasses import dataclass

@dataclass
class EmailConfig:
"""Configuration for email delivery."""
html: bool = True
retries: int = 3
timeout_seconds: int = 30

def send_email(
to,
subject,
body,
*,
config=None,
):
"""Send an email with the given configuration."""
if config is None:
config = EmailConfig()
print(f"Sending email...")
print(f" to: {to}")
print(f" subject: {subject}")
print(f" html: {config.html}, retries: {config.retries}, timeout: {config.timeout_seconds}s")

send_email("[email protected]", "Hello", "Hi Alice!")

urgent_config = EmailConfig(html=False, retries=1, timeout_seconds=10)
send_email("[email protected]", "Alert", "System down!", config=urgent_config)

Why config objects beat long parameter lists:

1. Reusable — define once, pass to many calls:
prod_config = EmailConfig(retries=5, timeout_seconds=60)
send_email("[email protected]", "Test1", "...", config=prod_config)
send_email("[email protected]", "Test2", "...", config=prod_config)

2. Testable — config is a plain data object:
assert EmailConfig().retries == 3
assert EmailConfig(retries=1).retries == 1

3. Extensible — add a field without changing callers:
@dataclass
class EmailConfig:
html: bool = True
retries: int = 3
timeout_seconds: int = 30
track_opens: bool = False # New! No existing calls break.

4. Self-documenting — IDE shows all fields and defaults

Rule of thumb: When 3+ parameters "travel together" and configure the same concern, extract them into a dataclass.

Expected Output
Sending email...\n  to: [email protected]\n  subject: Hello\n  html: True, retries: 3, timeout: 30s\nSending email...\n  to: [email protected]\n  subject: Alert\n  html: False, retries: 1, timeout: 10s
Hints

Hint 1: When a function has more than 3-4 related parameters, group them into a dataclass. This gives you named fields, defaults, type safety, and a clean repr.

Hint 2: The `EmailConfig` dataclass holds delivery options (html, retries, timeout). The `send_email` function takes the essential args (to, subject, body) plus the config object.


Hard

#9Redesign a Cryptic APIHard
full-redesignenumkeyword-onlydocstringtype-annotations

The function below is a worst-case API — cryptic name, mystery parameters, magic integers, positional booleans, and no documentation. Redesign it from scratch applying every clean API principle from the lesson.

Python
from enum import Enum

class Aggregation(Enum):
    SUM = "sum"
    AVERAGE = "average"
    MAX = "max"

def aggregate_top_values(
    values,
    aggregation,
    *,
    top_n=None,
    use_absolute=False,
):
    """Aggregate the top-N values from a list.

    Args:
        values: Input list of numeric values.
        aggregation: Aggregation method (SUM, AVERAGE, MAX).
        top_n: If given, consider only the top N values. None means all.
        use_absolute: If True, take absolute value of each element first.

    Returns:
        The aggregated result as a float.

    Raises:
        ValueError: If values is empty.
    """
    if not values:
        raise ValueError("Cannot aggregate empty list")

    processed = [abs(v) for v in values] if use_absolute else list(values)
    processed.sort(reverse=True)
    selected = processed[:top_n] if top_n is not None else processed

    if aggregation == Aggregation.SUM:
        return float(sum(selected))
    if aggregation == Aggregation.AVERAGE:
        return sum(selected) / len(selected)
    if aggregation == Aggregation.MAX:
        return float(max(selected))
    raise ValueError(f"Unknown aggregation: {aggregation}")

# Clean call sites — self-documenting
result1 = aggregate_top_values(
    [-3, 1, -5, 2],
    Aggregation.SUM,
    use_absolute=True,
    top_n=2,
)
print(f"Top 2 (sum, absolute): {result1}")

result2 = aggregate_top_values([1, 2, 3, 4], Aggregation.AVERAGE)
print(f"All (average): {result2}")

result3 = aggregate_top_values([3, 1, 5, 2], Aggregation.MAX, top_n=3)
print(f"Top 3 (max): {int(result3)}")
Solution
# BEFORE — the worst API
def do(x, y, z=1, flag=False, out=None):
if flag:
x = [abs(v) for v in x]
x = sorted(x, reverse=True)[:z]
if y == "sum":
result = sum(x)
elif y == "avg":
result = sum(x) / len(x) if x else 0
elif y == "max":
result = max(x) if x else 0
if out == "print":
print(result)
return result

# Problems:
# 1. Name "do" says nothing
# 2. x, y, z, flag, out — meaningless parameter names
# 3. z=1 means "all" — magic value
# 4. y is a string mode flag — easy to typo
# 5. flag is a positional boolean
# 6. out="print" mixes I/O with computation
# 7. No type annotations, no docstring

# AFTER — clean API
from enum import Enum

class Aggregation(Enum):
SUM = "sum"
AVERAGE = "average"
MAX = "max"

def aggregate_top_values(
values,
aggregation,
*,
top_n=None,
use_absolute=False,
):
"""Aggregate the top-N values from a list.

Args:
values: Input list of numeric values.
aggregation: Aggregation method (SUM, AVERAGE, MAX).
top_n: If given, consider only the top N values. None means all.
use_absolute: If True, take absolute value of each element first.

Returns:
The aggregated result as a float.

Raises:
ValueError: If values is empty.
"""
if not values:
raise ValueError("Cannot aggregate empty list")

processed = [abs(v) for v in values] if use_absolute else list(values)
processed.sort(reverse=True)
selected = processed[:top_n] if top_n is not None else processed

if aggregation == Aggregation.SUM:
return float(sum(selected))
if aggregation == Aggregation.AVERAGE:
return sum(selected) / len(selected)
if aggregation == Aggregation.MAX:
return float(max(selected))
raise ValueError(f"Unknown aggregation: {aggregation}")

result1 = aggregate_top_values(
[-3, 1, -5, 2],
Aggregation.SUM,
use_absolute=True,
top_n=2,
)
print(f"Top 2 (sum, absolute): {result1}")

result2 = aggregate_top_values([1, 2, 3, 4], Aggregation.AVERAGE)
print(f"All (average): {result2}")

result3 = aggregate_top_values([3, 1, 5, 2], Aggregation.MAX, top_n=3)
print(f"Top 3 (max): {int(result3)}")

Every fix mapped to a principle:

BeforeAfterPrinciple
doaggregate_top_valuesNaming communicates intent
xvaluesDescriptive parameter names
y = "sum"Aggregation.SUMEnum replaces magic strings
z = 1 (means all)top_n = NoneNone means "no limit" — no magic values
flag (positional bool)use_absolute (keyword-only)Boolean flag anti-pattern fixed
out = "print"RemovedSingle responsibility — I/O is not this function's job
No annotationsFull annotationsType annotations as documentation
No docstringGoogle-style docstringDocuments args, returns, raises
Returns int or floatAlways returns floatConsistent return type
Expected Output
Top 2 (sum, absolute): 8.0\nAll (average): 2.5\nTop 3 (max): 5
Hints

Hint 1: Identify every problem: cryptic name, positional booleans, integer mode flag, inconsistent return behavior, no type annotations, no docstring.

Hint 2: Apply all principles: (1) descriptive function name, (2) enum for aggregation mode, (3) keyword-only args, (4) type annotations, (5) consistent float return, (6) Google-style docstring.

#10Builder Pattern for Complex ConfigurationHard
builder-patternchainingfluent-interfaceimmutable-config

Implement a QueryBuilder that constructs a database query configuration step by step. The builder should support method chaining (fluent interface) and produce an immutable QueryConfig result. Validate that a table is set before building.

Python
from dataclasses import dataclass, field

@dataclass(frozen=True)
class QueryConfig:
    """Immutable query configuration — cannot be modified after creation."""
    table: str
    filters: tuple
    order_by: str
    limit: int
    offset: int

class QueryBuilder:
    """Fluent builder for constructing QueryConfig objects."""

    def __init__(self):
        self._table = None
        self._filters = []
        self._order_by = None
        self._limit = 100
        self._offset = 0

    def from_table(self, table):
        """Set the target table."""
        self._table = table
        return self

    def where(self, column, operator, value):
        """Add a filter condition."""
        self._filters.append((column, operator, value))
        return self

    def order_by(self, column):
        """Set the sort column."""
        self._order_by = column
        return self

    def limit(self, n):
        """Set the maximum number of results."""
        self._limit = n
        return self

    def offset(self, n):
        """Set the starting offset."""
        self._offset = n
        return self

    def build(self):
        """Validate and produce an immutable QueryConfig."""
        if not self._table:
            raise ValueError("Table is required — call .from_table() first")
        return QueryConfig(
            table=self._table,
            filters=tuple(self._filters),
            order_by=self._order_by or self._table,
            limit=self._limit,
            offset=self._offset,
        )

# Fluent chained usage
config1 = (
    QueryBuilder()
    .from_table("users")
    .where("age", ">", 21)
    .where("status", "=", "active")
    .order_by("name")
    .limit(10)
    .build()
)
print(config1)

config2 = (
    QueryBuilder()
    .from_table("orders")
    .where("total", ">", 100)
    .order_by("created_at")
    .limit(50)
    .build()
)
print(config2)
Solution
from dataclasses import dataclass, field

@dataclass(frozen=True)
class QueryConfig:
"""Immutable query configuration — cannot be modified after creation."""
table: str
filters: tuple
order_by: str
limit: int
offset: int

class QueryBuilder:
"""Fluent builder for constructing QueryConfig objects."""

def __init__(self):
self._table = None
self._filters = []
self._order_by = None
self._limit = 100
self._offset = 0

def from_table(self, table):
self._table = table
return self # enables chaining

def where(self, column, operator, value):
self._filters.append((column, operator, value))
return self

def order_by(self, column):
self._order_by = column
return self

def limit(self, n):
self._limit = n
return self

def offset(self, n):
self._offset = n
return self

def build(self):
if not self._table:
raise ValueError("Table is required — call .from_table() first")
return QueryConfig(
table=self._table,
filters=tuple(self._filters),
order_by=self._order_by or self._table,
limit=self._limit,
offset=self._offset,
)

config1 = (
QueryBuilder()
.from_table("users")
.where("age", ">", 21)
.where("status", "=", "active")
.order_by("name")
.limit(10)
.build()
)
print(config1)

config2 = (
QueryBuilder()
.from_table("orders")
.where("total", ">", 100)
.order_by("created_at")
.limit(50)
.build()
)
print(config2)

Builder pattern anatomy:

QueryBuilder() # 1. Create mutable builder
.from_table("users") # 2. Set required fields
.where("age", ">", 21) # 3. Add optional config (chainable)
.where("status", "=", "ok") # 4. Chain more config
.order_by("name") # 5. Chain more config
.limit(10) # 6. Chain more config
.build() # 7. Validate + produce immutable result

Why builders beat constructors for complex objects:

  • Readable: Each step is named — .where("age", ">", 21) vs positional args.
  • Flexible: Optional steps can be skipped — only .from_table() is required.
  • Safe: build() validates before producing the immutable result. The frozen=True dataclass prevents accidental mutation after construction.
  • Composable: You can pass a partially-built builder around and let different parts of the system add their constraints.

When to use the builder pattern:

  • Object has many optional fields with sensible defaults
  • Construction requires validation that spans multiple fields
  • You want the final object to be immutable
  • You want call sites to read like a declarative specification
Expected Output
QueryConfig(table='users', filters=[('age', '>', 21), ('status', '=', 'active')], order_by='name', limit=10, offset=0)\nQueryConfig(table='orders', filters=[('total', '>', 100)], order_by='created_at', limit=50, offset=0)
Hints

Hint 1: A builder collects configuration step by step and produces a final immutable object. Each method returns `self` to enable chaining.

Hint 2: The `build()` method validates the configuration and returns a frozen dataclass or namedtuple. Once built, the config cannot be modified — this prevents accidental mutation.

#11Fluent Data PipelineHard
fluent-interfacelazy-evaluationpipelinechaining

Implement a lazy, chainable Pipeline class that supports filter, map, sort_by, limit, collect, and reduce. All transformations should be deferred until collect() or reduce() is called. Each chainable method returns a new Pipeline (does not mutate the original).

Python
from functools import reduce as functools_reduce

class Pipeline:
    """Lazy, chainable data transformation pipeline."""

    def __init__(self, source):
        self._source = source
        self._ops = []

    def _clone_with(self, op):
        """Create a new Pipeline with an additional operation."""
        new = Pipeline(self._source)
        new._ops = self._ops + [op]
        return new

    def filter(self, predicate):
        """Keep elements where predicate returns True."""
        return self._clone_with(lambda data, p=predicate: [x for x in data if p(x)])

    def map(self, transform):
        """Apply transform to every element."""
        return self._clone_with(lambda data, f=transform: [f(x) for x in data])

    def sort_by(self, key, *, descending=False):
        """Sort elements by a key name or function."""
        if isinstance(key, str):
            key_fn = lambda x, k=key: x[k]
        else:
            key_fn = key
        return self._clone_with(
            lambda data, kf=key_fn, desc=descending: sorted(data, key=kf, reverse=desc)
        )

    def limit(self, n):
        """Keep only the first n elements."""
        return self._clone_with(lambda data, n=n: data[:n])

    def collect(self):
        """Execute all operations and return a list."""
        result = list(self._source)
        for op in self._ops:
            result = op(result)
        return result

    def reduce(self, func, initial):
        """Execute operations and fold into a single value."""
        return functools_reduce(func, self.collect(), initial)

# Test data
records = [
    {"name": "Alice", "score": 80, "active": True},
    {"name": "Bob", "score": 95, "active": False},
    {"name": "Carol", "score": 70, "active": True},
    {"name": "Dave", "score": 88, "active": True},
]

# Chain: filter active -> boost scores -> sort -> top 2
top = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .map(lambda r: {"name": r["name"], "score": round(r["score"] * 1.1, 1)})
    .sort_by("score", descending=True)
    .limit(2)
    .collect()
)
print(f"Top scorers: {top}")

# Reduce: sum of active scores
total = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .reduce(lambda acc, r: acc + r["score"], 0)
)
print(f"Total active score: {total}")

# Reuse: same pipeline, different terminal
names = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .sort_by("score", descending=True)
    .map(lambda r: r["name"])
    .collect()
)
print(f"Names: {names}")
Solution
from functools import reduce as functools_reduce

class Pipeline:
"""Lazy, chainable data transformation pipeline."""

def __init__(self, source):
self._source = source
self._ops = []

def _clone_with(self, op):
"""Create a new Pipeline with an additional operation."""
new = Pipeline(self._source)
new._ops = self._ops + [op]
return new

def filter(self, predicate):
return self._clone_with(lambda data, p=predicate: [x for x in data if p(x)])

def map(self, transform):
return self._clone_with(lambda data, f=transform: [f(x) for x in data])

def sort_by(self, key, *, descending=False):
if isinstance(key, str):
key_fn = lambda x, k=key: x[k]
else:
key_fn = key
return self._clone_with(
lambda data, kf=key_fn, desc=descending: sorted(data, key=kf, reverse=desc)
)

def limit(self, n):
return self._clone_with(lambda data, n=n: data[:n])

def collect(self):
result = list(self._source)
for op in self._ops:
result = op(result)
return result

def reduce(self, func, initial):
return functools_reduce(func, self.collect(), initial)

records = [
{"name": "Alice", "score": 80, "active": True},
{"name": "Bob", "score": 95, "active": False},
{"name": "Carol", "score": 70, "active": True},
{"name": "Dave", "score": 88, "active": True},
]

top = (
Pipeline(records)
.filter(lambda r: r["active"])
.map(lambda r: {"name": r["name"], "score": round(r["score"] * 1.1, 1)})
.sort_by("score", descending=True)
.limit(2)
.collect()
)
print(f"Top scorers: {top}")

total = (
Pipeline(records)
.filter(lambda r: r["active"])
.reduce(lambda acc, r: acc + r["score"], 0)
)
print(f"Total active score: {total}")

names = (
Pipeline(records)
.filter(lambda r: r["active"])
.sort_by("score", descending=True)
.map(lambda r: r["name"])
.collect()
)
print(f"Names: {names}")

Key design decisions:

1. LAZY — operations stored, not executed:
Pipeline(data).filter(fn).map(fn) # no work done yet
.collect() # NOW all ops run in sequence

2. IMMUTABLE CHAINS — each method returns a NEW Pipeline:
base = Pipeline(data).filter(active)
branch_a = base.limit(5) # does not modify base
branch_b = base.sort_by("score") # does not modify base

3. LAMBDA CAPTURE — default arg trick prevents late binding:
lambda data, p=predicate: ... # p is bound at creation time
# Without p=predicate, all lambdas would share the last predicate

4. FLUENT INTERFACE — return self/new enables chaining:
Pipeline(data).filter(fn).map(fn).limit(5).collect()
# Reads like a sentence: "filter, then map, then limit, then collect"

Clean API principles applied:

  • Naming: filter, map, sort_by, limit, collect, reduce — all standard verbs from functional programming.
  • Keyword-only: sort_by(key, *, descending=False) — prevents sort_by("name", True).
  • Consistent returns: chainable methods return Pipeline, terminal methods return the result type.
  • Single responsibility: each method does exactly one thing — add one operation to the chain.
  • Least surprise: collect() returns a list, reduce() returns a single value — no surprises.
Expected Output
Top scorers: [{'name': 'Dave', 'score': 96.8}, {'name': 'Alice', 'score': 88.0}]\nTotal active score: 238\nNames: ['Dave', 'Alice', 'Carol']
Hints

Hint 1: Each method (filter, map, sort_by, limit) should store the operation but NOT execute it. Return a new Pipeline instance with the operation appended.

Hint 2: The `collect()` method applies all stored operations in order. The `reduce()` method applies them and then folds. Lazy evaluation means no work happens until collect/reduce is called.

© 2026 EngineersOfAI. All rights reserved.