Python Designing Clean APIs Practice Problems & Exercises

Practice: Designing Clean APIs

11 problems4 Easy4 Medium3 Hard⏱ 40-55 min

Easy

#1Fix the Function NameEasy

namingverb-phrasesintent

The function below works correctly but has a terrible name. Rename it (and its internal variable) so that a caller can understand what it does without reading the body.

Python

def filter_active_users(users):
    """Return only users whose 'active' field is True."""
    active = [u for u in users if u.get("active")]
    return active

# Test
users = [
    {"name": "Alice", "active": True},
    {"name": "Bob", "active": False},
    {"name": "Carol", "active": True},
]

result = filter_active_users(users)
print(f"Active users: {result}")
print(f"Total active: {len(result)}")

Solution

# BEFORE — bad name
def do(d):
    r = [u for u in d if u.get("active")]
    return r

# AFTER — intent is clear from the name alone
def filter_active_users(users):
    """Return only users whose 'active' field is True."""
    active = [u for u in users if u.get("active")]
    return active

users = [
    {"name": "Alice", "active": True},
    {"name": "Bob", "active": False},
    {"name": "Carol", "active": True},
]

result = filter_active_users(users)
print(f"Active users: {result}")
print(f"Total active: {len(result)}")

Naming principles applied:

do(d) violates every naming rule: no verb phrase, no indication of what d is, no hint at the return type.
filter_active_users(users) tells you: (1) it filters, (2) it returns active items, (3) it works on users.
The variable r becomes active — descriptive of what it holds.

The newspaper test: Can you describe the function in one verb phrase? "Filter active users" — passes. "Do d" — fails.

Expected Output

Active users: [{'name': 'Alice', 'active': True}, {'name': 'Carol', 'active': True}]\nTotal active: 2

Hints

Hint 1: Function names should be verb phrases that describe WHAT the function does, not vague words like "do", "run", or "process".

Hint 2: Use the prefix table: `get_` for read-only retrieval, `filter_` or descriptive verb for transformations. The function filters users by active status, so a name like `get_active_users` or `filter_active_users` communicates intent.

#2Enforce Keyword-Only ArgumentsEasy

keyword-onlystar-separatorcall-site-clarity

The function below takes too many positional arguments. Refactor it so that data remains positional but all other parameters are keyword-only (using the * separator). This prevents callers from passing unlabeled values.

Python

def create_report(
    data,
    *,
    title,
    author,
    format="pdf",
    include_charts=True,
):
    """Create a report from data with the given configuration."""
    return f"Report: {title} by {author} ({format}, charts={include_charts})"

# These calls MUST use keyword names — positional won't work
report1 = create_report(
    [1, 2, 3],
    title="Q4 Summary",
    author="Alice",
)
print(report1)

report2 = create_report(
    [4, 5, 6],
    title="Monthly",
    author="Bob",
    format="html",
    include_charts=False,
)
print(report2)

Solution

# BEFORE — all positional, caller can mix up order
def create_report(data, title, author, format="pdf", include_charts=True):
    return f"Report: {title} by {author} ({format}, charts={include_charts})"

# Dangerous call — which string is title, which is author?
# create_report([1,2,3], "Q4 Summary", "Alice", "pdf", True)

# AFTER — keyword-only after *
def create_report(
    data,
    *,
    title,
    author,
    format="pdf",
    include_charts=True,
):
    """Create a report from data with the given configuration."""
    return f"Report: {title} by {author} ({format}, charts={include_charts})"

report1 = create_report(
    [1, 2, 3],
    title="Q4 Summary",
    author="Alice",
)
print(report1)

report2 = create_report(
    [4, 5, 6],
    title="Monthly",
    author="Bob",
    format="html",
    include_charts=False,
)
print(report2)

Why keyword-only matters:

BEFORE (positional):
  create_report(data, "Q4", "Alice", "pdf", True)
  # What does True mean? What if you swap "Q4" and "Alice"?

AFTER (keyword-only):
  create_report(data, title="Q4", author="Alice", include_charts=True)
  # Every argument is labeled — impossible to confuse

Rule of thumb: If a function has more than 2-3 parameters, make everything after the first one or two keyword-only. The * separator costs nothing at runtime but prevents entire categories of bugs.

Expected Output

Report: Q4 Summary by Alice (pdf, charts=True)\nReport: Monthly by Bob (html, charts=False)

Hints

Hint 1: Place a bare `*` after the positional parameter `data` in the function signature. Everything after `*` becomes keyword-only.

Hint 2: Keyword-only arguments force callers to write `title="Q4 Summary"` instead of just `"Q4 Summary"` — making every call site self-documenting.

#3Consistent Return TypesEasy

return-typeconsistencyempty-collection

The two functions below have inconsistent return types — they return None when there are no results, forcing callers to check for None before iterating. Fix both functions to always return a list (empty list for no results).

Python

# Simulated database
tag_db = {1: ["python", "tutorial"], 2: ["javascript", "react"]}
score_db = {"alice": [95, 87], "bob": [72, 68, 91]}

def get_tags(post_id):
    """Return tags for a post. Always returns a list."""
    return tag_db.get(post_id, [])

def get_scores(username):
    """Return scores for a user. Always returns a list."""
    return score_db.get(username, [])

# Callers can iterate safely without None checks
tags1 = get_tags(1)
print(f"Tags for post 1: {tags1}")

tags_missing = get_tags(999)
print(f"Tags for post 999: {tags_missing}")
print(f"Can iterate empty: {[t.upper() for t in tags_missing] == []}")

scores1 = get_scores("alice")
print(f"Scores for alice: {scores1}")

scores_missing = get_scores("unknown")
print(f"Scores for unknown: {scores_missing}")

Solution

# BEFORE — inconsistent, returns None or list
def get_tags_bad(post_id):
    tags = tag_db.get(post_id)
    if not tags:
        return None        # Caller must check for None!
    return tags

def get_scores_bad(username):
    if username not in score_db:
        return None        # Caller must check for None!
    return score_db[username]

# AFTER — always returns a list
tag_db = {1: ["python", "tutorial"], 2: ["javascript", "react"]}
score_db = {"alice": [95, 87], "bob": [72, 68, 91]}

def get_tags(post_id):
    """Return tags for a post. Always returns a list."""
    return tag_db.get(post_id, [])

def get_scores(username):
    """Return scores for a user. Always returns a list."""
    return score_db.get(username, [])

tags1 = get_tags(1)
print(f"Tags for post 1: {tags1}")

tags_missing = get_tags(999)
print(f"Tags for post 999: {tags_missing}")
print(f"Can iterate empty: {[t.upper() for t in tags_missing] == []}")

scores1 = get_scores("alice")
print(f"Scores for alice: {scores1}")

scores_missing = get_scores("unknown")
print(f"Scores for unknown: {scores_missing}")

Why consistent return types matter:

BEFORE (returns None or list):
  tags = get_tags(999)
  # tags is None — next line crashes:
  for t in tags:         # TypeError: 'NoneType' is not iterable
      print(t.upper())

  # Caller must write defensive code EVERY time:
  tags = get_tags(999)
  if tags is not None:
      for t in tags:
          print(t.upper())

AFTER (always returns list):
  tags = get_tags(999)   # returns []
  for t in tags:         # safe — iterating empty list does nothing
      print(t.upper())

Rule: Prefer empty collections over None for "no results." Reserve None for cases where absence has a distinct semantic meaning (e.g., find_user() returns None meaning "user does not exist" vs returning a user dict).

Expected Output

Tags for post 1: ['python', 'tutorial']\nTags for post 999: []\nCan iterate empty: True\nScores for alice: [95, 87]\nScores for unknown: []

Hints

Hint 1: A function that returns a list should ALWAYS return a list — including an empty list `[]` for "no results". Never return `None` when the caller expects a list.

Hint 2: Returning `None` for "not found" forces every caller to write `if result is not None:` before iterating. An empty list lets callers iterate safely without any check.

#4Replace Boolean Flag with EnumEasy

enumboolean-flagreadability

The function below uses a boolean flag reverse to control sort order. At the call site, sort_values(data, True) is ambiguous. Refactor to use a SortOrder enum so call sites are self-documenting.

Python

from enum import Enum

class SortOrder(Enum):
    ASCENDING = "ascending"
    DESCENDING = "descending"

def sort_values(values, *, order=SortOrder.ASCENDING):
    """Sort a list of values in the specified order."""
    descending = order == SortOrder.DESCENDING
    return sorted(values, reverse=descending)

# Test — call sites are now unambiguous
data = [3, 1, 4, 1, 5, 2]

asc = sort_values(data, order=SortOrder.ASCENDING)
print(f"Ascending: {asc}")

desc = sort_values(data, order=SortOrder.DESCENDING)
print(f"Descending: {desc}")

# Default is ascending
default = sort_values(data)
print(f"Ascending (explicit): {default}")

Solution

# BEFORE — boolean flag
def sort_values_bad(values, reverse=False):
    return sorted(values, reverse=reverse)

# Call site: what does True mean?
# sort_values_bad([3,1,4], True)   # Reverse what? Sort? Order?

# AFTER — enum makes intent explicit
from enum import Enum

class SortOrder(Enum):
    ASCENDING = "ascending"
    DESCENDING = "descending"

def sort_values(values, *, order=SortOrder.ASCENDING):
    """Sort a list of values in the specified order."""
    descending = order == SortOrder.DESCENDING
    return sorted(values, reverse=descending)

data = [3, 1, 4, 1, 5, 2]

asc = sort_values(data, order=SortOrder.ASCENDING)
print(f"Ascending: {asc}")

desc = sort_values(data, order=SortOrder.DESCENDING)
print(f"Descending: {desc}")

default = sort_values(data)
print(f"Ascending (explicit): {default}")

Why enums beat booleans:

BEFORE:
  sort_values(data, True)      # What does True mean??
  sort_values(data, False)     # And False??

AFTER:
  sort_values(data, order=SortOrder.DESCENDING)   # Crystal clear
  sort_values(data, order=SortOrder.ASCENDING)     # No ambiguity

Bonus: Enums are extensible. If you later need SortOrder.RANDOM or SortOrder.STABLE_DESCENDING, you add a member — no boolean gymnastics needed. Booleans only offer two states; enums offer as many as you need.

Expected Output

Ascending: [1, 2, 3, 4, 5]\nDescending: [5, 4, 3, 2, 1]\nAscending (explicit): [1, 2, 3, 4, 5]

Hints

Hint 1: A boolean flag `reverse=True` at a call site leaves the reader guessing: "reverse what?" An enum like `SortOrder.DESCENDING` is unambiguous.

Hint 2: Create a `SortOrder` enum with `ASCENDING` and `DESCENDING` members. Replace the boolean parameter with a `sort_order: SortOrder` parameter.

Medium

#5Apply Single Responsibility PrincipleMedium

single-responsibilityseparation-of-concernsrefactoring

The function below violates single responsibility — it validates, creates, saves, and notifies all in one place. Split it into focused functions that each do one thing. Then compose them in a coordinator function.

Python

def create_user(name, email):
    """Create a user dict from name and email."""
    return {"name": name, "email": email}

def validate_user(user):
    """Validate that a user dict has required fields with valid data."""
    if not user.get("name"):
        raise ValueError("Name is required")
    if not user.get("email") or "@" not in user["email"]:
        raise ValueError("Valid email is required")
    return True

def save_user(user):
    """Persist a user to the database."""
    print(f"Saved: {user['email']}")
    return True

def notify_user(user):
    """Send a welcome notification to the user."""
    print(f"Notified: {user['email']}")
    return True

def register_user(name, email):
    """Orchestrate user registration: create, validate, save, notify."""
    user = create_user(name, email)
    print(f"User: {user}")
    valid = validate_user(user)
    print(f"Valid: {valid}")
    save_user(user)
    notify_user(user)

# Test
register_user("Alice", "[email protected]")

Solution

# BEFORE — one function does everything
def register_user_bad(name, email):
    if not name:
        raise ValueError("Name required")
    if "@" not in email:
        raise ValueError("Invalid email")
    user = {"name": name, "email": email}
    # save to db...
    print(f"Saved: {email}")
    # send notification...
    print(f"Notified: {email}")
    return user

# AFTER — each function does one thing
def create_user(name, email):
    """Create a user dict from name and email."""
    return {"name": name, "email": email}

def validate_user(user):
    """Validate that a user dict has required fields with valid data."""
    if not user.get("name"):
        raise ValueError("Name is required")
    if not user.get("email") or "@" not in user["email"]:
        raise ValueError("Valid email is required")
    return True

def save_user(user):
    """Persist a user to the database."""
    print(f"Saved: {user['email']}")
    return True

def notify_user(user):
    """Send a welcome notification to the user."""
    print(f"Notified: {user['email']}")
    return True

def register_user(name, email):
    """Orchestrate user registration: create, validate, save, notify."""
    user = create_user(name, email)
    print(f"User: {user}")
    valid = validate_user(user)
    print(f"Valid: {valid}")
    save_user(user)
    notify_user(user)

register_user("Alice", "[email protected]")

Single responsibility benefits:

Testability:
  - test_validate_user() — no DB, no network, just logic
  - test_save_user() — mock only the DB
  - test_notify_user() — mock only the mailer

Reusability:
  - validate_user() can be used in update_user() too
  - save_user() works for any user dict, not just new registrations
  - notify_user() can be reused for password resets

Debuggability:
  - If notifications fail, you know it is in notify_user()
  - The monolithic version could fail anywhere in 20 lines

The newspaper test for each function:

create_user — "Create a user dict." Pass.
validate_user — "Validate user data." Pass.
save_user — "Save user to database." Pass.
notify_user — "Send welcome notification." Pass.
register_user — "Orchestrate registration." Pass (it delegates, does not implement).

Expected Output

User: {'name': 'Alice', 'email': '[email protected]'}\nValid: True\nSaved: [email protected]\nNotified: [email protected]

Hints

Hint 1: The original function does four things: validate, create, save, and notify. Each should be its own function with a single purpose.

Hint 2: Apply the newspaper test: describe each function in one sentence without "and" or "or." If you cannot, split further. `fetch_user_data`, `validate_user`, `save_user`, `notify_user` each pass this test.

#6Design Error Handling with Consistent ContractsMedium

error-handlingreturn-typeValueErrorcontract

Design three functions with clear, consistent error handling contracts. Each function demonstrates a different but internally consistent pattern: raise on invalid input, raise on domain violations, or return None for lookups.

Python

def divide(a, b):
    """Divide a by b. Raises ZeroDivisionError if b is zero."""
    if b == 0:
        raise ZeroDivisionError("Cannot divide by zero")
    return a / b

def parse_age(value):
    """Parse a string into a valid age. Raises ValueError on invalid input."""
    try:
        age = int(value)
    except (ValueError, TypeError):
        raise ValueError(f"Cannot parse age from: {value!r}")
    if age < 0 or age > 150:
        raise ValueError(f"Age out of valid range: {age}")
    return age

def find_user(username):
    """Look up a user by username. Returns dict or None if not found."""
    users = {
        "alice": {"name": "Alice", "email": "[email protected]"},
        "bob": {"name": "Bob", "email": "[email protected]"},
    }
    return users.get(username)

# Test divide
result = divide(10, 3)
print(f"divide(10, 3): {result:.4f}")
try:
    divide(10, 0)
except ZeroDivisionError:
    print("divide(10, 0): raised ZeroDivisionError")

# Test parse_age
print(f"parse_age('25'): {parse_age('25')}")
try:
    parse_age("abc")
except ValueError:
    print("parse_age('abc'): raised ValueError")
try:
    parse_age("-5")
except ValueError:
    print("parse_age('-5'): raised ValueError")

# Test find_user
user = find_user("alice")
print(f"find_user('alice'): {user['email']}")
missing = find_user("unknown")
print(f"find_user('unknown'): {missing}")

Solution

def divide(a, b):
    """Divide a by b. Raises ZeroDivisionError if b is zero."""
    if b == 0:
        raise ZeroDivisionError("Cannot divide by zero")
    return a / b

def parse_age(value):
    """Parse a string into a valid age. Raises ValueError on invalid input."""
    try:
        age = int(value)
    except (ValueError, TypeError):
        raise ValueError(f"Cannot parse age from: {value!r}")
    if age < 0 or age > 150:
        raise ValueError(f"Age out of valid range: {age}")
    return age

def find_user(username):
    """Look up a user by username. Returns dict or None if not found."""
    users = {
        "alice": {"name": "Alice", "email": "[email protected]"},
        "bob": {"name": "Bob", "email": "[email protected]"},
    }
    return users.get(username)

result = divide(10, 3)
print(f"divide(10, 3): {result:.4f}")
try:
    divide(10, 0)
except ZeroDivisionError:
    print("divide(10, 0): raised ZeroDivisionError")

print(f"parse_age('25'): {parse_age('25')}")
try:
    parse_age("abc")
except ValueError:
    print("parse_age('abc'): raised ValueError")
try:
    parse_age("-5")
except ValueError:
    print("parse_age('-5'): raised ValueError")

user = find_user("alice")
print(f"find_user('alice'): {user['email']}")
missing = find_user("unknown")
print(f"find_user('unknown'): {missing}")

Three error handling contracts:

Contract 1 — Always returns or raises (computation):
  divide(a, b) -> float       # always float
  divide(a, 0) -> raises      # never returns None for errors

Contract 2 — Always returns or raises (validation):
  parse_age("25") -> int      # always int
  parse_age("abc") -> raises  # invalid input = exception
  parse_age("-5") -> raises   # domain violation = exception

Contract 3 — Returns value or None (lookup):
  find_user("alice") -> dict  # found
  find_user("xxx") -> None    # not found (absence, not error)

When to raise vs return None:

Raise when the caller gave you bad input (wrong type, out of range, impossible operation). The caller made a mistake.
Return None when the input is valid but the item does not exist. "Not found" is a normal outcome, not an error.
Never mix both in the same function — pick one contract and stick to it.

Expected Output

divide(10, 3): 3.3333\ndivide(10, 0): raised ZeroDivisionError\nparse_age('25'): 25\nparse_age('abc'): raised ValueError\nparse_age('-5'): raised ValueError\nfind_user('alice'): [email protected]\nfind_user('unknown'): None

Hints

Hint 1: Functions that compute or get values should raise exceptions on invalid input rather than returning None. Reserve None for "not found" semantics in lookup functions.

Hint 2: Design three contracts: (1) `divide` always returns float or raises, (2) `parse_age` always returns int or raises, (3) `find_user` returns dict or None (lookup semantics). Each is consistent within its own contract.

#7Separate Pure Logic from I/OMedium

pure-functionstestabilityI/O-boundary

The function below mixes I/O (database lookup, logging) with business logic (discount calculation). Refactor so the discount logic is a pure function that can be tested without any mocks or database.

Python

def compute_price(
    base_price,
    *,
    is_premium=False,
    years_active=0,
):
    """Compute final price after applying loyalty discount.

    Pure function — no I/O, no side effects, fully testable.
    """
    if is_premium:
        discount = min(0.20 + years_active * 0.01, 0.35)
    else:
        discount = 0.0
    return round(base_price * (1 - discount), 2)

# Test pure logic — no mocks needed!
print(f"Basic (0 years): ${compute_price(100, is_premium=False):.2f}")
print(f"Premium (0 years): ${compute_price(100, is_premium=True):.2f}")
print(f"Premium (5 years): ${compute_price(100, is_premium=True, years_active=5):.2f}")
print(f"Premium (20 years): ${compute_price(100, is_premium=True, years_active=20):.2f}")
print("Loyalty discount capped at 35%")

Solution

# BEFORE — impure, mixes I/O with logic
def get_price_bad(user_id, base_price):
    user = db.get_user(user_id)              # I/O: database read
    logger.info(f"Computing price for {user_id}")  # side effect: logging
    if user.is_premium:
        discount = min(0.20 + user.years_active * 0.01, 0.35)
    else:
        discount = 0.0
    return round(base_price * (1 - discount), 2)

# Testing requires: mock db, mock logger, create fake user — painful

# AFTER — pure logic separated from I/O
def compute_price(
    base_price,
    *,
    is_premium=False,
    years_active=0,
):
    """Compute final price after applying loyalty discount.

    Pure function — no I/O, no side effects, fully testable.
    """
    if is_premium:
        discount = min(0.20 + years_active * 0.01, 0.35)
    else:
        discount = 0.0
    return round(base_price * (1 - discount), 2)

# I/O wrapper (lives at the boundary of the system)
def get_user_price(user_id, base_price):
    """Fetch user and compute their price. I/O at the boundary."""
    # user = db.get_user(user_id)  # I/O happens here
    # return compute_price(base_price, is_premium=user.is_premium, years_active=user.years_active)
    pass

# Test the PURE logic — zero mocks
print(f"Basic (0 years): ${compute_price(100, is_premium=False):.2f}")
print(f"Premium (0 years): ${compute_price(100, is_premium=True):.2f}")
print(f"Premium (5 years): ${compute_price(100, is_premium=True, years_active=5):.2f}")
print(f"Premium (20 years): ${compute_price(100, is_premium=True, years_active=20):.2f}")
print("Loyalty discount capped at 35%")

Pure vs impure comparison:

IMPURE — get_price_bad(user_id, 100):
  Requires: database mock, logger mock, fake user object
  Test setup: ~15 lines of mocking boilerplate
  Risk: test might fail due to mock setup, not logic bugs

PURE — compute_price(100, is_premium=True, years_active=5):
  Requires: nothing
  Test: assert compute_price(100, is_premium=True, years_active=5) == 75.0
  Risk: only fails if the logic is wrong

The pattern: Push I/O to the edges of your system. Keep business logic pure in the middle. The thin I/O wrapper fetches data, calls the pure function, and saves the result.

Expected Output

Basic (0 years): $100.00\nPremium (0 years): $80.00\nPremium (5 years): $75.00\nPremium (20 years): $65.00\nLoyalty discount capped at 35%

Hints

Hint 1: A pure function takes all data as arguments and returns a result with no side effects. It never calls a database, API, or logger.

Hint 2: Extract the pricing logic into a pure function that takes `is_premium` and `years_active` as inputs. The impure wrapper fetches the user data and calls the pure function.

#8Config Object Instead of Many ParametersMedium

dataclassconfig-objectparameter-grouping

The function below has 7 parameters — too many to keep straight. Refactor by grouping the delivery options into an EmailConfig dataclass, reducing the function signature to essentials plus a config object.

Python

from dataclasses import dataclass

@dataclass
class EmailConfig:
    """Configuration for email delivery."""
    html: bool = True
    retries: int = 3
    timeout_seconds: int = 30

def send_email(
    to,
    subject,
    body,
    *,
    config=None,
):
    """Send an email with the given configuration."""
    if config is None:
        config = EmailConfig()
    print(f"Sending email...")
    print(f"  to: {to}")
    print(f"  subject: {subject}")
    print(f"  html: {config.html}, retries: {config.retries}, timeout: {config.timeout_seconds}s")

# Default config
send_email("[email protected]", "Hello", "Hi Alice!")

# Custom config
urgent_config = EmailConfig(html=False, retries=1, timeout_seconds=10)
send_email("[email protected]", "Alert", "System down!", config=urgent_config)

Solution

# BEFORE — too many parameters
def send_email_bad(to, subject, body, html=True, retries=3,
                   timeout=30, track_opens=False):
    pass

# Call site is messy:
# send_email_bad("[email protected]", "Hi", "body", True, 3, 30, False)

# AFTER — grouped into config dataclass
from dataclasses import dataclass

@dataclass
class EmailConfig:
    """Configuration for email delivery."""
    html: bool = True
    retries: int = 3
    timeout_seconds: int = 30

def send_email(
    to,
    subject,
    body,
    *,
    config=None,
):
    """Send an email with the given configuration."""
    if config is None:
        config = EmailConfig()
    print(f"Sending email...")
    print(f"  to: {to}")
    print(f"  subject: {subject}")
    print(f"  html: {config.html}, retries: {config.retries}, timeout: {config.timeout_seconds}s")

send_email("[email protected]", "Hello", "Hi Alice!")

urgent_config = EmailConfig(html=False, retries=1, timeout_seconds=10)
send_email("[email protected]", "Alert", "System down!", config=urgent_config)

Why config objects beat long parameter lists:

1. Reusable — define once, pass to many calls:
   prod_config = EmailConfig(retries=5, timeout_seconds=60)
   send_email("[email protected]", "Test1", "...", config=prod_config)
   send_email("[email protected]", "Test2", "...", config=prod_config)

2. Testable — config is a plain data object:
   assert EmailConfig().retries == 3
   assert EmailConfig(retries=1).retries == 1

3. Extensible — add a field without changing callers:
   @dataclass
   class EmailConfig:
       html: bool = True
       retries: int = 3
       timeout_seconds: int = 30
       track_opens: bool = False   # New! No existing calls break.

4. Self-documenting — IDE shows all fields and defaults

Rule of thumb: When 3+ parameters "travel together" and configure the same concern, extract them into a dataclass.

Expected Output

Sending email...\n  to: [email protected]\n  subject: Hello\n  html: True, retries: 3, timeout: 30s\nSending email...\n  to: [email protected]\n  subject: Alert\n  html: False, retries: 1, timeout: 10s

Hints

Hint 1: When a function has more than 3-4 related parameters, group them into a dataclass. This gives you named fields, defaults, type safety, and a clean repr.

Hint 2: The `EmailConfig` dataclass holds delivery options (html, retries, timeout). The `send_email` function takes the essential args (to, subject, body) plus the config object.

Hard

#9Redesign a Cryptic APIHard

full-redesignenumkeyword-onlydocstringtype-annotations

The function below is a worst-case API — cryptic name, mystery parameters, magic integers, positional booleans, and no documentation. Redesign it from scratch applying every clean API principle from the lesson.

Python

from enum import Enum

class Aggregation(Enum):
    SUM = "sum"
    AVERAGE = "average"
    MAX = "max"

def aggregate_top_values(
    values,
    aggregation,
    *,
    top_n=None,
    use_absolute=False,
):
    """Aggregate the top-N values from a list.

    Args:
        values: Input list of numeric values.
        aggregation: Aggregation method (SUM, AVERAGE, MAX).
        top_n: If given, consider only the top N values. None means all.
        use_absolute: If True, take absolute value of each element first.

    Returns:
        The aggregated result as a float.

    Raises:
        ValueError: If values is empty.
    """
    if not values:
        raise ValueError("Cannot aggregate empty list")

    processed = [abs(v) for v in values] if use_absolute else list(values)
    processed.sort(reverse=True)
    selected = processed[:top_n] if top_n is not None else processed

    if aggregation == Aggregation.SUM:
        return float(sum(selected))
    if aggregation == Aggregation.AVERAGE:
        return sum(selected) / len(selected)
    if aggregation == Aggregation.MAX:
        return float(max(selected))
    raise ValueError(f"Unknown aggregation: {aggregation}")

# Clean call sites — self-documenting
result1 = aggregate_top_values(
    [-3, 1, -5, 2],
    Aggregation.SUM,
    use_absolute=True,
    top_n=2,
)
print(f"Top 2 (sum, absolute): {result1}")

result2 = aggregate_top_values([1, 2, 3, 4], Aggregation.AVERAGE)
print(f"All (average): {result2}")

result3 = aggregate_top_values([3, 1, 5, 2], Aggregation.MAX, top_n=3)
print(f"Top 3 (max): {int(result3)}")

Solution

# BEFORE — the worst API
def do(x, y, z=1, flag=False, out=None):
    if flag:
        x = [abs(v) for v in x]
    x = sorted(x, reverse=True)[:z]
    if y == "sum":
        result = sum(x)
    elif y == "avg":
        result = sum(x) / len(x) if x else 0
    elif y == "max":
        result = max(x) if x else 0
    if out == "print":
        print(result)
    return result

# Problems:
# 1. Name "do" says nothing
# 2. x, y, z, flag, out — meaningless parameter names
# 3. z=1 means "all" — magic value
# 4. y is a string mode flag — easy to typo
# 5. flag is a positional boolean
# 6. out="print" mixes I/O with computation
# 7. No type annotations, no docstring

# AFTER — clean API
from enum import Enum

class Aggregation(Enum):
    SUM = "sum"
    AVERAGE = "average"
    MAX = "max"

def aggregate_top_values(
    values,
    aggregation,
    *,
    top_n=None,
    use_absolute=False,
):
    """Aggregate the top-N values from a list.

    Args:
        values: Input list of numeric values.
        aggregation: Aggregation method (SUM, AVERAGE, MAX).
        top_n: If given, consider only the top N values. None means all.
        use_absolute: If True, take absolute value of each element first.

    Returns:
        The aggregated result as a float.

    Raises:
        ValueError: If values is empty.
    """
    if not values:
        raise ValueError("Cannot aggregate empty list")

    processed = [abs(v) for v in values] if use_absolute else list(values)
    processed.sort(reverse=True)
    selected = processed[:top_n] if top_n is not None else processed

    if aggregation == Aggregation.SUM:
        return float(sum(selected))
    if aggregation == Aggregation.AVERAGE:
        return sum(selected) / len(selected)
    if aggregation == Aggregation.MAX:
        return float(max(selected))
    raise ValueError(f"Unknown aggregation: {aggregation}")

result1 = aggregate_top_values(
    [-3, 1, -5, 2],
    Aggregation.SUM,
    use_absolute=True,
    top_n=2,
)
print(f"Top 2 (sum, absolute): {result1}")

result2 = aggregate_top_values([1, 2, 3, 4], Aggregation.AVERAGE)
print(f"All (average): {result2}")

result3 = aggregate_top_values([3, 1, 5, 2], Aggregation.MAX, top_n=3)
print(f"Top 3 (max): {int(result3)}")

Every fix mapped to a principle:

Before	After	Principle
`do`	`aggregate_top_values`	Naming communicates intent
`x`	`values`	Descriptive parameter names
`y = "sum"`	`Aggregation.SUM`	Enum replaces magic strings
`z = 1` (means all)	`top_n = None`	None means "no limit" — no magic values
`flag` (positional bool)	`use_absolute` (keyword-only)	Boolean flag anti-pattern fixed
`out = "print"`	Removed	Single responsibility — I/O is not this function's job
No annotations	Full annotations	Type annotations as documentation
No docstring	Google-style docstring	Documents args, returns, raises
Returns int or float	Always returns float	Consistent return type

Expected Output

Top 2 (sum, absolute): 8.0\nAll (average): 2.5\nTop 3 (max): 5

Hints

Hint 1: Identify every problem: cryptic name, positional booleans, integer mode flag, inconsistent return behavior, no type annotations, no docstring.

Hint 2: Apply all principles: (1) descriptive function name, (2) enum for aggregation mode, (3) keyword-only args, (4) type annotations, (5) consistent float return, (6) Google-style docstring.

#10Builder Pattern for Complex ConfigurationHard

builder-patternchainingfluent-interfaceimmutable-config

Implement a QueryBuilder that constructs a database query configuration step by step. The builder should support method chaining (fluent interface) and produce an immutable QueryConfig result. Validate that a table is set before building.

Python

from dataclasses import dataclass, field

@dataclass(frozen=True)
class QueryConfig:
    """Immutable query configuration — cannot be modified after creation."""
    table: str
    filters: tuple
    order_by: str
    limit: int
    offset: int

class QueryBuilder:
    """Fluent builder for constructing QueryConfig objects."""

    def __init__(self):
        self._table = None
        self._filters = []
        self._order_by = None
        self._limit = 100
        self._offset = 0

    def from_table(self, table):
        """Set the target table."""
        self._table = table
        return self

    def where(self, column, operator, value):
        """Add a filter condition."""
        self._filters.append((column, operator, value))
        return self

    def order_by(self, column):
        """Set the sort column."""
        self._order_by = column
        return self

    def limit(self, n):
        """Set the maximum number of results."""
        self._limit = n
        return self

    def offset(self, n):
        """Set the starting offset."""
        self._offset = n
        return self

    def build(self):
        """Validate and produce an immutable QueryConfig."""
        if not self._table:
            raise ValueError("Table is required — call .from_table() first")
        return QueryConfig(
            table=self._table,
            filters=tuple(self._filters),
            order_by=self._order_by or self._table,
            limit=self._limit,
            offset=self._offset,
        )

# Fluent chained usage
config1 = (
    QueryBuilder()
    .from_table("users")
    .where("age", ">", 21)
    .where("status", "=", "active")
    .order_by("name")
    .limit(10)
    .build()
)
print(config1)

config2 = (
    QueryBuilder()
    .from_table("orders")
    .where("total", ">", 100)
    .order_by("created_at")
    .limit(50)
    .build()
)
print(config2)

Solution

from dataclasses import dataclass, field

@dataclass(frozen=True)
class QueryConfig:
    """Immutable query configuration — cannot be modified after creation."""
    table: str
    filters: tuple
    order_by: str
    limit: int
    offset: int

class QueryBuilder:
    """Fluent builder for constructing QueryConfig objects."""

    def __init__(self):
        self._table = None
        self._filters = []
        self._order_by = None
        self._limit = 100
        self._offset = 0

    def from_table(self, table):
        self._table = table
        return self                 # enables chaining

    def where(self, column, operator, value):
        self._filters.append((column, operator, value))
        return self

    def order_by(self, column):
        self._order_by = column
        return self

    def limit(self, n):
        self._limit = n
        return self

    def offset(self, n):
        self._offset = n
        return self

    def build(self):
        if not self._table:
            raise ValueError("Table is required — call .from_table() first")
        return QueryConfig(
            table=self._table,
            filters=tuple(self._filters),
            order_by=self._order_by or self._table,
            limit=self._limit,
            offset=self._offset,
        )

config1 = (
    QueryBuilder()
    .from_table("users")
    .where("age", ">", 21)
    .where("status", "=", "active")
    .order_by("name")
    .limit(10)
    .build()
)
print(config1)

config2 = (
    QueryBuilder()
    .from_table("orders")
    .where("total", ">", 100)
    .order_by("created_at")
    .limit(50)
    .build()
)
print(config2)

Builder pattern anatomy:

QueryBuilder()                  # 1. Create mutable builder
  .from_table("users")         # 2. Set required fields
  .where("age", ">", 21)      # 3. Add optional config (chainable)
  .where("status", "=", "ok") # 4. Chain more config
  .order_by("name")           # 5. Chain more config
  .limit(10)                   # 6. Chain more config
  .build()                     # 7. Validate + produce immutable result

Why builders beat constructors for complex objects:

Readable: Each step is named — .where("age", ">", 21) vs positional args.
Flexible: Optional steps can be skipped — only .from_table() is required.
Safe: build() validates before producing the immutable result. The frozen=True dataclass prevents accidental mutation after construction.
Composable: You can pass a partially-built builder around and let different parts of the system add their constraints.

When to use the builder pattern:

Object has many optional fields with sensible defaults
Construction requires validation that spans multiple fields
You want the final object to be immutable
You want call sites to read like a declarative specification

Expected Output

QueryConfig(table='users', filters=[('age', '>', 21), ('status', '=', 'active')], order_by='name', limit=10, offset=0)\nQueryConfig(table='orders', filters=[('total', '>', 100)], order_by='created_at', limit=50, offset=0)

Hints

Hint 1: A builder collects configuration step by step and produces a final immutable object. Each method returns `self` to enable chaining.

Hint 2: The `build()` method validates the configuration and returns a frozen dataclass or namedtuple. Once built, the config cannot be modified — this prevents accidental mutation.

#11Fluent Data PipelineHard

fluent-interfacelazy-evaluationpipelinechaining

Implement a lazy, chainable Pipeline class that supports filter, map, sort_by, limit, collect, and reduce. All transformations should be deferred until collect() or reduce() is called. Each chainable method returns a new Pipeline (does not mutate the original).

Python

from functools import reduce as functools_reduce

class Pipeline:
    """Lazy, chainable data transformation pipeline."""

    def __init__(self, source):
        self._source = source
        self._ops = []

    def _clone_with(self, op):
        """Create a new Pipeline with an additional operation."""
        new = Pipeline(self._source)
        new._ops = self._ops + [op]
        return new

    def filter(self, predicate):
        """Keep elements where predicate returns True."""
        return self._clone_with(lambda data, p=predicate: [x for x in data if p(x)])

    def map(self, transform):
        """Apply transform to every element."""
        return self._clone_with(lambda data, f=transform: [f(x) for x in data])

    def sort_by(self, key, *, descending=False):
        """Sort elements by a key name or function."""
        if isinstance(key, str):
            key_fn = lambda x, k=key: x[k]
        else:
            key_fn = key
        return self._clone_with(
            lambda data, kf=key_fn, desc=descending: sorted(data, key=kf, reverse=desc)
        )

    def limit(self, n):
        """Keep only the first n elements."""
        return self._clone_with(lambda data, n=n: data[:n])

    def collect(self):
        """Execute all operations and return a list."""
        result = list(self._source)
        for op in self._ops:
            result = op(result)
        return result

    def reduce(self, func, initial):
        """Execute operations and fold into a single value."""
        return functools_reduce(func, self.collect(), initial)

# Test data
records = [
    {"name": "Alice", "score": 80, "active": True},
    {"name": "Bob", "score": 95, "active": False},
    {"name": "Carol", "score": 70, "active": True},
    {"name": "Dave", "score": 88, "active": True},
]

# Chain: filter active -> boost scores -> sort -> top 2
top = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .map(lambda r: {"name": r["name"], "score": round(r["score"] * 1.1, 1)})
    .sort_by("score", descending=True)
    .limit(2)
    .collect()
)
print(f"Top scorers: {top}")

# Reduce: sum of active scores
total = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .reduce(lambda acc, r: acc + r["score"], 0)
)
print(f"Total active score: {total}")

# Reuse: same pipeline, different terminal
names = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .sort_by("score", descending=True)
    .map(lambda r: r["name"])
    .collect()
)
print(f"Names: {names}")

Solution

from functools import reduce as functools_reduce

class Pipeline:
    """Lazy, chainable data transformation pipeline."""

    def __init__(self, source):
        self._source = source
        self._ops = []

    def _clone_with(self, op):
        """Create a new Pipeline with an additional operation."""
        new = Pipeline(self._source)
        new._ops = self._ops + [op]
        return new

    def filter(self, predicate):
        return self._clone_with(lambda data, p=predicate: [x for x in data if p(x)])

    def map(self, transform):
        return self._clone_with(lambda data, f=transform: [f(x) for x in data])

    def sort_by(self, key, *, descending=False):
        if isinstance(key, str):
            key_fn = lambda x, k=key: x[k]
        else:
            key_fn = key
        return self._clone_with(
            lambda data, kf=key_fn, desc=descending: sorted(data, key=kf, reverse=desc)
        )

    def limit(self, n):
        return self._clone_with(lambda data, n=n: data[:n])

    def collect(self):
        result = list(self._source)
        for op in self._ops:
            result = op(result)
        return result

    def reduce(self, func, initial):
        return functools_reduce(func, self.collect(), initial)

records = [
    {"name": "Alice", "score": 80, "active": True},
    {"name": "Bob", "score": 95, "active": False},
    {"name": "Carol", "score": 70, "active": True},
    {"name": "Dave", "score": 88, "active": True},
]

top = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .map(lambda r: {"name": r["name"], "score": round(r["score"] * 1.1, 1)})
    .sort_by("score", descending=True)
    .limit(2)
    .collect()
)
print(f"Top scorers: {top}")

total = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .reduce(lambda acc, r: acc + r["score"], 0)
)
print(f"Total active score: {total}")

names = (
    Pipeline(records)
    .filter(lambda r: r["active"])
    .sort_by("score", descending=True)
    .map(lambda r: r["name"])
    .collect()
)
print(f"Names: {names}")

Key design decisions:

1. LAZY — operations stored, not executed:
   Pipeline(data).filter(fn).map(fn)  # no work done yet
   .collect()                          # NOW all ops run in sequence

2. IMMUTABLE CHAINS — each method returns a NEW Pipeline:
   base = Pipeline(data).filter(active)
   branch_a = base.limit(5)           # does not modify base
   branch_b = base.sort_by("score")   # does not modify base

3. LAMBDA CAPTURE — default arg trick prevents late binding:
   lambda data, p=predicate: ...      # p is bound at creation time
   # Without p=predicate, all lambdas would share the last predicate

4. FLUENT INTERFACE — return self/new enables chaining:
   Pipeline(data).filter(fn).map(fn).limit(5).collect()
   # Reads like a sentence: "filter, then map, then limit, then collect"

Clean API principles applied:

Naming: filter, map, sort_by, limit, collect, reduce — all standard verbs from functional programming.
Keyword-only: sort_by(key, *, descending=False) — prevents sort_by("name", True).
Consistent returns: chainable methods return Pipeline, terminal methods return the result type.
Single responsibility: each method does exactly one thing — add one operation to the chain.
Least surprise: collect() returns a list, reduce() returns a single value — no surprises.

Expected Output

Top scorers: [{'name': 'Dave', 'score': 96.8}, {'name': 'Alice', 'score': 88.0}]\nTotal active score: 238\nNames: ['Dave', 'Alice', 'Carol']

Hints

Hint 1: Each method (filter, map, sort_by, limit) should store the operation but NOT execute it. Return a new Pipeline instance with the operation appended.

Hint 2: The `collect()` method applies all stored operations in order. The `reduce()` method applies them and then folds. Lazy evaluation means no work happens until collect/reduce is called.

Practice: Designing Clean APIs

Easy​

Medium​

Hard​

Easy

Medium

Hard