Designing Clean Function APIs - The Art of Good Interfaces
Reading time: ~18 minutes | Level: Foundation → Engineering
Here is an API that works but will cause pain:
def process(d, flag1=True, flag2=False, n=10, t=None, mode=0):
...
And here is the same API, designed well:
def extract_top_records(
data: list[dict],
*,
limit: int = 10,
include_archived: bool = False,
timeout_seconds: float | None = None,
) -> list[dict]:
...
Both functions might do the same thing. One communicates intent, enforces correctness, and documents itself. The other leaves callers guessing.
Function API design is the skill that separates code you maintain from code you rewrite.
What You Will Learn
- The Single Responsibility Principle applied to functions
- Naming conventions that communicate intent without documentation
- Why more than 3-4 positional arguments is a design smell
- How to use keyword-only arguments (
*) to enforce clarity at call sites - Why boolean flag arguments are an anti-pattern - and the fix
- How to design consistent return types that callers can depend on
- How type annotations serve as executable documentation
- How to design functions for testability
- The Principle of Least Surprise and how to violate it less often
- A complete before/after case study of a real-world API redesign
Prerequisites
- Python function parameters: positional, keyword, *args, **kwargs (Lessons 01–05)
- Type annotations:
-> int,Optional[T],T | None - Keyword-only arguments:
def f(*, keyword_only)(Lesson 04)
Principle 1: Single Responsibility
A function should do one thing. It should do it well. It should do it only.
# BAD: fetch + parse + save + log in one function
def handle_user_data(user_id, save=True, log=True, format="json"):
data = requests.get(f"/api/users/{user_id}").json()
if format == "json":
result = json.dumps(data)
elif format == "csv":
result = to_csv(data)
if log:
logger.info(f"Processed user {user_id}")
if save:
db.save(result)
return result
# GOOD: separate concerns
def fetch_user(user_id: int) -> dict:
return requests.get(f"/api/users/{user_id}").json()
def format_user(user: dict, format: str = "json") -> str:
if format == "json":
return json.dumps(user)
if format == "csv":
return to_csv(user)
raise ValueError(f"Unsupported format: {format!r}")
def save_record(data: str, table: str) -> None:
db.save(data, table=table)
The test for single responsibility: describe the function in one sentence without using "and" or "or." If you cannot, split it.
:::tip The Newspaper Test
Can you name the function with a verb phrase that tells the whole story? get_active_users() passes. process_and_save() fails.
:::
Principle 2: Naming That Communicates Intent
Function names should be verbs (or verb phrases) that describe what the function does - not how it does it.
# Names that hide intent
def run(data): ...
def do_stuff(x, y): ...
def process(items): ...
def handle(event): ...
# Names that communicate intent
def parse_csv_records(path: str) -> list[dict]: ...
def validate_email_format(email: str) -> bool: ...
def compute_moving_average(values, window: int): ...
def send_password_reset_email(user_id: int) -> None: ...
Common Prefixes and Their Contracts
| Prefix | Returns | Side effects |
|---|---|---|
get_ | a value | none |
fetch_ | a value (from I/O) | network/disk |
compute_ | a calculated value | none |
create_, make_ | a new object | possibly I/O |
build_ | a new object (builder pattern) | none |
validate_ | bool or raises | none |
parse_ | structured data from raw | none |
save_, store_ | None or id | writes to storage |
send_ | None or response | network |
update_ | None or updated value | modifies in place or DB |
delete_, remove_ | None or bool | destructive |
is_, has_, can_ | bool | none |
Choosing the right prefix sets expectations before callers read the code.
Principle 3: Argument Count
More than 3-4 positional arguments is a design smell.
# BAD: 7 positional arguments
def create_report(title, author, data, start_date, end_date, format, include_charts):
...
# GOOD option 1: group related args into a config object
from dataclasses import dataclass
@dataclass
class ReportConfig:
title: str
author: str
start_date: str
end_date: str
format: str = "pdf"
include_charts: bool = True
def create_report(data: list[dict], config: ReportConfig) -> bytes:
...
# GOOD option 2: keyword-only with defaults
def create_report(
data: list[dict],
*,
title: str,
author: str,
start_date: str,
end_date: str,
format: str = "pdf",
include_charts: bool = True,
) -> bytes:
...
The keyword-only approach forces callers to name every argument:
# Impossible to call incorrectly - order doesn't matter
report = create_report(
data,
title="Q4 Summary",
author="Alice",
start_date="2024-10-01",
end_date="2024-12-31",
)
Principle 4: The Boolean Flag Anti-Pattern
Boolean flags are a common sign that a function is doing two things:
# BAD: boolean flag
def get_users(include_inactive=False):
if include_inactive:
return db.query("SELECT * FROM users")
return db.query("SELECT * FROM users WHERE active = 1")
get_users() # get active
get_users(True) # caller has to know what True means!
Solutions:
# GOOD option 1: two named functions
def get_active_users() -> list[dict]:
return db.query("SELECT * FROM users WHERE active = 1")
def get_all_users() -> list[dict]:
return db.query("SELECT * FROM users")
# GOOD option 2: keyword-only with descriptive name
def get_users(*, include_inactive: bool = False) -> list[dict]:
...
get_users(include_inactive=True) # now readable at call site
# GOOD option 3: enum for multiple states
from enum import Enum
class UserFilter(Enum):
ACTIVE_ONLY = "active"
ALL = "all"
INACTIVE_ONLY = "inactive"
def get_users(filter: UserFilter = UserFilter.ACTIVE_ONLY) -> list[dict]:
...
get_users(UserFilter.ALL) # unambiguous
:::warning Boolean Blind Calls
create_user(data, True, False, True) - what do the booleans mean? Never make callers memorize positional boolean meanings. Use keyword-only arguments.
:::
Principle 5: Consistent Return Types
A function should always return the same type. Mixing return types creates defensive code at every call site.
# BAD: returns int OR None OR raises - callers can't trust the signature
def get_score(user_id):
if user_id < 0:
return None # invalid
if user_id > 1000:
raise ValueError # too high
return 42 # valid
# GOOD: consistent contract - returns int, raises on invalid input
def get_score(user_id: int) -> int:
if user_id < 0:
raise ValueError(f"user_id must be non-negative, got {user_id}")
if user_id > 1000:
raise ValueError(f"user_id too large: {user_id}")
return db.get_score(user_id)
# GOOD: returns Optional[int] consistently when None is valid
def find_score(user_id: int) -> int | None:
return db.find_score(user_id) # None if not found, always
Consistent Empty Collections
# BAD: returns None or list
def get_tags(post_id: int):
tags = db.get_tags(post_id)
if not tags:
return None # caller must check for None
return tags
# GOOD: always return list (empty list is fine)
def get_tags(post_id: int) -> list[str]:
return db.get_tags(post_id) or []
Principle 6: Pure Functions for Logic
A pure function has no side effects and always returns the same result for the same inputs. Pure functions are:
- Easy to test (no setup/teardown needed)
- Easy to reason about (no hidden state)
- Safe to call multiple times
- Composable
# Impure: depends on external state, has side effects
def get_user_discount(user_id):
user = db.get_user(user_id) # I/O side effect
if user.is_premium:
logger.info(f"Applying premium discount to {user_id}") # side effect
return 0.20
return 0.05
# Pure: given user data, compute discount - no side effects
def compute_discount(is_premium: bool, years_active: int) -> float:
if is_premium:
return min(0.20 + years_active * 0.01, 0.35)
return 0.05
# I/O happens at the boundary:
def get_user_discount(user_id: int) -> float:
user = db.get_user(user_id) # I/O at the boundary
return compute_discount( # pure computation
is_premium=user.is_premium,
years_active=user.years_active,
)
Test the pure logic without any database:
def test_compute_discount():
assert compute_discount(is_premium=True, years_active=0) == 0.20
assert compute_discount(is_premium=True, years_active=15) == 0.35
assert compute_discount(is_premium=False, years_active=10) == 0.05
Principle 7: Type Annotations as Documentation
Type annotations tell callers exactly what a function expects and returns - without reading the body.
# Unannotated: what is items? what does it return?
def top_n(items, n, key=None):
...
# Annotated: self-documenting
from typing import Callable, TypeVar
T = TypeVar("T")
def top_n(
items: list[T],
n: int,
*,
key: Callable[[T], float] | None = None,
) -> list[T]:
...
Python 3.9+ vs Older Syntax
# Python 3.9+: built-in generics, no imports needed
def process(data: list[dict], ids: set[int]) -> dict[str, list[int]]:
...
# Python 3.7-3.8: use from __future__ import annotations or typing module
from __future__ import annotations # enables 3.9+ syntax everywhere
# or
from typing import Dict, List, Set
def process(data: List[Dict], ids: Set[int]) -> Dict[str, List[int]]:
...
Return Type Conventions
# Procedure: mutates, no useful return
def sort_in_place(items: list) -> None: ...
# Optional result
def find_user(email: str) -> dict | None: ...
# Always returns result or raises
def get_user(user_id: int) -> dict: ... # raises if not found
# Multiple return values (use NamedTuple for named fields)
from typing import NamedTuple
class ParseResult(NamedTuple):
value: float
unit: str
confidence: float
def parse_measurement(text: str) -> ParseResult: ...
result = parse_measurement("3.5 kg")
print(result.value, result.unit) # named access, not result[0]
Principle 8: Docstrings
Write docstrings for public APIs. Pick a style and stick to it.
# Google style (recommended for most projects)
def compute_discount(is_premium: bool, years_active: int) -> float:
"""Compute the discount percentage for a user.
Args:
is_premium: Whether the user has a premium subscription.
years_active: Number of complete years the user has been active.
Returns:
Discount as a float between 0.0 and 0.35.
Raises:
ValueError: If years_active is negative.
Examples:
>>> compute_discount(is_premium=True, years_active=5)
0.25
>>> compute_discount(is_premium=False, years_active=10)
0.05
"""
if years_active < 0:
raise ValueError(f"years_active must be non-negative, got {years_active}")
if is_premium:
return min(0.20 + years_active * 0.01, 0.35)
return 0.05
:::tip One-Line vs Full Docstrings
One-line docstrings for simple functions: """Return the square of x.""". Multi-line for anything with non-obvious args, side effects, or exceptions. Always document raises.
:::
Principle 9: The Principle of Least Surprise
A function should do exactly what its name implies - nothing more, nothing less.
# SURPRISED: get_ modifies state
def get_or_create_user(email: str) -> dict:
user = db.find_user(email)
if not user:
user = db.create_user(email) # side effect! "get" implies read-only
return user
# BETTER: explicit about the intent
def find_or_create_user(email: str) -> tuple[dict, bool]:
"""Find a user by email, creating one if not found.
Returns:
Tuple of (user dict, was_created bool).
"""
user = db.find_user(email)
if user:
return user, False
return db.create_user(email), True
Other common violations:
save()that also validates and transforms inputvalidate()that also modifies the object__len__()that triggers a database query
Before/After: A Real-World API Redesign
Before
def proc(data, x=1, y=False, z=None, mode=0, fmt=None):
# what does x mean? y? mode? fmt?
if mode == 0:
result = [item for item in data if item.get("active")]
elif mode == 1:
result = data
if y:
result = sorted(result, key=lambda d: d.get("score", 0))
if x != 1:
result = result[:x]
if fmt == "csv":
return ",".join(str(r) for r in result)
return result
Problems: cryptic names, integer mode flag, boolean sort flag, inconsistent return type (list or str).
After
from enum import Enum
class UserStatus(Enum):
ACTIVE_ONLY = "active"
ALL = "all"
def get_users(
data: list[dict],
*,
status: UserStatus = UserStatus.ACTIVE_ONLY,
sort_by_score: bool = False,
limit: int | None = None,
) -> list[dict]:
"""Filter, sort, and limit a list of user records.
Args:
data: Raw list of user dicts.
status: Whether to include only active users or all users.
sort_by_score: If True, sort by 'score' field descending.
limit: Maximum number of records to return. None means no limit.
Returns:
Filtered (and optionally sorted and limited) list of user dicts.
"""
if status == UserStatus.ACTIVE_ONLY:
result = [u for u in data if u.get("active")]
else:
result = list(data) # copy, do not mutate input
if sort_by_score:
result.sort(key=lambda u: u.get("score", 0), reverse=True)
if limit is not None:
result = result[:limit]
return result # always a list
Improvements:
- Descriptive name:
get_users - Keyword-only args: callers must name everything
UserStatusenum: replacesmode=0/1magic integer- Descriptive booleans:
sort_by_scorevsy limit: int | None: replacesx=1magic default- Consistent return type: always
list[dict] - Docstring: complete documentation
Interview Questions
Q1: What is the Single Responsibility Principle for functions, and how do you apply it?
Answer: A function should do one thing well. The test: can you describe the function in one sentence without using "and" or "or"? If not, split it. Applied to functions: separate I/O from computation (fetch data, compute result, save result are separate functions), separate validation from transformation, and separate policy (what to do) from mechanism (how to do it). Functions with single responsibility are easier to name, test in isolation, reuse, and modify without unintended side effects on other behaviors.
Q2: Why are boolean flag arguments an anti-pattern? What are the alternatives?
Answer: Boolean flags like process(data, True, False) are opaque at call sites - callers must memorize what each positional bool means. They also often signal that a function does two things (one when True, another when False). Alternatives: (1) two named functions with distinct names; (2) keyword-only boolean: process(data, include_archived=True) - readable at call site; (3) an enum: process(data, mode=Mode.ARCHIVED) - explicit and extensible. Boolean keyword arguments are acceptable when the meaning is obvious from the name and there is only one such flag.
Q3: Why should functions return consistent types?
Answer: Inconsistent return types force every caller to write defensive code: result = func(); if result is None: ...; elif isinstance(result, list): ...; else: .... This is error-prone and verbose. Consistent types let callers trust the signature and eliminate branching. Rules: a function should always return the same type; prefer empty collections over None for "no results"; use T | None consistently when absence is meaningful; raise exceptions for invalid inputs rather than returning None as an error signal. Type checkers enforce this when you annotate return types correctly.
Q4: What is a pure function and why does it matter for testability?
Answer: A pure function: (1) returns the same output for the same inputs every time, and (2) has no side effects (does not modify external state, does not do I/O). Pure functions are trivially testable - no mocks, no database setup, no test isolation needed. You pass inputs and assert outputs. The pattern for testability: push I/O to the edges of your system, keep logic pure in the middle. compute_discount(is_premium, years_active) is pure and needs one line to test. get_user_discount(user_id) is impure and needs a database mock. Write as much logic as possible in pure functions.
Q5: When should you use keyword-only arguments (with *)?
Answer: Use keyword-only arguments when: (1) a function has more than 2-3 parameters and positional order would be ambiguous; (2) you have boolean or enumerated option arguments; (3) you want to allow future additions without breaking callers (new keyword args are backward-compatible); (4) the function is part of a public API that others will call. Keyword-only args improve call site readability: create_user(name="Alice", role="admin", send_welcome=True) is unambiguous. Exception: performance-critical internal functions may use positional args to avoid the keyword lookup overhead.
Q6: How do type annotations improve API design beyond just static checking?
Answer: Type annotations serve as executable documentation - they communicate the contract of a function (what it accepts, what it guarantees to return) without requiring callers to read the body. -> list[str] immediately tells callers they can iterate, index, and use list methods. -> str | None tells callers to check for None. -> None tells callers not to use the return value. Beyond documentation, annotations enable: static type checkers (mypy, pyright) to catch type errors before runtime, IDE autocompletion and inline documentation, automatic validation with tools like Pydantic, and generation of API docs. They also catch design issues - if you cannot write a clear type annotation, your API is probably unclear.
Practice Challenges
Beginner: Redesign the Bad API
Redesign the following function to follow clean API principles:
def do(x, y, z=1, flag=False, out=None):
# x: list of numbers
# y: operation ("sum", "avg", "max")
# z: top-n (1 = all)
# flag: True = absolute values
# out: "print" or None
if flag:
x = [abs(v) for v in x]
x = sorted(x, reverse=True)[:z]
if y == "sum":
result = sum(x)
elif y == "avg":
result = sum(x) / len(x) if x else 0
elif y == "max":
result = max(x) if x else 0
if out == "print":
print(result)
return result
Solution
from enum import Enum
class Aggregation(Enum):
SUM = "sum"
AVERAGE = "average"
MAX = "max"
def aggregate_top_values(
values: list[float],
aggregation: Aggregation,
*,
top_n: int | None = None,
use_absolute: bool = False,
) -> float:
"""Aggregate the top-N values from a list.
Args:
values: Input list of numeric values.
aggregation: Aggregation method (SUM, AVERAGE, MAX).
top_n: If given, consider only the top N largest values. None means all.
use_absolute: If True, take absolute value of each element first.
Returns:
The aggregated result as a float.
Raises:
ValueError: If values is empty and aggregation requires non-empty input.
"""
if not values:
if aggregation == Aggregation.AVERAGE:
return 0.0
if aggregation == Aggregation.MAX:
raise ValueError("Cannot compute max of empty list")
return 0.0
processed = [abs(v) for v in values] if use_absolute else list(values)
processed.sort(reverse=True)
selected = processed[:top_n] if top_n is not None else processed
if aggregation == Aggregation.SUM:
return float(sum(selected))
if aggregation == Aggregation.AVERAGE:
return sum(selected) / len(selected)
if aggregation == Aggregation.MAX:
return float(max(selected))
raise ValueError(f"Unknown aggregation: {aggregation}")
# Callers:
result = aggregate_top_values([-3, 1, -5, 2], Aggregation.SUM, use_absolute=True, top_n=2)
print(result) # 8.0 (top 2 absolute values: 5+3)
Intermediate: Rate Limiter API
Design a clean RateLimiter class with a check(key) method. The interface should be intuitive, well-typed, and testable. The class should support per-key limits (e.g., 100 requests per minute per user_id).
Solution
from dataclasses import dataclass, field
from time import monotonic
from collections import deque
@dataclass
class RateLimitConfig:
max_requests: int
window_seconds: float
class RateLimiter:
"""Sliding window rate limiter.
Args:
config: Rate limit configuration.
Example:
limiter = RateLimiter(RateLimitConfig(max_requests=100, window_seconds=60))
if limiter.is_allowed("user:42"):
process_request()
else:
return 429
"""
def __init__(self, config: RateLimitConfig) -> None:
self._config = config
self._windows: dict[str, deque[float]] = {}
def is_allowed(self, key: str) -> bool:
"""Check if a request from key is within the rate limit.
Records the request attempt regardless of the result.
Args:
key: Unique identifier (user_id, IP, API key, etc.).
Returns:
True if the request is within limits, False if rate limited.
"""
now = monotonic()
window_start = now - self._config.window_seconds
if key not in self._windows:
self._windows[key] = deque()
timestamps = self._windows[key]
# Remove timestamps outside the window
while timestamps and timestamps[0] < window_start:
timestamps.popleft()
if len(timestamps) >= self._config.max_requests:
return False # rate limited
timestamps.append(now)
return True
def remaining(self, key: str) -> int:
"""Return how many requests remain in the current window."""
now = monotonic()
window_start = now - self._config.window_seconds
timestamps = self._windows.get(key, deque())
recent = sum(1 for t in timestamps if t >= window_start)
return max(0, self._config.max_requests - recent)
# Usage
config = RateLimitConfig(max_requests=3, window_seconds=60)
limiter = RateLimiter(config)
for i in range(5):
allowed = limiter.is_allowed("user:1")
remaining = limiter.remaining("user:1")
print(f"Request {i+1}: {'allowed' if allowed else 'DENIED'}, remaining={remaining}")
# Request 1: allowed, remaining=2
# Request 2: allowed, remaining=1
# Request 3: allowed, remaining=0
# Request 4: DENIED, remaining=0
# Request 5: DENIED, remaining=0
Advanced: Pipeline Builder
Design a clean Pipeline API that chains data transformations:
result = (
Pipeline(raw_records)
.filter(lambda r: r["active"])
.map(lambda r: {**r, "score": r["score"] * 1.1})
.sort_by("score", descending=True)
.limit(10)
.collect()
)
The API should be: chainable, lazy (only evaluates on .collect()), type-annotated, and support custom aggregations via .reduce(func, initial).
Solution
from __future__ import annotations
from typing import TypeVar, Generic, Callable, Iterable, Any
from functools import reduce as functools_reduce
T = TypeVar("T")
U = TypeVar("U")
class Pipeline(Generic[T]):
"""Lazy, chainable data transformation pipeline.
All transformations are deferred until .collect() is called.
"""
def __init__(self, source: Iterable[T]) -> None:
self._source = source
self._ops: list[Callable] = []
def _apply(self, data: Iterable[T]) -> list[T]:
result = list(data)
for op in self._ops:
result = op(result)
return result
def filter(self, predicate: Callable[[T], bool]) -> Pipeline[T]:
"""Keep elements for which predicate returns True."""
new = Pipeline(self._source)
new._ops = self._ops + [lambda data, p=predicate: [x for x in data if p(x)]]
return new
def map(self, transform: Callable[[T], U]) -> Pipeline[U]:
"""Apply transform to every element."""
new = Pipeline(self._source)
new._ops = self._ops + [lambda data, f=transform: [f(x) for x in data]]
return new # type: ignore[return-value]
def sort_by(self, key: str | Callable[[T], Any], *, descending: bool = False) -> Pipeline[T]:
"""Sort elements by a key attribute name or key function."""
if isinstance(key, str):
key_fn = lambda x, k=key: x[k]
else:
key_fn = key
new = Pipeline(self._source)
new._ops = self._ops + [
lambda data, kf=key_fn, desc=descending: sorted(data, key=kf, reverse=desc)
]
return new
def limit(self, n: int) -> Pipeline[T]:
"""Keep only the first n elements."""
new = Pipeline(self._source)
new._ops = self._ops + [lambda data, n=n: data[:n]]
return new
def collect(self) -> list[T]:
"""Evaluate the pipeline and return a list."""
return self._apply(self._source)
def reduce(self, func: Callable[[U, T], U], initial: U) -> U:
"""Fold the pipeline into a single value."""
return functools_reduce(func, self._apply(self._source), initial)
# Example
raw = [
{"name": "Alice", "score": 80, "active": True},
{"name": "Bob", "score": 95, "active": False},
{"name": "Carol", "score": 70, "active": True},
{"name": "Dave", "score": 88, "active": True},
]
result = (
Pipeline(raw)
.filter(lambda r: r["active"])
.map(lambda r: {**r, "score": r["score"] * 1.1})
.sort_by("score", descending=True)
.limit(2)
.collect()
)
for r in result:
print(f"{r['name']}: {r['score']:.1f}")
# Dave: 96.8
# Alice: 88.0
total = Pipeline(raw).filter(lambda r: r["active"]).reduce(lambda acc, r: acc + r["score"], 0)
print(f"Total active score: {total}") # 238
Quick Reference
| Principle | Anti-Pattern | Fix |
|---|---|---|
| Single Responsibility | process_and_save() | process() + save() |
| Naming | do(x, y, flag) | compute_discount(price, rate) |
| Argument count | 7 positional args | Config dataclass or keyword-only |
| Boolean flags | create(data, True, False) | create(data, active=True, notify=False) |
| Consistent returns | int | None | raises | int always (raises on invalid) |
| Pure functions | I/O mixed with logic | I/O at boundaries, pure logic inside |
| Type annotations | def f(x, y) | def f(x: int, y: str) -> bool |
| Docstring | No docstring | Google/NumPy/Sphinx style |
| Least surprise | get() that writes | get_or_create() with explicit name |
Key Takeaways
- Single Responsibility: one function, one purpose - if you need "and" to describe it, split it
- Names communicate contracts:
get_implies read-only,compute_implies pure,send_implies I/O; choose the right verb - Keyword-only arguments (
*in signature) are the best solution to readable multi-argument APIs and the boolean flag anti-pattern - Consistent return types eliminate defensive branching at call sites; never return different types from different code paths
- Pure functions are testable without mocks - push I/O to the edges of your system, keep logic pure in the middle
- Type annotations are documentation that tools can verify;
-> Noneis a contract,-> T | Noneis a contract,-> Tis a contract - The Principle of Least Surprise: do exactly what the name implies - every deviation is a bug waiting to confuse someone
