Skip to main content

Designing Clean Function APIs - The Art of Good Interfaces

Reading time: ~18 minutes | Level: Foundation → Engineering

Here is an API that works but will cause pain:

def process(d, flag1=True, flag2=False, n=10, t=None, mode=0):
...

And here is the same API, designed well:

def extract_top_records(
data: list[dict],
*,
limit: int = 10,
include_archived: bool = False,
timeout_seconds: float | None = None,
) -> list[dict]:
...

Both functions might do the same thing. One communicates intent, enforces correctness, and documents itself. The other leaves callers guessing.

Function API design is the skill that separates code you maintain from code you rewrite.

What You Will Learn

  • The Single Responsibility Principle applied to functions
  • Naming conventions that communicate intent without documentation
  • Why more than 3-4 positional arguments is a design smell
  • How to use keyword-only arguments (*) to enforce clarity at call sites
  • Why boolean flag arguments are an anti-pattern - and the fix
  • How to design consistent return types that callers can depend on
  • How type annotations serve as executable documentation
  • How to design functions for testability
  • The Principle of Least Surprise and how to violate it less often
  • A complete before/after case study of a real-world API redesign

Prerequisites

  • Python function parameters: positional, keyword, *args, **kwargs (Lessons 01–05)
  • Type annotations: -> int, Optional[T], T | None
  • Keyword-only arguments: def f(*, keyword_only) (Lesson 04)

Principle 1: Single Responsibility

A function should do one thing. It should do it well. It should do it only.

# BAD: fetch + parse + save + log in one function
def handle_user_data(user_id, save=True, log=True, format="json"):
data = requests.get(f"/api/users/{user_id}").json()
if format == "json":
result = json.dumps(data)
elif format == "csv":
result = to_csv(data)
if log:
logger.info(f"Processed user {user_id}")
if save:
db.save(result)
return result

# GOOD: separate concerns
def fetch_user(user_id: int) -> dict:
return requests.get(f"/api/users/{user_id}").json()

def format_user(user: dict, format: str = "json") -> str:
if format == "json":
return json.dumps(user)
if format == "csv":
return to_csv(user)
raise ValueError(f"Unsupported format: {format!r}")

def save_record(data: str, table: str) -> None:
db.save(data, table=table)

The test for single responsibility: describe the function in one sentence without using "and" or "or." If you cannot, split it.

:::tip The Newspaper Test Can you name the function with a verb phrase that tells the whole story? get_active_users() passes. process_and_save() fails. :::

Principle 2: Naming That Communicates Intent

Function names should be verbs (or verb phrases) that describe what the function does - not how it does it.

# Names that hide intent
def run(data): ...
def do_stuff(x, y): ...
def process(items): ...
def handle(event): ...

# Names that communicate intent
def parse_csv_records(path: str) -> list[dict]: ...
def validate_email_format(email: str) -> bool: ...
def compute_moving_average(values, window: int): ...
def send_password_reset_email(user_id: int) -> None: ...

Common Prefixes and Their Contracts

PrefixReturnsSide effects
get_a valuenone
fetch_a value (from I/O)network/disk
compute_a calculated valuenone
create_, make_a new objectpossibly I/O
build_a new object (builder pattern)none
validate_bool or raisesnone
parse_structured data from rawnone
save_, store_None or idwrites to storage
send_None or responsenetwork
update_None or updated valuemodifies in place or DB
delete_, remove_None or booldestructive
is_, has_, can_boolnone

Choosing the right prefix sets expectations before callers read the code.

Principle 3: Argument Count

More than 3-4 positional arguments is a design smell.

# BAD: 7 positional arguments
def create_report(title, author, data, start_date, end_date, format, include_charts):
...

# GOOD option 1: group related args into a config object
from dataclasses import dataclass

@dataclass
class ReportConfig:
title: str
author: str
start_date: str
end_date: str
format: str = "pdf"
include_charts: bool = True

def create_report(data: list[dict], config: ReportConfig) -> bytes:
...

# GOOD option 2: keyword-only with defaults
def create_report(
data: list[dict],
*,
title: str,
author: str,
start_date: str,
end_date: str,
format: str = "pdf",
include_charts: bool = True,
) -> bytes:
...

The keyword-only approach forces callers to name every argument:

# Impossible to call incorrectly - order doesn't matter
report = create_report(
data,
title="Q4 Summary",
author="Alice",
start_date="2024-10-01",
end_date="2024-12-31",
)

Principle 4: The Boolean Flag Anti-Pattern

Boolean flags are a common sign that a function is doing two things:

# BAD: boolean flag
def get_users(include_inactive=False):
if include_inactive:
return db.query("SELECT * FROM users")
return db.query("SELECT * FROM users WHERE active = 1")

get_users() # get active
get_users(True) # caller has to know what True means!

Solutions:

# GOOD option 1: two named functions
def get_active_users() -> list[dict]:
return db.query("SELECT * FROM users WHERE active = 1")

def get_all_users() -> list[dict]:
return db.query("SELECT * FROM users")

# GOOD option 2: keyword-only with descriptive name
def get_users(*, include_inactive: bool = False) -> list[dict]:
...

get_users(include_inactive=True) # now readable at call site

# GOOD option 3: enum for multiple states
from enum import Enum

class UserFilter(Enum):
ACTIVE_ONLY = "active"
ALL = "all"
INACTIVE_ONLY = "inactive"

def get_users(filter: UserFilter = UserFilter.ACTIVE_ONLY) -> list[dict]:
...

get_users(UserFilter.ALL) # unambiguous

:::warning Boolean Blind Calls create_user(data, True, False, True) - what do the booleans mean? Never make callers memorize positional boolean meanings. Use keyword-only arguments. :::

Principle 5: Consistent Return Types

A function should always return the same type. Mixing return types creates defensive code at every call site.

# BAD: returns int OR None OR raises - callers can't trust the signature
def get_score(user_id):
if user_id < 0:
return None # invalid
if user_id > 1000:
raise ValueError # too high
return 42 # valid

# GOOD: consistent contract - returns int, raises on invalid input
def get_score(user_id: int) -> int:
if user_id < 0:
raise ValueError(f"user_id must be non-negative, got {user_id}")
if user_id > 1000:
raise ValueError(f"user_id too large: {user_id}")
return db.get_score(user_id)

# GOOD: returns Optional[int] consistently when None is valid
def find_score(user_id: int) -> int | None:
return db.find_score(user_id) # None if not found, always

Consistent Empty Collections

# BAD: returns None or list
def get_tags(post_id: int):
tags = db.get_tags(post_id)
if not tags:
return None # caller must check for None
return tags

# GOOD: always return list (empty list is fine)
def get_tags(post_id: int) -> list[str]:
return db.get_tags(post_id) or []

Principle 6: Pure Functions for Logic

A pure function has no side effects and always returns the same result for the same inputs. Pure functions are:

  • Easy to test (no setup/teardown needed)
  • Easy to reason about (no hidden state)
  • Safe to call multiple times
  • Composable
# Impure: depends on external state, has side effects
def get_user_discount(user_id):
user = db.get_user(user_id) # I/O side effect
if user.is_premium:
logger.info(f"Applying premium discount to {user_id}") # side effect
return 0.20
return 0.05

# Pure: given user data, compute discount - no side effects
def compute_discount(is_premium: bool, years_active: int) -> float:
if is_premium:
return min(0.20 + years_active * 0.01, 0.35)
return 0.05

# I/O happens at the boundary:
def get_user_discount(user_id: int) -> float:
user = db.get_user(user_id) # I/O at the boundary
return compute_discount( # pure computation
is_premium=user.is_premium,
years_active=user.years_active,
)

Test the pure logic without any database:

def test_compute_discount():
assert compute_discount(is_premium=True, years_active=0) == 0.20
assert compute_discount(is_premium=True, years_active=15) == 0.35
assert compute_discount(is_premium=False, years_active=10) == 0.05

Principle 7: Type Annotations as Documentation

Type annotations tell callers exactly what a function expects and returns - without reading the body.

# Unannotated: what is items? what does it return?
def top_n(items, n, key=None):
...

# Annotated: self-documenting
from typing import Callable, TypeVar

T = TypeVar("T")

def top_n(
items: list[T],
n: int,
*,
key: Callable[[T], float] | None = None,
) -> list[T]:
...

Python 3.9+ vs Older Syntax

# Python 3.9+: built-in generics, no imports needed
def process(data: list[dict], ids: set[int]) -> dict[str, list[int]]:
...

# Python 3.7-3.8: use from __future__ import annotations or typing module
from __future__ import annotations # enables 3.9+ syntax everywhere
# or
from typing import Dict, List, Set
def process(data: List[Dict], ids: Set[int]) -> Dict[str, List[int]]:
...

Return Type Conventions

# Procedure: mutates, no useful return
def sort_in_place(items: list) -> None: ...

# Optional result
def find_user(email: str) -> dict | None: ...

# Always returns result or raises
def get_user(user_id: int) -> dict: ... # raises if not found

# Multiple return values (use NamedTuple for named fields)
from typing import NamedTuple

class ParseResult(NamedTuple):
value: float
unit: str
confidence: float

def parse_measurement(text: str) -> ParseResult: ...

result = parse_measurement("3.5 kg")
print(result.value, result.unit) # named access, not result[0]

Principle 8: Docstrings

Write docstrings for public APIs. Pick a style and stick to it.

# Google style (recommended for most projects)
def compute_discount(is_premium: bool, years_active: int) -> float:
"""Compute the discount percentage for a user.

Args:
is_premium: Whether the user has a premium subscription.
years_active: Number of complete years the user has been active.

Returns:
Discount as a float between 0.0 and 0.35.

Raises:
ValueError: If years_active is negative.

Examples:
>>> compute_discount(is_premium=True, years_active=5)
0.25
>>> compute_discount(is_premium=False, years_active=10)
0.05
"""
if years_active < 0:
raise ValueError(f"years_active must be non-negative, got {years_active}")
if is_premium:
return min(0.20 + years_active * 0.01, 0.35)
return 0.05

:::tip One-Line vs Full Docstrings One-line docstrings for simple functions: """Return the square of x.""". Multi-line for anything with non-obvious args, side effects, or exceptions. Always document raises. :::

Principle 9: The Principle of Least Surprise

A function should do exactly what its name implies - nothing more, nothing less.

# SURPRISED: get_ modifies state
def get_or_create_user(email: str) -> dict:
user = db.find_user(email)
if not user:
user = db.create_user(email) # side effect! "get" implies read-only
return user

# BETTER: explicit about the intent
def find_or_create_user(email: str) -> tuple[dict, bool]:
"""Find a user by email, creating one if not found.

Returns:
Tuple of (user dict, was_created bool).
"""
user = db.find_user(email)
if user:
return user, False
return db.create_user(email), True

Other common violations:

  • save() that also validates and transforms input
  • validate() that also modifies the object
  • __len__() that triggers a database query

Before/After: A Real-World API Redesign

Before

def proc(data, x=1, y=False, z=None, mode=0, fmt=None):
# what does x mean? y? mode? fmt?
if mode == 0:
result = [item for item in data if item.get("active")]
elif mode == 1:
result = data
if y:
result = sorted(result, key=lambda d: d.get("score", 0))
if x != 1:
result = result[:x]
if fmt == "csv":
return ",".join(str(r) for r in result)
return result

Problems: cryptic names, integer mode flag, boolean sort flag, inconsistent return type (list or str).

After

from enum import Enum

class UserStatus(Enum):
ACTIVE_ONLY = "active"
ALL = "all"

def get_users(
data: list[dict],
*,
status: UserStatus = UserStatus.ACTIVE_ONLY,
sort_by_score: bool = False,
limit: int | None = None,
) -> list[dict]:
"""Filter, sort, and limit a list of user records.

Args:
data: Raw list of user dicts.
status: Whether to include only active users or all users.
sort_by_score: If True, sort by 'score' field descending.
limit: Maximum number of records to return. None means no limit.

Returns:
Filtered (and optionally sorted and limited) list of user dicts.
"""
if status == UserStatus.ACTIVE_ONLY:
result = [u for u in data if u.get("active")]
else:
result = list(data) # copy, do not mutate input

if sort_by_score:
result.sort(key=lambda u: u.get("score", 0), reverse=True)

if limit is not None:
result = result[:limit]

return result # always a list

Improvements:

  • Descriptive name: get_users
  • Keyword-only args: callers must name everything
  • UserStatus enum: replaces mode=0/1 magic integer
  • Descriptive booleans: sort_by_score vs y
  • limit: int | None: replaces x=1 magic default
  • Consistent return type: always list[dict]
  • Docstring: complete documentation

Interview Questions

Q1: What is the Single Responsibility Principle for functions, and how do you apply it?

Answer: A function should do one thing well. The test: can you describe the function in one sentence without using "and" or "or"? If not, split it. Applied to functions: separate I/O from computation (fetch data, compute result, save result are separate functions), separate validation from transformation, and separate policy (what to do) from mechanism (how to do it). Functions with single responsibility are easier to name, test in isolation, reuse, and modify without unintended side effects on other behaviors.

Q2: Why are boolean flag arguments an anti-pattern? What are the alternatives?

Answer: Boolean flags like process(data, True, False) are opaque at call sites - callers must memorize what each positional bool means. They also often signal that a function does two things (one when True, another when False). Alternatives: (1) two named functions with distinct names; (2) keyword-only boolean: process(data, include_archived=True) - readable at call site; (3) an enum: process(data, mode=Mode.ARCHIVED) - explicit and extensible. Boolean keyword arguments are acceptable when the meaning is obvious from the name and there is only one such flag.

Q3: Why should functions return consistent types?

Answer: Inconsistent return types force every caller to write defensive code: result = func(); if result is None: ...; elif isinstance(result, list): ...; else: .... This is error-prone and verbose. Consistent types let callers trust the signature and eliminate branching. Rules: a function should always return the same type; prefer empty collections over None for "no results"; use T | None consistently when absence is meaningful; raise exceptions for invalid inputs rather than returning None as an error signal. Type checkers enforce this when you annotate return types correctly.

Q4: What is a pure function and why does it matter for testability?

Answer: A pure function: (1) returns the same output for the same inputs every time, and (2) has no side effects (does not modify external state, does not do I/O). Pure functions are trivially testable - no mocks, no database setup, no test isolation needed. You pass inputs and assert outputs. The pattern for testability: push I/O to the edges of your system, keep logic pure in the middle. compute_discount(is_premium, years_active) is pure and needs one line to test. get_user_discount(user_id) is impure and needs a database mock. Write as much logic as possible in pure functions.

Q5: When should you use keyword-only arguments (with *)?

Answer: Use keyword-only arguments when: (1) a function has more than 2-3 parameters and positional order would be ambiguous; (2) you have boolean or enumerated option arguments; (3) you want to allow future additions without breaking callers (new keyword args are backward-compatible); (4) the function is part of a public API that others will call. Keyword-only args improve call site readability: create_user(name="Alice", role="admin", send_welcome=True) is unambiguous. Exception: performance-critical internal functions may use positional args to avoid the keyword lookup overhead.

Q6: How do type annotations improve API design beyond just static checking?

Answer: Type annotations serve as executable documentation - they communicate the contract of a function (what it accepts, what it guarantees to return) without requiring callers to read the body. -> list[str] immediately tells callers they can iterate, index, and use list methods. -> str | None tells callers to check for None. -> None tells callers not to use the return value. Beyond documentation, annotations enable: static type checkers (mypy, pyright) to catch type errors before runtime, IDE autocompletion and inline documentation, automatic validation with tools like Pydantic, and generation of API docs. They also catch design issues - if you cannot write a clear type annotation, your API is probably unclear.

Practice Challenges

Beginner: Redesign the Bad API

Redesign the following function to follow clean API principles:

def do(x, y, z=1, flag=False, out=None):
# x: list of numbers
# y: operation ("sum", "avg", "max")
# z: top-n (1 = all)
# flag: True = absolute values
# out: "print" or None
if flag:
x = [abs(v) for v in x]
x = sorted(x, reverse=True)[:z]
if y == "sum":
result = sum(x)
elif y == "avg":
result = sum(x) / len(x) if x else 0
elif y == "max":
result = max(x) if x else 0
if out == "print":
print(result)
return result
Solution
from enum import Enum

class Aggregation(Enum):
SUM = "sum"
AVERAGE = "average"
MAX = "max"

def aggregate_top_values(
values: list[float],
aggregation: Aggregation,
*,
top_n: int | None = None,
use_absolute: bool = False,
) -> float:
"""Aggregate the top-N values from a list.

Args:
values: Input list of numeric values.
aggregation: Aggregation method (SUM, AVERAGE, MAX).
top_n: If given, consider only the top N largest values. None means all.
use_absolute: If True, take absolute value of each element first.

Returns:
The aggregated result as a float.

Raises:
ValueError: If values is empty and aggregation requires non-empty input.
"""
if not values:
if aggregation == Aggregation.AVERAGE:
return 0.0
if aggregation == Aggregation.MAX:
raise ValueError("Cannot compute max of empty list")
return 0.0

processed = [abs(v) for v in values] if use_absolute else list(values)
processed.sort(reverse=True)
selected = processed[:top_n] if top_n is not None else processed

if aggregation == Aggregation.SUM:
return float(sum(selected))
if aggregation == Aggregation.AVERAGE:
return sum(selected) / len(selected)
if aggregation == Aggregation.MAX:
return float(max(selected))
raise ValueError(f"Unknown aggregation: {aggregation}")

# Callers:
result = aggregate_top_values([-3, 1, -5, 2], Aggregation.SUM, use_absolute=True, top_n=2)
print(result) # 8.0 (top 2 absolute values: 5+3)

Intermediate: Rate Limiter API

Design a clean RateLimiter class with a check(key) method. The interface should be intuitive, well-typed, and testable. The class should support per-key limits (e.g., 100 requests per minute per user_id).

Solution
from dataclasses import dataclass, field
from time import monotonic
from collections import deque

@dataclass
class RateLimitConfig:
max_requests: int
window_seconds: float

class RateLimiter:
"""Sliding window rate limiter.

Args:
config: Rate limit configuration.

Example:
limiter = RateLimiter(RateLimitConfig(max_requests=100, window_seconds=60))
if limiter.is_allowed("user:42"):
process_request()
else:
return 429
"""

def __init__(self, config: RateLimitConfig) -> None:
self._config = config
self._windows: dict[str, deque[float]] = {}

def is_allowed(self, key: str) -> bool:
"""Check if a request from key is within the rate limit.

Records the request attempt regardless of the result.

Args:
key: Unique identifier (user_id, IP, API key, etc.).

Returns:
True if the request is within limits, False if rate limited.
"""
now = monotonic()
window_start = now - self._config.window_seconds

if key not in self._windows:
self._windows[key] = deque()

timestamps = self._windows[key]

# Remove timestamps outside the window
while timestamps and timestamps[0] < window_start:
timestamps.popleft()

if len(timestamps) >= self._config.max_requests:
return False # rate limited

timestamps.append(now)
return True

def remaining(self, key: str) -> int:
"""Return how many requests remain in the current window."""
now = monotonic()
window_start = now - self._config.window_seconds
timestamps = self._windows.get(key, deque())
recent = sum(1 for t in timestamps if t >= window_start)
return max(0, self._config.max_requests - recent)

# Usage
config = RateLimitConfig(max_requests=3, window_seconds=60)
limiter = RateLimiter(config)

for i in range(5):
allowed = limiter.is_allowed("user:1")
remaining = limiter.remaining("user:1")
print(f"Request {i+1}: {'allowed' if allowed else 'DENIED'}, remaining={remaining}")
# Request 1: allowed, remaining=2
# Request 2: allowed, remaining=1
# Request 3: allowed, remaining=0
# Request 4: DENIED, remaining=0
# Request 5: DENIED, remaining=0

Advanced: Pipeline Builder

Design a clean Pipeline API that chains data transformations:

result = (
Pipeline(raw_records)
.filter(lambda r: r["active"])
.map(lambda r: {**r, "score": r["score"] * 1.1})
.sort_by("score", descending=True)
.limit(10)
.collect()
)

The API should be: chainable, lazy (only evaluates on .collect()), type-annotated, and support custom aggregations via .reduce(func, initial).

Solution
from __future__ import annotations
from typing import TypeVar, Generic, Callable, Iterable, Any
from functools import reduce as functools_reduce

T = TypeVar("T")
U = TypeVar("U")

class Pipeline(Generic[T]):
"""Lazy, chainable data transformation pipeline.

All transformations are deferred until .collect() is called.
"""

def __init__(self, source: Iterable[T]) -> None:
self._source = source
self._ops: list[Callable] = []

def _apply(self, data: Iterable[T]) -> list[T]:
result = list(data)
for op in self._ops:
result = op(result)
return result

def filter(self, predicate: Callable[[T], bool]) -> Pipeline[T]:
"""Keep elements for which predicate returns True."""
new = Pipeline(self._source)
new._ops = self._ops + [lambda data, p=predicate: [x for x in data if p(x)]]
return new

def map(self, transform: Callable[[T], U]) -> Pipeline[U]:
"""Apply transform to every element."""
new = Pipeline(self._source)
new._ops = self._ops + [lambda data, f=transform: [f(x) for x in data]]
return new # type: ignore[return-value]

def sort_by(self, key: str | Callable[[T], Any], *, descending: bool = False) -> Pipeline[T]:
"""Sort elements by a key attribute name or key function."""
if isinstance(key, str):
key_fn = lambda x, k=key: x[k]
else:
key_fn = key
new = Pipeline(self._source)
new._ops = self._ops + [
lambda data, kf=key_fn, desc=descending: sorted(data, key=kf, reverse=desc)
]
return new

def limit(self, n: int) -> Pipeline[T]:
"""Keep only the first n elements."""
new = Pipeline(self._source)
new._ops = self._ops + [lambda data, n=n: data[:n]]
return new

def collect(self) -> list[T]:
"""Evaluate the pipeline and return a list."""
return self._apply(self._source)

def reduce(self, func: Callable[[U, T], U], initial: U) -> U:
"""Fold the pipeline into a single value."""
return functools_reduce(func, self._apply(self._source), initial)


# Example
raw = [
{"name": "Alice", "score": 80, "active": True},
{"name": "Bob", "score": 95, "active": False},
{"name": "Carol", "score": 70, "active": True},
{"name": "Dave", "score": 88, "active": True},
]

result = (
Pipeline(raw)
.filter(lambda r: r["active"])
.map(lambda r: {**r, "score": r["score"] * 1.1})
.sort_by("score", descending=True)
.limit(2)
.collect()
)

for r in result:
print(f"{r['name']}: {r['score']:.1f}")
# Dave: 96.8
# Alice: 88.0

total = Pipeline(raw).filter(lambda r: r["active"]).reduce(lambda acc, r: acc + r["score"], 0)
print(f"Total active score: {total}") # 238

Quick Reference

PrincipleAnti-PatternFix
Single Responsibilityprocess_and_save()process() + save()
Namingdo(x, y, flag)compute_discount(price, rate)
Argument count7 positional argsConfig dataclass or keyword-only
Boolean flagscreate(data, True, False)create(data, active=True, notify=False)
Consistent returnsint | None | raisesint always (raises on invalid)
Pure functionsI/O mixed with logicI/O at boundaries, pure logic inside
Type annotationsdef f(x, y)def f(x: int, y: str) -> bool
DocstringNo docstringGoogle/NumPy/Sphinx style
Least surpriseget() that writesget_or_create() with explicit name

Key Takeaways

  • Single Responsibility: one function, one purpose - if you need "and" to describe it, split it
  • Names communicate contracts: get_ implies read-only, compute_ implies pure, send_ implies I/O; choose the right verb
  • Keyword-only arguments (* in signature) are the best solution to readable multi-argument APIs and the boolean flag anti-pattern
  • Consistent return types eliminate defensive branching at call sites; never return different types from different code paths
  • Pure functions are testable without mocks - push I/O to the edges of your system, keep logic pure in the middle
  • Type annotations are documentation that tools can verify; -> None is a contract, -> T | None is a contract, -> T is a contract
  • The Principle of Least Surprise: do exactly what the name implies - every deviation is a bug waiting to confuse someone
© 2026 EngineersOfAI. All rights reserved.