The functools Module - lru_cache, partial, reduce, singledispatch and More
Reading time: ~35 minutes | Level: Intermediate → Engineering
Before reading further, predict the output of this program:
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
print(fib(100))
print(fib.cache_info())
Show Answer
354224848179261915075
CacheInfo(hits=98, misses=101, maxsize=None, currsize=101)
Without lru_cache, computing fib(100) via naive recursion would take longer than the age of the universe - the call tree has 2^100 nodes. With lru_cache, each unique value of n from 0 to 100 is computed exactly once. misses=101 means 101 unique (n,) argument tuples were computed (fib(0) through fib(100)). hits=98 means 98 calls returned a cached value without recomputation (each of fib(2) through fib(99) is called twice by the two branches above it, but fib(0) and fib(1) are each called once from above; the exact hit/miss split depends on call order). maxsize=None means unbounded cache - no eviction.
functools is the standard library module for higher-order functions - functions that operate on or return other functions. It is used in every serious Python codebase. This lesson covers every significant tool in the module at engineering depth.
What You Will Learn
lru_cache/cache: LRU eviction mechanics,cache_info(),cache_clear(), thread safety,maxsizewraps: why it exists, what__wrapped__is, and why you must use it in decoratorspartialandpartialmethod: pre-filling arguments for callbacks and class methodsreduce: left fold with accumulator, initial value, and theoperatormoduletotal_ordering: implementing all 6 comparison operators from 2cached_property: compute-once, store-on-instance (vslru_cachewhich stores on class)singledispatchandsingledispatchmethod: function overloading based on argument type
Prerequisites
- Lesson 05 (Decorators) -
lru_cacheandwrapsare deeply connected to the decorator protocol - Lesson 07 (Pure Functions) -
lru_cacheonly works correctly on pure functions - Lesson 08 (Immutability) - understanding why unhashable arguments cannot be cached
Part 1 - lru_cache and cache
How LRU Eviction Works
lru_cache implements a Least Recently Used cache. When the cache is full (maxsize entries) and a new, uncached value must be computed, the entry that was accessed least recently is evicted to make room.
The internal data structure is a doubly-linked list (for O(1) LRU eviction) combined with a dict (for O(1) lookup). Every cache hit moves the accessed entry to the "most recently used" end of the list. Every eviction removes from the "least recently used" end.
The Cache Key
The cache key is constructed from the positional and keyword arguments, converted to a hashable tuple. Arguments must be hashable - if you pass a list or dict, you get a TypeError.
from functools import lru_cache
@lru_cache(maxsize=128)
def compute(x: int, y: int) -> int:
print(f" Computing {x}, {y}")
return x * y
print(compute(3, 4)) # Computes and caches
print(compute(3, 4)) # Cache hit - no "Computing" print
print(compute(4, 3)) # Different key - computes again (3*4 != 4*3 in cache key terms)
info = compute.cache_info()
print(info)
# CacheInfo(hits=1, misses=2, maxsize=128, currsize=2)
cache_info() Fields
| Field | Meaning |
|---|---|
hits | Number of calls that returned a cached result |
misses | Number of calls that computed a new result |
maxsize | Maximum cache size (None = unbounded) |
currsize | Current number of entries in the cache |
cache_clear()
compute.cache_clear()
info = compute.cache_info()
print(info)
# CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)
Use cache_clear() in tests to reset state between test runs, or when the cached data is known to be stale (for example, after a configuration reload).
maxsize Values
# maxsize=None - unbounded, no eviction, highest hit rate
# Use when: fixed, known set of possible inputs (e.g. fib numbers)
@lru_cache(maxsize=None)
def factorial(n: int) -> int:
return 1 if n <= 1 else n * factorial(n - 1)
# maxsize=128 (default) - LRU eviction when full
# Use when: many possible inputs but working set is small
@lru_cache(maxsize=128)
def fetch_user_permissions(user_id: int) -> frozenset[str]:
# simulate DB lookup
return frozenset({"read", "write"})
# maxsize=1 - only caches the most recent call
# Use when: memoizing the last result to avoid recomputation on repeated calls with the same arg
@lru_cache(maxsize=1)
def parse_config(config_path: str) -> dict:
import json
with open(config_path) as f:
return json.load(f)
functools.cache (Python 3.9+)
functools.cache is lru_cache(maxsize=None) with a simpler API and slightly lower overhead (no need to track LRU order when the cache is unbounded).
from functools import cache
@cache
def fib(n: int) -> int:
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
Prefer functools.cache over lru_cache(maxsize=None) for new code on Python 3.9+.
Thread Safety
lru_cache uses a re-entrant lock internally. Concurrent calls from multiple threads are safe - they will not corrupt the cache. However, there is a subtle issue with concurrent cache misses: if two threads call f(x) simultaneously and neither finds it cached, both may compute f(x) before either stores the result. The final stored value will be correct, but the computation runs twice. This is called a cache stampede and is generally acceptable for read-heavy workloads.
lru_cache caches based on arguments. If any argument is unhashable (a list, dict, set, or any other unhashable type), the decorated function raises TypeError at call time - not at decoration time. Convert unhashable arguments to hashable equivalents before caching: list → tuple, dict → tuple(sorted(d.items())), set → frozenset.
Applying lru_cache to a method causes self to be part of the cache key. Because the cache is stored on the class (via the descriptor), the instance is kept alive as long as any entry in the class-level cache references it. This creates a memory leak: instances that should be garbage-collected are retained by the cache indefinitely. Use functools.cached_property instead for per-instance caching, or call method.cache_clear() explicitly before releasing instances.
Part 2 - functools.wraps
When you write a decorator, you wrap one function inside another. Without functools.wraps, the wrapped function loses its name, docstring, and signature.
# WITHOUT wraps - metadata lost
def log_calls(func):
def wrapper(*args, **kwargs):
print(f"Calling {func.__name__}")
return func(*args, **kwargs)
return wrapper
@log_calls
def add(x: int, y: int) -> int:
"""Add two numbers."""
return x + y
print(add.__name__) # 'wrapper' - wrong!
print(add.__doc__) # None - wrong!
# WITH wraps - metadata preserved
from functools import wraps
def log_calls(func):
@wraps(func)
def wrapper(*args, **kwargs):
print(f"Calling {func.__name__}")
return func(*args, **kwargs)
return wrapper
@log_calls
def add(x: int, y: int) -> int:
"""Add two numbers."""
return x + y
print(add.__name__) # 'add' - correct
print(add.__doc__) # 'Add two numbers.' - correct
print(add.__wrapped__) # <function add at 0x...> - original function
What wraps Sets
functools.wraps copies: __module__, __name__, __qualname__, __annotations__, __doc__, and __dict__. It also sets __wrapped__ to the original function.
__wrapped__ for Testing and Introspection
__wrapped__ lets you access the original, undecorated function - useful in tests when you want to call the pure logic without decorator side effects:
from functools import wraps, lru_cache
def validate_positive(func):
@wraps(func)
def wrapper(n: int) -> int:
if n < 0:
raise ValueError(f"Expected positive, got {n}")
return func(n)
return wrapper
@validate_positive
@lru_cache(maxsize=None)
def compute_sqrt(n: int) -> float:
return n ** 0.5
# In tests: unwrap one layer to test compute_sqrt without validation:
original_sqrt = compute_sqrt.__wrapped__
print(original_sqrt(9)) # 3.0 - bypasses validation decorator
functools.wraps sets the __wrapped__ attribute on the wrapper to point to the original function. You can follow the __wrapped__ chain to reach the innermost original function: while hasattr(f, '__wrapped__'): f = f.__wrapped__. This is used by inspect.unwrap() and by test frameworks to introspect decorated functions.
Part 3 - functools.partial and partialmethod
partial: Pre-Filling Arguments
functools.partial creates a new callable with some arguments of the original function pre-filled. The resulting object, when called, calls the original function with the pre-filled arguments plus any new arguments.
from functools import partial
def power(base: float, exponent: float) -> float:
return base ** exponent
# Pre-fill 'exponent' keyword argument
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(4)) # 16.0 - power(4, exponent=2)
print(cube(3)) # 27.0 - power(3, exponent=3)
print(square(10)) # 100.0
# Inspect the partial object
print(square.func) # <function power>
print(square.args) # () - no pre-filled positional args
print(square.keywords) # {'exponent': 2}
Positional vs Keyword Pre-filling
def log(level: str, message: str, timestamp: str) -> str:
return f"[{timestamp}] [{level}] {message}"
# Pre-fill positional argument (first argument fixed)
log_error = partial(log, "ERROR")
print(log_error("Disk full", "2024-01-01"))
# [2024-01-01] [ERROR] Disk full
# Pre-fill keyword argument
log_at_noon = partial(log, timestamp="12:00:00")
print(log_at_noon("INFO", "Service started"))
# [12:00:00] [INFO] Service started
When to Prefer partial Over lambda
partial is more readable and more introspectable than lambda for pre-filling arguments:
data = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
# With lambda - works but hides what is happening
data.sort(key=lambda x: x["age"])
# With operator.itemgetter - cleaner, introspectable
import operator
data.sort(key=operator.itemgetter("age"))
# With partial - useful when the key function itself needs configuration
import functools
def multi_key(item: dict, *fields: str) -> tuple:
return tuple(item[f] for f in fields)
sort_by_age_name = partial(multi_key, "age", "name")
data.sort(key=sort_by_age_name)
Prefer partial over lambda for pre-filling arguments: partial is more readable, introspectable (you can inspect .func, .args, .keywords), and debuggable. lambda is a better fit for simple inline transformations that do not need introspection.
partialmethod for Class Methods
functools.partialmethod creates a partial method descriptor - like partial, but designed for use inside class bodies.
from functools import partialmethod
class TextProcessor:
def _apply_style(self, text: str, style: str, prefix: str = "") -> str:
return f"{prefix}[{style}]{text}[/{style}]"
# Create specialised methods by pre-filling 'style'
bold = partialmethod(_apply_style, style="b")
italic = partialmethod(_apply_style, style="i")
code = partialmethod(_apply_style, style="code", prefix=">>> ")
processor = TextProcessor()
print(processor.bold("Hello")) # [b]Hello[/b]
print(processor.italic("World")) # [i]World[/i]
print(processor.code("print('hi')")) # >>> [code]print('hi')[/code]
Part 4 - functools.reduce
functools.reduce applies a binary function cumulatively to the elements of an iterable, reducing it to a single value. It is the left fold from functional programming.
from functools import reduce
from operator import add, mul
numbers = [1, 2, 3, 4, 5]
# Sum using reduce (equivalent to sum(numbers))
total = reduce(add, numbers)
print(total) # 15
# Trace: ((((1 + 2) + 3) + 4) + 5) = 15
# Product using reduce (no built-in prod before Python 3.8)
product = reduce(mul, numbers)
print(product) # 120
# Trace: ((((1 * 2) * 3) * 4) * 5) = 120
# With initial value - used when the iterable may be empty
total_with_init = reduce(add, [], 0) # empty list + initial value
print(total_with_init) # 0 (without initial value, reduce([]) raises TypeError)
The operator Module
operator provides function equivalents for Python's operators, usable with map, filter, reduce, and sorted.
import operator
# Arithmetic
print(operator.add(3, 4)) # 7
print(operator.mul(3, 4)) # 12
print(operator.floordiv(7, 2)) # 3
# Comparison
print(operator.lt(3, 4)) # True
print(operator.eq(3, 3)) # True
# Item and attribute access
get_age = operator.itemgetter("age")
data = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
print(sorted(data, key=get_age))
# [{'name': 'Bob', 'age': 25}, {'name': 'Alice', 'age': 30}]
# For nested access:
get_first = operator.itemgetter(0)
pairs = [(3, "c"), (1, "a"), (2, "b")]
print(sorted(pairs, key=get_first)) # [(1, 'a'), (2, 'b'), (3, 'c')]
# attrgetter - for object attributes
from operator import attrgetter
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
people = [Person("Alice", 30), Person("Bob", 25), Person("Carol", 35)]
print(sorted(people, key=attrgetter("age")))
# [Person(name='Bob', age=25), Person(name='Alice', age=30), Person(name='Carol', age=35)]
operator.methodcaller: Cleaner than Lambda for Method Calls
from operator import methodcaller
words = ["hello", "WORLD", "Python"]
# With lambda
print(sorted(words, key=lambda w: w.lower()))
# With methodcaller - more readable, introspectable
lower = methodcaller("lower")
print(sorted(words, key=lower))
# ['hello', 'Python', 'WORLD']
# methodcaller with arguments
strip_dots = methodcaller("strip", ".")
strings = ["...hello...", "..world.", "python..."]
print(list(map(strip_dots, strings)))
# ['hello', 'world', 'python']
reduce for Function Pipelines
from functools import reduce
from typing import Callable, TypeVar
T = TypeVar("T")
def compose(*funcs: Callable) -> Callable:
"""Compose functions right-to-left: compose(f, g, h)(x) == f(g(h(x)))"""
return reduce(lambda f, g: lambda x: f(g(x)), funcs)
def pipe(*funcs: Callable) -> Callable:
"""Compose functions left-to-right: pipe(f, g, h)(x) == h(g(f(x)))"""
return reduce(lambda f, g: lambda x: g(f(x)), funcs)
# Build a text transformation pipeline
clean = pipe(
str.strip,
str.lower,
lambda s: s.replace(" ", "_"),
)
print(clean(" Hello World ")) # hello_world
print(clean(" Python Rocks ")) # python_rocks
Part 5 - functools.total_ordering
total_ordering fills in the missing comparison methods when you define __eq__ and one of __lt__, __le__, __gt__, or __ge__. Python derives the remaining five methods automatically.
from functools import total_ordering
from dataclasses import dataclass
@total_ordering
@dataclass
class Version:
major: int
minor: int
patch: int
def __eq__(self, other: object) -> bool:
if not isinstance(other, Version):
return NotImplemented
return (self.major, self.minor, self.patch) == (other.major, other.minor, other.patch)
def __lt__(self, other: "Version") -> bool:
if not isinstance(other, Version):
return NotImplemented
return (self.major, self.minor, self.patch) < (other.major, other.minor, other.patch)
def __repr__(self) -> str:
return f"v{self.major}.{self.minor}.{self.patch}"
v1 = Version(1, 0, 0)
v2 = Version(1, 2, 0)
v3 = Version(2, 0, 0)
# All six comparison operators work - derived from __eq__ and __lt__:
print(v1 < v2) # True
print(v2 > v1) # True - derived: not (v2 < v1) and v2 != v1
print(v1 <= v2) # True - derived: v1 < v2 or v1 == v2
print(v3 >= v2) # True - derived
print(sorted([v3, v1, v2])) # [v1.0.0, v1.2.0, v2.0.0]
total_ordering is slower than defining all six methods explicitly (each derived comparison involves at least one extra method call). For performance-critical comparison-heavy code (sorting millions of objects), define all six manually. For most application code, total_ordering is the right trade-off.
Part 6 - functools.cached_property
cached_property is a descriptor that computes a property once on first access, stores the result directly on the instance's __dict__, and returns the cached value on all subsequent accesses - without calling the method again.
from functools import cached_property
import math
class Circle:
def __init__(self, radius: float):
self.radius = radius
@cached_property
def area(self) -> float:
print(" Computing area...")
return math.pi * self.radius ** 2
@cached_property
def circumference(self) -> float:
print(" Computing circumference...")
return 2 * math.pi * self.radius
c = Circle(5.0)
print(c.area) # Computing area... then 78.539...
print(c.area) # 78.539... - no "Computing" - cached in c.__dict__
print(c.circumference) # Computing circumference... then 31.415...
print(c.__dict__) # {'radius': 5.0, 'area': 78.539..., 'circumference': 31.415...}
cached_property vs lru_cache on Methods
| Feature | cached_property | lru_cache on method |
|---|---|---|
| Cache location | Instance __dict__ | Class-level cache dict |
| Memory | Freed when instance is GC'd | Holds instance alive (memory leak) |
| Cache key | Per-instance (no args needed) | (self, *args) |
| Supports arguments | No (property: no args) | Yes |
| Thread safe | Requires __slots__ caveat | Yes (internal lock) |
from functools import cached_property, lru_cache
class DataSet:
def __init__(self, data: list[float]):
self.data = data
# CORRECT: cached_property for argument-free expensive computation
@cached_property
def statistics(self) -> dict:
return {
"mean": sum(self.data) / len(self.data),
"min": min(self.data),
"max": max(self.data),
}
# WRONG: lru_cache on method - holds self in cache, causes memory leak
@lru_cache(maxsize=None)
def mean_of_first_n(self, n: int) -> float:
return sum(self.data[:n]) / n
# self is part of the cache key → instance never garbage collected
cached_property does not work on classes with __slots__ because __slots__ prevents instance __dict__, and cached_property stores results in __dict__. If you need caching on slotted classes, use lru_cache and manage the memory leak explicitly.
Part 7 - functools.singledispatch
singledispatch implements function overloading based on the type of the first argument. You define a base implementation, then register type-specific implementations.
from functools import singledispatch
from typing import Sequence
@singledispatch
def process(value) -> str:
"""Default implementation for unregistered types."""
return f"Unknown type: {type(value).__name__}"
@process.register(int)
def _(value: int) -> str:
return f"Integer: {value * 2}"
@process.register(str)
def _(value: str) -> str:
return f"String: {value.upper()}"
@process.register(list)
def _(value: list) -> str:
return f"List with {len(value)} items: {value}"
@process.register(float)
@process.register(complex)
def _(value) -> str:
return f"Numeric: {abs(value):.4f}"
print(process(42)) # Integer: 84
print(process("hello")) # String: HELLO
print(process([1, 2, 3])) # List with 3 items: [1, 2, 3]
print(process(3.14)) # Numeric: 3.1400
print(process(3+4j)) # Numeric: 5.0000
print(process({"key": 1})) # Unknown type: dict
Registering with Type Annotations (Python 3.7+)
from functools import singledispatch
@singledispatch
def serialise(value) -> bytes:
raise TypeError(f"Cannot serialise type {type(value).__name__}")
@serialise.register
def _(value: int) -> bytes:
return value.to_bytes(8, "big")
@serialise.register
def _(value: str) -> bytes:
return value.encode("utf-8")
@serialise.register
def _(value: bytes) -> bytes:
return value # already bytes
print(serialise(42)) # b'\x00\x00\x00\x00\x00\x00\x00*'
print(serialise("hello")) # b'hello'
print(serialise(b"already")) # b'already'
singledispatchmethod for Class Methods (Python 3.8+)
from functools import singledispatchmethod
class Formatter:
@singledispatchmethod
def format(self, value) -> str:
return str(value)
@format.register(int)
def _(self, value: int) -> str:
return f"{value:,}" # thousands separator
@format.register(float)
def _(self, value: float) -> str:
return f"{value:.2f}"
@format.register(list)
def _(self, value: list) -> str:
return ", ".join(str(v) for v in value)
fmt = Formatter()
print(fmt.format(1000000)) # 1,000,000
print(fmt.format(3.14159)) # 3.14
print(fmt.format([1, 2, 3])) # 1, 2, 3
print(fmt.format("raw")) # raw - default
MRO-Based Dispatch: Subclasses Inherit Registrations
singledispatch walks the MRO of the argument's type when no exact registration exists. This means a handler registered for a base class also handles subclasses.
from functools import singledispatch
from collections.abc import Sequence, Mapping
@singledispatch
def describe(value) -> str:
return f"object: {value!r}"
@describe.register(Sequence)
def _(value) -> str:
return f"sequence of {len(value)} items"
@describe.register(Mapping)
def _(value) -> str:
return f"mapping with keys: {list(value.keys())}"
print(describe([1, 2, 3])) # sequence of 3 items - list is a Sequence
print(describe((1, 2, 3))) # sequence of 3 items - tuple is a Sequence
print(describe({"a": 1})) # mapping with keys: ['a'] - dict is a Mapping
print(describe(42)) # object: 42 - int matches default
functools.wraps sets __wrapped__ on the wrapper function, pointing to the original wrapped function. You can access the original via decorated_func.__wrapped__. The inspect.unwrap() function follows the __wrapped__ chain all the way to the innermost original. This is used by test frameworks, documentation generators, and help() to show the true signature of a decorated function.
Part 8 - Putting It Together: A Functools-Powered Serialiser
A concrete example combining singledispatch, lru_cache, wraps, and partial:
import json
from functools import singledispatch, lru_cache, wraps, partial
from dataclasses import dataclass, asdict
from decimal import Decimal
from datetime import date, datetime
from typing import Any
# singledispatch: type-aware serialisation
@singledispatch
def to_json_serialisable(value: Any) -> Any:
"""Convert a value to a JSON-serialisable form."""
raise TypeError(f"Not JSON-serialisable: {type(value).__name__}")
@to_json_serialisable.register(int)
@to_json_serialisable.register(float)
@to_json_serialisable.register(str)
@to_json_serialisable.register(bool)
@to_json_serialisable.register(type(None))
def _(value: Any) -> Any:
return value # already serialisable
@to_json_serialisable.register(Decimal)
def _(value: Decimal) -> str:
return str(value)
@to_json_serialisable.register(date)
def _(value: date) -> str:
return value.isoformat()
@to_json_serialisable.register(datetime)
def _(value: datetime) -> str:
return value.isoformat()
@to_json_serialisable.register(list)
@to_json_serialisable.register(tuple)
def _(value) -> list:
return [to_json_serialisable(v) for v in value]
@to_json_serialisable.register(dict)
def _(value: dict) -> dict:
return {str(k): to_json_serialisable(v) for k, v in value.items()}
# lru_cache + wraps: cache the expensive schema computation
def cache_schema(func):
@wraps(func)
@lru_cache(maxsize=64)
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
@cache_schema
def get_field_names(cls: type) -> tuple[str, ...]:
import dataclasses
return tuple(f.name for f in dataclasses.fields(cls))
# partial: pre-configure JSON serialisation
compact_json = partial(json.dumps, separators=(",", ":"))
pretty_json = partial(json.dumps, indent=2, sort_keys=True)
# Usage
@dataclass
class Order:
order_id: str
amount: Decimal
created_at: date
tags: list[str]
order = Order(
order_id="ORD-001",
amount=Decimal("99.99"),
created_at=date(2024, 1, 15),
tags=["priority", "verified"],
)
serialisable = to_json_serialisable(asdict(order))
print(compact_json(serialisable))
# {"order_id":"ORD-001","amount":"99.99","created_at":"2024-01-15","tags":["priority","verified"]}
print(pretty_json(serialisable))
# {
# "amount": "99.99",
# "created_at": "2024-01-15",
# ...
# }
Engineering Checklist
Before moving on, verify you can answer these without looking:
- What data structures does
lru_cacheuse internally and why? - What are
hits,misses,maxsize, andcurrsizeincache_info()? - Why do unhashable arguments cause
lru_cacheto raiseTypeError? - Why does applying
lru_cacheto an instance method cause a memory leak? - What is
__wrapped__and how do you use it in tests? - What is the difference between
functools.partial(positional) andfunctools.partial(keyword) pre-filling? - How does
singledispatchresolve a call when there is no exact type match? - What is the difference between
cached_propertyandlru_cacheon a method, in terms of where the cached value is stored? - How do you implement all six comparison operators using only two method definitions?
Key Takeaways
lru_cacheuses a doubly-linked list (O(1) LRU eviction) combined with a dict (O(1) lookup). On a cache miss, it computes and stores the result; on a hit, it moves the entry to the most-recently-used position.maxsize=Nonedisables eviction and is slightly faster.functools.cache(Python 3.9+) islru_cache(maxsize=None)with a simpler API. Prefer it for unbounded caches.lru_cacherequires hashable arguments - it constructs a tuple key from all arguments. Passing alistordictraisesTypeErrorat call time.- Applying
lru_cacheto an instance method storesselfin the class-level cache key, preventing instances from being garbage-collected. Usecached_propertyfor per-instance caching with no argument. functools.wrapscopies__name__,__doc__,__annotations__, and other metadata from the wrapped function to the wrapper. It also sets__wrapped__to the original function. Always use@wraps(func)inside custom decorators.functools.partialpre-fills positional or keyword arguments and returns a new callable. Inspect it via.func,.args,.keywords. Preferpartialoverlambdafor pre-filling - it is more readable and introspectable.functools.reduce(f, iterable, initial)is the left fold: it appliesfcumulatively. Always provide theinitialvalue when the iterable may be empty to avoidTypeError.total_orderingderives all six comparison operators from__eq__and one of__lt__,__le__,__gt__,__ge__. There is a small runtime cost per comparison; for high-performance sorting, define all six methods manually.cached_propertycomputes a value once on first access and stores it in the instance's__dict__. The cache is freed when the instance is garbage-collected - no memory leak.singledispatchdispatches on the type of the first argument. When there is no exact type registration, it walks the argument type's MRO to find the closest registered base type. Subclasses of registered types are handled automatically.
Graded Practice
Level 1 - Predict the Output
Question 1
from functools import lru_cache
@lru_cache(maxsize=3)
def slow(n: int) -> int:
return n * n
slow(1)
slow(2)
slow(3)
slow(4) # evicts LRU
slow(1) # miss - was evicted
info = slow.cache_info()
print(info.hits)
print(info.misses)
print(info.currsize)
Show Answer
0
5
3
slow(1), slow(2), slow(3) - three misses, cache is full (size 3). slow(4) - miss, evicts the LRU entry which is slow(1) (the one not accessed most recently). Cache now holds {2, 3, 4}. slow(1) - miss again (it was evicted). Cache now holds {3, 4, 1} (2 is now LRU). Total: 5 misses, 0 hits, currsize=3. No call resulted in a cache hit because every call to a previously computed value was after an eviction.
Question 2
from functools import wraps
def double_result(func):
@wraps(func)
def wrapper(*args, **kwargs):
return func(*args, **kwargs) * 2
return wrapper
@double_result
def add(x: int, y: int) -> int:
"""Add two integers."""
return x + y
print(add(3, 4))
print(add.__name__)
print(add.__doc__)
print(hasattr(add, "__wrapped__"))
Show Answer
14
add
Add two integers.
True
add(3, 4) calls wrapper(3, 4), which calls add(3, 4) returning 7, then doubles it to 14. @wraps(func) copies __name__ ("add"), __doc__ ("Add two integers."), and sets __wrapped__ to the original add function. All three metadata attributes are preserved on the wrapper.
Question 3
from functools import partial
def format_price(amount: float, currency: str, precision: int) -> str:
return f"{currency}{amount:.{precision}f}"
eur_price = partial(format_price, currency="€", precision=2)
jpy_price = partial(format_price, currency="¥", precision=0)
print(eur_price(9.99))
print(jpy_price(9.99))
print(eur_price.func.__name__)
print(eur_price.keywords)
Show Answer
€9.99
¥10
format_price
{'currency': '€', 'precision': 2}
eur_price(9.99) calls format_price(9.99, currency="€", precision=2) → "€9.99". jpy_price(9.99) calls format_price(9.99, currency="¥", precision=0) → "¥10" (9.99 rounded to 0 decimal places is 10). eur_price.func is the original format_price function, so eur_price.func.__name__ is "format_price". eur_price.keywords is the dict of pre-filled keyword arguments.
Question 4
from functools import singledispatch
@singledispatch
def stringify(value) -> str:
return f"default:{value}"
@stringify.register(int)
def _(value: int) -> str:
return f"int:{value}"
@stringify.register(bool)
def _(value: bool) -> str:
return f"bool:{value}"
print(stringify(42))
print(stringify(True))
print(stringify(3.14))
print(stringify("hi"))
Show Answer
int:42
bool:True
default:3.14
default:hi
stringify(42): 42 is an int, registered handler returns "int:42". stringify(True): True is a bool. Even though bool is a subclass of int, there is an explicit registration for bool that takes precedence over the int registration. singledispatch checks for the most specific registered type in the MRO - bool is more specific than int. stringify(3.14): float is not registered, falls through to the default. stringify("hi"): str is not registered, falls through to the default.
Question 5
from functools import cached_property
class Fibonacci:
def __init__(self, n: int):
self.n = n
@cached_property
def value(self) -> int:
print(" computing...")
a, b = 0, 1
for _ in range(self.n):
a, b = b, a + b
return a
f = Fibonacci(10)
print(f.value)
print(f.value)
print("value" in f.__dict__)
Show Answer
computing...
55
55
True
First access to f.value triggers the computation: prints " computing..." and returns 55 (the 10th Fibonacci number: 0,1,1,2,3,5,8,13,21,34,55). The result 55 is stored in f.__dict__["value"]. Second access to f.value reads directly from f.__dict__ - the cached_property descriptor is bypassed because instance __dict__ takes precedence over the class-level descriptor. No " computing..." printed. "value" in f.__dict__ is True because cached_property stores the result on the instance.
Level 2 - Debug Challenge
The following code is supposed to memoize an API response, but it has two bugs: one causes a TypeError at runtime, and the other causes a memory leak. Identify both bugs and fix them.
from functools import lru_cache
import requests
class APIClient:
def __init__(self, base_url: str):
self.base_url = base_url
@lru_cache(maxsize=100) # BUG 1: memory leak
def fetch_user(self, user_id: int) -> dict:
response = requests.get(f"{self.base_url}/users/{user_id}")
return response.json()
@lru_cache(maxsize=50)
def search_users(self, tags: list[str]) -> list[dict]: # BUG 2: TypeError
params = {"tags": ",".join(tags)}
response = requests.get(f"{self.base_url}/users", params=params)
return response.json()
Show Answer
Bug 1 - Memory leak from lru_cache on instance method: lru_cache on fetch_user makes self part of the cache key (the cache key is (self, user_id)). The cache is stored at the class level (attached to the function object). Every APIClient instance that calls fetch_user is retained in the cache indefinitely - instances can never be garbage-collected as long as any cached entry references them. With many short-lived clients (common in testing or in per-request contexts), this leaks memory silently.
Bug 2 - Unhashable argument: tags: list[str] is a list, which is not hashable. lru_cache constructs a tuple key from all arguments including tags. Passing a list raises TypeError: unhashable type: 'list' at call time.
Fixed version:
from functools import lru_cache, cached_property
import requests
class APIClient:
def __init__(self, base_url: str):
self.base_url = base_url
# FIX 1: Use a module-level lru_cache function instead of a method-level one
# Or: accept that fetch_user is not cached at the instance level and use
# a separate cache mechanism. Here we use a module-level cached function:
def fetch_user(self, user_id: int) -> dict:
return _fetch_user_cached(self.base_url, user_id)
def search_users(self, tags: tuple[str, ...]) -> list[dict]:
# FIX 2: caller must pass tuple, not list
return _search_users_cached(self.base_url, tags)
# Module-level cached functions - no self in cache key, no memory leak
@lru_cache(maxsize=100)
def _fetch_user_cached(base_url: str, user_id: int) -> dict:
response = requests.get(f"{base_url}/users/{user_id}")
return response.json()
@lru_cache(maxsize=50)
def _search_users_cached(base_url: str, tags: tuple[str, ...]) -> list[dict]:
# FIX 2: tags is now a tuple (hashable), not a list
params = {"tags": ",".join(tags)}
response = requests.get(f"{base_url}/users", params=params)
return response.json()
# Usage:
client = APIClient("https://api.example.com")
user = client.fetch_user(42)
results = client.search_users(("python", "functional")) # pass tuple, not list
Alternative for Bug 2 (if callers must pass lists): convert to frozenset or tuple inside the method before calling the cached function:
def search_users(self, tags: list[str]) -> list[dict]:
return _search_users_cached(self.base_url, tuple(sorted(tags)))
Sorting ensures ["a", "b"] and ["b", "a"] produce the same cache key.
Level 3 - Design Challenge
Design a type-aware data validation framework using singledispatch. The framework should:
- Validate
intvalues: must be within a specified range - Validate
strvalues: must match a specified regex pattern and length constraints - Validate
listvalues: must contain only items that themselves pass validation - Validate
dictvalues: must have all required keys, and each value must pass validation for its type - Return a
ValidationResult(an immutable dataclass) containingis_valid: boolanderrors: tuple[str, ...] - Be extensible - new types can be registered without modifying existing code
Show Answer
from functools import singledispatch
from dataclasses import dataclass
import re
from typing import Any
@dataclass(frozen=True)
class ValidationResult:
is_valid: bool
errors: tuple[str, ...]
@classmethod
def ok(cls) -> "ValidationResult":
return cls(is_valid=True, errors=())
@classmethod
def fail(cls, *errors: str) -> "ValidationResult":
return cls(is_valid=False, errors=errors)
def merge(self, other: "ValidationResult") -> "ValidationResult":
all_errors = self.errors + other.errors
return ValidationResult(
is_valid=not all_errors,
errors=all_errors,
)
# ============================================================
# VALIDATION CONTEXT OBJECTS - immutable configuration
# ============================================================
@dataclass(frozen=True)
class IntConstraint:
min_value: int
max_value: int
@dataclass(frozen=True)
class StrConstraint:
pattern: str
min_length: int = 0
max_length: int = 10_000
@dataclass(frozen=True)
class ListConstraint:
item_constraint: Any # constraint for each item
min_items: int = 0
max_items: int = 10_000
@dataclass(frozen=True)
class DictConstraint:
required_keys: frozenset[str]
value_constraints: tuple # tuple of (key, constraint) pairs
# ============================================================
# SINGLEDISPATCH VALIDATOR - dispatches on constraint type
# ============================================================
@singledispatch
def validate(value: Any, constraint: Any) -> ValidationResult:
raise TypeError(f"No validator registered for constraint type: {type(constraint).__name__}")
@validate.register(IntConstraint)
def _(value: Any, constraint: IntConstraint) -> ValidationResult:
if not isinstance(value, int):
return ValidationResult.fail(f"Expected int, got {type(value).__name__}")
errors = []
if value < constraint.min_value:
errors.append(f"Value {value} is below minimum {constraint.min_value}")
if value > constraint.max_value:
errors.append(f"Value {value} is above maximum {constraint.max_value}")
return ValidationResult.fail(*errors) if errors else ValidationResult.ok()
@validate.register(StrConstraint)
def _(value: Any, constraint: StrConstraint) -> ValidationResult:
if not isinstance(value, str):
return ValidationResult.fail(f"Expected str, got {type(value).__name__}")
errors = []
if len(value) < constraint.min_length:
errors.append(f"String length {len(value)} is below minimum {constraint.min_length}")
if len(value) > constraint.max_length:
errors.append(f"String length {len(value)} exceeds maximum {constraint.max_length}")
if not re.fullmatch(constraint.pattern, value):
errors.append(f"Value '{value}' does not match pattern '{constraint.pattern}'")
return ValidationResult.fail(*errors) if errors else ValidationResult.ok()
@validate.register(ListConstraint)
def _(value: Any, constraint: ListConstraint) -> ValidationResult:
if not isinstance(value, list):
return ValidationResult.fail(f"Expected list, got {type(value).__name__}")
errors = []
if len(value) < constraint.min_items:
errors.append(f"List has {len(value)} items, minimum is {constraint.min_items}")
if len(value) > constraint.max_items:
errors.append(f"List has {len(value)} items, maximum is {constraint.max_items}")
for i, item in enumerate(value):
result = validate(item, constraint.item_constraint)
if not result.is_valid:
for err in result.errors:
errors.append(f"[{i}]: {err}")
return ValidationResult.fail(*errors) if errors else ValidationResult.ok()
@validate.register(DictConstraint)
def _(value: Any, constraint: DictConstraint) -> ValidationResult:
if not isinstance(value, dict):
return ValidationResult.fail(f"Expected dict, got {type(value).__name__}")
errors = []
missing = constraint.required_keys - frozenset(value.keys())
for key in sorted(missing):
errors.append(f"Missing required key: '{key}'")
value_constraint_map = dict(constraint.value_constraints)
for key, val_constraint in value_constraint_map.items():
if key in value:
result = validate(value[key], val_constraint)
if not result.is_valid:
for err in result.errors:
errors.append(f"['{key}']: {err}")
return ValidationResult.fail(*errors) if errors else ValidationResult.ok()
# ============================================================
# DEMO - extensible, immutable, pure
# ============================================================
user_constraint = DictConstraint(
required_keys=frozenset({"name", "age", "email", "tags"}),
value_constraints=(
("name", StrConstraint(pattern=r"[A-Za-z ]+", min_length=2, max_length=100)),
("age", IntConstraint(min_value=18, max_value=120)),
("email", StrConstraint(pattern=r"[^@]+@[^@]+\.[^@]+", max_length=254)),
("tags", ListConstraint(
item_constraint=StrConstraint(pattern=r"[a-z_]+", max_length=50),
min_items=1,
max_items=10,
)),
),
)
valid_user = {
"name": "Alice Smith",
"age": 30,
"tags": ["admin", "verified"],
}
invalid_user = {
"name": "A", # too short
"age": 15, # below minimum
"email": "not-an-email", # pattern mismatch
"tags": ["UPPERCASE"], # pattern mismatch (must be lowercase)
# "missing_key" is absent - but "tags" is present so no missing key error
}
result_valid = validate(valid_user, user_constraint)
print(f"Valid user: {result_valid.is_valid}") # True
print(f"Errors: {result_valid.errors}") # ()
result_invalid = validate(invalid_user, user_constraint)
print(f"\nInvalid user: {result_invalid.is_valid}") # False
for err in result_invalid.errors:
print(f" - {err}")
Output:
Valid user: True
Errors: ()
Invalid user: False
- ['name']: String length 1 is below minimum 2
- ['age']: Value 15 is below minimum 18
- ['email']: Value 'not-an-email' does not match pattern '[^@]+@[^@]+\.[^@]+'
- ['tags']: [0]: Value 'UPPERCASE' does not match pattern '[a-z_]+'
Design highlights:
singledispatchdispatches on the constraint type (not the value type), making the framework open to extension: register a new constraint class and its validator without modifying existing code- All data objects (
ValidationResult, constraint classes) are frozen dataclasses - immutable, hashable - The
validatefunction is pure: same inputs always return the sameValidationResult - Recursive validation (list items, dict values) works naturally because
validateis a regular function that can call itself - Errors from nested structures are prefixed with their path (
[0]:,['key']:) for clear error messages
What's Next
Lesson 10 covers functools.partial at full depth alongside currying - the theoretical and practical distinction between partial application (what Python does natively) and currying (transforming f(a, b, c) into f(a)(b)(c)). You will see how to implement currying in Python, when to use it, and how the operator module provides curried-style operations for use in pipelines.
