Closures - Functions That Remember Their Environment
Reading time: ~18 minutes | Level: Foundation → Engineering
Here is a result that surprises most developers when they first see it:
def make_counter():
count = 0
def increment():
nonlocal count
count += 1
return count
return increment
c1 = make_counter()
c2 = make_counter()
print(c1()) # 1
print(c1()) # 2
print(c1()) # 3
print(c2()) # 1 ← independent counter, starts at zero
print(c1()) # 4 ← c1 remembers its own count separately
make_counter has already returned. Its local variable count should be gone - the function's stack frame is destroyed on return. Yet c1() keeps incrementing its own count, and c2() has a completely separate one.
How does Python keep count alive? The answer is closures: inner functions that carry their enclosing scope's variables with them, stored in a special structure called cell objects on the heap.
What You Will Learn
- What a closure is: an inner function bound to free variables from its enclosing scope
- How CPython stores free variables:
__closure__, cell objects, andco_freevars - The classic closure bug: loop variable capture and why all closures see the final value
- Two correct fixes: default-argument capture and
functools.partial - The
nonlocalkeyword: mutating an enclosing scope's variable - Function factories: generating specialized functions at runtime
- Closures vs classes: choosing the right abstraction for stateful behavior
- Why closures keep enclosing scopes alive (and when that causes memory leaks)
Prerequisites
- Comfortable with Python functions,
def, and return values (topic 01) - Understanding that functions are first-class objects (topic 01)
- Basic knowledge of Python scope and the LEGB rule (topic 08)
Mental Model: A Closure Is a Function Plus Its Environment
When an inner function is defined inside an outer function, and the inner function references variables from the outer function, Python creates a closure: the inner function carries those variables with it even after the outer function finishes executing.
The variable n is not stored in the outer function's stack frame after return. Instead, it is promoted to a cell object on the heap, shared between the outer function and the inner function.
Part 1 - Your First Closure
def make_adder(n):
"""Return a function that adds n to its argument."""
def add(x):
return x + n # n is a free variable - not local, not global
return add
add5 = make_adder(5)
add10 = make_adder(10)
print(add5(3)) # 8
print(add10(3)) # 13
print(add5(20)) # 25
add5 and add10 are two different closure objects. Each carries its own captured value of n.
A free variable is a variable that is referenced in a function body but defined in an enclosing scope - not a local variable of that function, not a global module-level name.
Part 2 - CPython Internals: Cell Objects and __closure__
Let us look under the hood. CPython stores free variables in cell objects - thin wrappers on the heap that hold a single reference.
def make_adder(n):
def add(x):
return x + n
return add
add5 = make_adder(5)
# Inspect the closure
print(add5.__closure__) # (<cell at 0x7f...>,)
print(add5.__closure__[0].cell_contents) # 5
print(add5.__code__.co_freevars) # ('n',)
Each entry in __closure__ is a cell object. cell_contents holds the actual value. co_freevars is a tuple of the variable names corresponding to the cells, in the same order.
Why Cell Objects? Sharing Between Scopes
The cell object solves a critical problem: what if the outer function needs to modify the variable after the inner function is created? Both functions must see the same storage location.
def make_counter():
count = 0 # stored in a cell, not on the stack
def increment():
nonlocal count # access the shared cell
count += 1
return count
def reset():
nonlocal count
count = 0
return increment, reset
inc, rst = make_counter()
print(inc()) # 1
print(inc()) # 2
print(inc()) # 3
rst()
print(inc()) # 1 - count was reset via the shared cell
Both increment and reset share the same cell object. When reset sets count = 0, the cell's contents change, and increment sees the new value on the next call. This is why it is called a cell: both functions share one box.
Bytecode Evidence
import dis
def make_adder(n):
def add(x):
return x + n
return add
dis.dis(make_adder.__code__.co_consts[1]) # disassemble the inner `add`
# LOAD_FAST 0 (x)
# LOAD_DEREF 0 (n) ← loads from closure cell, not local frame
# BINARY_ADD
# RETURN_VALUE
LOAD_DEREF is the bytecode instruction for reading a free variable from a closure cell. Compare this to LOAD_FAST (local variable on the current frame) and LOAD_GLOBAL (module-level name).
Part 3 - The Classic Closure Bug: Loop Variable Capture
This is one of the most common Python interview questions and one of the most surprising bugs in real code.
# Intended: create three functions that print 0, 1, 2 respectively
funcs = []
for i in range(3):
def f():
print(i)
funcs.append(f)
funcs[0]() # 2 ← expected 0
funcs[1]() # 2 ← expected 1
funcs[2]() # 2 ← expected 2
All three functions print 2. Why?
The Explanation
All three functions are closures that capture the same variable i - they hold a reference to the cell containing i, not a snapshot of its value at the moment they were defined. When the loop finishes, i is 2. When you call any of the three functions later, they all look up the current value of i in the shared cell, which is 2.
After the loop completes (i == 2), all three closures share the same cell object in memory.
The three function objects are different objects, but they all reference the same cell for i. The loop creates one binding for i; each iteration reassigns it.
:::warning This is not a Python bug This behavior is correct and consistent. Closures capture variables by reference (via a shared cell), not by value. The surprise arises from the expectation that a snapshot is taken. Understanding this distinction is essential for working with decorators, callbacks, and async code. :::
Fix 1: Default Argument Capture (Snapshot at Definition Time)
funcs = []
for i in range(3):
def f(i=i): # default arg evaluates i at def-time
print(i)
funcs.append(f)
funcs[0]() # 0 correct
funcs[1]() # 1 correct
funcs[2]() # 2 correct
i=i in the default argument evaluates i at the moment def executes, creating a local parameter named i with that snapshot value. This is not a closure at all - i becomes a local with a default, so no cell is involved for that variable.
Fix 2: functools.partial
from functools import partial
def f(i):
print(i)
funcs = [partial(f, i) for i in range(3)]
funcs[0]() # 0 correct
funcs[1]() # 1 correct
funcs[2]() # 2 correct
partial binds arguments at creation time. The value is baked into the partial object immediately, no closure cell involved.
Fix 3: Factory Function (Most Explicit)
def make_f(captured_i):
def f():
print(captured_i)
return f
funcs = [make_f(i) for i in range(3)]
funcs[0]() # 0 correct
funcs[1]() # 1 correct
funcs[2]() # 2 correct
Each call to make_f(i) creates a new scope with its own captured_i cell, taking a snapshot of the current value of i as an argument.
Part 4 - The nonlocal Keyword
Without nonlocal, you can read an enclosing variable but you cannot assign to it. Assignment creates a new local variable instead, shadowing the outer one.
def outer():
x = 10
def inner():
x = 20 # creates a NEW local x, does not touch outer's x
print(x)
inner()
print(x)
outer()
# 20 ← inner's local x
# 10 ← outer's x is unchanged
nonlocal declares that an assignment should target the enclosing scope's variable, writing through the shared cell:
def outer():
x = 10
def inner():
nonlocal x
x = 20 # modifies outer's x via the shared cell
print(x)
inner()
print(x)
outer()
# 20
# 20 ← outer's x was modified
nonlocal Walks All Enclosing Scopes
nonlocal searches upward through all enclosing function scopes until it finds the name. It does not jump to the global namespace:
def level1():
x = "level1"
def level2():
# x is not defined in level2
def level3():
nonlocal x # finds x in level1, skips level2
x = "modified by level3"
level3()
print(x) # modified by level3
level2()
level1()
# modified by level3
:::note nonlocal vs global
nonlocal targets the nearest enclosing function scope that defines the variable. global targets the module-level namespace. You cannot use nonlocal to reach a global - use global for that. Using global inside deeply nested functions is generally a design smell; prefer returning values or using a mutable container.
:::
Part 5 - Function Factories
A function factory is a function that returns other functions, each customized by the factory's arguments. Closures make function factories practical.
Range Validator Factory
def make_range_validator(min_val, max_val):
"""Return a validator for values in [min_val, max_val]."""
def validate(value):
if not (min_val <= value <= max_val):
raise ValueError(
f"Value {value} is outside [{min_val}, {max_val}]"
)
return value
return validate
validate_age = make_range_validator(0, 150)
validate_percentage = make_range_validator(0.0, 1.0)
validate_port = make_range_validator(1, 65535)
print(validate_age(25)) # 25
print(validate_percentage(0.75)) # 0.75
print(validate_port(8080)) # 8080
try:
validate_age(200)
except ValueError as e:
print(e) # Value 200 is outside [0, 150]
Multiplier Factory
def make_multiplier(factor):
def multiply(x):
return x * factor
return multiply
double = make_multiplier(2)
triple = make_multiplier(3)
percent = make_multiplier(0.01)
print(double(7)) # 14
print(triple(7)) # 21
print(percent(75)) # 0.75
Configuration-Bound Logger Factory
import datetime
def make_logger(prefix, level="INFO"):
"""Return a logging function bound to a prefix and level."""
def log(message):
ts = datetime.datetime.now().strftime("%H:%M:%S")
print(f"[{ts}] [{level}] [{prefix}] {message}")
return log
db_log = make_logger("DATABASE")
api_log = make_logger("API", level="DEBUG")
db_log("Connection established")
# [14:23:01] [INFO] [DATABASE] Connection established
api_log("Request received: GET /users")
# [14:23:01] [DEBUG] [API] Request received: GET /users
Each logger captures prefix and level in its own cells, independent of all other loggers created by the same factory.
Part 6 - Closures vs Classes
Both closures and classes can encapsulate state. How do you choose between them?
Closure Approach - Simple Counter
def make_accumulator():
total = 0
def add(x):
nonlocal total
total += x
return total
return add
acc = make_accumulator()
print(acc(10)) # 10
print(acc(5)) # 15
print(acc(20)) # 35
Class Approach - Same Behavior
class Accumulator:
def __init__(self):
self.total = 0
def __call__(self, x):
self.total += x
return self.total
acc = Accumulator()
print(acc(10)) # 10
print(acc(5)) # 15
print(acc(20)) # 35
Decision Guide
| Choose a Closure when | Choose a Class when |
|---|---|
| You have 1–2 pieces of state | You have multiple methods (multiple behaviors) |
| You need a single callable (one behavior) | You need to inspect or modify state from outside |
| You want to avoid class boilerplate | You need inheritance or mixins |
| The object is short-lived or intermediate | You need serialization (__repr__, __getstate__) |
| Writing a decorator or event handler | The concept has a meaningful domain identity |
:::tip The Pragmatic Rule If you only need one callable behavior and fewer than three pieces of state, a closure is cleaner. If you need multiple methods or your state grows complex, write a class. Do not be dogmatic - choose what reads clearly six months from now. :::
Part 7 - Closures Inside Decorators
Every decorator that wraps a function is a closure. The wrapper function captures the original function as a free variable.
def timer(func):
import time
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs) # func is a free variable
elapsed = time.perf_counter() - start
print(f"{func.__name__} took {elapsed:.4f}s")
return result
return wrapper
@timer
def slow_sum(n):
return sum(range(n))
slow_sum(10_000_000)
# slow_sum took 0.3142s
The decorator timer returns wrapper. Inside wrapper, func is a free variable captured from timer's scope. This is a closure: wrapper.__closure__ contains a cell holding the original slow_sum function.
import time
def timer(func):
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f"{func.__name__} took {elapsed:.4f}s")
return result
return wrapper
@timer
def add(a, b):
return a + b
print(add.__closure__) # (<cell at 0x...>,)
print(add.__closure__[0].cell_contents) # <function add at 0x...>
print(add.__code__.co_freevars) # ('func',)
After decoration, the name add in the module namespace points to wrapper. The original add function lives inside wrapper's closure cell.
Part 8 - Memoization with a Closure Cache
One of the most practical closure patterns is hand-rolled memoization:
def memoize(func):
cache = {} # captured as a free variable in wrapper's closure
def wrapper(*args):
if args not in cache:
cache[args] = func(*args)
return cache[args]
return wrapper
@memoize
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
print(fibonacci(10)) # 55
print(fibonacci(50)) # 12586269025 - computed in microseconds with memoization
Each decorated function gets its own cache dict - created fresh each time memoize is called. The cache lives in the closure, invisible from the outside. This is exactly how functools.lru_cache works conceptually, with the addition of a bounded LRU eviction policy.
Part 9 - Memory: Closures Keep Enclosing Scopes Alive
A closure holds a reference to its cell objects. As long as the closure object is reachable, the cells (and the values inside them) remain alive - even if the outer function's frame is long gone.
def load_large_dataset():
data = list(range(10_000_000)) # occupies ~80 MB
def get_item(i):
return data[i] # data captured in closure cell
return get_item
getter = load_large_dataset()
# The 80 MB dataset stays in memory as long as `getter` exists!
print(getter(0)) # 0
del getter
# Now the closure is deleted, the cell is released, data can be GC'd
:::warning Closure Memory Leaks If you store closures in long-lived data structures - global caches, class attributes, event registries - the objects captured in their cells remain alive for as long as those closures are reachable. This is a common source of memory leaks in long-running servers and GUI applications.
Always ask: how long does this closure live? Does it hold a reference to something large that you expect to be freed? :::
Event Handler Factories - A Real-World Example
# Pattern used in GUI frameworks (tkinter, PyQt) and async event systems
def make_handler(button_label, action_func):
"""Create a click handler that remembers its button and action."""
def on_click(event):
print(f"Button '{button_label}' clicked")
action_func(event)
return on_click
def save_file(event):
print("Saving...")
def open_file(event):
print("Opening...")
save_handler = make_handler("Save", save_file)
open_handler = make_handler("Open", open_file)
# Simulate button clicks
save_handler(None)
# Button 'Save' clicked
# Saving...
open_handler(None)
# Button 'Open' clicked
# Opening...
Each handler captures button_label and action_func in its own closure cells. No global state, no class needed.
Interview Questions
Q1: What is a closure in Python?
Answer: A closure is an inner function that retains access to variables from its enclosing function's scope even after the enclosing function has returned. CPython implements this by promoting the captured variables from the stack into cell objects on the heap. The inner function object's __closure__ attribute holds a tuple of these cell objects, and __code__.co_freevars holds the corresponding variable names. Variables captured in closure cells are called free variables. As long as the closure object is alive, its cells are alive, keeping the captured values from being garbage collected.
Q2: What is the classic loop variable capture bug and how do you fix it?
Answer: When you define a closure inside a loop, all closures share a reference to the same loop variable cell. After the loop ends, that cell holds the final value, so all closures return the same result regardless of which iteration created them.
Three fixes:
- Default argument capture -
def f(i=i):evaluatesiat definition time, creating a local parameter with a snapshot value. functools.partial-partial(f, i)binds the argument immediately at creation time.- Factory function - wrap the closure in an outer function call that receives the loop variable as an argument, creating a fresh scope per iteration.
Q3: What does nonlocal do and why is it needed?
Answer: nonlocal declares that an assignment statement inside an inner function should target a variable in the nearest enclosing function scope, rather than creating a new local variable. Without nonlocal, any assignment x = ... inside a nested function creates a fresh local variable, shadowing the outer one (and causing UnboundLocalError if you read x before assigning). nonlocal tells the compiler to emit STORE_DEREF instead of STORE_FAST, which writes through the shared cell object. It is only valid inside nested functions; to modify a module global, use global.
Q4: How does Python's __closure__ attribute work?
Answer: __closure__ is a tuple of cell objects, one per free variable, in the same order as __code__.co_freevars. A cell object is a thin heap-allocated wrapper with a cell_contents attribute holding the captured value. Both the outer function (while executing) and the inner function access the same cell object, so mutations made by one are immediately visible to the other. If the function has no free variables, __closure__ is None.
Q5: When should you use a closure instead of a class?
Answer: Use a closure when you need a single callable with one or two pieces of persistent state, and the concept is lightweight - decorators, function factories, event handlers, and simple caches are natural fits. Use a class when you need multiple methods (distinct behaviors), when external code needs to inspect or modify state directly, when inheritance is required, or when the concept has enough identity and complexity to warrant a named type. In production codebases, functools.lru_cache and simple decorators use closures; database connection pools, ML models, and API clients use classes.
Q6: How can closures cause memory leaks and how do you prevent them?
Answer: A closure holds a reference to all its cell objects. If a large object (a dataset, a loaded model, a socket) is captured in a cell, and the closure is stored in a long-lived structure (a global registry, a class attribute, an unbounded cache), the large object remains alive indefinitely - even if you assume you are done with it.
Prevention strategies:
- Capture only the fields you need: store
config.timeoutinstead of the entireconfigobject. - Use
weakref.reffor objects that should not be kept alive solely by the closure. - Explicitly remove closures from registries when their work is done.
- Use
functools.lru_cachewith a boundedmaxsizeto cap memory usage.
Practice Challenges
Beginner: Build a Simple Function Factory
Write a function make_power(exp) that returns a new function. The returned function should accept a number x and return x raised to exp.
Create square = make_power(2), cube = make_power(3), and sqrt = make_power(0.5). Verify the closures are independent and inspect their __closure__ contents.
Solution
def make_power(exp):
"""Return a function that raises its argument to exp."""
def power(x):
return x ** exp
return power
square = make_power(2)
cube = make_power(3)
sqrt = make_power(0.5)
print(square(4)) # 16
print(cube(3)) # 27
print(sqrt(16)) # 4.0
# Verify these are independent closures with different cells
print(square.__code__.co_freevars) # ('exp',)
print(square.__closure__[0].cell_contents) # 2
print(cube.__closure__[0].cell_contents) # 3
print(sqrt.__closure__[0].cell_contents) # 0.5
# Works with any numeric type
print(square(2.5)) # 6.25
print(cube(-2)) # -8
print(sqrt(2)) # 1.4142135623730951
Intermediate: Fix the Loop Bug and Build a Dispatcher
Below is broken code that tries to build a dispatcher mapping HTTP method names to handler functions. All handlers currently print the wrong method name. Fix it using two different approaches, then build a route(method, path) function that dispatches calls.
# Broken version - all handlers print "DELETE"
methods = ["GET", "POST", "PUT", "DELETE"]
handlers = []
for method in methods:
def handler(path):
print(f"Handling {method} {path}")
handlers.append(handler)
for h in handlers:
h("/api/users")
Solution
from functools import partial
methods = ["GET", "POST", "PUT", "DELETE"]
# --- Fix 1: default argument capture ---
handlers_v1 = []
for method in methods:
def handler(path, method=method): # method=method snapshots value
print(f"[v1] Handling {method} {path}")
handlers_v1.append(handler)
for h in handlers_v1:
h("/api/users")
# [v1] Handling GET /api/users
# [v1] Handling POST /api/users
# [v1] Handling PUT /api/users
# [v1] Handling DELETE /api/users
print()
# --- Fix 2: factory function (cleanest approach) ---
def make_handler(method):
def handler(path):
print(f"[v2] Handling {method} {path}")
return handler
# Build a dispatch table
dispatch = {method: make_handler(method) for method in methods}
def route(method, path):
if method in dispatch:
dispatch[method](path)
else:
print(f"405 Method Not Allowed: {method}")
route("GET", "/api/users") # [v2] Handling GET /api/users
route("POST", "/api/orders") # [v2] Handling POST /api/orders
route("DELETE", "/api/items/5") # [v2] Handling DELETE /api/items/5
route("PATCH", "/api/users/1") # 405 Method Not Allowed: PATCH
Advanced: Closure-Based Memoization with Time-to-Live (TTL)
Build a memoize_ttl(seconds) decorator factory. When applied to a function, it caches results for the given number of seconds. After the TTL expires for a given set of arguments, the next call recomputes and re-caches the result. The cache must be per-decorated-function (not shared globally), and must support both positional and keyword arguments.
Solution
import time
import functools
def memoize_ttl(seconds):
"""
Decorator factory: cache call results for `seconds` seconds.
Each decorated function gets its own independent cache (closure).
"""
def decorator(func):
# cache maps key -> (result, expiry_timestamp)
# `cache` is a free variable in `wrapper`'s closure
cache = {}
@functools.wraps(func)
def wrapper(*args, **kwargs):
# Build a hashable cache key from all arguments
key = (args, tuple(sorted(kwargs.items())))
now = time.monotonic()
if key in cache:
result, expiry = cache[key]
if now < expiry:
return result
# TTL expired - fall through to recompute
result = func(*args, **kwargs)
cache[key] = (result, now + seconds)
return result
# Expose cache internals for testing / monitoring
wrapper.cache = cache
wrapper.cache_clear = lambda: cache.clear()
wrapper.cache_info = lambda: {
"size": len(cache),
"ttl_seconds": seconds,
}
return wrapper
return decorator
# --- Demo ---
call_count = 0
@memoize_ttl(seconds=2)
def fetch_user(user_id):
"""Simulate a slow database lookup (~100ms)."""
global call_count
call_count += 1
time.sleep(0.05) # simulate I/O latency
return {"id": user_id, "name": f"User_{user_id}"}
print(fetch_user(1)) # computed (cache miss)
print(fetch_user(1)) # instant (cache hit)
print(fetch_user(2)) # computed (different args)
print(fetch_user(1)) # instant (still within TTL)
print(f"DB calls so far: {call_count}") # 2
print(f"Cache info: {fetch_user.cache_info()}") # size: 2
print("\nWaiting for TTL to expire...")
time.sleep(2.1)
print(fetch_user(1)) # TTL expired - recomputes
print(fetch_user(1)) # instant (fresh cache)
print(f"Total DB calls: {call_count}") # 3
# Output:
# {'id': 1, 'name': 'User_1'}
# {'id': 1, 'name': 'User_1'}
# {'id': 2, 'name': 'User_2'}
# {'id': 1, 'name': 'User_1'}
# DB calls so far: 2
# Cache info: {'size': 2, 'ttl_seconds': 2}
# Waiting for TTL to expire...
# {'id': 1, 'name': 'User_1'}
# {'id': 1, 'name': 'User_1'}
# Total DB calls: 3
Quick Reference
| Concept | Syntax / Example | Notes |
|---|---|---|
| Create a closure | def outer(): x=1; def inner(): return x; return inner | x is a free variable |
| Inspect free variable names | f.__code__.co_freevars | Tuple of captured names |
| Inspect cell value | f.__closure__[0].cell_contents | One cell per free variable |
| Check if closure exists | f.__closure__ is not None | None means no free vars |
| Mutate enclosing variable | nonlocal x inside inner function | Required for assignment |
| Loop capture fix - snapshot | def f(i=i): | Default arg captures value now |
| Loop capture fix - partial | partial(f, i) | from functools import partial |
| Loop capture fix - factory | def make_f(v): def f(): return v; return f | New scope per iteration |
| Function factory | def make_adder(n): def add(x): return x+n; return add | Returns configured function |
| Closure in decorator | def deco(func): def wrapper(*a,**k): ...; return wrapper | func is the free variable |
| Memoize with closure | cache = {}; def wrapper(*args): ... | Cache lives in closure cell |
Key Takeaways
- A closure is an inner function plus the cell objects holding its free variables - these cells live on the heap, not on the stack, so they outlive the enclosing function's execution
__closure__is a tuple of cell objects;__code__.co_freevarsnames them;cell.cell_contentsholds the value - you can inspect this in any Python REPL- Closures capture by reference via a shared cell, not by value - the classic loop bug arises from all closures sharing one loop-variable cell
- Fix loop capture with default arguments (
i=i),functools.partial, or a factory function - each approach creates a snapshot or a fresh scope at definition time nonlocallets an inner function write to an enclosing cell; without it, any assignment creates a new local variable, shadowing the outer one- Closures are the mechanism behind decorators - the wrapper function captures the original function as a free variable in its
__closure__ - Every closure that captures a large object keeps that object alive; be intentional about closure lifetimes in long-running servers and event-driven systems
