Skip to main content

Tuples and Immutability - Engineering with Stable Data

Reading time: ~20 minutes | Level: Foundation → Engineering

Here is a question that reveals whether you actually understand Python's type system.

t = ([1, 2], [3, 4])

t[0].append(99)

print(t)
print(hash(t))

Most engineers expect this to either succeed completely or fail completely. The actual output:

([1, 2, 99], [3, 4])
TypeError: unhashable type: 'list'

The tuple "changed" - and yet it is still immutable. The hash() call failed - and yet the tuple exists and its structure was never reassigned.

These two results expose a profound distinction: immutability is not the same as constancy, and hashability requires deep immutability. Understanding why is the foundation of reliable Python engineering.

What You Will Learn

  • How CPython implements tuples as PyTupleObject - the actual C struct and why it differs structurally from PyListObject
  • Why creating a tuple literal is faster than creating a list - compile-time constant folding in the bytecode compiler
  • The critical distinction between shallow immutability (the tuple's pointers cannot change) and deep constancy (nothing reachable can change)
  • How __hash__ works, why -1 is a forbidden hash value, and the exact conditions under which a tuple is unhashable
  • Named tuples with collections.namedtuple and typing.NamedTuple - what each generates and when to use each
  • Structural unpacking and * splat unpacking patterns engineers actually use in production
  • The tuple-as-return-value pattern and why it is Python's idiomatic multiple-return mechanism
  • When to choose tuple over list: semantic intent, API contracts, dict keys, and measurable memory savings

Prerequisites

  • Python variables and name binding (Module 3, Lesson 4)
  • Python lists internals (Module 4, Lesson 1)
  • Basic understanding of Big-O notation and pointer arithmetic

Part 1 - The Mental Model: What a Tuple Actually Is

What Most Engineers Think

Most engineers describe a tuple as "a list you cannot modify." This is approximately correct but misses everything important about how tuples work and why they exist.

A more precise definition: a tuple is a fixed-length, ordered sequence of object references whose reference array is allocated once at creation time and can never be reallocated, extended, or modified.

The CPython Internal Structure

In CPython, lists are defined in listobject.h and tuples in tupleobject.h. The difference is architecturally significant:

/* PyListObject - dynamic array, designed to grow */
typedef struct {
PyObject_VAR_HEAD
PyObject **ob_item; /* pointer to the pointer array on heap - can be realloc'd */
Py_ssize_t allocated; /* total capacity, usually > ob_size */
} PyListObject;

/* PyTupleObject - fixed-length inline array */
typedef struct {
PyObject_VAR_HEAD
PyObject *ob_item[1]; /* flexible array member - pointers stored INSIDE the struct */
} PyTupleObject;

The structural difference is decisive. Here is what memory looks like for each:

Key observations:

  1. The list stores its pointer array in a separately heap-allocated block - this block can be realloc'd to grow. The PyListObject struct holds a PyObject ** pointer to this block.
  2. The tuple stores its pointers inline, inside the struct itself using a C99 flexible array member (ob_item[1]). One allocation, zero possibility of realloc.
  3. The list has an allocated field (capacity, always >= ob_size). The tuple has no such field - ob_size is both the length and the total capacity simultaneously.
  4. Because there is no realloc mechanism, tuples are structurally immutable - not by convention, but because there is no C code path to extend them.

Part 2 - Memory: Tuples Are Smaller Than Lists

import sys

# Same integer data, different containers
lst = [1, 2, 3, 4, 5]
tup = (1, 2, 3, 4, 5)

print(sys.getsizeof(lst)) # 104 bytes (64-bit CPython 3.11+)
print(sys.getsizeof(tup)) # 80 bytes

The breakdown explains the difference:

FieldList (64-bit CPython)Tuple (64-bit CPython)
ob_refcnt8 bytes (PyObject_VAR_HEAD)8 bytes
*ob_type8 bytes8 bytes
ob_size8 bytes8 bytes
allocated8 bytes (tuples have NO allocated field)-
*ob_item ptr8 bytes (pointer to separate block)-
ob_item[5]-40 bytes (5 pointers x 8 bytes, inline in struct)
struct total40 bytes + separate pointer block (with over-alloc padding)72 bytes inline
sys.getsizeof~104 bytes for 5 elements~80 bytes

:::tip Memory Savings at Scale If you have 10 million coordinate pairs in a data processing pipeline stored as tuples instead of lists, you save roughly 240 MB in struct overhead alone - before accounting for over-allocation slack in lists. That determines whether your application fits in RAM. :::

Measuring the Difference

import sys

for n in range(0, 11, 2):
lst = list(range(n))
tup = tuple(range(n))
diff = sys.getsizeof(lst) - sys.getsizeof(tup)
print(f"n={n:2d}: list={sys.getsizeof(lst):4d}B tuple={sys.getsizeof(tup):4d}B saved={diff:+d}B")

# Output (64-bit CPython 3.11):
# n= 0: list= 56B tuple= 40B saved=+16B
# n= 2: list= 72B tuple= 56B saved=+16B
# n= 4: list= 88B tuple= 72B saved=+16B
# n= 6: list= 104B tuple= 88B saved=+16B
# n= 8: list= 120B tuple= 104B saved=+16B
# n=10: list= 152B tuple= 120B saved=+32B

The tuple grows by exactly 8 bytes per element - one pointer, perfectly linear. The list grows in jumps because of over-allocation. The gap widens as lists trigger reallocations into larger capacity blocks.

Part 3 - Why Tuple Creation Is Faster: Compile-Time Constants

import timeit

list_time = timeit.timeit("[1, 2, 3, 4, 5]", number=10_000_000)
tuple_time = timeit.timeit("(1, 2, 3, 4, 5)", number=10_000_000)

print(f"List: {list_time:.3f}s")
print(f"Tuple: {tuple_time:.3f}s")

# Typical output:
# List: 0.520s
# Tuple: 0.072s ← roughly 7x faster

The reason is visible in the bytecode:

import dis

print("=== [1, 2, 3] ===")
dis.dis(compile("[1, 2, 3]", "<string>", "eval"))

print("\n=== (1, 2, 3) ===")
dis.dis(compile("(1, 2, 3)", "<string>", "eval"))
=== [1, 2, 3] ===
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_LIST 3 ← allocates a new list object at RUNTIME
8 RETURN_VALUE

=== (1, 2, 3) ===
1 0 LOAD_CONST 0 ((1, 2, 3)) ← entire tuple is ONE pre-built constant
2 RETURN_VALUE

The Python bytecode compiler recognizes that (1, 2, 3) - a tuple whose elements are all compile-time constants - is itself a constant. It evaluates the tuple once at compile time and stores the result as a bytecode constant. Every execution is a single LOAD_CONST - no allocation, no pointer writes, just loading a pre-existing object.

[1, 2, 3] cannot be a compile-time constant because lists are mutable. If multiple parts of code shared the same list literal, mutating it in one place would corrupt all other users. So Python must create a fresh list on every execution.

:::note CPython Empty Tuple Singleton CPython caches the empty tuple as a global singleton. Every expression () or call to tuple() that produces an empty tuple returns the exact same object - no allocation:

a = ()
b = ()
print(a is b) # True - identical object, not just equal

Functions that return () on their fast path pay zero allocation cost. :::

Part 4 - Immutability Is Not Constancy: The Most Important Distinction

This is the lesson's central insight. Read it carefully.

t = ([1, 2], [3, 4])

# The tuple's pointer slots cannot be reassigned
t[0] = [99, 100] # TypeError: 'tuple' object does not support item assignment

# But the LIST that t[0] points to can still be mutated
t[0].append(99)
print(t) # ([1, 2, 99], [3, 4]) ← tuple "changed"!

The tuple guarantees that its internal reference array is immutable - you cannot make t[0] point to a different object. But the tuple makes no guarantee about the objects those references point to. If those objects are mutable, they can still be mutated through their own interface.

The pointers in the tuple are unchanged. The list object at the other end changed.

This is called shallow immutability. The tuple guarantees its own pointer slots are frozen; it makes no guarantees about what those pointers point to.

Deep immutability - where nothing reachable through the structure can change - requires that every element also be immutable: integers, strings, other tuples with only immutable contents, frozensets.

:::danger Hashability Consequence This distinction has a critical practical consequence. A tuple containing mutable objects cannot be hashed and therefore cannot be used as a dict key or set member:

t = ([1, 2], [3, 4])
hash(t) # TypeError: unhashable type: 'list'

# But this works - all elements are immutable
t2 = ((1, 2), (3, 4))
hash(t2) # Works: 3713082716806266542 (session-specific)

:::

Part 5 - Hashability: Why Dicts and Sets Need It

How __hash__ Works

Every Python object usable as a dict key or set member must implement __hash__. The contract:

  1. Objects that compare equal must have equal hashes: a == b implies hash(a) == hash(b)
  2. A hash must be stable - it must not change during an object's lifetime
  3. Hash collisions are allowed - different objects can share the same hash value

CPython computes a tuple's hash by combining the hashes of all its elements using a mixing algorithm derived from xxHash:

# Simplified conceptual approximation of CPython's tuplehash()
def tuple_hash_conceptual(t):
acc = 0x27D4EB2F165667C5 # seeded with Py_HashSecret
for item in t:
item_hash = hash(item) # raises TypeError if item is unhashable
acc = acc ^ (item_hash * 0x517CC1B727220A95 + 1) # simplified mixing
result = acc if acc != -1 else -2 # -1 is reserved
return result

Why -1 Is a Forbidden Hash Return Value

In CPython, C functions return -1 to signal an error - it is the standard "exception was raised" sentinel for C-level hash computations. If a legitimate hash happened to equal -1, the interpreter would misinterpret it as an error. CPython solves this by mapping -1 to -2 before returning to Python:

class NegativeOneHash:
def __hash__(self):
return -1 # Python will store this as -2

obj = NegativeOneHash()
print(hash(obj)) # -2 (CPython transparently remapped -1 → -2)

Verifying Hashability

# All of these work - every element is hashable
hash((1, 2, 3)) # integers are hashable
hash(("alice", 42, 3.14)) # str, int, float all hashable
hash(((1, 2), (3, 4))) # nested tuples of ints - fine
hash((frozenset({1, 2}), "hello")) # frozenset is hashable

# All of these fail - unhashable element somewhere in the structure
hash((1, [2, 3])) # TypeError: unhashable type: 'list'
hash(({"key": "val"},)) # TypeError: unhashable type: 'dict'
hash(((1, [2]),)) # TypeError: list nested inside inner tuple

# Practical consequence: dict keys
cache = {}
cache[(42, "us-east-1")] = "data" # Works - both elements hashable
cache[(42, ["us-east-1"])] = "data" # TypeError at this line

Part 6 - Named Tuples: Tuples with Semantic Labels

Anonymous tuples like (42.0, -73.5) carry no information about meaning. Named tuples add field names while remaining fully tuple-compatible.

collections.namedtuple - The Classic Factory Function

from collections import namedtuple

# Define the type (generates a new class at runtime)
Point = namedtuple("Point", ["x", "y"])

p = Point(x=3.0, y=4.0)

# Access by name (readable) or by index (tuple-compatible)
print(p.x) # 3.0
print(p[0]) # 3.0 ← still a real tuple
print(p) # Point(x=3.0, y=4.0)

# Fully compatible with all tuple protocols
x, y = p # unpacking works
print(len(p)) # 2
print(hash(p)) # hashable - it is a tuple

# Useful class methods
p2 = p._replace(y=7.0) # returns NEW Point with y changed (immutable!)
print(p2) # Point(x=3.0, y=7.0)
print(p) # Point(x=3.0, y=4.0) ← original unchanged

# _asdict: convert to a regular dict
print(p._asdict()) # {'x': 3.0, 'y': 4.0}

# _fields: introspect field names
print(Point._fields) # ('x', 'y')

# _make: create from any iterable
p3 = Point._make([5.0, 6.0])

What does namedtuple generate? It dynamically creates a class that inherits from tuple. The named fields are property descriptors on the class that call self[index] - zero extra memory per instance:

typing.NamedTuple - The Modern Type-Safe Approach

from typing import NamedTuple

class HTTPResponse(NamedTuple):
status_code: int
body: str
headers: dict = {} # default values supported
is_cached: bool = False

# Construction - positional or keyword
r = HTTPResponse(status_code=200, body='{"ok": true}')
print(r)
# HTTPResponse(status_code=200, body='{"ok": true}', headers={}, is_cached=False)

# Named access
print(r.status_code) # 200
print(r.body) # '{"ok": true}'

# Still a real tuple
print(isinstance(r, tuple)) # True
print(r[0]) # 200

# Methods can be defined in the class body
class Vector(NamedTuple):
x: float
y: float

def magnitude(self) -> float:
return (self.x**2 + self.y**2) ** 0.5

def dot(self, other: "Vector") -> float:
return self.x * other.x + self.y * other.y

v = Vector(3.0, 4.0)
print(v.magnitude()) # 5.0
print(v.dot(Vector(1.0, 0.0))) # 3.0

When to Choose Each

ScenarioRecommended
Simple records in new codetyping.NamedTuple
Need type checking (mypy/pyright)typing.NamedTuple
Need default field valuestyping.NamedTuple
Need methods on the recordtyping.NamedTuple
Legacy code using string-based field namescollections.namedtuple
Dynamically generating field names at runtimecollections.namedtuple
Performance-critical hot loop, avoid property lookupplain tuple with known indices

Part 7 - Structural Unpacking and Splat Patterns

Basic Unpacking

point = (3.0, 4.0)
x, y = point # two-element unpack - must match exactly

rgb = (255, 128, 0)
r, g, b = rgb

# Nested unpacking
matrix_row = ((1, 2), (3, 4))
(a, b), (c, d) = matrix_row
print(a, b, c, d) # 1 2 3 4

# Unpacking in a for loop - idiomatic Python
points = [(0, 0), (1, 2), (3, 4)]
for x, y in points:
print(f"Point: ({x}, {y})")

Splat (*) Unpacking - Capturing Variable Lengths

first, *rest = (1, 2, 3, 4, 5)
print(first) # 1
print(rest) # [2, 3, 4, 5] ← rest is always a LIST, not a tuple

*init, last = (1, 2, 3, 4, 5)
print(init) # [1, 2, 3, 4]
print(last) # 5

first, *middle, last = (1, 2, 3, 4, 5)
print(first) # 1
print(middle) # [2, 3, 4]
print(last) # 5

# Practical: parse semantic version tuples
version = (3, 11, 4, "final")
major, minor, *rest = version
print(f"Python {major}.{minor}") # Python 3.11

:::note Starred Variable Is Always a List *rest in an unpack expression always captures into a regular Python list, even when unpacking a tuple. This is by design: the starred variable captures a variable number of elements, so a mutable list is the appropriate container. :::

Unpacking in Function Calls

def distance(x1, y1, x2, y2):
return ((x2 - x1)**2 + (y2 - y1)**2) ** 0.5

p1 = (0.0, 0.0)
p2 = (3.0, 4.0)

# With splat - clean and expressive
d = distance(*p1, *p2)
print(d) # 5.0

Swap Without a Temporary Variable

a, b = 10, 20
a, b = b, a # RHS packs into tuple (20, 10), then unpacks into a, b
print(a, b) # 20 10

# CPython actually optimizes the simple two-variable swap:
# It uses ROT_TWO bytecode - no heap tuple is constructed at all.

Part 8 - Tuple as Return Value: Python's Multiple Return Mechanism

Python functions return exactly one object. The idiomatic way to return multiple values is to return a tuple - Python's syntax makes this feel like native multiple returns:

import math

def polar_to_cartesian(r: float, theta: float) -> tuple[float, float]:
x = r * math.cos(theta)
y = r * math.sin(theta)
return x, y # implicitly packs into a tuple

# Unpack immediately
x, y = polar_to_cartesian(5.0, math.pi / 4)
print(f"x={x:.3f}, y={y:.3f}") # x=3.536, y=3.536

# Or keep as tuple
coords = polar_to_cartesian(5.0, math.pi / 4)
print(type(coords)) # <class 'tuple'>
print(coords) # (3.5355339059327378, 3.5355339059327373)

Using NamedTuple for Rich Return Types

When functions return more than two or three values, a named tuple beats a plain tuple for readability:

from typing import NamedTuple

class ModelMetrics(NamedTuple):
accuracy: float
precision: float
recall: float
f1_score: float

def evaluate_model(y_true, y_pred) -> ModelMetrics:
# ... compute metrics ...
return ModelMetrics(accuracy=0.92, precision=0.88, recall=0.90, f1_score=0.89)

metrics = evaluate_model(y_true, y_pred)

# Named access - self-documenting code
print(f"Accuracy: {metrics.accuracy:.1%}") # Accuracy: 92.0%
print(f"F1 Score: {metrics.f1_score:.3f}") # F1 Score: 0.890

# Legacy unpack still works
acc, prec, rec, f1 = metrics

:::tip API Design with NamedTuple When designing library functions that return multiple values, prefer typing.NamedTuple over plain tuples. Named tuples allow callers to use field names (self-documenting), survive refactoring when new optional fields with defaults are added, and appear with correct types in IDE auto-complete and type checkers. :::

Part 9 - When to Choose Tuple Over List

The choice is primarily a semantic decision about communicating intent, not a performance decision.

Choose TUPLE whenChoose LIST when
Data represents a fixed-arity record - e.g., (user_id, timestamp, ip)Collection of homogeneous items - e.g., [user_1, user_2, user_3]
Structure should not change - e.g., DB config: ("localhost", 5432)Items may be added or removed - e.g., an accumulator, a queue
Needed as dict key or set member - e.g., cache[(user_id, region)]Order matters but may change - e.g., sorted query results
API contract signals immutability - return (min_val, max_val)API contract signals mutability - return [result1, result2]
Row from a DB result set (fixed schema per row)Batch of items to process (count varies by batch)

The Dict Key Use Case - Definitive Example

# Tuple as composite cache key - lists cannot do this
from functools import lru_cache

@lru_cache(maxsize=1024)
def compute_route(origin: tuple[float, float],
destination: tuple[float, float]) -> float:
"""lru_cache requires hashable arguments - tuples work, lists do not."""
lat1, lon1 = origin
lat2, lon2 = destination
# expensive computation (route planning API, etc.)
return 42.7 # km

result = compute_route((37.7749, -122.4194), (34.0522, -118.2437))
print(result) # 42.7

# 2D grid indexing
grid: dict[tuple[int, int], str] = {}
for row in range(5):
for col in range(5):
grid[(row, col)] = f"cell_{row}_{col}"

print(grid[(2, 3)]) # cell_2_3

Semantic Signal in Type Annotations

Python's type system formalizes the semantic distinction:

# tuple[int, str, float] - exactly 3 elements, each with a specific type
# List[str] - variable number of strings

def parse_row(raw: str) -> tuple[int, str, float]:
# Returns a fixed-arity record
parts = raw.split(",")
return int(parts[0]), parts[1].strip(), float(parts[2])

def load_records(path: str) -> list[tuple[int, str, float]]:
# Variable number of fixed-arity records
with open(path) as f:
return [parse_row(line) for line in f]

Part 10 - CPython Small Tuple Optimization

CPython maintains a free list of recently deallocated small tuples for reuse. When a small tuple is freed, its memory is not immediately returned to the OS - it is kept on the free list. The next time a small tuple of the same size is created, CPython may reuse that memory instead of allocating fresh from the heap.

# Demonstrating free-list reuse (CPython implementation detail)
a = (1, 2, 3)
id_a = id(a)
del a # tuple is freed - may go to free list

b = (4, 5, 6) # new tuple of same size - may reuse freed memory
id_b = id(b)

print(id_a == id_b) # Often True on CPython - same memory was reused

:::note Implementation Detail Only Free-list reuse is specific to CPython. PyPy, Jython, GraalPy, and other Python implementations do not guarantee this behavior. Never write code that relies on tuple identity (is) - only equality (==). Identity is an observable side effect of memory allocation, not a language guarantee. :::

Interview Questions

Q1: What is the structural difference between PyListObject and PyTupleObject in CPython memory?

Answer: Both store arrays of 8-byte pointers to Python objects. The key difference is where that array lives. A PyListObject stores its pointer array in a separately heap-allocated block pointed to by its ob_item field - this block can be realloc'd when the list grows. The struct also has an allocated field tracking total capacity (always >= ob_size). A PyTupleObject uses a C99 flexible array member - the pointer array is stored inline, at the end of the struct itself in one contiguous allocation. There is no allocated field because there is no mechanism to grow the array. The result: tuples use one fewer allocation, have better cache locality for small sizes, consume less memory (no capacity field, no slack), and are physically incapable of being resized.

Q2: Why can a tuple contain a mutable object? Is that a contradiction?

Answer: It is not a contradiction - it is a precise feature. Tuple immutability applies exactly to the tuple's own reference slots: you cannot rebind t[0] to point to a different object, and you cannot add or remove slots. But the tuple makes no claim about the objects those references point to. If t[0] points to a list, t[0].append(x) mutates the list - you are sending a message to the list object through a pointer that happens to be stored in the tuple. The tuple's pointer to that list did not change; only the list's internal state changed. This is called shallow immutability. Deep immutability requires that all reachable objects also be immutable, which is not something Python enforces - it is the programmer's responsibility to ensure if needed.

Q3: Why is tuple creation from constant literals faster than list creation?

Answer: The Python bytecode compiler performs constant folding on tuple literals whose elements are all compile-time constants (integers, strings, floats, None, True, False, and other constant tuples). It evaluates (1, 2, 3) once at compile time and stores the resulting tuple object in the code object's co_consts array. At runtime, execution is a single LOAD_CONST bytecode instruction - no allocation, no pointer writes, just loading a pre-existing object. List literals cannot be constant-folded because lists are mutable: if the same list object were shared between all executions of the literal, mutating it anywhere would corrupt all users. A fresh list must be heap-allocated on every execution via BUILD_LIST. The speedup is typically 4-8x for small constant tuples.

Q4: Under what exact conditions is a tuple not hashable?

Answer: A tuple is not hashable when any element within it - at any depth - is not hashable. CPython computes a tuple's hash by recursively calling hash() on each element and mixing the results. If any hash() call raises TypeError (which all mutable built-in types do - lists, dicts, sets), the TypeError propagates out of the tuple's __hash__. So hash((1, (2, [3]))) fails because the inner list [3] is unhashable, even though it is nested two levels deep. The only way to guarantee a tuple is hashable is to ensure every element, and every element of every nested container, is immutable: integers, floats, strings, bytes, booleans, None, other fully-hashable tuples, and frozensets.

Q5: What does typing.NamedTuple give you that collections.namedtuple does not?

Answer: Both produce classes that inherit from tuple with named field access. typing.NamedTuple offers four additional capabilities: (1) Type annotations - field types are recorded and understood by static type checkers like mypy and Pyright, so point.x is known to be float rather than Any; (2) Default values - field: type = default works naturally in the class body; (3) Methods - you can define methods in the class body that operate on the record; (4) Readable syntax - the class-body form reads like a normal Python class, is easier to understand and extend, and appears correctly in IDE auto-complete. The runtime behavior and memory usage are identical - both produce tuple subclasses where named fields are property descriptors that call self[index].

Q6: In a high-throughput ML inference service, a function receives a feature vector as input and is called 50,000 times per second. The function uses @lru_cache. Should the vector be passed as a list or a tuple? Justify your answer with concrete technical reasons.

Answer: Pass it as a tuple. Three concrete reasons: (1) @lru_cache requires hashable arguments - lists are unhashable and will raise TypeError immediately when the decorator tries to use the argument as a cache key. Tuples of numeric values (floats, ints) are hashable. (2) Creation speed - if the feature vector is constructed from constants or fixed-size computations, tuple literals benefit from constant folding and over-allocation is avoided. For a vector assembled programmatically, tuple(vector_list) is a fast O(n) conversion that pays off if the cache hit rate is high. (3) Semantic correctness - a feature vector is a fixed-arity record of numeric features; it should not change after construction. Using a tuple communicates this contract to all readers and to type checkers, prevents accidental mutation inside the function, and enables the caching infrastructure. For truly hot paths, also consider numpy arrays with a hashable wrapper or pre-hashing the vector.

Practice Challenges

Beginner - Predict the Output

What does this code print, and why?

t = (1, 2, [3, 4])
t[2] += [5, 6]
print(t)
Solution

This code both raises a TypeError and modifies t. Here is the sequence:

  1. t[2] evaluates to the list [3, 4]
  2. [3, 4] += [5, 6] calls list.__iadd__, which calls t[2].extend([5, 6]) and mutates the list in-place to [3, 4, 5, 6], then returns the same list object
  3. Python then attempts to assign the returned list back to t[2] - this triggers tuple.__setitem__, which raises TypeError: 'tuple' object does not support item assignment

The mutation in step 2 happened before the assignment attempt in step 3. When the TypeError is raised, the list is already [3, 4, 5, 6].

t = (1, 2, [3, 4])
try:
t[2] += [5, 6]
except TypeError as e:
print(e) # 'tuple' object does not support item assignment
print(t) # (1, 2, [3, 4, 5, 6]) - list was mutated before error!

Lesson: Never use augmented assignment (+=, -=, *=, etc.) on mutable objects stored inside a tuple. The mutation side effect occurs before the assignment exception, leaving the data in a mutated state with an exception raised - a genuinely confusing outcome.

Intermediate - Build a Typed Record System

You are writing a function that parses CSV rows into typed records for a financial data pipeline. Each row contains: trade_id (int), symbol (str), price (float), quantity (int), side (str: "buy" or "sell").

Requirements:

  • Records must be usable as dict keys (for deduplication by trade_id)
  • Callers must be able to unpack: trade_id, symbol, price, qty, side = record
  • Type checkers must know that record.price is float
  • Records must be immutable after creation

Implement the record type and a parse_csv_row(line: str) -> TradeRecord function.

Solution
from typing import NamedTuple, Literal

class TradeRecord(NamedTuple):
trade_id: int
symbol: str
price: float
quantity: int
side: Literal["buy", "sell"]

def notional_value(self) -> float:
"""Total value of this trade in dollars."""
return self.price * self.quantity

def is_valid(self) -> bool:
return (
self.trade_id > 0
and len(self.symbol) > 0
and self.price > 0
and self.quantity > 0
and self.side in ("buy", "sell")
)


def parse_csv_row(line: str) -> TradeRecord:
"""Parse a CSV line into a TradeRecord.

Expected format: "trade_id,symbol,price,quantity,side"
Example: "1001,AAPL,172.50,100,buy"
"""
parts = line.strip().split(",")
if len(parts) != 5:
raise ValueError(f"Expected 5 fields, got {len(parts)}: {line!r}")
return TradeRecord(
trade_id=int(parts[0]),
symbol=parts[1].strip().upper(),
price=float(parts[2]),
quantity=int(parts[3]),
side=parts[4].strip().lower(),
)


# Usage
row = "1001,AAPL,172.50,100,buy"
record = parse_csv_row(row)

# Named access - readable and type-checked
print(f"{record.symbol}: {record.quantity} shares @ ${record.price:.2f}")
# AAPL: 100 shares @ $172.50

print(f"Notional: ${record.notional_value():,.2f}")
# Notional: $17,250.00

# Tuple unpack - legacy compatible
trade_id, symbol, price, qty, side = record
print(trade_id, symbol) # 1001 AAPL

# Use as dict key (deduplication by trade_id - simplified)
seen_trades: dict[int, TradeRecord] = {}
seen_trades[record.trade_id] = record

# Or use the whole record as a key (hashable - all fields are immutable scalars)
record_set: set[TradeRecord] = set()
record_set.add(record)
print(record in record_set) # True

Why typing.NamedTuple wins here:

  • trade_id, symbol, price, quantity, side are all immutable scalars → hashable → usable as dict keys or in sets
  • record.price is known to be float by mypy/Pyright - type errors caught at analysis time
  • notional_value() method lives on the class - no need for standalone utility functions
  • Tuple unpack trade_id, symbol, price, qty, side = record still works for legacy callers

Advanced - Memoization Cache with Tuple Keys

Implement a generic memoization decorator that caches function results using the arguments as the cache key. It must handle lists and dicts in arguments by converting them to hashable equivalents, include a cache statistics counter, and raise a meaningful error when arguments cannot be made hashable.

# Target usage:
@memo
def compute_features(data: list[float], threshold: float) -> list[float]:
return [x for x in data if x > threshold]

r1 = compute_features([1.0, 2.5, 3.0], threshold=2.0) # computed
r2 = compute_features([1.0, 2.5, 3.0], threshold=2.0) # from cache

print(compute_features.cache_stats())
# {'hits': 1, 'misses': 1, 'size': 1}
Solution
import functools
from typing import Any, Callable

def make_hashable(obj: Any) -> Any:
"""Recursively convert common unhashable containers to hashable equivalents.

list → tuple (of recursively converted elements)
dict → tuple of sorted (key, value) pairs
set → frozenset (of recursively converted elements)
Other objects: returned as-is if hashable, raise TypeError if not.
"""
if isinstance(obj, list):
return tuple(make_hashable(item) for item in obj)
elif isinstance(obj, dict):
return tuple(sorted(
(make_hashable(k), make_hashable(v)) for k, v in obj.items()
))
elif isinstance(obj, set):
return frozenset(make_hashable(item) for item in obj)
else:
try:
hash(obj) # raises TypeError if not hashable
return obj
except TypeError:
raise TypeError(
f"Cannot make argument of type {type(obj).__name__!r} hashable. "
f"Value: {obj!r}"
) from None


def memo(func: Callable) -> Callable:
"""Memoization decorator supporting list/dict arguments via tuple conversion."""
cache: dict[tuple, Any] = {}
hits = 0
misses = 0

@functools.wraps(func)
def wrapper(*args, **kwargs):
nonlocal hits, misses
# Build a single hashable cache key from all arguments
try:
hashable_args = tuple(make_hashable(a) for a in args)
hashable_kwargs = tuple(
sorted((k, make_hashable(v)) for k, v in kwargs.items())
)
key = (hashable_args, hashable_kwargs)
except TypeError as e:
raise TypeError(
f"Cannot memoize call to {func.__name__!r}: {e}"
) from None

if key in cache:
hits += 1
return cache[key]

misses += 1
result = func(*args, **kwargs)
cache[key] = result
return result

def cache_stats() -> dict[str, int]:
return {"hits": hits, "misses": misses, "size": len(cache)}

def cache_clear() -> None:
nonlocal hits, misses
cache.clear()
hits = 0
misses = 0

wrapper.cache_stats = cache_stats # type: ignore[attr-defined]
wrapper.cache_clear = cache_clear # type: ignore[attr-defined]
return wrapper


# Demonstration
@memo
def compute_features(data: list[float], threshold: float) -> list[float]:
print(f" Computing for data={data}, threshold={threshold}")
return [x for x in data if x > threshold]

r1 = compute_features([1.0, 2.5, 3.0], threshold=2.0) # Computing...
r2 = compute_features([1.0, 2.5, 3.0], threshold=2.0) # From cache
r3 = compute_features([1.0, 2.5, 3.0], threshold=1.0) # Computing... (different threshold)

print(f"r1 = {r1}") # [2.5, 3.0]
print(f"r2 = {r2}") # [2.5, 3.0]
print(compute_features.cache_stats())
# {'hits': 1, 'misses': 2, 'size': 2}

# Keyword argument order independence
r4 = compute_features(threshold=2.0, data=[1.0, 2.5, 3.0]) # From cache - same key!
print(compute_features.cache_stats())
# {'hits': 2, 'misses': 2, 'size': 2}

Key design decisions explained:

  • The cache key is ((hashable_args...), ((k1, v1), (k2, v2)...)) - a tuple of two tuples. This ensures both positional and keyword arguments contribute to the key.
  • Keyword arguments are sorted by key name before hashing - so f(a=1, b=2) and f(b=2, a=1) produce the same cache key.
  • make_hashable recurses so nested structures like list[list[int]] are handled correctly.
  • nonlocal allows the closure to mutate the counter integers from the outer scope.
  • functools.wraps preserves __name__, __doc__, __annotations__, and __module__ on the wrapper.
  • This pattern appears in real production systems: FastAPI's dependency injection, pytest's fixture caching, and Django's cache_page decorator all use similar argument-to-key conversion strategies.

Quick Reference

OperationCodeComplexityNotes
Create literalt = (1, 2, 3)O(1) if constantsCompiler folds constant tuples
Create from iterabletuple(iterable)O(n)Converts any iterable
Create emptyt = ()O(1)Returns CPython singleton
Single elementt = (42,)O(1)Trailing comma required
Index accesst[i]O(1)Direct pointer arithmetic
Slicet[i:j]O(j-i)Returns new tuple
Unpacka, b = tO(n)Count must match
Splat unpackfirst, *rest = tO(n)rest is a list
Lengthlen(t)O(1)Stored in ob_size
Membershipx in tO(n)Linear scan - use set for repeated
Concatenatet1 + t2O(n+m)Returns new tuple
Repeatt * kO(nk)Returns new tuple
Hashhash(t)O(n)Hashes all elements recursively
As dict keyd[t] = vO(n) hash + O(1) lookupAll elements must be hashable
Named fieldnt.field_nameO(1)Property calls self[index]
_replacent._replace(x=1)O(n)Returns new namedtuple
_asdictnt._asdict()O(n)Returns dict
_fieldsNT._fieldsO(1)Tuple of field name strings
Size in bytessys.getsizeof(t)O(1)~40 + 8 per element (64-bit)
Compare equalt1 == t2O(n)Element-wise left-to-right

Key Takeaways

  • A CPython tuple is a single heap allocation - its pointer array is stored inline inside the struct using a flexible array member. There is no allocated capacity field and no separate pointer block. This is why tuples are smaller and have better cache locality than equivalent lists.
  • Tuple literals of compile-time constants are evaluated once at compile time and stored as bytecode constants. Execution is a single LOAD_CONST instruction - typically 4-8x faster than equivalent list creation.
  • Immutability is shallow: the tuple's reference slots cannot be reassigned, but the mutable objects those slots point to can still be mutated. t = ([1,2],[3,4]); t[0].append(99) silently works and changes t's visible content.
  • A tuple is hashable only when every element - recursively - is hashable. Any mutable object anywhere in the element graph breaks hashability and prevents use as a dict key or set member.
  • A single-element tuple requires a trailing comma: (42,) - without it, (42) is just the integer 42 in parentheses. The comma, not the parentheses, creates the tuple.
  • typing.NamedTuple is the modern preferred form for named records - it provides type annotations, default values, method support, and IDE auto-complete, with zero extra memory per instance compared to a plain tuple.
  • Choose tuple for heterogeneous fixed-arity records (database rows, coordinates, API responses); choose list for homogeneous variable-length sequences (collections of items that grow or shrink). This semantic contract is the primary reason to pick one over the other.
  • Multiple return values from functions are tuples: return a, b packs into (a, b) automatically. CPython optimizes simple two-variable swaps a, b = b, a to use ROT_TWO bytecode with no heap allocation.
© 2026 EngineersOfAI. All rights reserved.