Skip to main content

Mutable vs Immutable - Designing Reliable Data Structures

Reading time: ~22 minutes | Level: Foundation → Engineering

Here is one of the most common bugs in Python, appearing in production systems every day.

def add_tag(tags=[]):
tags.append("processed")
return tags

result1 = add_tag()
result2 = add_tag()
result3 = add_tag()

print(result1)
print(result2)
print(result3)

If you expected each call to return ["processed"], you have the wrong mental model about how Python handles default arguments.

The actual output:

['processed']
['processed', 'processed']
['processed', 'processed', 'processed']

Default mutable arguments are created once when the function is defined, not on each call. The same list object is reused - and grows - across every invocation.

This bug, and dozens like it, stem from a misunderstanding of Python's object model and the difference between mutability and rebinding.

What You Will Learn

  • The three pillars of the Python object model: id(), type(), value
  • What immutability actually means: you cannot change the object, only rebind the name
  • Complete classification of mutable and immutable built-in types
  • The id() trick: small integer caching (-5 to 256) and string interning
  • Pass-by-object-reference: how Python function parameters actually work
  • The mutable default argument anti-pattern - the cause and the canonical fix
  • How immutability enables safe concurrent access without locks
  • @dataclass(frozen=True): enforcing immutability in custom classes
  • Why += behaves fundamentally differently for mutable vs immutable objects
  • Why string concatenation in a loop is O(n²) and the correct O(n) alternatives

Prerequisites

  • Python variables and assignment
  • Understanding of Python functions and parameters
  • Basic familiarity with lists, dicts, and tuples

Part 1 - The Python Object Model: Three Pillars

Every Object Has Three Properties

In CPython, every object has exactly three properties:

x = 42

print(id(x)) # Identity - memory address in CPython (e.g., 140234567)
print(type(x)) # Type - <class 'int'>
print(x) # Value - 42

These three properties are fixed at creation time. For immutable types, the value cannot change. For mutable types, the value can change in place.

What "Immutable" Really Means

Immutability is a property of the object, not the variable name.

x = 42 # x binds to the integer object with value 42
x = 43 # x is rebound to a DIFFERENT integer object with value 43
# The original integer 42 was NOT changed - it cannot be

y = [1, 2, 3]
y.append(4) # The list object IS changed in place - [1, 2, 3, 4]
# y still binds to the same object (same id)

Part 2 - The Complete Type Classification

Immutable Types

These types cannot be changed after creation. Any operation that appears to "modify" them creates a new object:

# int - immutable
n = 100
id_before = id(n)
n += 1
print(id(n) == id_before) # False - n now points to a new int object (101)

# float - immutable
f = 3.14

# bool - immutable (bool is a subclass of int)
b = True
print(isinstance(b, int)) # True
print(b + 1) # 2

# str - immutable
s = "hello"
id_before = id(s)
s += " world"
print(id(s) == id_before) # False - new string object created

# tuple - immutable (but elements may be mutable!)
t = (1, 2, 3)
# t[0] = 99 - TypeError: 'tuple' object does not support item assignment

# frozenset - immutable
fs = frozenset({1, 2, 3})
# fs.add(4) - AttributeError: no attribute 'add'

# bytes - immutable
data = b"hello"
# data[0] = 65 - TypeError

Mutable Types

These types can be changed in place. The object's value changes, but the object's identity (id()) stays the same:

# list - mutable
lst = [1, 2, 3]
id_before = id(lst)
lst.append(4)
print(id(lst) == id_before) # True - same object, modified in place

# dict - mutable
d = {"a": 1}
id_before = id(d)
d["b"] = 2
print(id(d) == id_before) # True - same object

# set - mutable
s = {1, 2, 3}
s.add(4)

# bytearray - mutable (contrast with bytes)
ba = bytearray(b"hello")
ba[0] = 72 # Change first byte - no error, modified in place
print(ba) # bytearray(b'Hello')

# Custom class instances - mutable by default
class Point:
def __init__(self, x, y):
self.x = x
self.y = y

p = Point(1, 2)
p.x = 99 # Mutable - attribute reassignment works

Part 3 - The id() Trick: Integer Caching and String Interning

Small Integer Caching (-5 to 256)

CPython pre-allocates integer objects for the range -5 to 256. When you create a small integer, Python returns the cached object rather than creating a new one.

a = 100
b = 100
print(a is b) # True - same cached object!
print(id(a) == id(b)) # True

a = 257
b = 257
print(a is b) # False (usually) - different objects outside the cache range
print(id(a) == id(b)) # False

This is why is comparisons on integers are unreliable - they give different results depending on the value.

CPython pre-allocates integer objects for the range [-5, 256]:

IndexObjectReturned for
0IntObject(-5)a = -5
1IntObject(-4)a = -4
.........
261IntObject(256)a = 256

Any int in [-5, 256]: Python returns the cached object. Any int outside: Python creates a new object.

:::danger Never Use is to Compare Integers a is b tests identity (same object in memory). a == b tests equality (same value). For integers, 42 is 42 is True due to caching, but 1000 is 1000 may be False. Always use == for value comparison. Reserve is for None, True, False, and explicit identity checks. :::

String Interning

Python automatically interns short strings that look like identifiers (letters, digits, underscores). String literals in source code are often interned within the same module.

# Interned strings (looks like an identifier)
a = "hello"
b = "hello"
print(a is b) # True in CPython - same interned object

# Not interned (contains spaces)
a = "hello world"
b = "hello world"
print(a is b) # False (usually) - separate objects

# Explicit interning
import sys
a = sys.intern("my_custom_string_for_dict_key")
b = sys.intern("my_custom_string_for_dict_key")
print(a is b) # True - explicitly interned
# Use sys.intern() for strings used as dictionary keys in hot paths
# Identity comparison (is) is faster than equality comparison (==)

Part 4 - Pass-by-Object-Reference

Python is neither "pass by value" nor "pass by reference" in the traditional C sense. It is pass by object reference (also called "pass by assignment" or "call by sharing").

When you call a function, the function parameter becomes a new name that binds to the same object as the argument.

def modify(items, count):
items.append(99) # Modifies the list object (mutable) - visible outside
count += 1 # Creates a new int object - local rebinding only
print(f"Inside: items={items}, count={count}")

my_list = [1, 2, 3]
my_count = 0
modify(my_list, my_count)
print(f"Outside: my_list={my_list}, my_count={my_count}")

Output:

Inside: items=[1, 2, 3, 99], count=1
Outside: my_list=[1, 2, 3, 99], my_count=0

Key Rule

  • Mutable argument: mutations to the object are visible to the caller (because caller and callee share the same object)
  • Immutable argument: the function can only rebind its local name - the caller's binding is unaffected
  • Reassigning the parameter: items = [99] in the function body would rebind the local name items, not the caller's my_list

Part 5 - The Mutable Default Argument Anti-Pattern

This is the single most common Python gotcha. Python evaluates default argument values once, at function definition time, not on each call.

# THE BUG
def add_tag(tag, tags=[]): # This list is created ONCE
tags.append(tag)
return tags

print(add_tag("alpha")) # ['alpha']
print(add_tag("beta")) # ['alpha', 'beta'] - the default list grew!
print(add_tag("gamma")) # ['alpha', 'beta', 'gamma']

The default list [] is stored as an attribute of the function object and shared across all calls that use it.

# Proof: inspect the function's default arguments
print(add_tag.__defaults__) # (['alpha', 'beta', 'gamma'],) - the accumulated list!

The Canonical Fix: Use None as Sentinel

def add_tag(tag, tags=None):
if tags is None:
tags = [] # Create a fresh list on each call
tags.append(tag)
return tags

print(add_tag("alpha")) # ['alpha']
print(add_tag("beta")) # ['beta'] - fresh list each time
print(add_tag("gamma")) # ['gamma']

# If you WANT to share across calls, be explicit about it
shared_list = ["initial"]
print(add_tag("alpha", shared_list)) # ['initial', 'alpha']
print(add_tag("beta", shared_list)) # ['initial', 'alpha', 'beta']

:::warning The Sentinel Pattern Is Mandatory Using mutable default arguments (lists, dicts, sets) is a well-known anti-pattern. Every serious Python linter (pylint, flake8, mypy) flags it. The sentinel None pattern is the universal fix. Never use def f(items=[]), def f(config={}), or def f(visited=set()) in production code. :::

When Mutable Defaults Are Intentional

There is one legitimate use case: function-level caches that should persist across calls:

def memoized_fib(n, _cache={0: 0, 1: 1}):
"""Fibonacci with a mutable default dict as a persistent cache."""
if n not in _cache:
_cache[n] = memoized_fib(n - 1, _cache) + memoized_fib(n - 2, _cache)
return _cache[n]

print(memoized_fib(10)) # 55
print(memoized_fib.__defaults__[0]) # {0: 0, 1: 1, 2: 1, 3: 2, ..., 10: 55}

This is intentional and documented - the mutable default is used as a persistent cache, not as a per-call fresh container. Prefer functools.lru_cache in production code.

Part 6 - The += Operator: Mutable vs Immutable Behavior

The += operator behaves fundamentally differently depending on the type it is applied to.

For Immutable Types: += Creates a New Object

x = 42
print(id(x)) # e.g., 9788544
x += 1
print(id(x)) # Different - x now points to a new int object (43)
# The original int 42 is unchanged; x was rebound

For Mutable Types: += Modifies In Place (via __iadd__)

lst = [1, 2, 3]
id_before = id(lst)
lst += [4, 5] # Calls list.__iadd__([4, 5]) - extends in place!
print(id(lst) == id_before) # True - same object
print(lst) # [1, 2, 3, 4, 5]

# Compare with +: creates a new list
lst2 = [1, 2, 3]
id_before = id(lst2)
lst2 = lst2 + [4, 5] # Creates a new list!
print(id(lst2) == id_before) # False - new object

The Critical Bug: += on Immutable Tuple Elements

t = ([1, 2], [3, 4])
# Can we += on a tuple element?
# t[0] += [99] - What happens?

try:
t[0] += [99]
except TypeError as e:
print(f"Error: {e}")

print(t)

Output:

Error: 'tuple' object does not support item assignment
...
([1, 2, 99], [3, 4])

This is the strangest behavior in Python. The += on t[0]:

  1. Calls list.__iadd__([99]) - succeeds, extends t[0] in place
  2. Tries to assign the result back to t[0] - fails (tuple is immutable)

So the list was modified, but the TypeError was still raised. Both things happened.

:::danger Mutable Objects in Tuples Never use += on a mutable object stored inside an immutable container (tuple). The mutation succeeds but the assignment raises an exception, leaving the program in an inconsistent state. If you need a mutable sequence in a tuple-like structure, reconsider your data model. :::

Part 7 - Immutability and Safe Concurrent Access

Immutable objects are inherently thread-safe. Because they cannot be changed, multiple threads can read them simultaneously with no risk of data races.

import threading

# SAFE: Sharing immutable data between threads
CONFIG = {
"host": "localhost", # str - immutable
"port": 5432, # int - immutable
"name": ("db", "v2"), # tuple of str - immutable
}

def read_config():
# No lock needed - CONFIG values are immutable, reading is safe
return CONFIG["host"], CONFIG["port"]

threads = [threading.Thread(target=read_config) for _ in range(100)]
for t in threads:
t.start()
for t in threads:
t.join()
# No race conditions - immutable data is concurrency-safe
# UNSAFE: Sharing mutable data between threads without locks
shared_list = []

def append_items():
for i in range(1000):
shared_list.append(i) # NOT thread-safe!

threads = [threading.Thread(target=append_items) for _ in range(4)]
# This can corrupt shared_list due to race conditions on the list's internals

# SAFE: Use a lock for mutable shared state
import threading

shared_list = []
lock = threading.Lock()

def append_items_safe():
for i in range(1000):
with lock:
shared_list.append(i)

Part 8 - Immutable Custom Classes with @dataclass(frozen=True)

Python's dataclasses module provides frozen=True, which makes a dataclass immutable after construction. The fields become read-only, and the class becomes hashable (usable as a dict key or set member).

from dataclasses import dataclass, field
import dataclasses

@dataclass(frozen=True)
class Point:
x: float
y: float

p = Point(1.0, 2.0)
print(p.x) # 1.0

# p.x = 99.0 # FrozenInstanceError: cannot assign to field 'x'

# Frozen dataclasses are hashable - can be used as dict keys or in sets
points = {p, Point(3.0, 4.0)}
cache = {p: "result_at_this_point"}


@dataclass(frozen=True)
class Config:
host: str
port: int
tags: tuple = field(default_factory=tuple) # Must use tuple, not list!
# tags: list = field(default_factory=list) # Would make it unhashable

def with_host(self, new_host: str) -> "Config":
"""Return a modified copy - the only way to 'change' a frozen instance."""
return dataclasses.replace(self, host=new_host)


base = Config(host="localhost", port=5432, tags=("primary",))
prod = base.with_host("prod-db.internal")

print(base.host) # "localhost" - unchanged
print(prod.host) # "prod-db.internal"
print(base == prod) # False - different values
print(hash(base)) # Works - frozen dataclasses are hashable

:::tip frozen=True Forces Good Design When you use @dataclass(frozen=True), Python forces you to think about mutations as "return a new object with this change" rather than "modify this object". This is the functional programming style applied to Python, and it eliminates entire categories of aliasing bugs. :::

Part 9 - String Concatenation: Why O(n²) and the Fix

The O(n²) Anti-Pattern

Strings are immutable. Every concatenation creates a new string object, copying all characters from both operands.

# BAD: O(n²) string building
parts = ["apple", "banana", "cherry", "date"] # imagine 10,000 parts

result = ""
for part in parts:
result += part # Each += creates a new string, copying everything so far
# Iteration 1: copies "apple" → cost: 5 chars
# Iteration 2: copies "applebanana" → cost: 11 chars
# Iteration 3: copies "applebananacherry" → cost: 17 chars
# ...
# Total copies: 5 + 11 + 17 + ... = O(n²)

For 10,000 strings of average length 10:

  • Total cost ≈ 10,000 × 10,000 / 2 = 50 million character copies
# GOOD: O(n) string building with join
result = "".join(parts) # All parts joined in one pass - O(total length)

# Also good: str.join with a generator
result = ", ".join(part.upper() for part in parts)

# For building up a complex string incrementally, use a list then join
parts = []
for chunk in data_source:
parts.append(process(chunk))
result = "".join(parts)

Benchmark

import timeit

words = ["word"] * 10_000

def concat_plus():
result = ""
for w in words:
result += w
return result

def concat_join():
return "".join(words)

t_plus = timeit.timeit(concat_plus, number=100)
t_join = timeit.timeit(concat_join, number=100)

print(f"'+=' concatenation: {t_plus:.3f}s")
print(f"'join' method: {t_join:.3f}s")
print(f"Speedup: {t_plus/t_join:.1f}x")

Typical output:

'+=' concatenation: 0.812s
'join' method: 0.042s
Speedup: 19.3x

str.join is roughly 20x faster for 10,000 strings. The gap grows quadratically with input size.

:::note CPython String Optimization CPython has an optimization for str += str in some cases (when the string's reference count is 1), but this optimization is fragile and disappears in many common patterns. Do not rely on it. Always use str.join for building strings from multiple parts. :::

Part 10 - Immutability Patterns in Production Systems

Pattern 1: Named Tuples for Lightweight Immutable Records

from typing import NamedTuple

class DatabaseRecord(NamedTuple):
id: int
name: str
score: float

def with_score(self, new_score: float) -> "DatabaseRecord":
return self._replace(score=new_score)

# Immutable, hashable, memory-efficient (no __dict__)
record = DatabaseRecord(id=1, name="Alice", score=85.0)
updated = record.with_score(90.0)

print(record.score) # 85.0 - unchanged
print(updated.score) # 90.0

# Can be used as dict keys
cache = {record: "processed"}

Pattern 2: Tuple-Keyed Caches

# Using immutable tuples as cache keys - works because tuples are hashable
_cache: dict[tuple, float] = {}

def expensive_computation(x: float, y: float, z: float) -> float:
key = (x, y, z) # Tuple is immutable - safe and hashable as dict key
if key not in _cache:
_cache[key] = x ** 2 + y ** 2 + z ** 2 # simulate expensive work
return _cache[key]

# Lists cannot be used as keys - TypeError: unhashable type: 'list'
# key = [x, y, z] # Would fail on _cache[key]

Pattern 3: Functional-Style Record Updates

from dataclasses import dataclass, replace

@dataclass(frozen=True)
class UserState:
user_id: str
score: int
level: int
badges: tuple[str, ...]

def award_badge(state: UserState, badge: str) -> UserState:
"""Return a new state with the badge added - never mutate in place."""
return replace(state, badges=state.badges + (badge,))

def level_up(state: UserState) -> UserState:
"""Return a new state with incremented level."""
return replace(state, level=state.level + 1, score=state.score + 100)

# Each operation returns a new state - old states are preserved
state0 = UserState(user_id="U001", score=0, level=1, badges=())
state1 = award_badge(state0, "first_login")
state2 = level_up(state1)
state3 = award_badge(state2, "level_2_achieved")

# Full audit trail - all states preserved
print(state0.badges) # ()
print(state1.badges) # ('first_login',)
print(state3.badges) # ('first_login', 'level_2_achieved')

Interview Questions

Q1: What does "immutable" mean in Python, and does it mean the variable can't change?

Answer: Immutability is a property of the object, not the variable. An immutable object cannot change its value or internal state after creation. When you write x = 42; x = 43, you are not mutating the integer 42 - you are creating a new integer object (43) and rebinding the name x to it. The integer 42 still exists in memory until its reference count drops to zero. The variable x is just a name - it can be rebound to any object at any time. What cannot happen is some_int.value = 99 - integer objects have no mutable fields.

Q2: Explain the mutable default argument bug and the canonical fix.

Answer: Python evaluates default argument values at function definition time, not at each call. If you write def f(lst=[]), the empty list [] is created once and stored as f.__defaults__[0]. Every call that uses the default gets the same list object. Mutations accumulate across calls. The fix: use None as a sentinel value for mutable defaults, and create a fresh mutable object inside the function body: def f(lst=None): if lst is None: lst = []. This creates a new empty list on each call that uses the default.

Q3: How does += behave differently for int vs list?

Answer: For immutable types like int, x += 1 is equivalent to x = x + 1 - it creates a new integer object and rebinds x to it. The original integer is unchanged. id(x) will differ before and after. For mutable types like list, lst += [4] calls list.__iadd__([4]), which extends the list in place and returns the same list object. id(lst) is the same before and after. The difference matters when the mutable object is referenced through multiple names: b = a; a += [4] will cause b to also reflect the change for lists, but not for ints.

Q4: Why does Python cache small integers in the range -5 to 256?

Answer: CPython pre-allocates integer objects for values -5 through 256 at interpreter startup. This optimization exists because small integers are used extremely frequently in Python programs (loop counters, array indices, boolean values). By caching them, Python avoids repeatedly allocating and deallocating identical integer objects, reducing memory pressure and improving performance. The consequence: a = 100; b = 100; a is b is True because both names point to the same cached object. For values outside this range, a = 1000; b = 1000; a is b is typically False. This is why you must use == for value comparison, never is.

Q5: Why is string concatenation in a loop O(n²)?

Answer: Strings are immutable. Every result += piece creates a new string by allocating memory for len(result) + len(piece) characters and copying all characters from both strings into the new allocation. On iteration k, where result has accumulated k pieces of average length L, the copy costs k × L operations. Summing across all n iterations: L(1 + 2 + 3 + ... + n) = L × n(n+1)/2 = O(n²). The fix is "".join(pieces_list) - it calculates the total length in one pass, allocates once, then fills in each piece. Total cost: O(total characters) = O(n × L).

Q6: How does @dataclass(frozen=True) enforce immutability and what does it enable?

Answer: @dataclass(frozen=True) generates __setattr__ and __delattr__ methods that raise FrozenInstanceError on any attempt to set or delete an attribute after construction. It also generates __hash__ based on the fields (since frozen instances are reliably immutable, their hash cannot change). This enables: (1) using instances as dictionary keys, (2) storing in sets, (3) safe concurrent access without locks, (4) architectural clarity that an object represents a value, not a mutable entity. To "modify" a frozen dataclass, use dataclasses.replace(instance, field=new_value), which returns a new instance.

Practice Challenges

Beginner - Predict the Output

What does this print? Explain each line.

a = [1, 2, 3]
b = a
b += [4]
print(a)
print(b)
print(a is b)

x = 100
y = x
y += 1
print(x)
print(y)
print(x is y)
Solution

Output:

[1, 2, 3, 4]
[1, 2, 3, 4]
True
100
101
False

Explanation:

b = a makes b an alias for the same list object. b += [4] calls list.__iadd__([4]) which extends the list in place. Since a and b point to the same list, both see [1, 2, 3, 4]. a is b is True - the += did not create a new list.

y = x makes y another name for the integer object 100. y += 1 creates a new integer object 101 and rebinds y to it. x still points to 100. x is y is False because they now point to different integer objects (100 and 101).

Intermediate - Fix the Anti-Pattern

The following class has a mutable default argument bug that causes data to accumulate across instances. Identify the bug, explain why it happens, and rewrite the class correctly.

class RequestBuilder:
def __init__(self, method, path, headers={}):
self.method = method
self.path = path
self.headers = headers

def add_header(self, key, value):
self.headers[key] = value
return self

def build(self):
return {
"method": self.method,
"path": self.path,
"headers": self.headers,
}

# Usage
req1 = RequestBuilder("GET", "/api/users")
req1.add_header("Authorization", "Bearer token_1")

req2 = RequestBuilder("POST", "/api/items") # New request, no headers?
print(req2.build()) # What does this print?
Solution

Current output (buggy):

{'method': 'POST', 'path': '/api/items', 'headers': {'Authorization': 'Bearer token_1'}}

req2 sees req1's headers! Both req1.headers and req2.headers reference the same default dict {} that was created when the class was defined.

Fixed version:

from typing import Optional

class RequestBuilder:
def __init__(self, method: str, path: str, headers: Optional[dict] = None):
self.method = method
self.path = path
# Create a fresh dict for each instance - never share the default
self.headers = dict(headers) if headers is not None else {}

def add_header(self, key: str, value: str) -> "RequestBuilder":
self.headers[key] = value
return self

def build(self) -> dict:
return {
"method": self.method,
"path": self.path,
"headers": dict(self.headers), # Defensive copy on output
}

# Test correctness
req1 = RequestBuilder("GET", "/api/users")
req1.add_header("Authorization", "Bearer token_1")

req2 = RequestBuilder("POST", "/api/items")
print(req2.build()) # {'method': 'POST', 'path': '/api/items', 'headers': {}}
print(req1.build()) # {'method': 'GET', 'path': '/api/users', 'headers': {'Authorization': 'Bearer token_1'}}
# Each instance has its own independent headers dict

Bonus - immutable version using frozen dataclass:

from dataclasses import dataclass, replace, field

@dataclass(frozen=True)
class Request:
method: str
path: str
headers: tuple[tuple[str, str], ...] = field(default_factory=tuple)

def with_header(self, key: str, value: str) -> "Request":
"""Return a new Request with the header added."""
return replace(self, headers=self.headers + ((key, value),))

def headers_dict(self) -> dict:
return dict(self.headers)

req1 = Request("GET", "/api/users").with_header("Authorization", "Bearer token_1")
req2 = Request("POST", "/api/items")

print(req1.headers_dict()) # {'Authorization': 'Bearer token_1'}
print(req2.headers_dict()) # {}

Advanced - Design an Immutable Event Log

Design an EventLog class for a game state machine. Requirements:

  1. The log must be truly immutable once an event is appended - no mutation of existing entries
  2. "Appending" returns a new EventLog instance
  3. The log must be hashable (to use as a dict key for caching game states)
  4. Support slicing: log.last(n) returns the last n events as a new EventLog
  5. Benchmark: compare your immutable log against a mutable list for 10,000 append operations
@dataclass(frozen=True)
class GameEvent:
tick: int
event_type: str
payload: tuple # Must be immutable for hashability
Solution
from dataclasses import dataclass, field
import timeit
import hashlib

@dataclass(frozen=True)
class GameEvent:
tick: int
event_type: str
payload: tuple # tuple of immutable values - hashable


@dataclass(frozen=True)
class EventLog:
"""
Immutable append-only event log backed by a tuple.
Each "append" returns a new EventLog - original is never modified.
Hashable: can be used as a dict key or stored in a set.
"""
_events: tuple[GameEvent, ...] = field(default_factory=tuple)

def append(self, event: GameEvent) -> "EventLog":
"""Return a new EventLog with the event added."""
return EventLog(_events=self._events + (event,))

def last(self, n: int) -> "EventLog":
"""Return a new EventLog with only the last n events."""
return EventLog(_events=self._events[-n:] if n > 0 else ())

def __len__(self) -> int:
return len(self._events)

def __getitem__(self, index):
return self._events[index]

def __iter__(self):
return iter(self._events)

def replay(self, until_tick: int) -> "EventLog":
"""Return log filtered to events up to a given tick."""
return EventLog(_events=tuple(e for e in self._events if e.tick <= until_tick))


# Test correctness
log0 = EventLog()
log1 = log0.append(GameEvent(tick=1, event_type="MOVE", payload=(0, 0, 1, 0)))
log2 = log1.append(GameEvent(tick=2, event_type="ATTACK", payload=(1, "sword")))
log3 = log2.append(GameEvent(tick=3, event_type="PICKUP", payload=(2, "shield")))

print(f"log0 length: {len(log0)}") # 0 - original unchanged
print(f"log3 length: {len(log3)}") # 3

# Immutable - each is a distinct object
print(log0 is log1) # False
print(log1 is log2) # False

# Hashable - can be used as dict key
state_cache = {}
state_cache[log3] = "game_state_snapshot"
print(state_cache[log3]) # "game_state_snapshot"

# Slicing
last_two = log3.last(2)
print(len(last_two)) # 2
print(last_two[0].event_type) # "ATTACK"

# Replay up to tick 2
early_log = log3.replay(until_tick=2)
print(len(early_log)) # 2 (ticks 1 and 2 only)

# --- Benchmark: immutable log vs mutable list ---
def immutable_append_test():
log = EventLog()
for i in range(1000):
log = log.append(GameEvent(tick=i, event_type="TICK", payload=(i,)))
return log

def mutable_append_test():
log = []
for i in range(1000):
log.append(GameEvent(tick=i, event_type="TICK", payload=(i,)))
return log

t_immutable = timeit.timeit(immutable_append_test, number=100)
t_mutable = timeit.timeit(mutable_append_test, number=100)

print(f"\nImmutable EventLog (1000 appends × 100): {t_immutable:.3f}s")
print(f"Mutable list (1000 appends × 100): {t_mutable:.3f}s")
print(f"Overhead ratio: {t_immutable/t_mutable:.1f}x")

Expected output:

Immutable EventLog (1000 appends × 100): 0.847s
Mutable list (1000 appends × 100): 0.009s
Overhead ratio: 94.1x

Design analysis:

  • Each "append" allocates a new tuple (copies all N existing events + 1 new one) → O(n) per append → O(n²) total for n appends
  • This is the fundamental cost of immutability: no in-place mutation means copying
  • For game state logging, this is acceptable: logs are typically small (< 10,000 events), and the benefits (hashability, no aliasing bugs, safe concurrent reads) outweigh the cost
  • For high-frequency logging (>10,000 events), use a persistent data structure library or batch immutable snapshots

Quick Reference

TypeMutable?Hashable?Examples
intNoYes42, -5, 0
floatNoYes3.14, 1.0
boolNoYesTrue, False
strNoYes"hello", ""
tupleNo*Yes*(1, 2), ()
frozensetNoYesfrozenset({1,2})
bytesNoYesb"data"
listYesNo[1, 2, 3]
dictYesNo{"a": 1}
setYesNo{1, 2, 3}
bytearrayYesNobytearray(b"x")
dataclass(frozen=True)NoYesCustom records

*Tuple is immutable and hashable only if all its elements are also immutable.

ScenarioUseWhy
Dict key or set memberImmutable typeMust be hashable
Shared across threadsImmutable typeNo race condition possible
Function default containerNone sentinelAvoid shared mutable default
Large evolving datasetlist or dictIn-place mutation is efficient
Configuration object@dataclass(frozen=True)Read-only, hashable, explicit
String building from partsstr.join(parts)O(n) vs O(n²) for +=
Cache key with multiple fieldstupleImmutable, hashable, ordered

Key Takeaways

  • Python's object model has three pillars: id() (identity), type(), and value - immutability means the value cannot change after the object is created
  • Rebinding a name (x = 43) never modifies an object - it redirects the name to a different object; immutable objects appear to "change" only because the name gets rebound to a new object
  • Python is pass-by-object-reference: functions receive a binding to the same object - mutations to mutable arguments are visible to callers; rebinding the parameter is not
  • The mutable default argument bug is caused by default values being evaluated once at definition time - always use None as sentinel and create fresh containers inside the function body
  • For immutable types, += creates a new object and rebinds; for mutable types, += calls __iadd__ and modifies in place - understanding this difference prevents subtle aliasing bugs
  • String concatenation in a loop is O(n²) - always use "".join(list_of_strings) which is O(n) by allocating once and filling in
  • @dataclass(frozen=True) enforces immutability for custom classes, enables hashability, and communicates design intent - use it for value objects, configuration records, and anything shared across threads
© 2026 EngineersOfAI. All rights reserved.