Skip to main content

Shallow vs Deep Copy - Understanding Python's Memory Model

Reading time: ~22 minutes | Level: Foundation → Engineering

Here is a bug that has broken production configuration systems more times than it should have.

import copy

default_config = {
"database": {"host": "localhost", "port": 5432},
"cache": {"ttl": 300, "max_size": 1000},
}

# Create a "copy" for the production environment
prod_config = default_config.copy()
prod_config["database"]["host"] = "prod-db.internal"

print(default_config["database"]["host"])

If you think this prints "localhost", you have the wrong mental model.

The actual output:

prod-db.internal

The default config was silently mutated. dict.copy() creates a shallow copy - the outer dict is new, but all nested objects are still shared references to the same objects.

This one misunderstanding causes silent data corruption in configuration systems, test fixtures, ML experiment pipelines, and API response processing.

What You Will Learn

  • Why a = b creates an alias, not a copy - and how to verify it with id()
  • Shallow copy: what copy.copy(), list[:], list.copy(), and dict.copy() actually do
  • Deep copy: how copy.deepcopy() recursively copies the entire object graph
  • ASCII memory diagrams: exact layout difference between assignment, shallow copy, and deep copy
  • When shallow copy is sufficient - the immutable leaf value rule
  • When deep copy is required - mutable nested objects and the hidden bugs from not using it
  • Performance cost of deepcopy - alternatives including JSON round-trip and protocol buffers
  • The custom __copy__ and __deepcopy__ protocol for controlling copy behavior
  • The memo dict inside deepcopy: how CPython handles circular references without infinite recursion
  • Real-world patterns: defensive copying in APIs, test fixtures, simulation environments

Prerequisites

  • Python variables and name binding
  • Understanding of Python's object model (id(), reference semantics)
  • Basic knowledge of mutable vs immutable types (lists, dicts, tuples, ints)

Part 1 - Assignment Creates an Alias, Not a Copy

The Fundamental Rule

In Python, assignment (=) never copies an object. It creates a new name that points to the same object.

# This is the single most important fact about Python assignment
a = [1, 2, 3]
b = a # b is NOT a copy - b is an alias for the same object

print(id(a)) # e.g., 140234567891456
print(id(b)) # exactly the same address

print(a is b) # True - same object in memory

Now the consequence:

b.append(4)
print(a) # [1, 2, 3, 4] - a was also "modified"!
print(b) # [1, 2, 3, 4]

There is only one list. Both a and b are names that point to it. Appending via b is the same as appending via a.

Both names point to the SAME object. Mutating through either name affects "both" (really: the one object).

Part 2 - Shallow Copy: One Level Deep

What Shallow Copy Does

A shallow copy creates a new container object but fills it with references to the same elements as the original.

import copy

a = [[1, 2], [3, 4], [5, 6]]

# All four of these produce identical shallow copies:
b1 = a.copy() # list method
b2 = a[:] # slice syntax
b3 = list(a) # list constructor
b4 = copy.copy(a) # explicit copy module

print(a is b1) # False - different outer list
print(a[0] is b1[0]) # True - same inner list!

The outer list is new. The inner lists are still the originals.

# Proving the shallow copy behavior
b = a.copy()
b.append([7, 8]) # Safe: adds to b only
print(len(a)) # 3 - a is unchanged

b[0].append(99) # DANGER: modifies the shared inner list
print(a[0]) # [1, 2, 99] - a[0] is also affected!

ASCII Memory Diagram: Shallow Copy

Outer containers: DIFFERENT objects (a is not b → False)

Inner elements: SAME objects (a[0] is b[0] → True)

Part 3 - Deep Copy: All the Way Down

What Deep Copy Does

copy.deepcopy() recursively copies every object in the object graph. The result is a completely independent copy - no shared references at any depth.

import copy

a = [[1, 2], [3, 4], [5, 6]]
b = copy.deepcopy(a)

print(a is b) # False - different outer list
print(a[0] is b[0]) # False - different inner list too!

b[0].append(99)
print(a[0]) # [1, 2] - completely unchanged
print(b[0]) # [1, 2, 99]

ASCII Memory Diagram: Deep Copy

Every object at every depth is a NEW independent copy. No shared references anywhere in the graph.

Part 4 - When Shallow Copy Is Sufficient

The key insight: shallow copy is safe when all nested objects are immutable.

Immutable objects cannot be modified, so sharing references to them is harmless - you cannot accidentally "mutate" an integer or a string through a shared reference.

# Case 1: Flat list of immutable values - shallow copy is safe
a = [1, 2, 3, 4, 5]
b = a.copy()
b[0] = 99
print(a) # [1, 2, 3, 4, 5] - unchanged
# Why safe? Integers are immutable. b[0] = 99 rebinds b[0] to a new int,
# it doesn't mutate the integer 1.

# Case 2: List of strings - shallow copy is safe
names_a = ["Alice", "Bob", "Carol"]
names_b = names_a.copy()
names_b[0] = "Zara" # Rebinds names_b[0], doesn't mutate the string
print(names_a) # ["Alice", "Bob", "Carol"] - unchanged

# Case 3: List of tuples - shallow copy is safe
points_a = [(1, 2), (3, 4)]
points_b = points_a.copy()
# Tuples are immutable - nothing in names_b can mutate points_a's elements

# Case 4: Dict with primitive values - shallow copy is safe
config_a = {"timeout": 30, "retry": 3}
config_b = config_a.copy()
config_b["timeout"] = 60
print(config_a) # {"timeout": 30, "retry": 3} - unchanged

:::tip The Immutable Leaf Rule Shallow copy is safe if and only if all objects at every level below the container are immutable. Lists, dicts, and sets are mutable - if your structure contains any of them at depth > 1, you need deepcopy or careful manual copying. :::

Part 5 - When Deep Copy Is Required

Deep copy is required when you have mutable nested objects that you intend to modify independently.

The Configuration System Bug (Revisited)

import copy

default_config = {
"database": {"host": "localhost", "port": 5432}, # Mutable nested dict!
"cache": {"ttl": 300, "max_size": 1000},
}

# WRONG: Shallow copy
prod_config = default_config.copy()
prod_config["database"]["host"] = "prod-db.internal" # Mutates shared inner dict!
print(default_config["database"]["host"]) # "prod-db.internal" - BUG!

# Reset for the demo
default_config["database"]["host"] = "localhost"

# CORRECT: Deep copy
prod_config = copy.deepcopy(default_config)
prod_config["database"]["host"] = "prod-db.internal" # Modifies independent copy
print(default_config["database"]["host"]) # "localhost" - correct

ML Experiment Pipeline Bug

import copy

# Template hyperparameters - nested mutable structure
base_params = {
"model": {"hidden_layers": [128, 64, 32], "dropout": 0.3},
"training": {"lr": 0.001, "epochs": 50},
}

# BAD: Shallow copy - shared hidden_layers list!
experiment_1 = base_params.copy()
experiment_2 = base_params.copy()

experiment_1["model"]["hidden_layers"].append(16) # Mutates the shared list!
print(experiment_2["model"]["hidden_layers"]) # [128, 64, 32, 16] - wrong!

# CORRECT: Deep copy
experiment_1 = copy.deepcopy(base_params)
experiment_2 = copy.deepcopy(base_params)

experiment_1["model"]["hidden_layers"].append(16)
print(experiment_2["model"]["hidden_layers"]) # [128, 64, 32] - isolated

Test Fixtures

import copy
import unittest

class TestShoppingCart:
# Class-level fixture - shared across tests
BASE_CART = {
"items": [
{"id": "A001", "qty": 2, "price": 9.99},
],
"discounts": [],
}

def setUp(self):
# WRONG: self.cart = self.BASE_CART - all tests share same object!
# WRONG: self.cart = self.BASE_CART.copy() - nested items list is shared!
self.cart = copy.deepcopy(self.BASE_CART) # Each test gets own independent copy

def test_add_item(self):
self.cart["items"].append({"id": "B002", "qty": 1, "price": 14.99})
assert len(self.cart["items"]) == 2
# Without deepcopy, this would permanently modify BASE_CART,
# causing all subsequent tests to see 2 items!

Part 6 - Performance: deepcopy is Expensive

deepcopy must traverse the entire object graph recursively. For large, deeply nested structures, this is significant.

import copy
import timeit

# Build a realistically complex structure
data = {
"records": [{"id": i, "values": list(range(100))} for i in range(1000)]
}

# Benchmark: shallow copy vs deep copy
t_shallow = timeit.timeit(lambda: data.copy(), number=10000)
t_deep = timeit.timeit(lambda: copy.deepcopy(data), number=100)

print(f"Shallow copy (×10,000): {t_shallow:.4f}s")
print(f"Deep copy (×100): {t_deep:.4f}s")
print(f"Deep copy per call: {t_deep/100*1000:.2f}ms")

Typical output:

Shallow copy (×10,000): 0.0031s
Deep copy (×100): 0.8240s
Deep copy per call: 8.24ms

Deep copy of a structure with 1000 records × 100 values takes approximately 8ms. At scale (1000 requests/second, each needing a deep copy), this is 8 seconds of CPU time per second - 800% CPU from copying alone.

Alternatives to deepcopy at Scale

import copy
import json
import pickle

data = {"layers": [{"weights": list(range(100))} for _ in range(50)]}

# Option 1: JSON round-trip (for JSON-serializable structures)
def copy_via_json(obj):
return json.loads(json.dumps(obj))

# Option 2: pickle round-trip (handles more types but still slow)
def copy_via_pickle(obj):
return pickle.loads(pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL))

# Option 3: deepcopy (most general)
def copy_via_deepcopy(obj):
return copy.deepcopy(obj)

t_json = timeit.timeit(lambda: copy_via_json(data), number=1000)
t_pickle = timeit.timeit(lambda: copy_via_pickle(data), number=1000)
t_deep = timeit.timeit(lambda: copy_via_deepcopy(data), number=1000)

print(f"JSON round-trip: {t_json:.3f}s")
print(f"pickle: {t_pickle:.3f}s")
print(f"deepcopy: {t_deep:.3f}s")

Typical ranking: JSON ≈ deepcopy > pickle for simple structures. For complex Python objects, deepcopy wins on generality; JSON is faster for flat/simple data.

:::note JSON Round-Trip Limitations json.dumps + json.loads only works for JSON-serializable types: dicts, lists, strings, numbers, booleans, and None. It does not handle: custom classes, datetime objects, set, tuples (converted to lists), bytes, or circular references. Use it only when your data is provably JSON-compatible. :::

Part 7 - The memo Dict: How deepcopy Handles Circular References

Without memoization, deepcopy on a circular structure would recurse infinitely.

import copy

# Circular reference: a contains b, b contains a
a = []
b = [a]
a.append(b)
# a → [b], b → [a] - circular!

# This would be infinite recursion without the memo dict
deep_a = copy.deepcopy(a) # Works correctly!
print(deep_a) # [[[[...]]]] - circular, but terminates
print(deep_a[0][0] is deep_a) # True - the circular reference is preserved!

How the memo dict works

CPython's deepcopy maintains a dictionary called memo that maps id(original_object) to its already-created copy. Before copying any object, it checks: "have I already copied this exact object (by identity)?"

deepcopy(a) execution with circular reference:

Step 1: deepcopy(a) called. memo = {}
id(a) not in memo → create new list deep_a. memo = {id(a): deep_a}

Step 2: Recursively copy a[0] = b.
id(b) not in memo → create new list deep_b. memo = {id(a): deep_a, id(b): deep_b}

Step 3: Recursively copy b[0] = a.
id(a) IS in memo → return memo[id(a)] = deep_a (already created!)
Fill deep_b[0] = deep_a

Step 4: Fill deep_a[0] = deep_b

Result: deep_a → [deep_b], deep_b → [deep_a] - circular structure reproduced!
No infinite recursion - memo prevented revisiting a.
# You can pass your own memo dict to share it across calls
# (advanced: for copying groups of mutually-referencing objects atomically)
memo = {}
deep_x = copy.deepcopy(x, memo)
deep_y = copy.deepcopy(y, memo) # References to x within y will point to deep_x

Part 8 - Custom __copy__ and __deepcopy__ Protocols

Python lets you control exactly how your objects are copied by implementing special methods.

Custom __copy__

import copy

class Config:
def __init__(self, host, port, tags):
self.host = host # str - immutable, safe to share
self.port = port # int - immutable, safe to share
self.tags = tags # list - mutable! need to copy

def __copy__(self):
"""Shallow copy: share immutable fields, copy mutable ones."""
new_obj = Config.__new__(Config)
new_obj.host = self.host # Share the string (immutable)
new_obj.port = self.port # Share the int (immutable)
new_obj.tags = self.tags # Share the list reference (shallow!)
return new_obj

def __repr__(self):
return f"Config(host={self.host!r}, port={self.port}, tags={self.tags})"


original = Config("localhost", 5432, ["primary", "write"])
shallow = copy.copy(original)

shallow.tags.append("replica") # Mutates shared tags list
print(original.tags) # ['primary', 'write', 'replica'] - shared!

Custom __deepcopy__

import copy

class Config:
def __init__(self, host, port, tags):
self.host = host
self.port = port
self.tags = tags

def __deepcopy__(self, memo):
"""Deep copy: fully independent copy of all fields."""
# IMPORTANT: register self in memo BEFORE recursing,
# to handle potential circular references
new_obj = Config.__new__(Config)
memo[id(self)] = new_obj

new_obj.host = copy.deepcopy(self.host, memo) # str - new str object
new_obj.port = copy.deepcopy(self.port, memo) # int - new int object
new_obj.tags = copy.deepcopy(self.tags, memo) # list - fully new list
return new_obj

def __repr__(self):
return f"Config(host={self.host!r}, port={self.port}, tags={self.tags})"


original = Config("localhost", 5432, ["primary", "write"])
deep = copy.deepcopy(original)

deep.tags.append("replica")
print(original.tags) # ['primary', 'write'] - unchanged
print(deep.tags) # ['primary', 'write', 'replica']

Performance-Optimized Custom deepcopy

A common pattern: skip deepcopy for fields you know are immutable, reducing recursion cost.

class FastConfig:
"""Configuration with optimized copy - avoids deepcopy on immutable fields."""
__slots__ = ("host", "port", "allowed_ips", "settings")

def __init__(self, host, port, allowed_ips, settings):
self.host = host # str - immutable
self.port = port # int - immutable
self.allowed_ips = allowed_ips # list - mutable
self.settings = settings # dict - mutable

def __deepcopy__(self, memo):
new_obj = FastConfig.__new__(FastConfig)
memo[id(self)] = new_obj

# Immutable fields: no need to deepcopy, just assign
new_obj.host = self.host # str is immutable - safe shortcut
new_obj.port = self.port # int is immutable - safe shortcut

# Mutable fields: must deepcopy
new_obj.allowed_ips = copy.deepcopy(self.allowed_ips, memo)
new_obj.settings = copy.deepcopy(self.settings, memo)
return new_obj

Part 9 - Copy Mechanisms Reference

There are multiple ways to copy in Python. Understanding what each does at the memory level:

import copy

original = [[1, 2], [3, 4]]

# === ASSIGNMENT - not a copy at all ===
alias = original
# alias IS original (same id, same mutations)

# === SHALLOW COPIES - new container, shared elements ===
s1 = original.copy() # list method - only works on list
s2 = original[:] # slice - equivalent to copy()
s3 = list(original) # list constructor - equivalent to copy()
s4 = copy.copy(original) # explicit - works on any type
# All of: s1[0] is original[0] → True

# === DICT-SPECIFIC SHALLOW COPY ===
d = {"a": [1, 2], "b": [3, 4]}
d_shallow = d.copy() # dict method - same as copy.copy(d)
# d_shallow["a"] is d["a"] → True

# === DEEP COPY - fully independent ===
deep = copy.deepcopy(original)
# deep[0] is original[0] → False - completely independent
Copy MethodOuter ContainerInner Objects
Assignment (a = b)SAMESAME
Shallow copyNEWSAME
Deep copyNEWNEW (recursively)

Part 10 - Real-World Patterns

Pattern 1: Defensive Copying in APIs

class UserRecord:
"""Public-facing API that defensively copies mutable input."""

def __init__(self, name: str, roles: list[str]):
self.name = name
self._roles = list(roles) # Defensive copy: don't hold reference to caller's list

def get_roles(self) -> list[str]:
return list(self._roles) # Defensive copy: caller can't mutate internal state

def add_role(self, role: str) -> None:
if role not in self._roles:
self._roles.append(role)

# Usage - internal state is protected
user = UserRecord("Alice", ["read", "write"])
external_roles = user.get_roles()
external_roles.append("admin") # Modify the copy
print(user.get_roles()) # ['read', 'write'] - internal state protected

Pattern 2: Simulation Checkpointing

import copy

class SimulationState:
def __init__(self):
self.agents = [{"id": i, "pos": [0, 0], "health": 100} for i in range(10)]
self.tick = 0

def checkpoint(self):
"""Save a deep copy of current state for rollback."""
return copy.deepcopy(self)

def rollback(self, checkpoint):
"""Restore from checkpoint."""
self.agents = copy.deepcopy(checkpoint.agents)
self.tick = checkpoint.tick

# Run a simulation with checkpointing
state = SimulationState()
checkpoint = state.checkpoint() # Save state before risky operation

# Simulate a turn that may corrupt state
state.agents[0]["pos"] = [10, 20]
state.tick = 1

# If something went wrong, rollback
state.rollback(checkpoint)
print(state.agents[0]["pos"]) # [0, 0] - restored
print(state.tick) # 0 - restored

Pattern 3: Immutable Value Objects (Avoiding Copy Entirely)

The best copy is no copy. For configuration and value objects, use frozen dataclasses:

from dataclasses import dataclass

@dataclass(frozen=True)
class DatabaseConfig:
host: str
port: int
name: str

# frozen=True makes all fields read-only after construction
# You never need to copy this - it cannot be mutated
config = DatabaseConfig(host="localhost", port=5432, name="appdb")

# "Modify" by creating a new instance with replace
import dataclasses
prod_config = dataclasses.replace(config, host="prod-db.internal")

print(config.host) # "localhost" - original unchanged
print(prod_config.host) # "prod-db.internal"
# No copy needed - immutability guarantees safety

:::tip Design Insight The best way to avoid copy bugs is to use immutable data structures. @dataclass(frozen=True) gives you a read-only record type. tuple instead of list. frozenset instead of set. When objects cannot be mutated, aliasing is harmless - no defensive copying needed. :::

Interview Questions

Q1: What is the difference between a = b, a = b.copy(), and a = copy.deepcopy(b) for a nested list?

Answer: a = b creates an alias - both names refer to the same object in memory, id(a) == id(b). Mutating the list through either name affects both. a = b.copy() creates a new outer list but fills it with references to the same inner objects - id(a) != id(b) but id(a[0]) == id(b[0]). Mutating an inner list through a will affect b and vice versa. a = copy.deepcopy(b) recursively copies every object - id(a) != id(b) and id(a[0]) != id(b[0]). No shared mutable references at any depth. The right choice depends on whether nested objects will be mutated.

Q2: When is shallow copy sufficient and when is deep copy required?

Answer: Shallow copy is sufficient when all elements at depth > 1 are immutable types (int, float, str, tuple of immutables, frozenset). Because immutable objects cannot be changed in-place, sharing references to them is harmless. Deep copy is required when any nested object is mutable (list, dict, set, custom mutable class) AND you intend to modify those nested objects independently. Practical test: if you copy a structure and then mutate a nested item in the copy, does the original change? If yes, you need deep copy.

Q3: How does copy.deepcopy() handle circular references? What is the memo dict?

Answer: Without memoization, deepcopy would infinitely recurse on circular structures like a = []; a.append(a). CPython's deepcopy passes a memo dictionary through all recursive calls, mapping id(original_object) to its already-created deep copy. Before copying any object, it checks memo[id(obj)]. If found, it returns the existing copy immediately (breaking the recursion). If not found, it creates the copy, registers it in memo, then recurses into the object's contents. This ensures each object is copied exactly once, circular references are reproduced correctly, and the recursion terminates.

Q4: What does dict.copy() do differently from copy.deepcopy()?

Answer: dict.copy() is a shallow copy - it creates a new dict object with the same key-value pairs, but the keys and values themselves are shared references. For a dict like {"db": {"host": "localhost"}}, dict.copy() produces a new outer dict but the inner {"host": "localhost"} dict is shared. Modifying copy["db"]["host"] mutates the original. copy.deepcopy() recursively copies the inner dict too, producing complete independence. For dicts containing only immutable values (strings, numbers, tuples), dict.copy() is safe and much faster.

Q5: How do you implement __deepcopy__ for a custom class, and why would you do this?

Answer: You implement __deepcopy__(self, memo) to control exactly how your object is deep-copied - typically to skip expensive recursion on known-immutable attributes or to handle resources (file handles, database connections) that should not be cloned. The implementation must: (1) create a new instance without calling __init__, (2) register memo[id(self)] = new_obj immediately to handle circular references, (3) call copy.deepcopy(attr, memo) for mutable attributes and directly assign immutable ones. The memo parameter must be passed through to every recursive deepcopy call.

Q6: What are the performance alternatives to copy.deepcopy() for large structures?

Answer: (1) JSON round-trip: json.loads(json.dumps(obj)) - faster for JSON-serializable flat/simple structures but loses type fidelity (tuples become lists, non-serializable types fail). (2) pickle round-trip: pickle.loads(pickle.dumps(obj, protocol=5)) - handles more types but similar or slower speed. (3) Immutable design: use @dataclass(frozen=True) or namedtuple so no copy is needed - aliasing is harmless. (4) Manual selective copy: only deepcopy the mutable parts, assign immutable parts directly. (5) Structural sharing: functional programming patterns where "modified" copies share unchanged sub-structures.

Practice Challenges

Beginner - Predict the Output

What does this code print?

import copy

a = [1, [2, 3], 4]
b = copy.copy(a)
b[0] = 99
b[1].append(99)

print(a)
print(b)
Solution

Output:

[1, [2, 3, 99], 4]
[99, [2, 3, 99], 4]

b = copy.copy(a) creates a new outer list but the inner [2, 3] is shared. b[0] = 99 rebinds b[0] to a new integer - this does not affect a[0] because integers are immutable and we are changing the pointer in b, not the object. b[1].append(99) mutates the shared list object - both a[1] and b[1] are the same list, so both see the change.

Intermediate - Fix the Bug

The following function has a subtle aliasing bug. Identify it and provide the corrected version.

def make_request_headers(auth_token: str, base_headers: dict) -> dict:
"""Build request headers by adding auth token to base headers."""
base_headers["Authorization"] = f"Bearer {auth_token}"
return base_headers

# Shared base headers for all requests
BASE_HEADERS = {
"Content-Type": "application/json",
"Accept": "application/json",
}

headers_req1 = make_request_headers("token_abc", BASE_HEADERS)
headers_req2 = make_request_headers("token_xyz", BASE_HEADERS)

print(headers_req1.get("Authorization")) # What does this print?
print(headers_req2.get("Authorization")) # And this?
print(BASE_HEADERS) # And this?
Solution

Current output (buggy):

Bearer token_xyz ← req1's token was overwritten!
Bearer token_xyz
{'Content-Type': 'application/json', 'Accept': 'application/json', 'Authorization': 'Bearer token_xyz'}

BASE_HEADERS is mutated on the first call, then mutated again on the second. Both headers_req1 and headers_req2 are the same dict object as BASE_HEADERS. All three point to the same object.

Fixed version - defensive copy:

def make_request_headers(auth_token: str, base_headers: dict) -> dict:
"""Build request headers by adding auth token to base headers."""
headers = base_headers.copy() # Shallow copy is sufficient here
headers["Authorization"] = f"Bearer {auth_token}"
return headers

BASE_HEADERS = {
"Content-Type": "application/json",
"Accept": "application/json",
}

headers_req1 = make_request_headers("token_abc", BASE_HEADERS)
headers_req2 = make_request_headers("token_xyz", BASE_HEADERS)

print(headers_req1.get("Authorization")) # Bearer token_abc - correct
print(headers_req2.get("Authorization")) # Bearer token_xyz - correct
print(BASE_HEADERS) # No Authorization key - untouched

# Shallow copy is sufficient because dict values (strings) are immutable.
# If base_headers contained nested dicts (e.g., "metadata": {...}),
# we would need copy.deepcopy to prevent nested aliasing.

Advanced - Implement a Copy-on-Write Cache

Implement a TemplateCache class that stores configuration templates and provides isolated copies to callers. Requirements:

  1. Templates are stored internally and must not be modifiable by callers
  2. get_template() should return an independent copy that callers can modify freely
  3. update_template() should replace a template (takes a new dict - store an independent copy)
  4. Adding a 4th requirement: the class must handle templates with circular references correctly
# Expected behavior:
cache = TemplateCache()
cache.add_template("api_request", {
"headers": {"Content-Type": "application/json"},
"retry": {"max_attempts": 3, "delays": [1, 2, 4]},
})

# Caller gets an independent copy
t1 = cache.get_template("api_request")
t1["headers"]["Authorization"] = "Bearer secret"
t1["retry"]["delays"].append(8)

# Original template is unaffected
t2 = cache.get_template("api_request")
print("Authorization" in t2["headers"]) # False
print(t2["retry"]["delays"]) # [1, 2, 4]
Solution
import copy
from typing import Any

class TemplateCache:
"""
A cache that stores templates and provides isolated copies.
Uses deepcopy to ensure templates cannot be corrupted by callers.
"""

def __init__(self):
self._templates: dict[str, Any] = {}

def add_template(self, name: str, template: Any) -> None:
"""Store an independent copy of the template."""
# Deep copy on write: we own our copy, caller can't mutate it
self._templates[name] = copy.deepcopy(template)

def get_template(self, name: str) -> Any:
"""Return an independent copy of the template."""
if name not in self._templates:
raise KeyError(f"Template '{name}' not found")
# Deep copy on read: caller gets their own independent copy
return copy.deepcopy(self._templates[name])

def update_template(self, name: str, new_template: Any) -> None:
"""Replace a template with an independent copy of new_template."""
self._templates[name] = copy.deepcopy(new_template)

def list_templates(self) -> list[str]:
return list(self._templates.keys())

def __len__(self) -> int:
return len(self._templates)


# Test the implementation
cache = TemplateCache()
cache.add_template("api_request", {
"headers": {"Content-Type": "application/json"},
"retry": {"max_attempts": 3, "delays": [1, 2, 4]},
})

# Verify caller gets independent copy
t1 = cache.get_template("api_request")
t1["headers"]["Authorization"] = "Bearer secret"
t1["retry"]["delays"].append(8)

t2 = cache.get_template("api_request")
print("Authorization" in t2["headers"]) # False - internal template unaffected
print(t2["retry"]["delays"]) # [1, 2, 4] - internal template unaffected

# Verify add_template copies the input (caller can mutate original after adding)
raw = {"settings": [1, 2, 3]}
cache.add_template("raw_test", raw)
raw["settings"].append(99) # Mutate original after adding

t3 = cache.get_template("raw_test")
print(t3["settings"]) # [1, 2, 3] - not [1, 2, 3, 99]

# Circular reference test
circular = {}
circular["self_ref"] = circular
cache.add_template("circular", circular) # Must not infinite-loop
t4 = cache.get_template("circular")
print(t4["self_ref"] is t4) # True - circular structure reproduced correctly

Design analysis:

  • "Copy on write, copy on read" pattern: both add_template and get_template use deepcopy
  • This guarantees complete isolation at the cost of two deep copies per "use" cycle
  • For performance-critical scenarios with immutable leaf data, a shallow copy might suffice
  • The deepcopy call with memo handles circular references automatically
  • __slots__ not used here for simplicity, but would reduce per-instance overhead in a cache holding thousands of templates

Quick Reference

OperationCreates New Container?Elements Shared?Handles Nested Mutables?Performance
a = bNoYes (alias)N/AO(1)
a = b.copy() (list)YesYesNoO(n)
a = b[:]YesYesNoO(n)
a = list(b)YesYesNoO(n)
a = copy.copy(b)YesYesNoO(n)
a = copy.deepcopy(b)YesNoYesO(graph size)
a = json.loads(json.dumps(b))YesNoYes (JSON types only)Varies
ScenarioRecommended Approach
Flat list/dict of primitiveslist.copy() or dict.copy()
Nested structure, no nested mutationShallow copy
Nested structure, independent mutation neededcopy.deepcopy()
Configuration object@dataclass(frozen=True) + dataclasses.replace()
Test fixturescopy.deepcopy() in setUp()
High-performance path, JSON-safe datajson.loads(json.dumps(obj))
Custom class with expensive attributes__deepcopy__ with manual field selection
Circular referencescopy.deepcopy() (handles via memo dict)

Key Takeaways

  • Python assignment (=) never copies - it creates an alias; both names point to the same object, and mutation through either name affects the one shared object
  • Shallow copy creates a new container but fills it with shared references to the same inner objects - safe only when inner objects are immutable
  • Deep copy (copy.deepcopy()) traverses and copies the entire object graph recursively - the only way to guarantee complete independence of nested mutable structures
  • The memo dict inside deepcopy maps id(original) to its copy, enabling correct handling of circular references without infinite recursion
  • deepcopy is expensive - benchmark before using it in hot paths; consider immutable design with @dataclass(frozen=True) as a zero-copy alternative
  • The custom __deepcopy__ protocol allows you to skip copying known-immutable fields, register in memo before recursing, and handle non-copyable resources (connections, file handles)
  • The safest API design copies defensively on both input and output - callers cannot corrupt internal state
© 2026 EngineersOfAI. All rights reserved.