Python Memory References Practice Problems & Exercises

Practice: Memory References and Aliasing

11 problems4 Easy4 Medium3 Hard⏱ 40–55 min

Easy

#1Identity vs EqualityEasy

is-operatorequalityidentity

Predict the output of each comparison. Think carefully about when Python creates new objects versus reusing existing ones.

Python

a = [1, 2, 3]
b = a
print(a is b)

c = [1, 2, 3]
print(a is c)

print(a == c)

x = None
y = None
print(x is y)

p = [10, 20]
q = [10, 20]
print(p is q)

Solution

True
False
True
True
False

How it works:

a is b is True because b = a makes b an alias — both names point to the same list object in memory.
a is c is False because c = [1, 2, 3] creates a brand new list object, even though the contents are identical.
a == c is True because == compares values, and both lists contain [1, 2, 3].
x is y is True because None is a singleton — there is exactly one None object in the interpreter.
p is q is False because each [10, 20] literal creates a separate list object.

Key insight: Use is only for singletons (None, True, False) or when you explicitly need identity checks. Use == for value comparison in all other cases.

Expected Output

True\nFalse\nTrue\nTrue\nFalse

Hints

Hint 1: The `is` operator checks whether two names point to the exact same object in memory (same id).

Hint 2: `==` checks value equality by calling __eq__. Two distinct list objects with the same contents are equal but not identical.

#2Trace the Reference CountEasy

sys-getrefcountreference-countingtrace

Predict the output at each print call. Remember that sys.getrefcount() adds 1 to the true count.

Python

import sys

x = []
print(sys.getrefcount(x))

y = x
z = [x]
print(sys.getrefcount(x))

del y
print(sys.getrefcount(x))

z.clear()
print(sys.getrefcount(x))

Solution

Step-by-step trace:

x = [] — refcount is 1 (just x). sys.getrefcount(x) adds its own temporary reference, so prints 2.
y = x adds +1. z = [x] adds +1 (stored in z[0]). Now the object has 3 real references (x, y, z[0]). With getrefcount's temporary: 4.
del y removes one reference. Real count: 2 (x, z[0]). With getrefcount: 3.
z.clear() removes the list element, dropping the z[0] reference. Real count: 1 (x). With getrefcount: 2.

The +1 rule: Always subtract 1 from sys.getrefcount() to get the actual number of references in your code. The function itself holds a temporary reference to its argument for the duration of the call.

Expected Output

2\n4\n3\n2

Hints

Hint 1: sys.getrefcount() always returns one more than the actual count because the function argument itself is a temporary reference.

Hint 2: Storing an object in a list increments its reference count. Removing it from the list decrements it.

#3del Does Not DestroyEasy

del-keywordname-bindingreference-counting

Predict the output. Three different ways to "remove" a list — only one actually destroys the data for all aliases.

Python

# Scenario 1: del removes the name, not the object
a = [1, 2, 3]
b = a
del a
print(b)

# Scenario 2: rebinding creates a new object
c = [1, 2, 3]
d = c
c = []
print(d)

# Scenario 3: clear() mutates in place
e = [1, 2, 3]
f = e
e.clear()
print(f)

# Scenario 4: reassignment does not affect aliases
g = [1, 2, 3]
h = g
g = [4, 5, 6]
print(h)

Solution

[1, 2, 3]
[1, 2, 3]
[]
[1, 2, 3]

What each operation actually does:

Scenario 1 — del a: Removes the name a from the namespace. The list object still exists because b holds a reference. del never destroys the object directly.
Scenario 2 — c = []: Rebinds the name c to a new empty list. The original [1, 2, 3] object is unchanged and still referenced by d.
Scenario 3 — e.clear(): Mutates the list object in place, emptying it. Since f is an alias to the same object, f also sees an empty list.
Scenario 4 — g = [4, 5, 6]: Rebinds g to a completely new list. h still references the original [1, 2, 3].

Key insight: Only in-place mutation (clear(), append(), extend(), [i] = val) affects all aliases. Rebinding (=) and del only change the name, not the object.

Expected Output

[1, 2, 3]\n[1, 2, 3]\n[]\n[1, 2, 3]

Hints

Hint 1: del removes the name binding, not the object. If another name still references the object, it stays alive.

Hint 2: list.clear() mutates the object in place — all aliases see the change. Rebinding with = [] creates a new object.

#4The Classic Matrix Aliasing BugEasy

aliasingnested-structureslist-multiplication

Predict the output. This is one of the most common aliasing bugs in Python.

Python

# The buggy way
bad_matrix = [[0] * 3] * 3
print(bad_matrix[0] is bad_matrix[1])

bad_matrix[0][0] = 9
print(bad_matrix)

# The correct way
good_matrix = [[0] * 3 for _ in range(3)]
print(good_matrix[0] is good_matrix[1])

good_matrix[0][0] = 9
print(good_matrix)

Solution

True
[[9, 0, 0], [9, 0, 0], [9, 0, 0]]
False
[[9, 0, 0], [0, 0, 0], [0, 0, 0]]

Why the bug happens:

[[0] * 3] * 3 first creates a single inner list [0, 0, 0], then the outer * 3 puts three references to that same object in the outer list. All three "rows" are the same object in memory, so modifying one modifies all of them.

The list comprehension [[0] * 3 for _ in range(3)] evaluates [0] * 3 three separate times, creating three independent list objects. Modifying one row has no effect on the others.

Memory picture:

Bad:   outer -> [ptr, ptr, ptr]  --> [0, 0, 0]  (ONE object)
Good:  outer -> [ptr1, ptr2, ptr3]
                  |      |      |
                  v      v      v
               [0,0,0] [0,0,0] [0,0,0]  (THREE objects)

Rule of thumb: Never use * n to duplicate mutable objects inside a list. Always use a comprehension or explicit loop.

Expected Output

True\n[[9, 0, 0], [9, 0, 0], [9, 0, 0]]\nFalse\n[[9, 0, 0], [0, 0, 0], [0, 0, 0]]

Hints

Hint 1: `[[0] * 3] * 3` creates one inner list object and puts three references to it in the outer list.

Hint 2: A list comprehension `[[0] * 3 for _ in range(3)]` creates three independent inner lists.

Medium

#5Function Aliasing TrapMedium

aliasingfunction-argumentsmutation

Predict the output. Pay attention to which functions mutate the argument and which rebind the local name.

Python

def append_to(lst, val):
    lst.append(val)

def replace_list(lst, new_lst):
    lst = new_lst  # Rebinds the local name only

def extend_copy(lst, val):
    copy = list(lst)
    copy.append(val)
    return copy

data = [1, 2, 3]
append_to(data, 99)
print(data)

alias = data
print(alias)

original = [10, 20]
replace_list(original, [77, 88])
print(original)

result = extend_copy(original, 99)
print(result)

Solution

[1, 2, 3, 99]
[1, 2, 3, 99]
[10, 20]
[10, 20, 99]

Analysis:

append_to(data, 99) — lst is an alias for data. Calling lst.append(val) mutates the original list. data is now [1, 2, 3, 99].
alias = data was set before append_to, but since alias and data point to the same object, alias also shows [1, 2, 3, 99].
replace_list(original, [77, 88]) — inside the function, lst = new_lst rebinds the local variable lst to [77, 88]. The caller's original is untouched because rebinding a local name does not affect the caller's reference.
extend_copy(original, 99) — creates a new list via list(lst), appends to the copy, and returns it. The original is unchanged.

Key insight: Python uses "pass by object reference" — the function receives a reference to the same object, not a copy. Mutation through that reference is visible to the caller. Rebinding the local parameter is not.

Expected Output

[1, 2, 3, 99]\n[1, 2, 3, 99]\n[10, 20]\n[10, 20, 99]

Hints

Hint 1: When a mutable object is passed to a function, the parameter is an alias to the same object. Mutations via the parameter are visible to the caller.

Hint 2: Rebinding the parameter with = inside the function only changes the local name — the caller does not see the rebinding.

#6Reference Count ContributionsMedium

sys-getrefcountcontainersreference-counting

Write a function that traces the reference count of an object through a series of operations, correctly accounting for the sys.getrefcount() overhead. Show how each container operation increments or decrements the count.

Python

import sys

def trace_refcount(label, obj):
    count = sys.getrefcount(obj) - 1  # Subtract getrefcount's own reference
    print(f"{label:<20}{count}")

def demonstrate_contributions():
    obj = object()
    trace_refcount("After creation:", obj)

    alias = obj
    trace_refcount("After alias:", obj)

    container = [obj]
    trace_refcount("After list store:", obj)

    d = {"key": obj}
    trace_refcount("After dict store:", obj)

    container.pop()
    trace_refcount("After list remove:", obj)

    d.clear()
    trace_refcount("After dict clear:", obj)

    del alias
    trace_refcount("After del alias:", obj)

demonstrate_contributions()

Solution

import sys

def trace_refcount(label, obj):
    count = sys.getrefcount(obj) - 1
    print(f"{label:<20}{count}")

def demonstrate_contributions():
    obj = object()
    trace_refcount("After creation:", obj)      # 1: just 'obj'

    alias = obj
    trace_refcount("After alias:", obj)          # 2: obj + alias

    container = [obj]
    trace_refcount("After list store:", obj)     # 3: obj + alias + container[0]

    d = {"key": obj}
    trace_refcount("After dict store:", obj)     # 4: obj + alias + container[0] + d["key"]

    container.pop()
    trace_refcount("After list remove:", obj)    # 3: obj + alias + d["key"]

    d.clear()
    trace_refcount("After dict clear:", obj)     # 2: obj + alias

    del alias
    trace_refcount("After del alias:", obj)      # 1: just obj

demonstrate_contributions()

Reference count sources:

Each of these operations increments ob_refcnt:

Assignment to a name (alias = obj)
Storage in a list (container = [obj] or lst.append(obj))
Storage as a dict value (d["key"] = obj)
Passing as a function argument (temporary, during the call)
Storage as an attribute (instance.attr = obj)

Each of these decrements ob_refcnt:

del name
list.pop(), list.clear(), del lst[i]
dict.clear(), del d[key], dict.pop(key)
Function argument going out of scope (when the call returns)
Rebinding a name (alias = something_else)

When ob_refcnt reaches 0, CPython immediately deallocates the object.

Starter Code

import sys

def trace_refcount(label, obj):
    """Print the reference count minus the getrefcount overhead."""
    # TODO: return the actual reference count (subtract getrefcount's +1)
    pass

def demonstrate_contributions():
    """Show how different operations affect reference count."""
    obj = object()
    # TODO: print refcount after each step:
    # 1. After creation (just one name)
    # 2. After assigning to another name
    # 3. After storing in a list
    # 4. After storing in a dict value
    # 5. After removing from the list
    # 6. After clearing the dict
    # 7. After deleting the alias
    pass

demonstrate_contributions()

Expected Output

After creation:     1\nAfter alias:        2\nAfter list store:   3\nAfter dict store:   4\nAfter list remove:  3\nAfter dict clear:   2\nAfter del alias:    1

Hints

Hint 1: sys.getrefcount(obj) always returns the actual count + 1 because the function argument is itself a reference.

Hint 2: Storing an object in a list (append, index assignment) increments refcount. Removing it (pop, clear, del lst[i]) decrements it.

Hint 3: Dict values also hold references. Clearing the dict releases those references.

#7Aliasing Bug DetectorMedium

aliasingidis-operatordebugging

Write utility functions that detect aliasing bugs in nested data structures. find_shared_references finds all index pairs that point to the same object. is_safe_matrix checks if a matrix has any aliased rows.

Python

def find_shared_references(data):
    shared = []
    for i in range(len(data)):
        for j in range(i + 1, len(data)):
            if data[i] is data[j]:
                shared.append((i, j))
    return shared

def is_safe_matrix(matrix):
    for i in range(len(matrix)):
        for j in range(i + 1, len(matrix)):
            if matrix[i] is matrix[j]:
                return False
    return True

# Test with aliased structure
row_a = [1, 2, 3]
row_b = [4, 5, 6]
data = [row_a, row_b, row_a, row_b, row_a]

print(f"Shared pairs: {find_shared_references(data)}")
print(f"Is safe: {is_safe_matrix(data)}")

# Fix it
fixed = [list(row) for row in data]
print(f"Fixed matrix safe: {is_safe_matrix(fixed)}")

Solution

def find_shared_references(data):
    """Find all pairs of indices sharing the same object."""
    shared = []
    for i in range(len(data)):
        for j in range(i + 1, len(data)):
            if data[i] is data[j]:
                shared.append((i, j))
    return shared

def is_safe_matrix(matrix):
    """Return True if no two rows are the same object."""
    for i in range(len(matrix)):
        for j in range(i + 1, len(matrix)):
            if matrix[i] is matrix[j]:
                return False
    return True

Output:

Shared pairs: [(0, 2), (0, 4), (2, 4), (1, 3)]
Is safe: False
Fixed matrix safe: True

Why this matters in production:

Aliasing bugs are silent — they do not raise exceptions. The program runs, produces wrong results, and the bug is extremely hard to trace because the mutation happens in a completely different part of the code from where the corrupted data is read.

The fix pattern: list(row) creates a shallow copy of each row, breaking the alias. For deeply nested structures, use copy.deepcopy(). The is_safe_matrix check can be added as a debug assertion:

assert is_safe_matrix(my_matrix), "Matrix has aliased rows — mutation will corrupt data"

Performance note: The find_shared_references function is O(n^2) where n is the number of elements. For large structures, you can optimize by grouping elements by id():

from collections import defaultdict

def find_shared_fast(data):
    id_to_indices = defaultdict(list)
    for i, item in enumerate(data):
        id_to_indices[id(item)].append(i)
    return [(indices[i], indices[j])
            for indices in id_to_indices.values() if len(indices) > 1
            for i in range(len(indices)) for j in range(i + 1, len(indices))]

Starter Code

def find_shared_references(data):
    """Given a list of lists, find all pairs of indices that share
    the same inner list object (are aliases, not just equal).
    Return a list of (i, j) tuples where i < j and data[i] is data[j].
    """
    # TODO: implement
    pass

def is_safe_matrix(matrix):
    """Return True if no two rows in the matrix are the same object.
    A matrix is 'safe' if modifying one row cannot affect another.
    """
    # TODO: implement
    pass

Expected Output

Shared pairs: [(0, 2), (0, 4), (2, 4), (1, 3)]\nIs safe: False\nFixed matrix safe: True

Hints

Hint 1: Use the `is` operator to check object identity, not `==` which checks value equality.

Hint 2: Compare every pair (i, j) where i < j. If data[i] is data[j], they are aliases.

Hint 3: A matrix is safe if no pair of rows shares identity.

#8Cyclic Reference DetectionMedium

gc-modulecyclic-referencesgarbage-collection

Demonstrate the difference between objects with circular references (which require the cyclic GC) and objects without cycles (which are cleaned up by reference counting alone).

Python

import gc

def create_cycle():
    gc.disable()
    gc.collect()  # Clear any existing garbage first

    a = []
    b = []
    a.append(b)
    b.append(a)

    del a
    del b
    # a and b keep each other alive — refcount never reaches 0

    collected = gc.collect()
    gc.enable()
    return collected

def create_clean():
    gc.disable()
    gc.collect()

    a = [1, 2, 3]
    b = [4, 5, 6]
    c = a  # alias, no cycle

    del a
    del b
    del c
    # All refcounts reach 0 — cleaned up by reference counting

    collected = gc.collect()
    gc.enable()
    return collected

cycle_count = create_cycle()
print(f"Cycle collection: collected > 0: {cycle_count > 0}")

clean_count = create_clean()
print(f"Clean collection: collected == 0: {clean_count == 0}")

Solution

import gc

def create_cycle():
    gc.disable()
    gc.collect()  # Clear existing garbage

    a = []
    b = []
    a.append(b)   # a[0] -> b, so b.refcount = 2 (b, a[0])
    b.append(a)   # b[0] -> a, so a.refcount = 2 (a, b[0])

    del a          # a.refcount = 1 (only b[0])
    del b          # b.refcount = 1 (only a[0])
    # Neither reaches 0 — they hold each other alive

    collected = gc.collect()
    gc.enable()
    return collected

def create_clean():
    gc.disable()
    gc.collect()

    a = [1, 2, 3]
    b = [4, 5, 6]
    c = a

    del a   # refcount of [1,2,3] drops to 1 (just c)
    del b   # refcount of [4,5,6] drops to 0 — immediately deallocated
    del c   # refcount of [1,2,3] drops to 0 — immediately deallocated

    collected = gc.collect()
    gc.enable()
    return collected

Why the cyclic GC is necessary:

Reference counting is fast and deterministic — objects are freed the instant their refcount hits zero. But it cannot detect "islands" of objects that reference each other with no external references. The cyclic GC uses a "trial deletion" algorithm:

For every container object tracked by GC, temporarily subtract the internal references.
Objects whose adjusted count is zero are unreachable — they form an island.
Collect the entire island at once.

Performance implication: The cyclic GC runs periodically (controlled by gc.get_threshold()), introducing brief "stop-the-world" pauses. For most programs this is negligible. For latency-sensitive applications, you can tune or disable the GC — but only if you are confident no cycles exist.

Starter Code

import gc

def create_cycle():
    """Create a circular reference and return the count of objects
    collected when gc.collect() is called.
    """
    # TODO:
    # 1. Disable automatic GC
    # 2. Create two objects that reference each other
    # 3. Delete the external references
    # 4. Run gc.collect() and return the number of collected objects
    # 5. Re-enable GC
    pass

def create_clean():
    """Create objects WITHOUT circular references.
    Show that reference counting alone handles cleanup.
    Return the count from gc.collect() (should be 0 or very small).
    """
    # TODO: create and delete objects with no cycles
    pass

Expected Output

Cycle collection: collected > 0: True\nClean collection: collected == 0: True

Hints

Hint 1: Disable GC with gc.disable() before creating cycles so they are not automatically collected.

Hint 2: Two lists that append each other form a cycle: a.append(b); b.append(a)

Hint 3: gc.collect() returns the number of unreachable objects found and collected.

Hard

#9Weak Reference CacheHard

weakrefWeakValueDictionarycachegc

Implement a WeakCache that stores objects using weak references. Cached objects are automatically evicted when all strong references outside the cache are deleted.

Python

import weakref
import gc

class HeavyObject:
    """Simulates a large object that supports weak references."""
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return f"HeavyObject({self.name})"

class WeakCache:
    def __init__(self):
        self._store = weakref.WeakValueDictionary()

    def get(self, key):
        return self._store.get(key)

    def put(self, key, value):
        self._store[key] = value

    def size(self):
        return len(self._store)

    def contains(self, key):
        return key in self._store

cache = WeakCache()

obj_a = HeavyObject("alpha")
obj_b = HeavyObject("beta")
cache.put("alpha", obj_a)
cache.put("beta", obj_b)

print(f"After put: size={cache.size()}, contains alpha: {cache.contains('alpha')}")
print(f"Retrieved: {cache.get('alpha')}")

del obj_a
gc.collect()
print(f"After del + GC: size={cache.size()}, contains alpha: {cache.contains('alpha')}")
print(f"Beta still alive: {cache.contains('beta')}")

del obj_b
gc.collect()
print(f"After del beta + GC: size={cache.size()}")

Solution

import weakref
import gc

class HeavyObject:
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return f"HeavyObject({self.name})"

class WeakCache:
    def __init__(self):
        self._store = weakref.WeakValueDictionary()

    def get(self, key):
        return self._store.get(key)

    def put(self, key, value):
        self._store[key] = value

    def size(self):
        return len(self._store)

    def contains(self, key):
        return key in self._store

How WeakValueDictionary works internally:

When you store cache["key"] = obj, the dictionary creates a weakref.ref(obj, callback) instead of a strong reference.
The callback is triggered when obj is garbage collected — it automatically removes the dictionary entry.
cache["key"] transparently dereferences the weak reference, returning the object if it is still alive or raising KeyError if it was collected.

Why basic types do not work: int, str, list, tuple, and dict are implemented in C without the __weakref__ slot that weak references require. Custom classes include __weakref__ by default. You can explicitly prevent it with __slots__ (without including __weakref__).

Production use case: Web frameworks use weak caches for expensive computations (template compilation, query plans). The cache speeds up repeated access, but does not prevent garbage collection when memory pressure is high. This is fundamentally different from functools.lru_cache, which holds strong references and requires an explicit maxsize to bound memory.

Starter Code

import weakref
import gc

class WeakCache:
    """A cache that allows entries to be garbage collected
    when no strong references remain outside the cache.
    
    Unlike a regular dict cache, this does not cause memory leaks
    for objects that are no longer in use.
    """

    def __init__(self):
        # TODO: initialize with a WeakValueDictionary
        pass

    def get(self, key):
        """Return cached object or None if not present / collected."""
        pass

    def put(self, key, value):
        """Store value in cache (weak reference)."""
        pass

    def size(self):
        """Return number of live entries."""
        pass

    def contains(self, key):
        """Check if key is in cache and object is still alive."""
        pass

Expected Output

After put: size=2, contains alpha: True\nRetrieved: HeavyObject(alpha)\nAfter del + GC: size=1, contains alpha: False\nBeta still alive: True\nAfter del beta + GC: size=0

Hints

Hint 1: weakref.WeakValueDictionary stores weak references as values. When the last strong reference to a value is deleted, the entry is automatically removed.

Hint 2: Basic types like int, str, list do not support weak references. Use a custom class as the cached object.

Hint 3: After deleting strong references, call gc.collect() to ensure the weak references are cleaned up.

#10Break the Cycle with weakrefHard

weakrefcircular-referencetreeparent-child

Implement a tree node that uses weak references for parent pointers. This prevents circular references between parent and child nodes, allowing the garbage collector to clean up entire subtrees when the root is deleted.

Python

import weakref
import gc

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.children = []
        self._parent_ref = None

    @property
    def parent(self):
        if self._parent_ref is None:
            return None
        return self._parent_ref()

    @parent.setter
    def parent(self, node):
        if node is None:
            self._parent_ref = None
        else:
            self._parent_ref = weakref.ref(node)

    def add_child(self, child_node):
        self.children.append(child_node)
        child_node.parent = self

    def depth(self):
        d = 0
        current = self.parent
        while current is not None:
            d += 1
            current = current.parent
        return d

    def path_to_root(self):
        path = [self.value]
        current = self.parent
        while current is not None:
            path.append(current.value)
            current = current.parent
        return path

# Build a tree
root = TreeNode("root")
child_a = TreeNode("child_a")
child_b = TreeNode("child_b")
grandchild = TreeNode("grandchild")

root.add_child(child_a)
root.add_child(child_b)
child_a.add_child(grandchild)

print(f"Root children: {len(root.children)}")
print(f"Child A parent: {child_a.parent.value}")
print(f"Grandchild parent: {grandchild.parent.value}")
print(f"Grandchild depth: {grandchild.depth()}")
print(f"Path to root: {grandchild.path_to_root()}")

# Delete the root — weak parent refs become None
del root
gc.collect()
print(f"After del root — grandchild parent alive: {grandchild.parent is not None}")
print(f"Grandchild still accessible: {grandchild.value}")

Solution

import weakref

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.children = []
        self._parent_ref = None

    @property
    def parent(self):
        if self._parent_ref is None:
            return None
        return self._parent_ref()  # Call the weakref to get the object or None

    @parent.setter
    def parent(self, node):
        if node is None:
            self._parent_ref = None
        else:
            self._parent_ref = weakref.ref(node)

    def add_child(self, child_node):
        self.children.append(child_node)
        child_node.parent = self

    def depth(self):
        d = 0
        current = self.parent
        while current is not None:
            d += 1
            current = current.parent
        return d

    def path_to_root(self):
        path = [self.value]
        current = self.parent
        while current is not None:
            path.append(current.value)
            current = current.parent
        return path

Why weak parent references matter:

Without weak references, the parent-child relationship creates a reference cycle:

Parent holds strong references to children via self.children
Children hold strong references to parent via self.parent
Deleting the root variable does not free any node — they all keep each other alive

With weak parent references:

Parent holds strong references to children (ownership)
Children hold weak references to parent (observation only)
Deleting the root reduces its refcount to 0 (no strong references from children), which triggers immediate deallocation via reference counting
This cascades: root's deallocation removes strong references to children, reducing their refcounts

The _parent_ref() call pattern: A weakref.ref object is callable. Calling it returns the referenced object if it is still alive, or None if it has been garbage collected. This is why the parent property getter calls self._parent_ref() — the parentheses dereference the weak reference.

After del root: The grandchild's parent property returns None for child_a (assuming child_a was only referenced through the tree). But grandchild itself is still alive if we hold a direct reference to it. The weak reference cleanly breaks the cycle without losing the ability to navigate the tree while it exists.

Starter Code

import weakref
import gc
import sys

class TreeNode:
    """A tree node that uses weak references for parent pointers
    to avoid circular reference memory leaks.
    """

    def __init__(self, value):
        self.value = value
        self.children = []
        self._parent_ref = None  # weak reference to parent

    @property
    def parent(self):
        """Return the parent node, or None if no parent or parent was collected."""
        # TODO: dereference the weak reference
        pass

    @parent.setter
    def parent(self, node):
        """Set parent using a weak reference."""
        # TODO: store a weak reference to node
        pass

    def add_child(self, child_node):
        """Add a child and set its parent to self."""
        # TODO: add to children list and set parent
        pass

    def depth(self):
        """Return depth of this node (root = 0)."""
        # TODO: walk up parent chain
        pass

    def path_to_root(self):
        """Return list of values from this node to root."""
        # TODO: collect values walking up
        pass

Expected Output

Root children: 2\nChild A parent: root\nGrandchild parent: child_a\nGrandchild depth: 2\nPath to root: ['grandchild', 'child_a', 'root']\nAfter del root — grandchild parent alive: False\nGrandchild still accessible: grandchild

Hints

Hint 1: Store the parent as weakref.ref(node) in the setter. In the getter, call self._parent_ref() to dereference.

Hint 2: If the parent has been garbage collected, the weak reference returns None.

Hint 3: The depth and path_to_root methods must handle the case where parent returns None (the node is the root or parent was collected).

#11Memory Leak HunterHard

gc-modulememory-leakdebuggingget_referrers

Build a MemoryProfiler that takes snapshots of live objects, diffs them to find types that are growing unexpectedly, and traces referrers to diagnose what is keeping leaked objects alive. This is how real memory leak investigations work.

Python

import gc
import sys
from collections import Counter

class MemoryProfiler:
    def __init__(self):
        self._snapshots = []

    def snapshot(self, label=""):
        gc.collect()
        gc.collect()
        gc.collect()
        objects = gc.get_objects()
        counts = Counter(type(obj).__name__ for obj in objects)
        self._snapshots.append({"label": label, "counts": counts})

    def diff(self, idx_a=0, idx_b=-1):
        a = self._snapshots[idx_a]["counts"]
        b = self._snapshots[idx_b]["counts"]
        all_types = set(a.keys()) | set(b.keys())
        result = []
        for t in all_types:
            ca = a.get(t, 0)
            cb = b.get(t, 0)
            delta = cb - ca
            if delta != 0:
                result.append((t, ca, cb, delta))
        result.sort(key=lambda x: abs(x[3]), reverse=True)
        return result

    def find_referrers(self, obj, max_depth=2):
        gc.collect()
        referrers = gc.get_referrers(obj)
        results = []
        for ref in referrers:
            type_name = type(ref).__name__
            snippet = repr(ref)[:80]
            results.append((type_name, snippet))
        return results

    def detect_leaks(self, threshold=10):
        diffs = self.diff()
        return [(t, ca, cb, d) for t, ca, cb, d in diffs if d > threshold]

# --- Simulate a memory leak ---
class LeakyItem:
    def __init__(self, data):
        self.data = data

profiler = MemoryProfiler()
profiler.snapshot("baseline")
print(f"Snapshot 0 (baseline): types tracked > 0: {len(profiler._snapshots[0]['counts']) > 0}")

# Create a leak: store objects in a global list and "forget" about it
_leaked_storage = []
for i in range(500):
    _leaked_storage.append(LeakyItem("x" * 100))

print("Leaked 500 LeakyItem objects")
profiler.snapshot("after leak")
print(f"Snapshot 1 (after leak): types tracked > 0: {len(profiler._snapshots[1]['counts']) > 0}")

# Detect the leak
leaks = profiler.detect_leaks(threshold=100)
for type_name, before, after, delta in leaks:
    if type_name == "LeakyItem":
        print(f"Leak detected: {type_name} increased by {delta}")

# Find what is keeping a leaked item alive
referrers = profiler.find_referrers(_leaked_storage[0])
has_list = any(t == "list" for t, _ in referrers)
print(f"Referrers of a leaked item: found list referrer: {has_list}")

# Cleanup
_leaked_storage.clear()
profiler.snapshot("after cleanup")
cleanup_diffs = profiler.diff(1, 2)
leaky_delta = 0
for t, ca, cb, d in cleanup_diffs:
    if t == "LeakyItem":
        leaky_delta = d
        break
print(f"After cleanup: LeakyItem delta <= 0: {leaky_delta <= 0}")

Solution

import gc
from collections import Counter

class MemoryProfiler:
    def __init__(self):
        self._snapshots = []

    def snapshot(self, label=""):
        gc.collect()
        gc.collect()
        gc.collect()  # Three passes for thorough collection
        objects = gc.get_objects()
        counts = Counter(type(obj).__name__ for obj in objects)
        self._snapshots.append({"label": label, "counts": counts})

    def diff(self, idx_a=0, idx_b=-1):
        a = self._snapshots[idx_a]["counts"]
        b = self._snapshots[idx_b]["counts"]
        all_types = set(a.keys()) | set(b.keys())
        result = []
        for t in all_types:
            ca = a.get(t, 0)
            cb = b.get(t, 0)
            delta = cb - ca
            if delta != 0:
                result.append((t, ca, cb, delta))
        result.sort(key=lambda x: abs(x[3]), reverse=True)
        return result

    def find_referrers(self, obj, max_depth=2):
        gc.collect()
        referrers = gc.get_referrers(obj)
        results = []
        for ref in referrers:
            type_name = type(ref).__name__
            snippet = repr(ref)[:80]
            results.append((type_name, snippet))
        return results

    def detect_leaks(self, threshold=10):
        diffs = self.diff()
        return [(t, ca, cb, d) for t, ca, cb, d in diffs if d > threshold]

How this mirrors real memory leak debugging:

Baseline snapshot: Before the suspected leak, capture a count of all live objects by type.
Trigger the leak: Run the code path you suspect is leaking.
Second snapshot: Capture counts again.
Diff: Compare the two snapshots. Types with large positive deltas are potential leaks.
Find referrers: For a leaked object, gc.get_referrers() tells you exactly what is keeping it alive — usually an unbounded list, dict, or global cache.

Why three gc.collect() calls: CPython's GC uses generational collection (gen 0, gen 1, gen 2). A single gc.collect() only collects the youngest generation by default. Calling it three times ensures objects have been promoted through all generations and collected.

Production tools that use this pattern:

tracemalloc (stdlib) — tracks memory allocations with file/line info
objgraph — generates visual reference graphs
pympler — tracks object sizes and growth over time

Common leak sources found this way:

Global dictionaries that grow without eviction
Closures capturing large objects
Event listeners / callbacks registered but never unregistered
C extensions that increment refcount without decrementing

Starter Code

import gc
import sys
from collections import Counter

class MemoryProfiler:
    """A diagnostic tool that snapshots live objects and detects leaks
    by comparing snapshots over time.
    """

    def __init__(self):
        self._snapshots = []

    def snapshot(self, label=""):
        """Take a snapshot of live object counts by type.
        Store it with a label for later comparison.
        """
        # TODO:
        # 1. Force full GC collection
        # 2. Get all tracked objects
        # 3. Count by type name
        # 4. Store in self._snapshots
        pass

    def diff(self, idx_a=0, idx_b=-1):
        """Compare two snapshots and return types with changed counts.
        Returns list of (type_name, count_a, count_b, delta) sorted by |delta| descending.
        """
        # TODO: compute difference between two snapshots
        pass

    def find_referrers(self, obj, max_depth=2):
        """Find what is keeping obj alive.
        Returns a list of (type, repr_snippet) for each referrer.
        """
        # TODO: use gc.get_referrers()
        pass

    def detect_leaks(self, threshold=10):
        """Compare first and last snapshot.
        Return types whose count increased by more than threshold.
        """
        # TODO: filter diff results
        pass

Expected Output

Snapshot 0 (baseline): types tracked > 0: True\nLeaked 500 LeakyItem objects\nSnapshot 1 (after leak): types tracked > 0: True\nLeak detected: LeakyItem increased by 500\nReferrers of a leaked item: found list referrer: True\nAfter cleanup: LeakyItem delta <= 0: True

Hints

Hint 1: gc.collect() should be called before gc.get_objects() to ensure all collectable garbage is cleaned up first.

Hint 2: Use collections.Counter on type(obj).__name__ for all gc.get_objects() to build the snapshot.

Hint 3: The diff should show types that increased (potential leaks) and decreased (freed objects).

Hint 4: gc.get_referrers(obj) returns all objects that hold a reference to obj — useful for finding what is keeping a leaked object alive.

Practice: Memory References and Aliasing

Easy​

Medium​

Hard​

Easy

Medium

Hard