Skip to main content

Reference Counting - How CPython Manages Memory at the C Level

Reading time: ~30 minutes | Level: Intermediate → Engineering

Before reading further, predict every output:

import sys

a = [1, 2, 3]
print(sys.getrefcount(a)) # ?

b = a
print(sys.getrefcount(a)) # ?

del b
print(sys.getrefcount(a)) # ?
Show Answer

Output:

2
3
2

Most engineers expect 1, 2, 1. The actual values are one higher than expected because sys.getrefcount(a) itself creates a temporary reference - passing a as an argument increments its reference count for the duration of the getrefcount call.

  • After a = [1, 2, 3]: one reference (a in your namespace). getrefcount adds one → 2
  • After b = a: two references (a and b). getrefcount adds one → 3
  • After del b: one reference (a again). getrefcount adds one → 2

The rule: always subtract 1 from sys.getrefcount()'s result to get the true reference count.

This is not a quirk - it reveals exactly how CPython's memory management works at the C level. Every Python object is a C struct with an ob_refcnt field. Every assignment, function call, and deletion modifies this field. Understanding it at this depth means understanding why objects sometimes live longer than expected, why __del__ is unreliable, why cycles require a separate garbage collector, and why weakref exists.

What You Will Learn

  • Every PyObject has ob_refcnt: the reference count field in C
  • How reference counts change: assignment, deletion, function calls, returns
  • sys.getrefcount() always adds 1 - why and how to account for it
  • Reading raw refcounts with ctypes (educational tool, not production code)
  • When refcount hits 0: tp_dealloc is called, memory returned immediately
  • Strengths of reference counting: deterministic destruction, immediate __del__ calls
  • Weaknesses: reference cycles are not collected by refcounting alone
  • Reference cycles: a.next = b; b.next = a - refcounts never reach zero
  • The weakref module: references that don't increment refcount
  • WeakValueDictionary and WeakKeyDictionary for caches that don't prevent GC
  • Why contextlib.contextmanager is safer than __del__ for resource cleanup

Prerequisites

  • Lesson 01: CPython Architecture - you need to understand PyObject at the C level
  • Lesson 03: Disassembly with dis - helpful for seeing LOAD/STORE operations that trigger refcount changes
  • Familiarity with Python's del statement and __del__ method

Part 1 - Every Object Is a PyObject

The C Structure

Every Python object in CPython is represented at the C level as a PyObject struct:

// From CPython's Include/object.h (simplified)
typedef struct _object {
Py_ssize_t ob_refcnt; // reference count
PyTypeObject *ob_type; // pointer to the type object
} PyObject;

Every single Python object - integers, strings, lists, functions, classes, modules - starts with these two fields. ob_refcnt is a C ssize_t (signed 64-bit integer on 64-bit systems). ob_type points to the object's type.

When you write a = [1, 2, 3] in Python, CPython:

  1. Allocates memory for a new list object
  2. Sets ob_refcnt = 1
  3. Sets ob_type to point to list
  4. Stores a pointer to the object in the name a in the local namespace

Why Reference Counting?

CPython chose reference counting for its simplicity and determinism:

  1. Immediate reclamation: when ob_refcnt drops to 0, memory is freed at that exact moment - no GC pause, no delay
  2. Deterministic __del__: finalizers are called immediately when the last reference drops
  3. Low overhead: incrementing/decrementing a counter is cheap - no periodic GC scans needed for simple cases
  4. Cache-friendly: small objects are freed and reused quickly, staying warm in CPU cache

The tradeoff: reference counting cannot collect cycles (objects that reference each other). Python's cyclic GC handles that separately (covered in Lesson 06).

Part 2 - How Reference Counts Change

The Six Operations That Change Refcounts

import sys

obj = object()
print(sys.getrefcount(obj) - 1) # 1 (subtract the getrefcount argument ref)

# 1. Assignment: +1
other = obj
print(sys.getrefcount(obj) - 1) # 2

# 2. Adding to a container: +1
lst = [obj]
print(sys.getrefcount(obj) - 1) # 3

# 3. Passing to a function: +1 (during the call)
def show_refcount(x):
# Inside the call: x is another reference to obj
print(sys.getrefcount(x) - 1) # 4 (obj + other + lst[0] + x)

show_refcount(obj)
# After the call: x is gone, refcount back to 3
print(sys.getrefcount(obj) - 1) # 3

# 4. Removing from container: -1
lst.remove(obj)
print(sys.getrefcount(obj) - 1) # 2

# 5. del statement: -1
del other
print(sys.getrefcount(obj) - 1) # 1

# 6. Reassignment: -1 for old object
new_obj = object()
obj = new_obj # 'obj' no longer points to original object
# original object's refcount is now 0 → freed immediately

Mermaid: The Refcount Lifecycle

Function Calls and Returns

Function calls are the highest-frequency refcount operations in typical Python programs:

import sys

def inspect_refcount(x, label):
# x is another reference - adds 1
count = sys.getrefcount(x) - 1 # subtract getrefcount's arg ref
print(f"{label}: {count}")

mylist = [1, 2, 3]
inspect_refcount(mylist, "before call") # probably 1 inside the function (2-1)

# After the function returns, the parameter 'x' is gone
# refcount drops back to 1
inspect_refcount(mylist, "after call")

# Storing a return value
def get_list():
result = [1, 2, 3]
# result has refcount 1 (the local name 'result')
return result # refcount stays 1 - transferred to caller's name
# 'result' local is removed from frame, but the object survives because
# the caller's name immediately holds the reference

received = get_list()
# received now holds the reference - refcount = 1

Part 3 - sys.getrefcount and ctypes

sys.getrefcount Always Adds 1

The extra reference is not a bug - it is correct behavior:

import sys

a = "hello"
# When sys.getrefcount(a) is called:
# 1. Python evaluates the argument 'a' - this creates a reference (the argument slot)
# 2. getrefcount() receives the reference
# 3. getrefcount() increments ob_refcnt for the duration of the call
# 4. getrefcount() reads ob_refcnt and returns it
# 5. The argument reference is released - ob_refcnt decrements
# The returned value is one higher than the "true" count at the call site

print(sys.getrefcount("hello") - 1) # subtract 1 to get the real count

# Note: string interning affects this - common strings may have high refcounts
# because CPython interns them and shares across all uses
print(sys.getrefcount("") - 1) # might be hundreds - "" is interned everywhere

Reading Raw Refcounts with ctypes (Educational Only)

For educational understanding, you can read the raw ob_refcnt field directly using ctypes:

import ctypes
import sys

a = [1, 2, 3]

# id(a) returns the memory address of the object
addr = id(a)

# Read the first field of PyObject (ob_refcnt) as a C long
raw_refcount = ctypes.c_long.from_address(addr).value
getrefcount_result = sys.getrefcount(a)

print(f"Raw ob_refcnt via ctypes: {raw_refcount}")
print(f"sys.getrefcount(a): {getrefcount_result}")
print(f"getrefcount - raw: {getrefcount_result - raw_refcount}")
# Output will show getrefcount is exactly 1 higher than raw ob_refcnt

:::danger Do Not Use ctypes.from_address in Production Code Reading raw memory addresses bypasses all of Python's safety guarantees. The object at id(a) could be moved or freed between the id() call and the from_address() call if you don't hold a strong reference. This technique is for learning internals only. In production, use sys.getrefcount() and subtract 1. :::

Small Integer and String Interning

CPython maintains a cache of small integers and interns certain strings. These objects have artificially high refcounts:

import sys

# Small integers (-5 to 256) are cached - refcount is very high
print(sys.getrefcount(0) - 1) # hundreds or thousands - 0 is used everywhere
print(sys.getrefcount(1) - 1) # similarly high
print(sys.getrefcount(257) - 1) # probably 1 - not cached

# Interned strings also have high refcounts
print(sys.getrefcount("") - 1) # high - empty string is used pervasively
print(sys.getrefcount("hello") - 1) # depends on usage in the current session

This caching is why a = 1; b = 1; a is b returns True - a and b point to the same cached integer object.

Part 4 - When Refcount Hits Zero: tp_dealloc

Immediate Deallocation

When ob_refcnt decrements to 0, CPython calls the type's tp_dealloc function immediately - no delay, no queue:

class Tracked:
def __init__(self, name):
self.name = name
print(f" Created: {self.name}")

def __del__(self):
print(f" Destroyed: {self.name}")

print("Creating a")
a = Tracked("A")

print("Creating b = a")
b = a

print("del a")
del a # refcount drops from 2 to 1 - NOT destroyed yet

print("del b")
del b # refcount drops from 1 to 0 - IMMEDIATELY destroyed here

print("After del b")

Output:

Creating a
Created: A
Creating b = a
del a
del b
Destroyed: A ← called IMMEDIATELY when del b executes
After del b

The destruction happens at the del b line, not at the end of the function, not at garbage collection time. This is deterministic destruction - a key advantage of reference counting over tracing GC.

What tp_dealloc Does

For a list object, list_dealloc (the C function) does:

  1. Decrements ob_refcnt of every element in the list (which may trigger their own deallocation cascade)
  2. Frees the internal array
  3. Returns the list object's memory to pymalloc

For a dict, a string, a custom class instance - each type has its own tp_dealloc that cleans up type-specific resources before returning memory.

Part 5 - Reference Cycles: Refcounting's Weakness

The Cycle Problem

Reference counting has one fundamental weakness: it cannot collect objects involved in reference cycles.

import sys

a = []
b = []
a.append(b) # a holds reference to b
b.append(a) # b holds reference to a - cycle created

print(sys.getrefcount(a) - 1) # 2: local name 'a' + b[0]
print(sys.getrefcount(b) - 1) # 2: local name 'b' + a[0]

del a
# a's name is gone - but b[0] still holds a reference
# refcount of the list originally named 'a' is now 1

del b
# b's name is gone - but a[0] still holds a reference (a[0] IS the b list)
# refcount of the list originally named 'b' is now 1

# Both objects have refcount 1 - neither ever reaches 0
# Neither is ever freed by reference counting alone
# This is a memory leak (until the cyclic GC runs)

Self-Referential Objects

# The simplest possible cycle
a = []
a.append(a) # a contains itself

print(sys.getrefcount(a) - 1) # 2: name 'a' + a[0]
del a
# refcount drops to 1 - the list still holds a reference to itself
# it will NEVER reach 0 via refcounting
# Only the cyclic GC can collect this

Instance Cycles

Cycles are common with parent-child relationships in object graphs:

class Node:
def __init__(self, val):
self.val = val
self.parent = None
self.children = []

def add_child(self, child):
child.parent = self # child → parent reference
self.children.append(child) # parent → child reference
# CYCLE: parent.children[i].parent is parent

root = Node("root")
child = Node("child")
root.add_child(child)

del root
del child
# Both objects still have refcount > 0 due to the cycle
# Memory is only reclaimed when the cyclic GC runs

Part 6 - Weak References: References That Don't Count

The weakref Module

A weak reference points to an object without incrementing its ob_refcnt. If the only remaining references to an object are weak references, the object is freed:

import weakref
import sys

class MyClass:
def __init__(self, name):
self.name = name
def __repr__(self):
return f"MyClass({self.name!r})"

obj = MyClass("example")
print(sys.getrefcount(obj) - 1) # 1 (just the 'obj' name)

# Create a weak reference - does NOT increment ob_refcnt
ref = weakref.ref(obj)
print(sys.getrefcount(obj) - 1) # still 1 - weak ref doesn't count

# Access the object through the weak reference
print(ref()) # MyClass('example') - dereference to get the live object

del obj # refcount drops to 0 - object freed IMMEDIATELY
print(ref()) # None - object is gone, weak ref returns None

WeakValueDictionary for Caches

The most practical use of weak references in production is WeakValueDictionary - a cache that does not prevent its values from being garbage collected:

import weakref

class ExpensiveObject:
def __init__(self, key):
self.key = key
self.data = [0] * 10_000 # large data

def __repr__(self):
return f"ExpensiveObject({self.key!r})"

# Normal dict: keeps objects alive even if nothing else references them
strong_cache = {}
strong_cache["a"] = ExpensiveObject("a")
strong_cache["b"] = ExpensiveObject("b")
# Even if no code uses these objects, they stay in memory as long as
# strong_cache exists

# WeakValueDictionary: objects are freed when no strong references remain
weak_cache = weakref.WeakValueDictionary()

obj_a = ExpensiveObject("a")
obj_b = ExpensiveObject("b")
weak_cache["a"] = obj_a
weak_cache["b"] = obj_b

print(dict(weak_cache)) # {'a': ExpensiveObject('a'), 'b': ExpensiveObject('b')}

del obj_b
# obj_b's only strong reference was the name 'obj_b'
# refcount drops to 0 - freed immediately

import gc; gc.collect()
print(dict(weak_cache)) # {'a': ExpensiveObject('a')} - 'b' was cleaned up

WeakKeyDictionary

WeakKeyDictionary uses weak references for the keys - useful for attaching metadata to objects without preventing their collection:

import weakref

class Widget:
pass

metadata = weakref.WeakKeyDictionary()

btn = Widget()
label = Widget()

metadata[btn] = {"type": "button", "text": "Click me"}
metadata[label] = {"type": "label", "text": "Hello"}

print(len(metadata)) # 2

del label
# label's refcount drops to 0 - freed
# WeakKeyDictionary automatically removes the entry

import gc; gc.collect()
print(len(metadata)) # 1 - entry for label was cleaned up automatically

:::tip Use weakref.WeakValueDictionary for Caches That Shouldn't Prevent Garbage Collection In-memory caches (object caches, connection pools, computed result caches) should almost always use WeakValueDictionary as their backing store. This prevents the cache from becoming a memory leak - objects are freed when the rest of the application no longer needs them, and the cache silently drops the entry. If the object is needed again, recompute and re-cache.

import weakref
import functools

def weak_memoize(func):
"""Memoization that doesn't prevent GC of results."""
cache = weakref.WeakValueDictionary()
@functools.wraps(func)
def wrapper(*args):
if args not in cache:
cache[args] = func(*args)
return cache[args]
return wrapper

:::

weakref.finalize: Run Code When an Object Is Freed

weakref.finalize registers a callback to run when an object is garbage collected, without preventing collection:

import weakref

class Connection:
def __init__(self, host):
self.host = host
print(f"Connected to {host}")

def on_connection_freed(host):
print(f"Connection to {host} was freed - cleaning up")

conn = Connection("db.example.com")
weakref.finalize(conn, on_connection_freed, conn.host)

del conn # refcount → 0, object freed, finalize callback runs immediately
# Output: Connection to db.example.com was freed - cleaning up

Part 7 - del and Resource Cleanup

del: The Finalizer

__del__ is called when an object's reference count reaches 0 (or when the cyclic GC collects it). For simple objects without cycles, this happens deterministically at the del point:

class FileWrapper:
def __init__(self, path):
self.path = path
self._file = open(path, 'w')
print(f"Opened {path}")

def write(self, data):
self._file.write(data)

def __del__(self):
if not self._file.closed:
self._file.close()
print(f"Closed {self.path}")

fw = FileWrapper("/tmp/test.txt")
fw.write("hello")
del fw # __del__ called immediately - file closed

Why del Is Unreliable

__del__ is not guaranteed to be called in all situations:

# Problem 1: __del__ is NOT called on objects in cycles
# (until the cyclic GC runs - which may be never if GC is disabled)
class Node:
def __init__(self):
self.ref = None
def __del__(self):
print("Node deleted")

a = Node()
b = Node()
a.ref = b
b.ref = a # cycle - __del__ may never be called if GC is disabled

# Problem 2: __del__ is NOT reliably called at interpreter shutdown
# CPython clears global variables during shutdown; objects referenced only
# by globals may be in a partially-teardown state when __del__ runs

# Problem 3: Exceptions in __del__ are silently ignored
class Buggy:
def __del__(self):
raise RuntimeError("Error in __del__") # printed to stderr, not raised

:::warning del Is Not Guaranteed to Run at Process Exit At Python interpreter shutdown, global variables are set to None in an unspecified order. __del__ methods that rely on globals (logging, database connections, file handles accessed through globals) will see None instead of the expected objects and crash or silently fail. Do not rely on __del__ for critical cleanup at process exit.

import logging

logger = logging.getLogger(__name__)

class Resource:
def __del__(self):
# UNRELIABLE: at process exit, logging module may already be torn down
logger.info("Resource freed") # logger might be None!

:::

Context Managers Are Better

Use context managers for deterministic resource cleanup. They are explicit, composable, and guaranteed to run:

from contextlib import contextmanager

@contextmanager
def managed_resource(name):
"""Deterministic resource management - always releases, even on exceptions."""
print(f"Acquiring {name}")
resource = {"name": name, "active": True}
try:
yield resource
finally:
resource["active"] = False
print(f"Released {name}") # ALWAYS runs, even if body raises

with managed_resource("database_connection") as conn:
print(f"Using {conn['name']}")
# raise RuntimeError("oops") # 'Released' still prints
# Output:
# Acquiring database_connection
# Using database_connection
# Released database_connection

:::danger Circular References Involving del in Python < 3.4 Could Leak Memory Forever Before Python 3.4 (PEP 442), if an object with __del__ was part of a reference cycle, CPython could not collect it safely - calling __del__ might resurrect the cycle. These objects were placed in gc.garbage and never freed. This was a major source of memory leaks.

Python 3.4+ (PEP 442) fixed this: __del__ is now safe to call even on objects in cycles. But if you run Python < 3.4, or work with libraries that claim Python 2/3 compatibility, be aware that combining __del__ with cycles is dangerous. :::

:::note del x Decrements Refcount; It Does Not Immediately Destroy the Object del x removes the name x from the current namespace and decrements the referenced object's ob_refcnt by 1. If ob_refcnt drops to 0, the object is freed immediately. But if other references exist (another name, a list element, a function's local variable), the object lives on. del x only guarantees that x no longer names the object - nothing about the object's lifetime.

a = [1, 2, 3]
b = a
del a # ob_refcnt: 2 → 1 - list survives, accessible via b
print(b) # [1, 2, 3] - still alive
del b # ob_refcnt: 1 → 0 - list freed NOW

:::

Part 8 - Refcount in Practice

Memory Management Without a GC Pause

One of the most significant advantages of reference counting is predictable memory usage. In a long-running server, objects are freed immediately when no longer needed - there is no GC pause that doubles memory usage while the GC scans:

import sys

def process_records(records):
"""Process each record independently - no accumulation."""
for record in records:
# transformed has refcount 1 (the local name)
transformed = transform(record) # refcount = 1
store(transformed) # refcount temporarily 2 during call
# After store() returns, if store didn't retain a reference:
# transformed's refcount drops to 1
# At next iteration, transformed is rebound - old object refcount → 0 → freed

# Memory usage stays bounded - old objects freed before new ones allocated

Compare to a tracing GC (Java, Go, Ruby): objects accumulate until a GC cycle runs, potentially doubling peak memory.

Debugging Unexpected Memory Retention

When objects live longer than expected, sys.getrefcount can help find unexpected references:

import sys
import gc

class MyObject:
pass

obj = MyObject()
expected_count = 1

# Why is refcount higher than expected?
count = sys.getrefcount(obj) - 1
if count > expected_count:
print(f"Unexpected references: {count}")
# Find who holds the references
referrers = gc.get_referrers(obj)
for referrer in referrers:
print(f" Referenced by: {type(referrer).__name__}: {referrer!r}")

Common Mistakes

Mistake 1 - Interpreting sys.getrefcount Without Subtracting 1

import sys

a = "unique_string_" + str(id({})) # unlikely to be interned
print(sys.getrefcount(a)) # prints 2 - beginner thinks there are 2 references!
# Wrong: there is only 1 reference (the name 'a'); the extra 1 is getrefcount's arg
print(sys.getrefcount(a) - 1) # correct: 1

Mistake 2 - Relying on del for Critical Resource Cleanup

# Wrong: __del__ may not run, or may run in wrong state at shutdown
class DatabaseConnection:
def __del__(self):
self.conn.close() # UNRELIABLE

# Right: always use a context manager
class DatabaseConnection:
def __enter__(self):
return self

def __exit__(self, *exc):
self.conn.close() # GUARANTEED to run
return False

with DatabaseConnection() as db:
db.query("SELECT 1")

Mistake 3 - Creating Cycles Unintentionally

# Common cycle pattern: callback holds reference to object that holds callback
class EventEmitter:
def __init__(self):
self.handlers = []

def on(self, handler):
self.handlers.append(handler)

emitter = EventEmitter()

def my_handler():
emitter.do_something() # captures 'emitter' - creates cycle!
# emitter → handlers → my_handler → emitter (closure)

emitter.on(my_handler)
# del emitter - refcount won't reach 0 due to cycle

# Fix: use weakref in the handler
import weakref

emitter_ref = weakref.ref(emitter)

def my_handler():
e = emitter_ref()
if e is not None:
e.do_something() # no cycle - weakref doesn't increment refcount

Mistake 4 - Using a Strong Cache When WeakValueDictionary Is Appropriate

# Wrong: cache holds objects alive even after all other references drop
_cache = {}

def get_user(user_id):
if user_id not in _cache:
_cache[user_id] = load_user(user_id) # user kept alive by cache forever
return _cache[user_id]

# Right: cache drops entries when objects are no longer needed elsewhere
import weakref

_cache = weakref.WeakValueDictionary()

def get_user(user_id):
user = _cache.get(user_id)
if user is None:
user = load_user(user_id)
_cache[user_id] = user
return user

Graded Practice Challenges

Level 1 - Predict the Output

Question 1: What does this print?

import sys

x = [1, 2, 3]
y = x
z = [x, x]

print(sys.getrefcount(x) - 1)
Show Answer

Output: 4

The list [1, 2, 3] is referenced by:

  1. Name x
  2. Name y
  3. z[0]
  4. z[1]

So ob_refcnt = 4. sys.getrefcount(x) adds 1 for its own argument, returning 5. Subtract 1 → 4.

Question 2: What does this print, and when does "Destroyed" appear?

class Obj:
def __init__(self, name): self.name = name
def __del__(self): print(f"Destroyed {self.name}")

print("A")
a = Obj("alpha")
print("B")
b = a
print("C")
del a
print("D")
del b
print("E")
Show Answer

Output:

A
B
C
D
Destroyed alpha
E
  • After del a: refcount drops from 2 to 1. Object survives (b still holds it).
  • After del b: refcount drops from 1 to 0. __del__ called immediately.
  • "Destroyed alpha" appears between "D" and "E" - proving deterministic, immediate destruction.

Question 3: What does this print?

import weakref

class Node:
pass

n = Node()
ref = weakref.ref(n)
print(ref() is n) # ?
del n
print(ref()) # ?
Show Answer

Output:

True
None

ref() dereferences the weak reference, returning the live Node object. ref() is n is True - same object. After del n, the only remaining reference was the name n. The weak reference did not count. Refcount → 0, object freed. ref() now returns None.

Question 4: What does this print?

import sys

def f(x):
return sys.getrefcount(x)

a = object()
print(f(a) - 1)
Show Answer

Output: 2

Inside f, x is a reference to the same object as a. So ob_refcnt = 2 (name a in outer scope + parameter x in f). sys.getrefcount(x) adds 1 for its own argument → returns 3. Subtract 1 → 2.

Question 5: What does this print?

import sys

a = []
a.append(a) # self-reference

print(sys.getrefcount(a) - 1)

del a
# The list is now inaccessible from Python code
# but its refcount is still 1 (a[0] points to itself)
# What happens to it?
Show Answer

Output: 2

Before del a: ob_refcnt = 2 (name a + a[0] which is a itself). getrefcount adds 1 → returns 3. Subtract 1 → 2.

After del a: ob_refcnt drops from 2 to 1. The list still holds a reference to itself via a[0]. The refcount will never reach 0 through reference counting alone. The list is a memory leak until the cyclic garbage collector runs and detects the unreachable cycle. This is exactly the problem covered in Lesson 06.

Level 2 - Debug Challenge

Find and fix all issues:

import weakref

# Bug 1: cache that prevents GC of large objects
class ImageCache:
def __init__(self):
self._cache = {} # strong references - images never freed

def get(self, path):
if path not in self._cache:
self._cache[path] = load_image(path)
return self._cache[path]

# Bug 2: __del__ used for critical network cleanup
class NetworkClient:
def __init__(self, host):
self.socket = connect(host)

def __del__(self):
self.socket.close() # unreliable at shutdown

# Bug 3: unintentional cycle via callback
class Button:
def __init__(self):
self.on_click = None

button = Button()
def handler():
print(f"Button at {id(button)} clicked") # captures button strongly
button.on_click = handler
# button → on_click → handler → button (closure captures button)

# Bug 4: misreading sys.getrefcount
import sys
data = {"key": "value"}
print(f"References to data: {sys.getrefcount(data)}") # reports wrong number
Show Solution

Bug 1 - Strong cache prevents GC:

class ImageCache:
def __init__(self):
self._cache = weakref.WeakValueDictionary() # images freed when not in use

def get(self, path):
img = self._cache.get(path)
if img is None:
img = load_image(path)
self._cache[path] = img
return img

Bug 2 - __del__ for critical cleanup:

class NetworkClient:
def __init__(self, host):
self.socket = connect(host)

def close(self):
self.socket.close()

def __enter__(self):
return self

def __exit__(self, *exc):
self.close() # guaranteed - use with 'with' statement
return False

with NetworkClient(host) as client:
client.do_work()

Bug 3 - Cycle via closure capturing button:

button = Button()
button_ref = weakref.ref(button)

def handler():
b = button_ref() # dereference weakref - no cycle
if b is not None:
print(f"Button at {id(b)} clicked")

button.on_click = handler

Bug 4 - Misreading sys.getrefcount:

import sys
data = {"key": "value"}
# Always subtract 1 - getrefcount adds a temporary reference for its argument
print(f"References to data: {sys.getrefcount(data) - 1}")

Level 3 - Design Challenge

Design a RefTracker context manager that:

  1. On entry, records the id() and initial sys.getrefcount() of a given object
  2. On exit, reports whether the refcount changed
  3. Has a snapshot() method that logs the current refcount delta
  4. Raises a warning (not an error) if the refcount on exit is higher than on entry (potential leak)
  5. Works correctly with with RefTracker(obj) as tracker: ...
Show Reference Solution
import sys
import warnings
import weakref


class RefTracker:
"""
Context manager that tracks reference count changes for an object.

Useful for debugging unexpected memory retention in tests and profiling.

Usage:
obj = MyClass()
with RefTracker(obj) as tracker:
do_something_with(obj)
tracker.snapshot("after do_something")
# Prints report on exit
"""

def __init__(self, obj, name: str | None = None):
# Use weakref so RefTracker itself doesn't inflate the refcount
# Note: not all objects are weakly referenceable (int, str are not)
try:
self._ref = weakref.ref(obj)
except TypeError:
# Fall back to strong reference for non-weakreferenceable objects
self._ref = lambda: obj

self._name = name or repr(obj)
self._initial_count = None
self._snapshots: list[tuple[str, int]] = []

def _get_count(self) -> int | None:
"""Get current refcount, subtracting 1 for getrefcount's own arg."""
obj = self._ref()
if obj is None:
return None
# getrefcount adds 1 for its argument + 1 for self._ref() temporary
# We subtract 2 to compensate for both
return sys.getrefcount(obj) - 2

def __enter__(self):
self._initial_count = self._get_count()
return self

def snapshot(self, label: str = "") -> int | None:
"""Record a refcount snapshot and print the delta."""
count = self._get_count()
if count is None:
print(f" [{label}] Object '{self._name}' has been freed")
self._snapshots.append((label, -1))
return None

delta = count - self._initial_count
sign = "+" if delta >= 0 else ""
print(f" [{label}] refcount={count} (delta={sign}{delta})")
self._snapshots.append((label, count))
return count

def __exit__(self, exc_type, exc_val, exc_tb):
final_count = self._get_count()
obj = self._ref()

print(f"\nRefTracker report for '{self._name}':")
print(f" Initial refcount: {self._initial_count}")

if final_count is None:
print(f" Final refcount: <freed>")
else:
delta = final_count - self._initial_count
sign = "+" if delta >= 0 else ""
print(f" Final refcount: {final_count} ({sign}{delta})")

if delta > 0:
warnings.warn(
f"RefTracker: '{self._name}' has {delta} more reference(s) "
f"on exit than entry. Possible memory leak.",
ResourceWarning,
stacklevel=2,
)

if self._snapshots:
print(f" Snapshots: {len(self._snapshots)}")

return False # don't suppress exceptions


# Usage
import gc

class SomeObject:
def __init__(self, data):
self.data = data

obj = SomeObject([1, 2, 3])

with RefTracker(obj, name="SomeObject") as tracker:
tracker.snapshot("initial")

extra_ref = obj # adds a reference
tracker.snapshot("after extra_ref = obj")

del extra_ref # removes it
tracker.snapshot("after del extra_ref")

# Output:
# [initial] refcount=1 (delta=+0)
# [after extra_ref = obj] refcount=2 (delta=+1)
# [after del extra_ref] refcount=1 (delta=+0)
#
# RefTracker report for 'SomeObject':
# Initial refcount: 1
# Final refcount: 1 (+0)
# Snapshots: 3

Design decisions:

  • Uses weakref.ref so RefTracker itself does not inflate the refcount
  • Subtracts 2 from getrefcount result: one for the argument slot, one for the self._ref() temporary
  • ResourceWarning (not RuntimeError) - a warning is appropriate for a diagnostic tool
  • Falls back to strong reference for non-weakly-referenceable objects (int, str, etc.)

Key Takeaways

  • Every Python object is a C PyObject struct with ob_refcnt (reference count) and ob_type (type pointer) as its first two fields
  • ob_refcnt increments on assignment, adding to a container, and passing as a function argument; decrements on del, removing from a container, and function return
  • sys.getrefcount(obj) always returns a count that is 1 higher than the true count - the argument itself creates a temporary reference; always subtract 1
  • When ob_refcnt reaches 0, tp_dealloc is called immediately and memory is returned - this is deterministic destruction with no GC pause
  • Reference counting is fast and deterministic but cannot collect reference cycles - objects that reference each other in a loop will never reach ob_refcnt = 0
  • del x decrements the refcount by 1; it does not guarantee immediate destruction if other references exist
  • weakref.ref, WeakValueDictionary, and WeakKeyDictionary create references that do not increment ob_refcnt - essential for caches that should not prevent GC
  • __del__ is called immediately when refcount reaches 0 for non-cyclic objects but is unreliable at interpreter shutdown and for objects in cycles; use context managers (with / __exit__) for deterministic resource cleanup
  • Circular references involving __del__ in Python < 3.4 could leak memory forever (PEP 442 fixed this in Python 3.4+)
  • The cyclic garbage collector (Lesson 06) handles what reference counting cannot: detecting and collecting unreachable cycles

What's Next

Lesson 06 covers CPython's cyclic garbage collector - the generational, mark-and-sweep collector that handles reference cycles. You will learn how three-generation collection works, how to tune GC thresholds for batch workloads, why gc.freeze() exists for forking servers, and how to diagnose memory leaks with gc.get_referrers() and tracemalloc.

© 2026 EngineersOfAI. All rights reserved.