Python Bytecode Inspection Practice Problems & Exercises
Practice: Bytecode Inspection
← Back to lessonEasy
Explore the __code__ attribute of a function. Print the key attributes of its code object.
def add_values(x, y):
return x + y
code = add_values.__code__
print(f"Name: {code.co_name}")
print(f"Args: {code.co_argcount}")
print(f"Locals: {code.co_varnames}")
print(f"Constants: {code.co_consts}")
print(f"Stack size: {code.co_stacksize}")
print(f"Filename: {code.co_filename}")Solution
def add_values(x, y):
return x + y
code = add_values.__code__
print(f"Name: {code.co_name}")
print(f"Args: {code.co_argcount}")
print(f"Locals: {code.co_varnames}")
print(f"Constants: {code.co_consts}")
print(f"Stack size: {code.co_stacksize}")
print(f"Filename: {code.co_filename}")
The code object is the compiled form of a function. It is created once at function definition time (when the def statement executes) and shared across all calls. Key attributes:
co_name— the function's name as written in sourceco_argcount— number of positional parameters (excludes*args,**kwargs, keyword-only args)co_varnames— tuple of ALL local variable names; parameters come first, then any variables assigned inside the functionco_consts— tuple of all constant values embedded in the bytecode;Noneis always at index 0co_stacksize— the maximum depth the evaluation stack reaches; the PVM pre-allocates this much stack space when the frame is createdco_filename— the source file where the function was defined; useful for debugging and coverage tools
Expected Output
Name: add_values\nArgs: 2\nLocals: ('x', 'y')\nConstants: (None,)\nStack size: 2\nFilename: <string> or current fileHints
Hint 1: Every Python function has a `__code__` attribute that is a code object. Access it with `func.__code__`.
Hint 2: Key attributes: `co_name` (function name), `co_argcount` (number of positional args), `co_varnames` (all local variable names including params), `co_consts` (constants), `co_stacksize` (max stack depth).
Write a function that separates a function's parameters from its non-parameter local variables using code object attributes.
def triangle_area(a, b, c):
s = (a + b + c) / 2
area = (s * (s - a) * (s - b) * (s - c)) ** 0.5
return area
code = triangle_area.__code__
params = code.co_varnames[:code.co_argcount]
non_params = list(code.co_varnames[code.co_argcount:])
print(f"Parameters: {code.co_argcount}")
print(f"Total locals: {len(code.co_varnames)}")
print(f"Non-param locals: {non_params}")
print(f"All local names: {code.co_varnames}")Solution
def triangle_area(a, b, c):
s = (a + b + c) / 2
area = (s * (s - a) * (s - b) * (s - c)) ** 0.5
return area
code = triangle_area.__code__
params = code.co_varnames[:code.co_argcount]
non_params = list(code.co_varnames[code.co_argcount:])
print(f"Parameters: {code.co_argcount}")
print(f"Total locals: {len(code.co_varnames)}")
print(f"Non-param locals: {non_params}")
print(f"All local names: {code.co_varnames}")
Why the ordering in co_varnames matters:
CPython allocates a fixed-size array of local variable slots when a frame is created. The slot index corresponds to the position in co_varnames. Parameters are loaded via LOAD_FAST 0, LOAD_FAST 1, etc., and stored with STORE_FAST N.
The consistent ordering (params first, then other locals in order of first assignment) means CPython can use the co_argcount value as a clean boundary: everything before it is a parameter, everything from it onward is a body-local.
This layout is also what inspect.signature() uses internally to reconstruct parameter information at runtime.
Expected Output
Parameters: 3\nTotal locals: 5\nNon-param locals: ['area', 's']\nAll local names: ('a', 'b', 'c', 's', 'area')Hints
Hint 1: Parameters always appear first in `co_varnames`. The slice `co_varnames[:co_argcount]` gives you only the parameter names.
Hint 2: `co_nlocals` gives the total count of local variables (Python 3.11+). Before that, use `len(co_varnames)`. Non-param locals are `co_varnames[co_argcount:]`.
Inspect the closure-related code object attributes. Show how CPython represents captured variables using cell and free variable lists.
def outer(x):
def inner():
return x * 2
return inner
fn = outer(10)
outer_code = outer.__code__
inner_code = fn.__code__
print(f"outer co_cellvars: {outer_code.co_cellvars}")
print(f"outer co_freevars: {outer_code.co_freevars}")
print(f"inner co_freevars: {inner_code.co_freevars}")
print(f"inner co_cellvars: {inner_code.co_cellvars}")
print(f"closure cells: {fn.__closure__}")
print(f"current value of x: {fn.__closure__[0].cell_contents}")Solution
def outer(x):
def inner():
return x * 2
return inner
fn = outer(10)
outer_code = outer.__code__
inner_code = fn.__code__
print(f"outer co_cellvars: {outer_code.co_cellvars}")
print(f"outer co_freevars: {outer_code.co_freevars}")
print(f"inner co_freevars: {inner_code.co_freevars}")
print(f"inner co_cellvars: {inner_code.co_cellvars}")
print(f"closure cells: {fn.__closure__}")
print(f"current value of x: {fn.__closure__[0].cell_contents}")
How CPython implements closures:
When the compiler detects that a variable in outer is referenced in inner, it marks that variable as a cell variable in outer (co_cellvars) and a free variable in inner (co_freevars).
At runtime, CPython wraps the variable in a cell object — a simple box with a cell_contents pointer. Both outer's frame and inner's closure point to the SAME cell object. This is what makes closures "capture by reference": if outer reassigns x, the cell's pointer updates, and inner sees the new value.
fn.__closure__ is a tuple of cell objects corresponding to fn.__code__.co_freevars (same index). Cell objects are also how nonlocal is implemented: nonlocal x in an inner function tells the compiler to use LOAD_DEREF/STORE_DEREF (cell access) instead of LOAD_FAST/STORE_FAST (local access).
Expected Output
outer co_cellvars: ('x',)\nouter co_freevars: ()\ninner co_freevars: ('x',)\ninner co_cellvars: ()\nclosure cells: (<cell at 0x...>,)\ncurrent value of x: 10Hints
Hint 1: When a nested function captures a variable from an enclosing scope, the enclosing function lists it in `co_cellvars` and the inner function lists it in `co_freevars`.
Hint 2: Cell objects are wrappers that allow two code objects (inner and outer) to share a mutable binding. Access the cell value with `cell.cell_contents`.
Predict what co_consts will contain for each function before running the code.
import dis
def f1(x):
return x + 1
def f2(x):
return x + 2 * 3 # Does CPython fold 2*3?
def f3(x, y):
return "result: " + str(x + y)
def f4():
return [1, 2, 3]
for fn in [f1, f2, f3, f4]:
print(f"{fn.__name__}: co_consts = {fn.__code__.co_consts}")Solution
import dis
def f1(x):
return x + 1
def f2(x):
return x + 2 * 3
def f3(x, y):
return "result: " + str(x + y)
def f4():
return [1, 2, 3]
for fn in [f1, f2, f3, f4]:
print(f"{fn.__name__}: co_consts = {fn.__code__.co_consts}")
Expected output:
f1: co_consts = (None, 1)
f2: co_consts = (None, 6)
f3: co_consts = (None, 'result: ')
f4: co_consts = (None, 1, 2, 3)
Explanation:
- f1:
None(implicit return placeholder) +1(the literal). - f2: CPython's peephole optimizer folds
2 * 3to6at compile time. The constants2and3never appear separately — only6does. This is constant folding. - f3: The string literal
"result: "is a constant.str(the builtin) andx + yare NOT constants —stris a global name lookup andx + yis a runtime computation. - f4: The list
[1, 2, 3]is NOT a constant (lists are mutable, so they cannot be embedded as constants). Instead, the individual elements1,2,3are constants, andBUILD_LISTassembles them at runtime.
Key insight: co_consts is not just "what literals you wrote." The optimizer transforms your source, and only immutable values can be constants. Tuples of constants CAN appear in co_consts (since tuples are immutable), but lists cannot.
Expected Output
See solution — focus on which values appear and whyHints
Hint 1: co_consts always includes None at index 0 (the implicit return value). Then come any literal constants from the function body.
Hint 2: CPython performs constant folding at compile time: arithmetic on two constants is evaluated immediately. So `2 * 3` in source becomes `6` in co_consts, not `2` and `3` separately.
Medium
Write a function that reads a .pyc file and extracts its header fields and the top-level code object using the marshal module.
import marshal
import struct
import py_compile
import os
def read_pyc(pyc_path):
with open(pyc_path, "rb") as f:
magic = f.read(4)
flags = struct.unpack("<I", f.read(4))[0]
timestamp = struct.unpack("<I", f.read(4))[0]
source_size = struct.unpack("<I", f.read(4))[0]
code_bytes = f.read()
code_obj = marshal.loads(code_bytes)
return {
"magic": magic.hex(),
"flags": flags,
"timestamp": timestamp,
"source_size": source_size,
"code_object": code_obj,
}
# Create and compile a test module
with open("_test_mod.py", "w") as f:
f.write("ANSWER = 42\ndef greet(name):\n return f'Hello, {name}'\n")
pyc_path = py_compile.compile("_test_mod.py")
result = read_pyc(pyc_path)
print(f"magic: {result['magic']}")
print(f"timestamp: {result['timestamp']}")
print(f"source_size: {result['source_size']} bytes")
print(f"code object name: {result['code_object'].co_name}")
print(f"constants count: {len(result['code_object'].co_consts)}")
print(f"constants: {result['code_object'].co_consts}")
# Clean up
os.remove("_test_mod.py")
os.remove(pyc_path)
os.rmdir("__pycache__")
Solution
import marshal
import struct
import py_compile
import os
def read_pyc(pyc_path):
with open(pyc_path, "rb") as f:
magic = f.read(4)
flags = struct.unpack("<I", f.read(4))[0]
timestamp = struct.unpack("<I", f.read(4))[0]
source_size = struct.unpack("<I", f.read(4))[0]
code_bytes = f.read()
code_obj = marshal.loads(code_bytes)
return {
"magic": magic.hex(),
"flags": flags,
"timestamp": timestamp,
"source_size": source_size,
"code_object": code_obj,
}
with open("_test_mod.py", "w") as f:
f.write("ANSWER = 42\ndef greet(name):\n return f'Hello, {name}'\n")
pyc_path = py_compile.compile("_test_mod.py")
result = read_pyc(pyc_path)
print(f"magic: {result['magic']}")
print(f"timestamp: {result['timestamp']}")
print(f"source_size: {result['source_size']} bytes")
print(f"code object name: {result['code_object'].co_name}")
print(f"constants count: {len(result['code_object'].co_consts)}")
print(f"constants: {result['code_object'].co_consts}")
os.remove("_test_mod.py")
os.remove(pyc_path)
os.rmdir("__pycache__")
What the marshalled code object contains:
The marshal module serializes Python objects to a compact binary format. Only a restricted set of types can be marshalled: None, bool, int, float, complex, bytes, str, tuple, frozenset, and code objects. Lists, dicts, and custom objects cannot be marshalled.
The top-level code object's co_consts contains:
None— always present42— the module-level constantANSWER = 42- A nested code object for the
greetfunction — code objects are constants from the module's perspective 'greet'— the function's qualified name (used when creating the function object)
Why co_consts contains nested code objects: When Python encounters a def statement, the bytecode does LOAD_CONST to push the pre-compiled code object for the inner function, then MAKE_FUNCTION to wrap it in a function object. The code object itself is a compile-time constant embedded in the parent's co_consts.
import marshal
import struct
import py_compile
import os
def read_pyc(pyc_path):
"""Read a .pyc file and return:
- magic_number (hex string)
- timestamp (int)
- source_size (int)
- the deserialized code object
"""
passExpected Output
magic: <hex bytes>\ntimestamp: <unix timestamp>\nsource_size: <bytes>\ncode object name: <module>\ncode object consts count: <N>Hints
Hint 1: A .pyc file has a 16-byte header: 4 bytes magic number, 4 bytes flags, 4 bytes timestamp, 4 bytes source size. After the header comes the marshalled code object.
Hint 2: Use `struct.unpack("<I", data)` to read little-endian unsigned 32-bit integers. Then call `marshal.loads()` on the remaining bytes to get the code object.
Extract the mapping from bytecode offsets to source line numbers using dis.findlinestarts(). Show how CPython knows which line number to report in tracebacks.
import dis
def multi_line_function(x):
# line 1 of body
y = x * 2 # line 2
z = y + 1 # line 3
if z > 10: # line 4
return z # line 5
return y # line 6
code = multi_line_function.__code__
print(f"Function: {code.co_name}")
print(f"First line: {code.co_firstlineno}")
print()
print("Offset -> Line number mapping:")
print("-" * 30)
for offset, lineno in dis.findlinestarts(code):
print(f" offset {offset:>3} -> line {lineno}")
print()
print("Full disassembly with line numbers:")
dis.dis(multi_line_function)Solution
import dis
def multi_line_function(x):
y = x * 2
z = y + 1
if z > 10:
return z
return y
code = multi_line_function.__code__
print(f"Function: {code.co_name}")
print(f"First line: {code.co_firstlineno}")
print()
print("Offset -> Line number mapping:")
print("-" * 30)
for offset, lineno in dis.findlinestarts(code):
print(f" offset {offset:>3} -> line {lineno}")
print()
print("Full disassembly with line numbers:")
dis.dis(multi_line_function)
How CPython uses the line table:
The line number table is a compressed mapping from bytecode offsets to source line numbers. CPython uses it in three key situations:
- Tracebacks: When an exception is raised, CPython reads the current frame's program counter (offset into bytecode), looks it up in the line table, and uses the result to print
File "x.py", line N. - Debuggers (
pdb): Thesys.settrace"line" event fires when the line number changes. CPython checks the line table to detect line boundaries. - Coverage tools:
coverage.pyhooks into the trace mechanism and uses line numbers to mark which lines were executed.
Python version note: Python 3.11 introduced co_linetable as a replacement for co_lnotab, with a more compact encoding that also stores column offsets (for precise error location in SyntaxError messages). dis.findlinestarts() handles both formats transparently.
Expected Output
See solution for line-to-offset mappingHints
Hint 1: In Python 3.10 and earlier, `co_lnotab` stores line number changes as pairs of bytes: (bytecode offset increment, line number increment). In 3.11+, `co_linetable` uses a more compact format.
Hint 2: Use `dis.findlinestarts(code)` to get (offset, line_number) pairs in a portable way across Python versions.
Compare the code objects of a lambda and an equivalent def function. Identify which attributes differ and which are the same.
import dis
square_lambda = lambda x: x * x
def square_def(x):
return x * x
def compare_code_objects(func1, func2):
c1 = func1.__code__
c2 = func2.__code__
attrs = [
"co_argcount", "co_varnames", "co_consts",
"co_names", "co_stacksize", "co_flags",
]
print(f"{'Attribute':<20} {'lambda':>25} {'def':>25} {'Match'}")
print("-" * 75)
for attr in attrs:
v1 = getattr(c1, attr)
v2 = getattr(c2, attr)
match = "YES" if v1 == v2 else "NO "
print(f"{attr:<20} {str(v1):>25} {str(v2):>25} {match}")
# Compare bytecode
code_match = list(dis.get_instructions(func1)) == list(dis.get_instructions(func2))
print(f"\nco_name match: {c1.co_name == c2.co_name} ({c1.co_name!r} vs {c2.co_name!r})")
print(f"Bytecode identical: {code_match}")
compare_code_objects(square_lambda, square_def)
Solution
import dis
square_lambda = lambda x: x * x
def square_def(x):
return x * x
def compare_code_objects(func1, func2):
c1 = func1.__code__
c2 = func2.__code__
attrs = [
"co_argcount", "co_varnames", "co_consts",
"co_names", "co_stacksize", "co_flags",
]
print(f"{'Attribute':<20} {'lambda':>25} {'def':>25} {'Match'}")
print("-" * 75)
for attr in attrs:
v1 = getattr(c1, attr)
v2 = getattr(c2, attr)
match = "YES" if v1 == v2 else "NO "
print(f"{attr:<20} {str(v1):>25} {str(v2):>25} {match}")
code_match = list(dis.get_instructions(func1)) == list(dis.get_instructions(func2))
print(f"\nco_name match: {c1.co_name == c2.co_name} ({c1.co_name!r} vs {c2.co_name!r})")
print(f"Bytecode identical: {code_match}")
compare_code_objects(square_lambda, square_def)
Key differences between lambda and def:
co_name: Alambdagets the name"<lambda>". Adefgets the function's actual name. This is the only semantic difference in the code object — it affectsrepr(), tracebacks, and__name__.co_flags: Usually identical. Both are regular functions (no generators, no coroutines, no cells).co_varnames,co_argcount,co_consts,co_stacksize: Identical — the same parameterx, the same constantNone, the same stack depth needed.- Bytecode: Identical —
LOAD_FAST 0,LOAD_FAST 0,BINARY_OP *,RETURN_VALUE.
The conclusion: lambda x: x * x and def square(x): return x * x compile to identical bytecode. The only difference is the name assigned to the code object. Lambda is purely syntactic sugar for a one-expression anonymous function — it is not a different kind of function at the interpreter level.
import dis
def compare_code_objects(func1, func2):
"""Compare two functions' code objects side by side.
Show matching and differing attributes.
Return True if they are semantically equivalent.
"""
passExpected Output
See solution for attribute comparison tableHints
Hint 1: Two code objects are "equivalent" if they have the same bytecode (co_code), constants (co_consts), names (co_names), variable names (co_varnames), and stack size (co_stacksize).
Hint 2: Note that co_name and co_filename may differ even for semantically identical functions. The function body bytecode (co_code) is the most reliable equivalence check.
Inspect a generator function's code object. Identify the CO_GENERATOR flag and explain how CPython distinguishes generators from regular functions at the bytecode level.
import dis
def regular_function(n):
return list(range(n))
def generator_function(n):
for i in range(n):
yield i
CO_GENERATOR = 0x20
CO_COROUTINE = 0x100
CO_ASYNC_GENERATOR = 0x200
for fn in [regular_function, generator_function]:
code = fn.__code__
is_gen = bool(code.co_flags & CO_GENERATOR)
print(f"--- {fn.__name__} ---")
print(f" co_flags (hex): {hex(code.co_flags)}")
print(f" CO_GENERATOR set: {is_gen}")
print(f" co_stacksize: {code.co_stacksize}")
print()
print("Generator disassembly (look for YIELD_VALUE):")
dis.dis(generator_function)Solution
import dis
def regular_function(n):
return list(range(n))
def generator_function(n):
for i in range(n):
yield i
CO_GENERATOR = 0x20
for fn in [regular_function, generator_function]:
code = fn.__code__
is_gen = bool(code.co_flags & CO_GENERATOR)
print(f"--- {fn.__name__} ---")
print(f" co_flags (hex): {hex(code.co_flags)}")
print(f" CO_GENERATOR set: {is_gen}")
print(f" co_stacksize: {code.co_stacksize}")
print()
print("Generator disassembly (look for YIELD_VALUE):")
dis.dis(generator_function)
How CPython implements generators:
When the Python compiler encounters a yield expression anywhere in a function body, it sets the CO_GENERATOR bit in co_flags. This single flag changes everything:
- On call: Instead of executing the code object immediately, CPython creates a
generatorobject and returns it without executing a single instruction. - The generator object stores the code object, the frame (which holds local variables and the program counter), and internal state (created, running, suspended, closed).
- On
next(): CPython resumes execution from where theYIELD_VALUEinstruction last suspended the frame. YIELD_VALUEopcode: Pops the top of the stack, saves it as the yield value, saves the frame state, and returns control to the caller.RESUMEopcode (Python 3.11+): The instruction at the start of the generator that handles resumption from asend()call.
The generator's frame persists between next() calls — this is the key difference from regular function calls, where the frame is created and destroyed on every call.
Expected Output
is_generator: True\nCO_GENERATOR flag set: True\nco_flags (hex): 0x...\nFirst instruction: RESUME\nYield expression in co_consts: FalseHints
Hint 1: Generators have the `CO_GENERATOR` bit set in `co_flags`. The value is `0x20` (32 in decimal). Check it with a bitwise AND: `code.co_flags & 0x20`.
Hint 2: Generator functions compile to code objects just like regular functions. The difference is entirely in `co_flags`. When called, CPython creates a generator object instead of executing immediately.
Hard
Build a recursive code object walker that prints a tree of all nested code objects inside a module or function. This is how tools like coverage.py discover all executable code units.
import dis
import types
def walk_code_objects(code, depth=0):
indent = " " * depth
prefix = "+-" if depth > 0 else ""
print(f"{indent}{prefix}[{code.co_name}]")
print(f"{indent} args={code.co_argcount}, locals={len(code.co_varnames)}, consts={len(code.co_consts)}")
# Recurse into nested code objects found in co_consts
for const in code.co_consts:
if isinstance(const, types.CodeType):
walk_code_objects(const, depth + 1)
# Test with a module-like structure
source = """
def outer(x):
items = [i * i for i in range(x)]
def inner(y):
return lambda z: x + y + z
return inner
class MyClass:
def method(self):
pass
"""
module_code = compile(source, "<test>", "exec")
print("Code object tree:")
walk_code_objects(module_code)
Solution
import dis
import types
def walk_code_objects(code, depth=0):
indent = " " * depth
prefix = "+-" if depth > 0 else ""
print(f"{indent}{prefix}[{code.co_name}]")
print(f"{indent} args={code.co_argcount}, locals={len(code.co_varnames)}, consts={len(code.co_consts)}")
for const in code.co_consts:
if isinstance(const, types.CodeType):
walk_code_objects(const, depth + 1)
source = """
def outer(x):
items = [i * i for i in range(x)]
def inner(y):
return lambda z: x + y + z
return inner
class MyClass:
def method(self):
pass
"""
module_code = compile(source, "<test>", "exec")
print("Code object tree:")
walk_code_objects(module_code)
Expected tree output:
[<module>]
args=0, locals=0, consts=4
+-[outer]
args=1, locals=2, consts=2
+-[<listcomp>]
args=1, locals=1, consts=1
+-[inner]
args=1, locals=1, consts=1
+-[<lambda>]
args=1, locals=1, consts=1
+-[MyClass]
args=0, locals=0, consts=3
+-[method]
args=1, locals=1, consts=1
Key observations:
- List comprehensions get their own code object (
<listcomp>) — they are compiled as anonymous functions internally, which is why variables defined inside a comprehension do not leak into the enclosing scope. - Classes get code objects — the class body is compiled as a code object and executed to populate the class namespace.
- Every
lambdagets a code object named<lambda>. - The nesting matches the source structure — CPython embeds inner code objects as constants in the outer code object's
co_consts.
Use cases for code walkers:
coverage.pydiscovers all executable code units to report untested lines- Security scanners look for dangerous opcodes (e.g.,
IMPORT_NAME) in serialized code - Bytecode obfuscators and optimisers traverse the code object tree to apply transformations
import dis
def walk_code_objects(code, depth=0):
"""Recursively walk all code objects nested inside a top-level
code object (e.g., nested functions, classes, comprehensions).
Print a tree of code object names and their key attributes.
"""
passExpected Output
See solution for recursive code object treeHints
Hint 1: Nested code objects (for inner functions, lambdas, comprehensions) are stored as constants in the outer code object's `co_consts`. Filter `co_consts` for objects of type `code`.
Hint 2: Use recursion with a `depth` parameter to indent the output and show the nesting level. Each level represents one level of function nesting.
Use code.replace() to patch a function's bytecode at runtime — replacing one of its constants without recompiling the source. Then wrap the modified code object in a new function.
import dis
import types
def patch_return_value(func, new_constant):
code = func.__code__
# Find the first non-None constant
old_consts = code.co_consts
new_consts = list(old_consts)
for i, c in enumerate(new_consts):
if c is not None:
new_consts[i] = new_constant
break
# Create patched code object
new_code = code.replace(co_consts=tuple(new_consts))
# Wrap in a new function object
return types.FunctionType(
new_code,
func.__globals__,
func.__name__,
func.__defaults__,
func.__closure__,
)
# Original function
def get_answer():
return 42
patched = patch_return_value(get_answer, 99)
print(f"Original: {get_answer()}")
print(f"Patched: {patched()}")
print(f"Original unchanged: {get_answer()}")
print("\nOriginal co_consts:", get_answer.__code__.co_consts)
print("Patched co_consts: ", patched.__code__.co_consts)
Solution
import dis
import types
def patch_return_value(func, new_constant):
code = func.__code__
old_consts = code.co_consts
new_consts = list(old_consts)
for i, c in enumerate(new_consts):
if c is not None:
new_consts[i] = new_constant
break
new_code = code.replace(co_consts=tuple(new_consts))
return types.FunctionType(
new_code,
func.__globals__,
func.__name__,
func.__defaults__,
func.__closure__,
)
def get_answer():
return 42
patched = patch_return_value(get_answer, 99)
print(f"Original: {get_answer()}")
print(f"Patched: {patched()}")
print(f"Original unchanged: {get_answer()}")
print("\nOriginal co_consts:", get_answer.__code__.co_consts)
print("Patched co_consts: ", patched.__code__.co_consts)
How this works:
code.replace() (PEP 570, Python 3.8+) creates a shallow copy of the code object with specified fields swapped. It is the safe API for code modification — you do not need to manually marshal/unmarshal bytecode.
types.FunctionType(code, globals, name, defaults, closure) creates a new function object wrapping the modified code object. This preserves all the runtime infrastructure (global namespace, closures, default arguments) while substituting the compiled code.
Real-world uses of runtime code patching:
unittest.mock: Patches attributes and functions at test time, then restores them.- Bytecode-level debuggers: Insert
BREAKPOINTinstructions into live code objects. - Monkey-patching libraries (e.g.,
gevent): Replace blocking I/O functions with cooperative ones. - Python security research: Detect when production code has been tampered with by comparing current code objects against known-good hashes.
Warning: Patching code objects incorrectly (wrong stack depths, inconsistent constant tables) will crash CPython with a segfault or corrupt the interpreter state. Always test patched code thoroughly.
import dis
import types
def patch_return_value(func, new_constant):
"""Replace the first non-None constant in a function's co_consts
with new_constant and return a new function with the patched code.
Use code.replace() (Python 3.8+) to create a modified code object.
"""
passExpected Output
Original: 42\nPatched: 99\nOriginal unchanged: 42Hints
Hint 1: Code objects are immutable in Python, but `code.replace()` (added in Python 3.8) creates a new code object with specific fields replaced. Pass keyword arguments for the fields you want to change.
Hint 2: To swap a constant: find the index of the value in `co_consts`, build a new tuple with the replacement at that index, then call `code.replace(co_consts=new_tuple)`.
Build a complete function signature extractor using only code object attributes and function metadata — no inspect module allowed.
import types
CO_VARARGS = 0x04
CO_VARKEYWORDS = 0x08
def extract_signature(func):
code = func.__code__
flags = code.co_flags
varnames = code.co_varnames
has_varargs = bool(flags & CO_VARARGS)
has_varkw = bool(flags & CO_VARKEYWORDS)
n_pos = code.co_argcount
n_kw = code.co_kwonlyargcount
pos_params = list(varnames[:n_pos])
idx = n_pos
varargs_name = varnames[idx] if has_varargs else None
if has_varargs:
idx += 1
kwonly_params = list(varnames[idx: idx + n_kw])
idx += n_kw
varkw_name = varnames[idx] if has_varkw else None
return {
"positional_params": pos_params,
"kwonly_params": kwonly_params,
"has_var_positional": has_varargs,
"var_positional_name": varargs_name,
"has_var_keyword": has_varkw,
"var_keyword_name": varkw_name,
"defaults": func.__defaults__,
"kwdefaults": func.__kwdefaults__,
}
# Test functions
def simple(x, y):
pass
def with_defaults(x, y=10, z=20):
pass
def complex_sig(a, b, *args, key=None, **kwargs):
pass
for fn in [simple, with_defaults, complex_sig]:
print(f"\n--- {fn.__name__} ---")
sig = extract_signature(fn)
for k, v in sig.items():
print(f" {k}: {v}")
Solution
import types
CO_VARARGS = 0x04
CO_VARKEYWORDS = 0x08
def extract_signature(func):
code = func.__code__
flags = code.co_flags
varnames = code.co_varnames
has_varargs = bool(flags & CO_VARARGS)
has_varkw = bool(flags & CO_VARKEYWORDS)
n_pos = code.co_argcount
n_kw = code.co_kwonlyargcount
pos_params = list(varnames[:n_pos])
idx = n_pos
varargs_name = varnames[idx] if has_varargs else None
if has_varargs:
idx += 1
kwonly_params = list(varnames[idx: idx + n_kw])
idx += n_kw
varkw_name = varnames[idx] if has_varkw else None
return {
"positional_params": pos_params,
"kwonly_params": kwonly_params,
"has_var_positional": has_varargs,
"var_positional_name": varargs_name,
"has_var_keyword": has_varkw,
"var_keyword_name": varkw_name,
"defaults": func.__defaults__,
"kwdefaults": func.__kwdefaults__,
}
def simple(x, y):
pass
def with_defaults(x, y=10, z=20):
pass
def complex_sig(a, b, *args, key=None, **kwargs):
pass
for fn in [simple, with_defaults, complex_sig]:
print(f"\n--- {fn.__name__} ---")
sig = extract_signature(fn)
for k, v in sig.items():
print(f" {k}: {v}")
The co_varnames layout for a complex signature:
def f(a, b, *args, key=None, **kwargs):
| | | | |
positional kwonly varkw
co_argcount=2
co_kwonlyargcount=1
CO_VARARGS flag set
CO_VARKEYWORDS flag set
co_varnames = ('a', 'b', 'args', 'key', 'kwargs', ... body locals ...)
[0] [1] [2] [3] [4]
How inspect.signature works: It reads exactly these fields — co_argcount, co_kwonlyargcount, co_flags, co_varnames — plus func.__defaults__ and func.__kwdefaults__ for defaults. It also reads __annotations__ for type hints. The inspect module is a high-level convenience wrapper around these raw code object fields.
import types
def extract_signature(func):
"""Extract a complete function signature from its code object alone
(without using inspect.signature).
Return a dict with:
- positional_params: list of positional parameter names
- kwonly_params: list of keyword-only parameter names
- has_var_positional: bool (has *args)
- has_var_keyword: bool (has **kwargs)
- defaults: tuple of default values (or None)
- kwdefaults: dict of keyword-only defaults (or None)
"""
passExpected Output
See solution for signature dict outputHints
Hint 1: co_argcount = number of positional params (including those with defaults, NOT keyword-only). co_kwonlyargcount = keyword-only params. co_flags bits CO_VARARGS (0x04) and CO_VARKEYWORDS (0x08) indicate *args and **kwargs.
Hint 2: Parameter names come from co_varnames. The order is: positional params, then *args (if any), then keyword-only params, then **kwargs (if any). Defaults are stored in func.__defaults__ and func.__kwdefaults__.
