Python Execution Model in Practice: Practice Problems & Exercises
Practice: Execution Model in Practice
← Back to lessonEasy
Use Python's built-in compile() function to compile a short piece of source code into a code object. Then inspect its type and three of its attributes: co_filename, co_consts, and co_names.
source = """
x = 10
y = 20
total = x + y
print(total)
"""
# Compile the source into a code object
code_obj = compile(source, "<practice>", "exec")
# Inspect the code object
print(f"Type: {type(code_obj)}")
print(f"Filename: {code_obj.co_filename}")
print(f"Constants: {code_obj.co_consts}")
print(f"Names: {code_obj.co_names}")
print(f"Number of constants: {len(code_obj.co_consts) - 1}") # Exclude NoneSolution
source = """
x = 10
y = 20
total = x + y
print(total)
"""
code_obj = compile(source, "<practice>", "exec")
print(f"Type: {type(code_obj)}")
print(f"Filename: {code_obj.co_filename}")
print(f"Constants: {code_obj.co_consts}")
print(f"Names: {code_obj.co_names}")
print(f"Number of constants: {len(code_obj.co_consts) - 1}")
What the code object contains:
co_consts: A tuple of all literal constant values found in the bytecode.Noneis always at index 0 (it is the implicit return value). Here you also see10and20from the assignments.co_names: A tuple of all names (variables, functions) referenced in the code. These are names that need to be looked up in the global or built-in scope at runtime.x,y,totalare the variables;printis a built-in name.co_filename: Whatever string you passed as the second argument tocompile(). CPython uses this for tracebacks and error messages. When you run a.pyfile, this is the file path.
Key insight: compile() is the same function CPython calls internally when you import a module or run a script. The code object it returns is the exact intermediate representation the PVM executes. You are looking at the same data structure CPython uses.
Expected Output
Type: <class 'code'>\nFilename: <practice>\nConstants: (None, 10, 20)\nNames: ('x', 'y', 'total', 'print')\nNumber of constants: 3Hints
Hint 1: The built-in `compile()` takes three arguments: source string, filename (can be any label), and mode (`"exec"` for statements, `"eval"` for a single expression).
Hint 2: A code object has attributes like `co_consts` (tuple of constants), `co_names` (tuple of names used), and `co_filename` (the filename you passed to compile).
Use dis.get_instructions() to disassemble a simple function. Count the total number of bytecode instructions and list every opcode name.
import dis
def add(a, b):
return a + b
# Get all bytecode instructions
instructions = list(dis.get_instructions(add))
# Count and list them
total = len(instructions)
opcodes = [instr.opname for instr in instructions]
unique = len(set(opcodes))
print(f"Total instructions: {total}")
print(f"Unique opcodes: {unique}")
print(f"Opcode list: {opcodes}")Solution
import dis
def add(a, b):
return a + b
instructions = list(dis.get_instructions(add))
total = len(instructions)
opcodes = [instr.opname for instr in instructions]
unique = len(set(opcodes))
print(f"Total instructions: {total}")
print(f"Unique opcodes: {unique}")
print(f"Opcode list: {opcodes}")
Bytecode breakdown for return a + b:
RESUME 0 ← required preamble (Python 3.11+)
LOAD_FAST 0 (a) ← push local variable 'a' onto the stack
LOAD_FAST 1 (b) ← push local variable 'b' onto the stack
BINARY_OP 0 (+) ← pop both, push a + b
RETURN_VALUE ← pop the top of stack and return it
- 5 total instructions for the simplest possible binary operation.
- 4 unique opcodes because
LOAD_FASTappears twice (once for each parameter). RESUMEwas added in Python 3.11 as an entry point for tracing and debugging. It does nothing in normal execution but is always the first instruction.
Why dis.get_instructions() instead of dis.dis(): dis.dis() prints human-readable output to stdout. dis.get_instructions() returns structured Instruction namedtuples you can programmatically analyze — far more useful for building tools.
Expected Output
Total instructions: 5\nUnique opcodes: 4\nOpcode list: ['RESUME', 'LOAD_FAST', 'LOAD_FAST', 'BINARY_OP', 'RETURN_VALUE']Hints
Hint 1: `dis.get_instructions(func)` returns an iterator of `Instruction` objects. Each has an `.opname` attribute with the opcode name.
Hint 2: Convert the iterator to a list first so you can both count and inspect the instructions.
Predict the output before running this code. Python executes code at three distinct times: import/module time, definition time, and call time. Which print statements run when?
def my_decorator(func):
print("4: decorator runs")
return func
print("1: module level")
def greet():
print("5: call time")
print("2: function defined")
class Config:
print("3: class body")
@my_decorator
def process():
print("this never prints")
greet()Solution
def my_decorator(func):
print("4: decorator runs")
return func
print("1: module level")
def greet():
print("5: call time")
print("2: function defined")
class Config:
print("3: class body")
@my_decorator
def process():
print("this never prints")
greet()
The three execution times:
-
Module level (immediate):
print("1: module level")runs as soon as Python reaches it. All top-level statements execute sequentially as the interpreter reads the file. -
Definition time: When Python reaches
def greet():, it compiles the function body into a code object and binds it to the namegreet— but it does NOT execute the body.print("2: function defined")runs next because it is the next module-level statement. -
Class body (immediate): When Python reaches
class Config:, it executes the entire class body right away to build the class namespace.print("3: class body")runs at class definition time. -
Decorator execution:
@my_decoratoris syntactic sugar forprocess = my_decorator(process). The decorator function is called immediately when thedefis reached. Soprint("4: decorator runs")executes at definition time, beforegreet()is ever called. -
Call time:
greet()is the actual function call. Only now doesprint("5: call time")execute.process()is never called, so"this never prints"never appears.
Key insight for interviews: Understanding when code runs is critical for debugging import-time side effects, metaclass behavior, decorator ordering, and circular import issues. The rule is: module-level and class bodies run immediately; function bodies run only when called; decorators run at definition time.
Expected Output
1: module level\n2: function defined\n3: class body\n4: decorator runs\n5: call timeHints
Hint 1: Module-level statements run immediately as Python reads the file from top to bottom. Function bodies do NOT run until called. Class bodies run immediately when the class statement is reached.
Hint 2: Decorators execute at function definition time (when the `def` is reached), not when the function is called.
Medium
Compare the bytecode generated for x ** 2 and x * x. Do they produce the same number of instructions? What is the critical difference the bytecode reveals?
import dis
def square_pow(x):
return x ** 2
def square_mul(x):
return x * x
# Analyze x ** 2
pow_instrs = list(dis.get_instructions(square_pow))
print("=== x ** 2 ===")
print(f"Instructions: {len(pow_instrs)}")
print(f"Opcodes: {', '.join(i.opname for i in pow_instrs)}")
for i in pow_instrs:
if i.arg is not None:
print(f" {i.opname:20s} arg={i.arg} ({i.argrepr})")
# Analyze x * x
mul_instrs = list(dis.get_instructions(square_mul))
print("\n=== x * x ===")
print(f"Instructions: {len(mul_instrs)}")
print(f"Opcodes: {', '.join(i.opname for i in mul_instrs)}")
for i in mul_instrs:
if i.arg is not None:
print(f" {i.opname:20s} arg={i.arg} ({i.argrepr})")
print(f"\nSame instruction count: {len(pow_instrs) == len(mul_instrs)}")Solution
import dis
def square_pow(x):
return x ** 2
def square_mul(x):
return x * x
pow_instrs = list(dis.get_instructions(square_pow))
print("=== x ** 2 ===")
print(f"Instructions: {len(pow_instrs)}")
print(f"Opcodes: {', '.join(i.opname for i in pow_instrs)}")
for i in pow_instrs:
if i.arg is not None:
print(f" {i.opname:20s} arg={i.arg} ({i.argrepr})")
mul_instrs = list(dis.get_instructions(square_mul))
print("\n=== x * x ===")
print(f"Instructions: {len(mul_instrs)}")
print(f"Opcodes: {', '.join(i.opname for i in mul_instrs)}")
for i in mul_instrs:
if i.arg is not None:
print(f" {i.opname:20s} arg={i.arg} ({i.argrepr})")
print(f"\nSame instruction count: {len(pow_instrs) == len(mul_instrs)}")
Both produce 5 instructions, but the operand sources differ:
x ** 2: LOAD_FAST(x) → LOAD_CONST(2) → BINARY_OP(**)
x * x: LOAD_FAST(x) → LOAD_FAST(x) → BINARY_OP(*)
Two important differences the bytecode does NOT show:
-
Operator dispatch cost:
BINARY_OP(**)callsint.__pow__, which handles arbitrary exponents (negative, float, modular).BINARY_OP(*)callsint.__mul__, a much simpler operation. For small integers,__mul__is measurably faster because it does not need the exponentiation logic. -
Operand loading:
x ** 2loads a constant (LOAD_CONSTreads from the code object'sco_conststuple).x * xloads the same local variable twice (LOAD_FASTreads from the fast locals array). Both are O(1) array lookups, butLOAD_FASTis marginally faster because fast locals are stored in a C array directly on the frame object.
Practical takeaway: For squaring, x * x is slightly faster than x ** 2 in CPython due to the cheaper __mul__ dispatch. But the difference is nanoseconds — write whichever is clearer for your use case. In numerical code, NumPy vectorization dwarfs this micro-optimization.
Expected Output
=== x ** 2 ===\nInstructions: 5\nOpcodes: RESUME, LOAD_FAST, LOAD_CONST, BINARY_OP, RETURN_VALUE\n\n=== x * x ===\nInstructions: 5\nOpcodes: RESUME, LOAD_FAST, LOAD_FAST, BINARY_OP, RETURN_VALUE\n\nSame instruction count: True\nKey difference: power uses LOAD_CONST(2), multiply uses LOAD_FAST(x) twiceHints
Hint 1: Use `dis.get_instructions()` to get structured instruction data. Compare the operand sources: one loads a constant, the other loads the same variable twice.
Hint 2: Think about what happens at runtime: `**` calls `__pow__` which is more expensive than `__mul__`, even though both produce 5 bytecode instructions.
Write a function that computes the distance between two 2D points. Then inspect its __code__ object to answer: How many local variables does it have? What constants are embedded? What is the max stack depth?
import math
def distance(x1, y1, x2, y2):
"""Euclidean distance between two points."""
dx = x2 - x1
dy = y2 - y1
return math.sqrt(dx ** 2 + dy ** 2)
code = distance.__code__
print("=== Code Object Attributes ===")
print(f"co_varnames: {code.co_varnames}")
print(f"co_nlocals: {code.co_nlocals}")
print(f"co_consts: {code.co_consts}")
print(f"co_stacksize: {code.co_stacksize}")
print(f"co_argcount: {code.co_argcount}")
print(f"co_name: {code.co_name}")
print("\n=== Analysis ===")
print(f"Parameters: {code.co_varnames[:code.co_argcount]}")
print(f"Locals (non-param): {code.co_varnames[code.co_argcount:]}")
print(f"Constant 2 is used for: exponentiation (dx**2, dy**2)")Solution
import math
def distance(x1, y1, x2, y2):
"""Euclidean distance between two points."""
dx = x2 - x1
dy = y2 - y1
return math.sqrt(dx ** 2 + dy ** 2)
code = distance.__code__
print("=== Code Object Attributes ===")
print(f"co_varnames: {code.co_varnames}")
print(f"co_nlocals: {code.co_nlocals}")
print(f"co_consts: {code.co_consts}")
print(f"co_stacksize: {code.co_stacksize}")
print(f"co_argcount: {code.co_argcount}")
print(f"co_name: {code.co_name}")
print("\n=== Analysis ===")
print(f"Parameters: {code.co_varnames[:code.co_argcount]}")
print(f"Locals (non-param): {code.co_varnames[code.co_argcount:]}")
print(f"Constant 2 is used for: exponentiation (dx**2, dy**2)")
Attribute breakdown:
| Attribute | Value | Meaning |
|---|---|---|
co_varnames | ('x1', 'y1', 'x2', 'y2', 'dx', 'dy') | All local names: params first, then locals |
co_nlocals | 6 | Total local variables (4 params + 2 locals) |
co_consts | ('Euclidean...', None, 2) | Index 0 is the docstring, then None, then the literal 2 |
co_stacksize | Varies (typically 3-4) | Max items on the evaluation stack simultaneously |
co_argcount | 4 | Number of positional parameters |
co_name | 'distance' | The function name as defined |
Important details:
- When a function has a docstring, it appears as
co_consts[0]instead ofNone. The compiler stores the docstring as a constant so it can be accessed at runtime viafunc.__doc__. co_stacksizeis computed by the compiler at compile time. It represents the worst-case stack depth needed to evaluate any expression in the function. The PVM allocates this much stack space when creating the frame.co_varnames[:co_argcount]always gives you just the parameter names, andco_varnames[co_argcount:]gives the local-only variables. This is how debuggers and IDEs distinguish parameters from locals.
Expected Output
See solution for full attribute listingHints
Hint 1: Every function has a `__code__` attribute. Key fields: `co_varnames` (local variable names including parameters), `co_consts` (literal constants), `co_stacksize` (max evaluation stack depth), `co_nlocals` (number of local variables).
Hint 2: `co_varnames` lists parameters first, then local variables in order of first assignment. `co_consts` always starts with `None` (the implicit return value) unless the function has a docstring (which goes at index 0).
Use ast.parse() to parse a multi-line expression. Then walk the AST to count how many function calls, binary operations, and name references exist in the code.
import ast
source = """
result = max(a + b, min(c * d, e - f))
total = sum([x ** 2 for x in range(10)])
"""
tree = ast.parse(source)
# Print the AST structure
print("=== AST Dump (simplified) ===")
print(ast.dump(tree, indent=2))
# Walk the tree and count node types
calls = 0
binops = 0
names = 0
for node in ast.walk(tree):
if isinstance(node, ast.Call):
calls += 1
elif isinstance(node, ast.BinOp):
binops += 1
elif isinstance(node, ast.Name):
names += 1
print(f"\n=== Node Counts ===")
print(f"Function calls (ast.Call): {calls}")
print(f"Binary operations (ast.BinOp): {binops}")
print(f"Name references (ast.Name): {names}")Solution
import ast
source = """
result = max(a + b, min(c * d, e - f))
total = sum([x ** 2 for x in range(10)])
"""
tree = ast.parse(source)
print("=== AST Dump (simplified) ===")
print(ast.dump(tree, indent=2))
calls = 0
binops = 0
names = 0
for node in ast.walk(tree):
if isinstance(node, ast.Call):
calls += 1
elif isinstance(node, ast.BinOp):
binops += 1
elif isinstance(node, ast.Name):
names += 1
print(f"\n=== Node Counts ===")
print(f"Function calls (ast.Call): {calls}")
print(f"Binary operations (ast.BinOp): {binops}")
print(f"Name references (ast.Name): {names}")
What the AST reveals:
- Function calls:
max(...),min(...),sum(...),range(10)— 4ast.Callnodes. The list comprehension internally compilesrange(10)as a call too. - Binary operations:
a + b,c * d,e - f,x ** 2— 4ast.BinOpnodes. Each has.left,.op(an operator node likeast.Add,ast.Mult), and.right. - Name references:
result,a,b,max,min,c,d,e,f,total,sum,x,range,x— every variable and function name creates anast.Namenode.
How ast.walk() works: It performs a breadth-first traversal of every node in the tree. Unlike ast.NodeVisitor (which you subclass), ast.walk() is a simple generator you can filter with isinstance(). It visits every node exactly once, regardless of nesting depth.
Why this matters: The AST is the data structure that linters (flake8, pylint), formatters (black), type checkers (mypy), and security scanners (bandit) all operate on. Understanding AST structure is how you build code analysis tools.
Expected Output
See solution for AST dump and node analysisHints
Hint 1: `ast.parse()` returns a `Module` node whose `.body` contains a list of statement nodes. Each node has a `.lineno` and `.col_offset`.
Hint 2: Use `ast.walk(tree)` to visit every node in the tree. Check `isinstance(node, ast.Call)` to find function calls, `isinstance(node, ast.BinOp)` for binary operations.
Trace exactly which LOAD and STORE instructions Python generates for a chain of variable assignments. Predict the stack state after each instruction.
import dis
def chain_assign():
x = 10
y = x + 5
z = y * x
x = z - y
return x
print("=== Full Disassembly ===")
dis.dis(chain_assign)
print("\n=== LOAD/STORE Trace Only ===")
for instr in dis.get_instructions(chain_assign):
if 'LOAD' in instr.opname or 'STORE' in instr.opname:
print(f" {instr.offset:3d} {instr.opname:20s} {instr.argrepr}")After running, trace the stack state:
Instruction Stack (bottom -> top) Locals
----------- --------------------- ------
LOAD_CONST 10 [10] {}
STORE_FAST x [] {x: 10}
LOAD_FAST x [10] {x: 10}
LOAD_CONST 5 [10, 5] {x: 10}
BINARY_OP + [15] {x: 10}
STORE_FAST y [] {x: 10, y: 15}
...continue tracing...
Solution
import dis
def chain_assign():
x = 10
y = x + 5
z = y * x
x = z - y
return x
print("=== Full Disassembly ===")
dis.dis(chain_assign)
print("\n=== LOAD/STORE Trace Only ===")
for instr in dis.get_instructions(chain_assign):
if 'LOAD' in instr.opname or 'STORE' in instr.opname:
print(f" {instr.offset:3d} {instr.opname:20s} {instr.argrepr}")
Complete stack trace:
Instruction Stack Locals after
----------- ----- ------------
RESUME [] {}
LOAD_CONST (10) [10] {}
STORE_FAST (x) [] {x:10}
LOAD_FAST (x) [10] {x:10}
LOAD_CONST (5) [10, 5] {x:10}
BINARY_OP (+) [15] {x:10}
STORE_FAST (y) [] {x:10, y:15}
LOAD_FAST (y) [15] {x:10, y:15}
LOAD_FAST (x) [15, 10] {x:10, y:15}
BINARY_OP (*) [150] {x:10, y:15}
STORE_FAST (z) [] {x:10, y:15, z:150}
LOAD_FAST (z) [150] {x:10, y:15, z:150}
LOAD_FAST (y) [150, 15] {x:10, y:15, z:150}
BINARY_OP (-) [135] {x:10, y:15, z:150}
STORE_FAST (x) [] {x:135, y:15, z:150}
LOAD_FAST (x) [135] {x:135, y:15, z:150}
RETURN_VALUE [] {x:135, y:15, z:150}
Key observations:
STORE_FASTalways clears the stack (pops the top value and writes it to the locals array). After every assignment, the stack is empty.xis reassigned on the lastSTORE_FAST (x). The old value (10) is overwritten with 135. The namexis just an index into the locals array — reassignment is just writing to a different slot.- The stack never exceeds depth 2 in this function. Each binary operation needs exactly 2 operands on the stack. This matches
co_stacksizefrom the code object. LOAD_FASTvsLOAD_CONST: Constants like10and5come fromco_constsviaLOAD_CONST. Variables come from the fast locals array viaLOAD_FAST. Both are O(1) indexed lookups, but they read from different arrays.
Expected Output
See solution for full instruction traceHints
Hint 1: Each variable assignment compiles to a sequence ending in `STORE_FAST` (for locals) or `STORE_NAME` (for module-level). Each variable read is `LOAD_FAST` or `LOAD_NAME`.
Hint 2: Chained assignments like `a = b = expr` compile the expression once, then use `COPY` (or `DUP_TOP` in older Python) followed by multiple `STORE_FAST` instructions.
Hard
Build a reusable bytecode profiler that takes any function and produces a detailed instruction report: total count, opcode frequency, category breakdown (loads, stores, operations, control flow, calls), and a load-to-store ratio.
import dis
from collections import Counter
def profile_bytecode(func):
"""Profile the bytecode of any function."""
instructions = list(dis.get_instructions(func))
opcodes = Counter(i.opname for i in instructions)
# Categorize instructions
categories = {
'loads': 0,
'stores': 0,
'operations': 0,
'control_flow': 0,
'calls': 0,
'other': 0,
}
for instr in instructions:
name = instr.opname
if name.startswith('LOAD'):
categories['loads'] += 1
elif name.startswith('STORE'):
categories['stores'] += 1
elif name in ('BINARY_OP', 'COMPARE_OP', 'IS_OP', 'CONTAINS_OP') or name.startswith('UNARY'):
categories['operations'] += 1
elif 'JUMP' in name or name in ('FOR_ITER', 'GET_ITER', 'POP_JUMP_IF_TRUE',
'POP_JUMP_IF_FALSE', 'POP_JUMP_IF_NONE',
'POP_JUMP_IF_NOT_NONE'):
categories['control_flow'] += 1
elif name in ('CALL', 'CALL_FUNCTION', 'CALL_METHOD', 'PUSH_NULL'):
categories['calls'] += 1
else:
categories['other'] += 1
# Print report
print(f"=== Bytecode Profile: {func.__name__} ===")
print(f"Total instructions: {len(instructions)}")
print(f"Unique opcodes: {len(opcodes)}")
print(f"\nTop opcodes:")
for op, count in opcodes.most_common(8):
bar = '#' * count
print(f" {op:25s} {count:3d} {bar}")
print(f"\nCategories:")
for cat, count in sorted(categories.items(), key=lambda x: -x[1]):
if count > 0:
pct = count / len(instructions) * 100
print(f" {cat:15s} {count:3d} ({pct:.0f}%)")
load_store_ratio = (categories['loads'] / categories['stores']
if categories['stores'] > 0 else float('inf'))
print(f"\nLoad/Store ratio: {load_store_ratio:.1f}")
return opcodes, categories
# Test with different function styles
def iterative_fibonacci(n):
a, b = 0, 1
for _ in range(n):
a, b = b, a + b
return a
def recursive_fibonacci(n):
if n <= 1:
return n
return recursive_fibonacci(n - 1) + recursive_fibonacci(n - 2)
print("--- Iterative ---")
profile_bytecode(iterative_fibonacci)
print("\n--- Recursive ---")
profile_bytecode(recursive_fibonacci)Solution
import dis
from collections import Counter
def profile_bytecode(func):
instructions = list(dis.get_instructions(func))
opcodes = Counter(i.opname for i in instructions)
categories = {
'loads': 0, 'stores': 0, 'operations': 0,
'control_flow': 0, 'calls': 0, 'other': 0,
}
for instr in instructions:
name = instr.opname
if name.startswith('LOAD'):
categories['loads'] += 1
elif name.startswith('STORE'):
categories['stores'] += 1
elif name in ('BINARY_OP', 'COMPARE_OP', 'IS_OP', 'CONTAINS_OP') or name.startswith('UNARY'):
categories['operations'] += 1
elif 'JUMP' in name or name in ('FOR_ITER', 'GET_ITER'):
categories['control_flow'] += 1
elif name in ('CALL', 'CALL_FUNCTION', 'CALL_METHOD', 'PUSH_NULL'):
categories['calls'] += 1
else:
categories['other'] += 1
print(f"=== Bytecode Profile: {func.__name__} ===")
print(f"Total instructions: {len(instructions)}")
print(f"Unique opcodes: {len(opcodes)}")
print(f"\nTop opcodes:")
for op, count in opcodes.most_common(8):
bar = '#' * count
print(f" {op:25s} {count:3d} {bar}")
print(f"\nCategories:")
for cat, count in sorted(categories.items(), key=lambda x: -x[1]):
if count > 0:
pct = count / len(instructions) * 100
print(f" {cat:15s} {count:3d} ({pct:.0f}%)")
load_store_ratio = (categories['loads'] / categories['stores']
if categories['stores'] > 0 else float('inf'))
print(f"\nLoad/Store ratio: {load_store_ratio:.1f}")
return opcodes, categories
def iterative_fibonacci(n):
a, b = 0, 1
for _ in range(n):
a, b = b, a + b
return a
def recursive_fibonacci(n):
if n <= 1:
return n
return recursive_fibonacci(n - 1) + recursive_fibonacci(n - 2)
print("--- Iterative ---")
profile_bytecode(iterative_fibonacci)
print("\n--- Recursive ---")
profile_bytecode(recursive_fibonacci)
What the profiles reveal:
-
Iterative version: More
STORE_FASTinstructions (from thea, b = b, a + bswap), control flow fromFOR_ITERandJUMP_BACKWARD, but noCALLinstructions inside the loop. All the work stays within the PVM eval loop. -
Recursive version: More
LOAD_GLOBAL(loadingrecursive_fibonacciby name for each recursive call), moreCALLinstructions (two per non-base-case invocation), and fewer stores. The load-to-store ratio is much higher because recursive code reads heavily but writes little.
Performance implication: Each CALL instruction in the recursive version creates an entirely new frame object (with its own locals array, stack, and instruction pointer). The iterative version reuses a single frame. This is why recursive Fibonacci is O(2^n) in both time and frame allocations, while iterative is O(n) with O(1) frames.
How to extend this tool: Add a compare_profiles(func1, func2) function that shows a side-by-side diff of instruction counts. This is exactly how CPython developers evaluate whether a compiler change improves code generation.
Expected Output
See solution for full profiling outputHints
Hint 1: Use `dis.get_instructions(func)` and `collections.Counter` to tally opcodes. Group them into categories: loads (LOAD_*), stores (STORE_*), operations (BINARY_OP, COMPARE_OP, UNARY_*), control flow (JUMP_*, FOR_ITER, POP_JUMP_*).
Hint 2: To compute the load-to-store ratio, count all instructions starting with LOAD vs STORE. A high ratio means the function reads much more than it writes — typical of expression-heavy code.
Implement an ast.NodeVisitor subclass that finds every function call in a piece of Python code, classifying each as a simple call (func()), method call (obj.method()), or chained call (a.b.c()). Report the call name, number of arguments, and line number.
import ast
class CallFinder(ast.NodeVisitor):
"""Find and classify all function calls in source code."""
def __init__(self):
self.calls = []
def _get_call_name(self, node):
"""Extract the full callable name from a Call node."""
func = node.func
if isinstance(func, ast.Name):
return func.id, "simple"
elif isinstance(func, ast.Attribute):
# Could be obj.method() or a.b.c()
parts = [func.attr]
value = func.value
while isinstance(value, ast.Attribute):
parts.append(value.attr)
value = value.value
if isinstance(value, ast.Name):
parts.append(value.id)
parts.reverse()
name = ".".join(parts)
call_type = "method" if len(parts) == 2 else "chained"
return name, call_type
return "<complex>", "complex"
def visit_Call(self, node):
name, call_type = self._get_call_name(node)
num_args = len(node.args) + len(node.keywords)
self.calls.append({
'name': name,
'type': call_type,
'args': num_args,
'line': node.lineno,
})
# Continue visiting child nodes (finds nested calls)
self.generic_visit(node)
# Test with realistic code
source = """
import os
import json
data = json.loads('{"key": "value"}')
path = os.path.join("/tmp", "output.txt")
items = sorted(data.items(), key=lambda x: x[0])
result = list(map(str.upper, ["hello", "world"]))
print(f"Found {len(items)} items")
os.path.exists(path)
"""
tree = ast.parse(source)
finder = CallFinder()
finder.visit(tree)
print(f"Total calls found: {len(finder.calls)}\n")
print(f"{'Name':30s} {'Type':10s} {'Args':5s} {'Line':5s}")
print("-" * 55)
for call in finder.calls:
print(f"{call['name']:30s} {call['type']:10s} {call['args']:5d} {call['line']:5d}")
# Summary
from collections import Counter
type_counts = Counter(c['type'] for c in finder.calls)
print(f"\nCall types: {dict(type_counts)}")Solution
import ast
class CallFinder(ast.NodeVisitor):
def __init__(self):
self.calls = []
def _get_call_name(self, node):
func = node.func
if isinstance(func, ast.Name):
return func.id, "simple"
elif isinstance(func, ast.Attribute):
parts = [func.attr]
value = func.value
while isinstance(value, ast.Attribute):
parts.append(value.attr)
value = value.value
if isinstance(value, ast.Name):
parts.append(value.id)
parts.reverse()
name = ".".join(parts)
call_type = "method" if len(parts) == 2 else "chained"
return name, call_type
return "<complex>", "complex"
def visit_Call(self, node):
name, call_type = self._get_call_name(node)
num_args = len(node.args) + len(node.keywords)
self.calls.append({
'name': name,
'type': call_type,
'args': num_args,
'line': node.lineno,
})
self.generic_visit(node)
source = """
import os
import json
data = json.loads('{"key": "value"}')
path = os.path.join("/tmp", "output.txt")
items = sorted(data.items(), key=lambda x: x[0])
result = list(map(str.upper, ["hello", "world"]))
print(f"Found {len(items)} items")
os.path.exists(path)
"""
tree = ast.parse(source)
finder = CallFinder()
finder.visit(tree)
print(f"Total calls found: {len(finder.calls)}\n")
print(f"{'Name':30s} {'Type':10s} {'Args':5s} {'Line':5s}")
print("-" * 55)
for call in finder.calls:
print(f"{call['name']:30s} {call['type']:10s} {call['args']:5d} {call['line']:5d}")
from collections import Counter
type_counts = Counter(c['type'] for c in finder.calls)
print(f"\nCall types: {dict(type_counts)}")
How ast.NodeVisitor works:
-
You subclass
ast.NodeVisitorand define methods namedvisit_<NodeType>. When you callvisitor.visit(tree), it walks the AST and dispatches to the matchingvisit_*method for each node type. -
Critical: You must call
self.generic_visit(node)inside yourvisit_Callmethod. Without it, the visitor stops at the first call node and never descends into nested calls. For example, insorted(data.items(), key=lambda x: x[0]), the outersorted()call contains an innerdata.items()call as an argument.generic_visitensures both are found. -
Method call resolution: For
os.path.join(...), the AST stores this as nestedAttributenodes:Attribute(value=Attribute(value=Name('os'), attr='path'), attr='join'). The_get_call_namemethod walks this chain to reconstruct the dotted name.
Real-world applications:
- Security scanning (bandit): Finds dangerous calls like
eval(),exec(),os.system(). - Dependency analysis: Finds all function calls to map which modules a codebase depends on.
- Code complexity metrics: Counting calls and their nesting depth correlates with maintenance burden.
- Linting rules: Enforce "never call X directly, use the wrapper Y instead."
Expected Output
See solution for full call analysis outputHints
Hint 1: Subclass `ast.NodeVisitor` and override `visit_Call`. The `node.func` attribute holds the callable expression. For simple calls like `print()`, it is an `ast.Name` with `.id`. For method calls like `obj.method()`, it is an `ast.Attribute` with `.attr` and `.value`.
Hint 2: To find nested calls (calls inside other calls), make sure your visitor calls `self.generic_visit(node)` to continue traversal into child nodes.
Build a comprehensive code object analyzer that computes complexity metrics for any function: bytecode instruction count, cyclomatic complexity (branch count), nesting depth (nested code objects), constant density, and a composite complexity score.
import dis
import ast
import sys
from collections import Counter
def analyze_complexity(func):
"""Compute complexity metrics for a function's code object."""
code = func.__code__
instructions = list(dis.get_instructions(func))
opcodes = Counter(i.opname for i in instructions)
# Metric 1: Raw size
total_instrs = len(instructions)
unique_opcodes = len(opcodes)
# Metric 2: Cyclomatic complexity (branches)
branch_ops = sum(1 for i in instructions
if 'JUMP' in i.opname or i.opname in ('FOR_ITER',))
cyclomatic = branch_ops + 1 # +1 for the default path
# Metric 3: Nesting depth (nested code objects)
nested_code_objects = sum(
1 for c in code.co_consts
if hasattr(c, 'co_code') # It's a code object
)
# Metric 4: Constant density (constants per instruction)
num_consts = len(code.co_consts)
const_density = num_consts / total_instrs if total_instrs > 0 else 0
# Metric 5: Variable pressure (unique locals / stack size)
var_pressure = code.co_nlocals / max(code.co_stacksize, 1)
# Composite score (weighted)
complexity_score = (
total_instrs * 0.3 +
cyclomatic * 5.0 +
nested_code_objects * 10.0 +
code.co_stacksize * 2.0
)
# Print report
print(f"{'=' * 50}")
print(f" Complexity Report: {func.__name__}")
print(f"{'=' * 50}")
print(f"\n Raw Metrics:")
print(f" Instructions: {total_instrs}")
print(f" Unique opcodes: {unique_opcodes}")
print(f" Constants: {num_consts}")
print(f" Local variables: {code.co_nlocals}")
print(f" Max stack depth: {code.co_stacksize}")
print(f" Nested code objects: {nested_code_objects}")
print(f"\n Complexity Metrics:")
print(f" Cyclomatic (branches + 1): {cyclomatic}")
print(f" Constant density: {const_density:.3f}")
print(f" Variable pressure: {var_pressure:.2f}")
print(f" Composite score: {complexity_score:.1f}")
# Classify
if complexity_score < 20:
level = "LOW - simple utility function"
elif complexity_score < 50:
level = "MEDIUM - moderate logic"
elif complexity_score < 100:
level = "HIGH - consider refactoring"
else:
level = "VERY HIGH - definitely refactor"
print(f" Classification: {level}")
print(f"\n Opcode Distribution (top 5):")
for op, count in opcodes.most_common(5):
pct = count / total_instrs * 100
print(f" {op:25s} {count:3d} ({pct:4.1f}%)")
return {
'instructions': total_instrs,
'cyclomatic': cyclomatic,
'nested': nested_code_objects,
'stack_depth': code.co_stacksize,
'score': complexity_score,
}
# Test with functions of varying complexity
def simple_add(a, b):
return a + b
def medium_search(items, target):
for i, item in enumerate(items):
if item == target:
return i
return -1
def complex_parser(text):
"""A deliberately complex function."""
results = []
lines = text.strip().split('\n')
for line in lines:
tokens = line.split()
if not tokens:
continue
cmd = tokens[0].lower()
if cmd == 'add':
results.append(sum(int(t) for t in tokens[1:]))
elif cmd == 'mul':
product = 1
for t in tokens[1:]:
product *= int(t)
results.append(product)
elif cmd == 'avg':
nums = [float(t) for t in tokens[1:]]
results.append(sum(nums) / len(nums) if nums else 0)
else:
results.append(None)
return results
metrics = {}
for fn in [simple_add, medium_search, complex_parser]:
metrics[fn.__name__] = analyze_complexity(fn)
print()
# Comparison summary
print("=== Summary Comparison ===")
print(f"{'Function':20s} {'Instrs':>8s} {'Cyclo':>6s} {'Score':>8s}")
print("-" * 44)
for name, m in metrics.items():
print(f"{name:20s} {m['instructions']:8d} {m['cyclomatic']:6d} {m['score']:8.1f}")Solution
import dis
from collections import Counter
def analyze_complexity(func):
code = func.__code__
instructions = list(dis.get_instructions(func))
opcodes = Counter(i.opname for i in instructions)
total_instrs = len(instructions)
unique_opcodes = len(opcodes)
branch_ops = sum(1 for i in instructions
if 'JUMP' in i.opname or i.opname in ('FOR_ITER',))
cyclomatic = branch_ops + 1
nested_code_objects = sum(
1 for c in code.co_consts if hasattr(c, 'co_code')
)
num_consts = len(code.co_consts)
const_density = num_consts / total_instrs if total_instrs > 0 else 0
var_pressure = code.co_nlocals / max(code.co_stacksize, 1)
complexity_score = (
total_instrs * 0.3 +
cyclomatic * 5.0 +
nested_code_objects * 10.0 +
code.co_stacksize * 2.0
)
print(f"{'=' * 50}")
print(f" Complexity Report: {func.__name__}")
print(f"{'=' * 50}")
print(f"\n Raw Metrics:")
print(f" Instructions: {total_instrs}")
print(f" Unique opcodes: {unique_opcodes}")
print(f" Constants: {num_consts}")
print(f" Local variables: {code.co_nlocals}")
print(f" Max stack depth: {code.co_stacksize}")
print(f" Nested code objects: {nested_code_objects}")
print(f"\n Complexity Metrics:")
print(f" Cyclomatic: {cyclomatic}")
print(f" Constant density: {const_density:.3f}")
print(f" Variable pressure: {var_pressure:.2f}")
print(f" Composite score: {complexity_score:.1f}")
if complexity_score < 20:
level = "LOW"
elif complexity_score < 50:
level = "MEDIUM"
elif complexity_score < 100:
level = "HIGH"
else:
level = "VERY HIGH"
print(f" Classification: {level}")
return {
'instructions': total_instrs, 'cyclomatic': cyclomatic,
'nested': nested_code_objects, 'stack_depth': code.co_stacksize,
'score': complexity_score,
}
def simple_add(a, b):
return a + b
def medium_search(items, target):
for i, item in enumerate(items):
if item == target:
return i
return -1
def complex_parser(text):
results = []
lines = text.strip().split('\n')
for line in lines:
tokens = line.split()
if not tokens:
continue
cmd = tokens[0].lower()
if cmd == 'add':
results.append(sum(int(t) for t in tokens[1:]))
elif cmd == 'mul':
product = 1
for t in tokens[1:]:
product *= int(t)
results.append(product)
elif cmd == 'avg':
nums = [float(t) for t in tokens[1:]]
results.append(sum(nums) / len(nums) if nums else 0)
else:
results.append(None)
return results
for fn in [simple_add, medium_search, complex_parser]:
analyze_complexity(fn)
print()
How each metric works:
-
Cyclomatic complexity: Counts branch points (JUMP instructions, FOR_ITER) plus 1 for the default path. This approximates McCabe's cyclomatic complexity from the bytecode level. More branches means more paths through the function, more test cases needed, and higher cognitive load.
-
Nested code objects: Comprehensions, lambda expressions, and nested
defstatements each compile to a separate code object stored inco_consts. A function with 3 comprehensions has 3 nested code objects — each adds hidden complexity because they have their own local scope, stack, and instruction stream. -
Stack depth (
co_stacksize): The maximum number of items on the evaluation stack at any point. Deep expression nesting (likef(g(h(x, y), z), w)) increases stack depth. High stack depth correlates with hard-to-read expressions. -
Variable pressure: The ratio of local variables to stack depth. High variable pressure means many named intermediate values (good for readability). Low variable pressure means complex expressions computed inline (harder to debug).
-
Constant density: Constants per instruction. High density suggests a function full of magic numbers or string literals. Low density suggests mostly variable manipulation.
Real-world tools that use similar analysis:
- radon (Python complexity tool): Computes cyclomatic complexity from the AST.
- wily (code complexity tracker): Tracks complexity metrics over git history.
- SonarQube: Uses bytecode analysis for Java; the same principles apply to Python.
The key insight is that the code object contains enough information to compute meaningful complexity metrics without ever running the code. This is the foundation of static analysis.
Expected Output
See solution for full complexity reportHints
Hint 1: Combine `dis.get_instructions()` for bytecode metrics with `ast.parse()` for structural metrics. Count branch instructions (POP_JUMP_IF_*) for cyclomatic complexity, nested code objects (co_consts containing code objects) for nesting depth.
Hint 2: A code object's `co_consts` tuple can contain other code objects — these represent nested functions, lambda expressions, and comprehensions. Count them to measure structural complexity.
