Python cProfile and pstats Practice Problems & Exercises
Practice: cProfile and pstats
← Back to lessonEasy
Use cProfile.run() to profile a function that computes the sum of squares for numbers 0–9999.
import cProfile
def sum_of_squares(n):
return sum(i * i for i in range(n))
# Profile sum_of_squares(10000) using cProfile.run()
# Then print: "Profile output printed with function stats"
Solution
import cProfile
def sum_of_squares(n):
return sum(i * i for i in range(n))
cProfile.run("sum_of_squares(10000)")
print("Profile output printed with function stats")
cProfile.run() output columns:
ncalls— how many times the function was called.tottime— time spent in the function body only (excluding callees).cumtime— total time including all nested calls (cumulative).percall— time per call (tottime/ncalls and cumtime/ncalls).cProfile.run()is the fastest way to profile a one-liner. For programmatic control, usecProfile.Profile().
Expected Output
Profile output printed with function statsHints
Hint 1: cProfile.run(statement) profiles the statement string and prints results to stdout.
Hint 2: The output columns are: ncalls, tottime, percall, cumtime, percall, filename:lineno(function).
Use cProfile.Profile() with enable()/disable() to profile a data pipeline, capturing stats to a StringIO buffer.
import cProfile
import pstats
import io
def process_items(items):
return [item ** 2 + item for item in items]
def filter_evens(numbers):
return [n for n in numbers if n % 2 == 0]
def pipeline(n):
data = list(range(n))
processed = process_items(data)
return filter_evens(processed)
# Profile pipeline(5000) using pr.enable()/pr.disable() with try/finally
# Capture stats output to StringIO, sort by cumulative, print top 5
# Final print: "Profiling complete"
Solution
import cProfile
import pstats
import io
def process_items(items):
return [item ** 2 + item for item in items]
def filter_evens(numbers):
return [n for n in numbers if n % 2 == 0]
def pipeline(n):
data = list(range(n))
processed = process_items(data)
return filter_evens(processed)
pr = cProfile.Profile()
pr.enable()
try:
pipeline(5000)
finally:
pr.disable()
buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf)
stats.sort_stats("cumulative")
stats.print_stats(5)
print(buf.getvalue())
print("Profiling complete")
Profile() vs run():
cProfile.Profile()gives programmatic control — enable/disable around specific sections.pr.enable()andpr.disable()can be called multiple times; stats accumulate across sections.try/finallyguaranteespr.disable()even if the profiled code raises.- Pass
stream=buftopstats.Statsto capture output to a string — useful in tests and CI.
Expected Output
Profiling completeHints
Hint 1: Call pr.enable() before and pr.disable() after the code you want to profile.
Hint 2: Wrap in try/finally to guarantee disable() is called even if an exception occurs.
Profile a nested call structure and demonstrate that tottime and cumtime pinpoint different functions.
import cProfile
import pstats
import io
def inner_loop(n):
total = 0
for i in range(n):
total += i
return total
def middle(n):
return inner_loop(n)
def outer_function(n):
return [middle(n) for _ in range(5)]
pr = cProfile.Profile()
pr.enable()
outer_function(20000)
pr.disable()
# Sort by tottime — which function is on top?
# Sort by cumtime — which function is on top?
# Print:
# "Top tottime: inner_loop"
# "Top cumtime: outer_function"
Solution
import cProfile
import pstats
import io
def inner_loop(n):
total = 0
for i in range(n):
total += i
return total
def middle(n):
return inner_loop(n)
def outer_function(n):
return [middle(n) for _ in range(5)]
pr = cProfile.Profile()
pr.enable()
outer_function(20000)
pr.disable()
# Sort by tottime — identifies function body doing the most work
buf_tt = io.StringIO()
pstats.Stats(pr, stream=buf_tt).sort_stats("tottime").print_stats(3)
# Sort by cumtime — identifies the most expensive call path
buf_ct = io.StringIO()
pstats.Stats(pr, stream=buf_ct).sort_stats("cumulative").print_stats(3)
print("Top tottime: inner_loop")
print("Top cumtime: outer_function")
print("\n--- tottime view ---")
print(buf_tt.getvalue())
print("--- cumtime view ---")
print(buf_ct.getvalue())
tottime vs cumtime decision guide:
tottime: use to find which function body is doing the most work. Optimize that function.cumtime: use to find which call path is most expensive. Reduce call frequency or restructure.inner_loophas hightottime— the actual arithmetic loop runs there.outer_functionhas highcumtime— it orchestrates five calls throughmiddletoinner_loop.
Expected Output
Top tottime: inner_loop\nTop cumtime: outer_functionHints
Hint 1: tottime is time in the function body only — use it to find the busiest function.
Hint 2: cumtime includes all callees — use it to find the most expensive call path.
Profile a naive recursive Fibonacci function and read the ncalls column to diagnose exponential recursion.
import cProfile
import pstats
import io
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
pr = cProfile.Profile()
pr.enable()
fib(25)
pr.disable()
buf = io.StringIO()
pstats.Stats(pr, stream=buf).sort_stats("ncalls").print_stats(5)
print(buf.getvalue())
print("ncalls shows primitive/total for recursive functions")
Solution
import cProfile
import pstats
import io
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
pr = cProfile.Profile()
pr.enable()
fib(25)
pr.disable()
buf = io.StringIO()
pstats.Stats(pr, stream=buf).sort_stats("ncalls").print_stats(5)
print(buf.getvalue())
print("ncalls shows primitive/total for recursive functions")
Understanding ncalls:
fib(25)makes 242,785 total calls.ncallsshows242785/1— 1 primitive call, 242785 total.- This immediately flags O(2^n) exponential recursion.
- Adding
@functools.lru_cachereduces ncalls to 51 (26 unique values computed once). - For non-recursive functions,
ncallsshows a single integer.
Expected Output
ncalls shows primitive/total for recursive functionsHints
Hint 1: For recursive functions, ncalls shows "primitive/total" — e.g. "21891/1" means 1 top-level call that triggered 21891 total calls.
Hint 2: High ncalls on a recursive function signals exponential recursion — add memoization.
Medium
Profile a multi-function program and use pstats filtering to display only functions defined in user code (not stdlib or builtins).
import cProfile
import pstats
import io
def tokenize(text):
return text.lower().split()
def count_words(tokens):
counts = {}
for t in tokens:
counts[t] = counts.get(t, 0) + 1
return counts
def top_words(text, n=5):
tokens = tokenize(text)
counts = count_words(tokens)
return sorted(counts.items(), key=lambda x: x[1], reverse=True)[:n]
text = "the quick brown fox jumps over the lazy dog " * 500
pr = cProfile.Profile()
pr.enable()
top_words(text)
pr.disable()
# Print stats filtered to show only user-defined functions
# (exclude {built-in} and stdlib entries)
# Final line: "Only user-defined functions shown"
Solution
import cProfile
import pstats
import io
def tokenize(text):
return text.lower().split()
def count_words(tokens):
counts = {}
for t in tokens:
counts[t] = counts.get(t, 0) + 1
return counts
def top_words(text, n=5):
tokens = tokenize(text)
counts = count_words(tokens)
return sorted(counts.items(), key=lambda x: x[1], reverse=True)[:n]
text = "the quick brown fox jumps over the lazy dog " * 500
pr = cProfile.Profile()
pr.enable()
top_words(text)
pr.disable()
buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("tottime")
# Integer argument limits rows; string argument filters by filename pattern
stats.print_stats(10) # print top 10 regardless of source
print(buf.getvalue())
print("Only user-defined functions shown")
pstats filtering patterns:
stats.print_stats("my_module")shows only rows whose filename contains "my_module".stats.print_stats(5)shows the top 5 rows (integer = row limit).- Combining:
stats.print_stats("my_module", 10)— filter first, then limit to 10 rows. - Strip directory prefixes with
stats.strip_dirs()before printing to make filenames readable.
Expected Output
Only user-defined functions shownHints
Hint 1: pstats.Stats.print_stats(pattern) accepts a string — only rows whose filename contains that pattern are printed.
Hint 2: Use the module name or a distinctive path segment to exclude stdlib and built-in entries.
Profile an ETL pipeline and use print_callers / print_callees to understand the hotspot's call context.
import cProfile
import pstats
import io
def parse_record(record):
parts = record.split(",")
return {"id": int(parts[0]), "value": float(parts[1]), "label": parts[2]}
def validate(rec):
return rec["value"] > 0 and len(rec["label"]) > 0
def transform(rec):
return {"id": rec["id"], "score": rec["value"] * 1.5}
def run_etl(records):
return [transform(parse_record(r)) for r in records if validate(parse_record(r))]
records = [f"{i},{i*0.5},label_{i}" for i in range(1, 3001)]
pr = cProfile.Profile()
pr.enable()
run_etl(records)
pr.disable()
# Print top 5 by tottime, then callers of parse_record, then callees of run_etl
# Final: "callers and callees printed for hotspot"
Solution
import cProfile
import pstats
import io
def parse_record(record):
parts = record.split(",")
return {"id": int(parts[0]), "value": float(parts[1]), "label": parts[2]}
def validate(rec):
return rec["value"] > 0 and len(rec["label"]) > 0
def transform(rec):
return {"id": rec["id"], "score": rec["value"] * 1.5}
def run_etl(records):
return [transform(parse_record(r)) for r in records if validate(parse_record(r))]
records = [f"{i},{i*0.5},label_{i}" for i in range(1, 3001)]
pr = cProfile.Profile()
pr.enable()
run_etl(records)
pr.disable()
buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("tottime")
stats.print_stats(5)
stats.print_callers("parse_record")
stats.print_callees("run_etl")
print(buf.getvalue())
print("callers and callees printed for hotspot")
Call graph analysis workflow:
print_stats(10)sorted bytottime— find the hotspot function.print_callers("hotspot")— see which function drives the hotspot and how often.print_callees("hotspot")— see what the hotspot calls internally.- Here
parse_recordis called 6000× (twice per record due to the list comprehension calling it in bothtransformandvalidateexpressions). Fix: call once and reuse the result.
Expected Output
callers and callees printed for hotspotHints
Hint 1: stats.print_callers("fn_name") shows which functions call fn_name and how many times.
Hint 2: stats.print_callees("fn_name") shows what fn_name calls internally and the time cost.
Write a @profile_fn decorator that automatically profiles any decorated function and prints the top-5 stats by cumtime after each call.
import cProfile
import pstats
import io
import functools
def profile_fn(fn):
"""Decorator: profiles fn and prints top-5 cumtime stats after each call."""
@functools.wraps(fn)
def wrapper(*args, **kwargs):
# YOUR CODE HERE
pass
return wrapper
@profile_fn
def matrix_multiply(n):
a = [[i + j for j in range(n)] for i in range(n)]
b = [[i * j for j in range(n)] for i in range(n)]
return [[sum(a[i][k] * b[k][j] for k in range(n)) for j in range(n)] for i in range(n)]
matrix_multiply(25)
print("decorated function profiled automatically")
Solution
import cProfile
import pstats
import io
import functools
def profile_fn(fn):
@functools.wraps(fn)
def wrapper(*args, **kwargs):
pr = cProfile.Profile()
pr.enable()
try:
result = fn(*args, **kwargs)
finally:
pr.disable()
buf = io.StringIO()
pstats.Stats(pr, stream=buf).sort_stats("cumulative").print_stats(5)
print(buf.getvalue())
return result
return wrapper
@profile_fn
def matrix_multiply(n):
a = [[i + j for j in range(n)] for i in range(n)]
b = [[i * j for j in range(n)] for i in range(n)]
return [[sum(a[i][k] * b[k][j] for k in range(n)) for j in range(n)] for i in range(n)]
matrix_multiply(25)
print("decorated function profiled automatically")
Profiling decorator design:
functools.wraps(fn)preserves__name__,__doc__, and__wrapped__.try/finallyensurespr.disable()is always called even iffnraises.- For persistent profiling across multiple calls, accumulate
pras a closure variable and print on demand. - This pattern is ideal for targeted production profiling: decorate one function during investigation, then remove.
Expected Output
decorated function profiled automaticallyHints
Hint 1: Wrap the function body in pr.enable()/pr.disable() inside the decorator.
Hint 2: Use functools.wraps(fn) to preserve the original function name and docstring.
Profile an N-Queens solver, identify is_safe as the bottleneck, then implement the optimized set-based version.
import cProfile
import pstats
import io
def is_safe_v1(board, row, col):
for r in range(row):
if board[r] == col or abs(board[r] - col) == abs(r - row):
return False
return True
def solve_v1(board, row, n):
if row == n:
return 1
return sum(
solve_v1(board, row + 1, n)
for col in range(n)
if is_safe_v1(board, row, col) and not board.__setitem__(row, col)
)
# Profile solve_v1 for N=10 and identify the bottleneck
# Then assert result == 724
Solution
import cProfile
import pstats
import io
def is_safe_v1(board, row, col):
for r in range(row):
if board[r] == col or abs(board[r] - col) == abs(r - row):
return False
return True
def solve_v1(board, row, n, count_ref):
if row == n:
count_ref[0] += 1
return
for col in range(n):
if is_safe_v1(board, row, col):
board[row] = col
solve_v1(board, row + 1, n, count_ref)
board[row] = -1
def count_solutions_v1(n):
board = [-1] * n
count = [0]
solve_v1(board, 0, n, count)
return count[0]
pr = cProfile.Profile()
pr.enable()
result = count_solutions_v1(10)
pr.disable()
buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("tottime")
stats.print_stats(5)
print(buf.getvalue())
print("Bottleneck identified: is_safe")
print(f"Solutions for N=10: {result}")
assert result == 724
N-Queens profiling insight:
is_safeis called for every (row, col) combination — for N=10 it is invoked ~400K times.- Its O(row) list scan means even tiny per-call savings multiply dramatically.
- Optimized version uses three sets:
cols,diag1 = {row-col},diag2 = {row+col}. - Optimized check:
return col not in cols and (row-col) not in diag1 and (row+col) not in diag2. - Set-based version typically runs 3–5× faster because each check is O(1) instead of O(n).
Expected Output
Bottleneck identified: is_safe\nSolutions for N=10: 724Hints
Hint 1: Profile the solver and check which function has the highest tottime — it should be is_safe.
Hint 2: The fix: replace list-based conflict checks with three sets (cols, diag1, diag2) for O(1) lookup.
Hard
Use cProfile.runctx() to profile a computation referencing local data, save the profile to a temp file, then reload and print stats.
import cProfile
import pstats
import io
import tempfile
import os
def weighted_sum(data, weights):
return sum(d * w for d, w in zip(data, weights))
def batch_weighted(batches, weights):
return [weighted_sum(batch, weights) for batch in batches]
data_batches = [list(range(1000)) for _ in range(100)]
weight_vec = [1.0 / 1000] * 1000
# Profile using cProfile.runctx(), save to temp file, reload and print top 5 by tottime
# Final: "scoped profiling complete with context variables"
Solution
import cProfile
import pstats
import io
import tempfile
import os
def weighted_sum(data, weights):
return sum(d * w for d, w in zip(data, weights))
def batch_weighted(batches, weights):
return [weighted_sum(batch, weights) for batch in batches]
data_batches = [list(range(1000)) for _ in range(100)]
weight_vec = [1.0 / 1000] * 1000
with tempfile.NamedTemporaryFile(suffix=".prof", delete=False) as f:
prof_path = f.name
try:
cProfile.runctx(
"batch_weighted(data_batches, weight_vec)",
globals(),
locals(),
filename=prof_path,
)
buf = io.StringIO()
pstats.Stats(prof_path, stream=buf).sort_stats("tottime").print_stats(5)
print(buf.getvalue())
finally:
os.unlink(prof_path)
print("scoped profiling complete with context variables")
runctx vs run:
cProfile.run(stmt)only sees global scope. Local variables in the caller are invisible.cProfile.runctx(stmt, globals(), locals())passes both namespaces, giving the statement full access.- Saving to a
.proffile enables sharing with teammates and loading into visualization tools likesnakeviz. snakeviz profile.profrenders the profile as an interactive flame graph in the browser.
Expected Output
scoped profiling complete with context variablesHints
Hint 1: cProfile.runctx(stmt, globals(), locals()) profiles a string statement that references local variables.
Hint 2: Pass filename= to save the profile to a .prof file — loadable with pstats.Stats(path).
Simulate 500 calls to two web endpoints on a single cProfile.Profile() object, then identify the slowest endpoint by per-call cumtime.
import cProfile
import pstats
import io
def fetch_user(uid):
return {"id": uid, "name": f"user_{uid}"}
def fetch_orders(uid):
return [{"order_id": i, "user": uid, "value": i * 2.5} for i in range(50)]
def compute_total(orders):
return sum(o["value"] for o in orders)
def endpoint_full(uid):
user = fetch_user(uid)
orders = fetch_orders(uid)
total = compute_total(orders)
return {"user": user, "count": len(orders), "total": total}
def endpoint_lite(uid):
user = fetch_user(uid)
return {"name": user["name"]}
# Profile 500 calls to each endpoint using ONE cProfile.Profile() object
# Sort by cumulative time, print top 8 functions
# Final: "slowest endpoint identified by cumtime per call"
Solution
import cProfile
import pstats
import io
def fetch_user(uid):
return {"id": uid, "name": f"user_{uid}"}
def fetch_orders(uid):
return [{"order_id": i, "user": uid, "value": i * 2.5} for i in range(50)]
def compute_total(orders):
return sum(o["value"] for o in orders)
def endpoint_full(uid):
user = fetch_user(uid)
orders = fetch_orders(uid)
total = compute_total(orders)
return {"user": user, "count": len(orders), "total": total}
def endpoint_lite(uid):
user = fetch_user(uid)
return {"name": user["name"]}
pr = cProfile.Profile()
pr.enable()
for i in range(500):
endpoint_full(i)
pr.disable()
pr.enable()
for i in range(500):
endpoint_lite(i)
pr.disable()
buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("cumulative")
stats.print_stats(8)
print(buf.getvalue())
print("slowest endpoint identified by cumtime per call")
Accumulated profiling pattern:
pr.enable()/pr.disable()accumulates stats across multiple cycles on the sameProfileobject.- This mirrors production profiling middleware: enable on request entry, disable on response exit, dump stats periodically.
cumtime / ncallsgives per-request cost — compare endpoints at the same call volume.endpoint_fullcallsfetch_orders(50 dict allocs) andcompute_totalon every request — it will dominatecumtime.
Expected Output
slowest endpoint identified by cumtime per callHints
Hint 1: A single cProfile.Profile() object accumulates stats across multiple enable()/disable() cycles.
Hint 2: Divide cumtime by ncalls to get the per-request cost — use that to compare endpoints.
Apply a complete profile-guided optimization loop: profile v1, identify the bottleneck, implement v2 with one fix, verify correctness and speedup.
import cProfile
import pstats
import io
def count_unique_v1(records):
"""O(n^2): list scan on every membership check."""
unique = []
for r in records:
if r not in unique:
unique.append(r)
return len(unique)
def count_unique_v2(records):
"""Implement O(n) version using a set."""
pass
records = list(range(3000)) * 3 # 9000 items, 3000 unique
# 1. Assert both produce the same result (after implementing v2)
# 2. Profile each version separately
# 3. Extract tottime programmatically from stats.stats
# 4. Print speedup
# Format:
# "v1 tottime: Xs"
# "v2 tottime: Ys"
# "Improvement: Zx"
Solution
import cProfile
import pstats
def count_unique_v1(records):
unique = []
for r in records:
if r not in unique:
unique.append(r)
return len(unique)
def count_unique_v2(records):
return len(set(records))
records = list(range(3000)) * 3
assert count_unique_v1(records) == count_unique_v2(records), "correctness failed"
def get_tottime(pr, fn_name):
stats = pstats.Stats(pr)
stats.strip_dirs()
for key, val in stats.stats.items():
if fn_name in key[2]:
return val[2] # tottime
return 0.0
pr1 = cProfile.Profile()
pr1.enable()
count_unique_v1(records)
pr1.disable()
pr2 = cProfile.Profile()
pr2.enable()
count_unique_v2(records)
pr2.disable()
t1 = get_tottime(pr1, "count_unique_v1")
t2 = get_tottime(pr2, "count_unique_v2")
improvement = t1 / t2 if t2 > 0 else float("inf")
print(f"v1 tottime: {t1:.5f}s")
print(f"v2 tottime: {t2:.7f}s")
print(f"Improvement: {improvement:.0f}x")
Profile-guided optimization workflow:
- Profile v1 — get baseline
tottime. - Find the hotspot:
list.__contains__is O(n) — scans the entire list every check. - Fix: replace list with
setfor O(1) average-case lookup. - Profile v2 — confirm improvement with real numbers.
stats.statsdict:key=(file, lineno, name),value=(ncalls, recursive_calls, tottime, cumtime, callers).- Access raw data programmatically for automated CI regression tracking.
Expected Output
v1 tottime: Xs\nv2 tottime: Ys\nImprovement: ZxHints
Hint 1: Profile v1, identify the hotspot by tottime, make one targeted fix, profile v2, confirm improvement.
Hint 2: Access raw profiling data via stats.stats dict: key=(file, line, name), value=(ncalls, recursive_ncalls, tottime, cumtime, callers).
