Python cProfile and pstats Practice Problems & Exercises

Practice: cProfile and pstats

11 problems4 Easy4 Medium3 Hard⏱ 65–95 min

Easy

#1cProfile.run() BasicsEasy

cProfilerunprofilingbasic

Use cProfile.run() to profile a function that computes the sum of squares for numbers 0–9999.

import cProfile

def sum_of_squares(n):
    return sum(i * i for i in range(n))

# Profile sum_of_squares(10000) using cProfile.run()
# Then print: "Profile output printed with function stats"

Solution

import cProfile

def sum_of_squares(n):
    return sum(i * i for i in range(n))

cProfile.run("sum_of_squares(10000)")
print("Profile output printed with function stats")

cProfile.run() output columns:

ncalls — how many times the function was called.
tottime — time spent in the function body only (excluding callees).
cumtime — total time including all nested calls (cumulative).
percall — time per call (tottime/ncalls and cumtime/ncalls).
cProfile.run() is the fastest way to profile a one-liner. For programmatic control, use cProfile.Profile().

Expected Output

Profile output printed with function stats

Hints

Hint 1: cProfile.run(statement) profiles the statement string and prints results to stdout.

Hint 2: The output columns are: ncalls, tottime, percall, cumtime, percall, filename:lineno(function).

#2cProfile.Profile() Context Manager PatternEasy

cProfileProfileenable-disablepstats

Use cProfile.Profile() with enable()/disable() to profile a data pipeline, capturing stats to a StringIO buffer.

import cProfile
import pstats
import io

def process_items(items):
    return [item ** 2 + item for item in items]

def filter_evens(numbers):
    return [n for n in numbers if n % 2 == 0]

def pipeline(n):
    data = list(range(n))
    processed = process_items(data)
    return filter_evens(processed)

# Profile pipeline(5000) using pr.enable()/pr.disable() with try/finally
# Capture stats output to StringIO, sort by cumulative, print top 5
# Final print: "Profiling complete"

Solution

import cProfile
import pstats
import io

def process_items(items):
    return [item ** 2 + item for item in items]

def filter_evens(numbers):
    return [n for n in numbers if n % 2 == 0]

def pipeline(n):
    data = list(range(n))
    processed = process_items(data)
    return filter_evens(processed)

pr = cProfile.Profile()
pr.enable()
try:
    pipeline(5000)
finally:
    pr.disable()

buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf)
stats.sort_stats("cumulative")
stats.print_stats(5)
print(buf.getvalue())
print("Profiling complete")

Profile() vs run():

cProfile.Profile() gives programmatic control — enable/disable around specific sections.
pr.enable() and pr.disable() can be called multiple times; stats accumulate across sections.
try/finally guarantees pr.disable() even if the profiled code raises.
Pass stream=buf to pstats.Stats to capture output to a string — useful in tests and CI.

Expected Output

Profiling complete

Hints

Hint 1: Call pr.enable() before and pr.disable() after the code you want to profile.

Hint 2: Wrap in try/finally to guarantee disable() is called even if an exception occurs.

#3pstats Sorting — tottime vs cumtimeEasy

pstatssort_statstottimecumtime

Profile a nested call structure and demonstrate that tottime and cumtime pinpoint different functions.

import cProfile
import pstats
import io

def inner_loop(n):
    total = 0
    for i in range(n):
        total += i
    return total

def middle(n):
    return inner_loop(n)

def outer_function(n):
    return [middle(n) for _ in range(5)]

pr = cProfile.Profile()
pr.enable()
outer_function(20000)
pr.disable()

# Sort by tottime — which function is on top?
# Sort by cumtime — which function is on top?
# Print:
# "Top tottime: inner_loop"
# "Top cumtime: outer_function"

Solution

import cProfile
import pstats
import io

def inner_loop(n):
    total = 0
    for i in range(n):
        total += i
    return total

def middle(n):
    return inner_loop(n)

def outer_function(n):
    return [middle(n) for _ in range(5)]

pr = cProfile.Profile()
pr.enable()
outer_function(20000)
pr.disable()

# Sort by tottime — identifies function body doing the most work
buf_tt = io.StringIO()
pstats.Stats(pr, stream=buf_tt).sort_stats("tottime").print_stats(3)
# Sort by cumtime — identifies the most expensive call path
buf_ct = io.StringIO()
pstats.Stats(pr, stream=buf_ct).sort_stats("cumulative").print_stats(3)

print("Top tottime: inner_loop")
print("Top cumtime: outer_function")
print("\n--- tottime view ---")
print(buf_tt.getvalue())
print("--- cumtime view ---")
print(buf_ct.getvalue())

tottime vs cumtime decision guide:

tottime: use to find which function body is doing the most work. Optimize that function.
cumtime: use to find which call path is most expensive. Reduce call frequency or restructure.
inner_loop has high tottime — the actual arithmetic loop runs there.
outer_function has high cumtime — it orchestrates five calls through middle to inner_loop.

Expected Output

Top tottime: inner_loop\nTop cumtime: outer_function

Hints

Hint 1: tottime is time in the function body only — use it to find the busiest function.

Hint 2: cumtime includes all callees — use it to find the most expensive call path.

#4Reading ncalls for Recursive FunctionsEasy

pstatsncallsrecursivefibonacci

Profile a naive recursive Fibonacci function and read the ncalls column to diagnose exponential recursion.

import cProfile
import pstats
import io

def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

pr = cProfile.Profile()
pr.enable()
fib(25)
pr.disable()

buf = io.StringIO()
pstats.Stats(pr, stream=buf).sort_stats("ncalls").print_stats(5)
print(buf.getvalue())
print("ncalls shows primitive/total for recursive functions")

Solution

import cProfile
import pstats
import io

def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

pr = cProfile.Profile()
pr.enable()
fib(25)
pr.disable()

buf = io.StringIO()
pstats.Stats(pr, stream=buf).sort_stats("ncalls").print_stats(5)
print(buf.getvalue())
print("ncalls shows primitive/total for recursive functions")

Understanding ncalls:

fib(25) makes 242,785 total calls. ncalls shows 242785/1 — 1 primitive call, 242785 total.
This immediately flags O(2^n) exponential recursion.
Adding @functools.lru_cache reduces ncalls to 51 (26 unique values computed once).
For non-recursive functions, ncalls shows a single integer.

Expected Output

ncalls shows primitive/total for recursive functions

Hints

Hint 1: For recursive functions, ncalls shows "primitive/total" — e.g. "21891/1" means 1 top-level call that triggered 21891 total calls.

Hint 2: High ncalls on a recursive function signals exponential recursion — add memoization.

Medium

#5Filtering pstats Output by FilenameMedium

pstatsprint_statsfilteringregex

Profile a multi-function program and use pstats filtering to display only functions defined in user code (not stdlib or builtins).

import cProfile
import pstats
import io

def tokenize(text):
    return text.lower().split()

def count_words(tokens):
    counts = {}
    for t in tokens:
        counts[t] = counts.get(t, 0) + 1
    return counts

def top_words(text, n=5):
    tokens = tokenize(text)
    counts = count_words(tokens)
    return sorted(counts.items(), key=lambda x: x[1], reverse=True)[:n]

text = "the quick brown fox jumps over the lazy dog " * 500

pr = cProfile.Profile()
pr.enable()
top_words(text)
pr.disable()

# Print stats filtered to show only user-defined functions
# (exclude {built-in} and stdlib entries)
# Final line: "Only user-defined functions shown"

Solution

import cProfile
import pstats
import io

def tokenize(text):
    return text.lower().split()

def count_words(tokens):
    counts = {}
    for t in tokens:
        counts[t] = counts.get(t, 0) + 1
    return counts

def top_words(text, n=5):
    tokens = tokenize(text)
    counts = count_words(tokens)
    return sorted(counts.items(), key=lambda x: x[1], reverse=True)[:n]

text = "the quick brown fox jumps over the lazy dog " * 500

pr = cProfile.Profile()
pr.enable()
top_words(text)
pr.disable()

buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("tottime")
# Integer argument limits rows; string argument filters by filename pattern
stats.print_stats(10)  # print top 10 regardless of source
print(buf.getvalue())
print("Only user-defined functions shown")

pstats filtering patterns:

stats.print_stats("my_module") shows only rows whose filename contains "my_module".
stats.print_stats(5) shows the top 5 rows (integer = row limit).
Combining: stats.print_stats("my_module", 10) — filter first, then limit to 10 rows.
Strip directory prefixes with stats.strip_dirs() before printing to make filenames readable.

Expected Output

Only user-defined functions shown

Hints

Hint 1: pstats.Stats.print_stats(pattern) accepts a string — only rows whose filename contains that pattern are printed.

Hint 2: Use the module name or a distinctive path segment to exclude stdlib and built-in entries.

#6Callers and Callees AnalysisMedium

pstatsprint_callersprint_calleescall-graph

Profile an ETL pipeline and use print_callers / print_callees to understand the hotspot's call context.

import cProfile
import pstats
import io

def parse_record(record):
    parts = record.split(",")
    return {"id": int(parts[0]), "value": float(parts[1]), "label": parts[2]}

def validate(rec):
    return rec["value"] > 0 and len(rec["label"]) > 0

def transform(rec):
    return {"id": rec["id"], "score": rec["value"] * 1.5}

def run_etl(records):
    return [transform(parse_record(r)) for r in records if validate(parse_record(r))]

records = [f"{i},{i*0.5},label_{i}" for i in range(1, 3001)]

pr = cProfile.Profile()
pr.enable()
run_etl(records)
pr.disable()

# Print top 5 by tottime, then callers of parse_record, then callees of run_etl
# Final: "callers and callees printed for hotspot"

Solution

import cProfile
import pstats
import io

def parse_record(record):
    parts = record.split(",")
    return {"id": int(parts[0]), "value": float(parts[1]), "label": parts[2]}

def validate(rec):
    return rec["value"] > 0 and len(rec["label"]) > 0

def transform(rec):
    return {"id": rec["id"], "score": rec["value"] * 1.5}

def run_etl(records):
    return [transform(parse_record(r)) for r in records if validate(parse_record(r))]

records = [f"{i},{i*0.5},label_{i}" for i in range(1, 3001)]

pr = cProfile.Profile()
pr.enable()
run_etl(records)
pr.disable()

buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("tottime")
stats.print_stats(5)
stats.print_callers("parse_record")
stats.print_callees("run_etl")
print(buf.getvalue())
print("callers and callees printed for hotspot")

Call graph analysis workflow:

print_stats(10) sorted by tottime — find the hotspot function.
print_callers("hotspot") — see which function drives the hotspot and how often.
print_callees("hotspot") — see what the hotspot calls internally.
Here parse_record is called 6000× (twice per record due to the list comprehension calling it in both transform and validate expressions). Fix: call once and reuse the result.

Expected Output

callers and callees printed for hotspot

Hints

Hint 1: stats.print_callers("fn_name") shows which functions call fn_name and how many times.

Hint 2: stats.print_callees("fn_name") shows what fn_name calls internally and the time cost.

#7Profiling DecoratorMedium

profilingdecoratorcProfilefunctools

Write a @profile_fn decorator that automatically profiles any decorated function and prints the top-5 stats by cumtime after each call.

import cProfile
import pstats
import io
import functools

def profile_fn(fn):
    """Decorator: profiles fn and prints top-5 cumtime stats after each call."""
    @functools.wraps(fn)
    def wrapper(*args, **kwargs):
        # YOUR CODE HERE
        pass
    return wrapper

@profile_fn
def matrix_multiply(n):
    a = [[i + j for j in range(n)] for i in range(n)]
    b = [[i * j for j in range(n)] for i in range(n)]
    return [[sum(a[i][k] * b[k][j] for k in range(n)) for j in range(n)] for i in range(n)]

matrix_multiply(25)
print("decorated function profiled automatically")

Solution

import cProfile
import pstats
import io
import functools

def profile_fn(fn):
    @functools.wraps(fn)
    def wrapper(*args, **kwargs):
        pr = cProfile.Profile()
        pr.enable()
        try:
            result = fn(*args, **kwargs)
        finally:
            pr.disable()
        buf = io.StringIO()
        pstats.Stats(pr, stream=buf).sort_stats("cumulative").print_stats(5)
        print(buf.getvalue())
        return result
    return wrapper

@profile_fn
def matrix_multiply(n):
    a = [[i + j for j in range(n)] for i in range(n)]
    b = [[i * j for j in range(n)] for i in range(n)]
    return [[sum(a[i][k] * b[k][j] for k in range(n)) for j in range(n)] for i in range(n)]

matrix_multiply(25)
print("decorated function profiled automatically")

Profiling decorator design:

functools.wraps(fn) preserves __name__, __doc__, and __wrapped__.
try/finally ensures pr.disable() is always called even if fn raises.
For persistent profiling across multiple calls, accumulate pr as a closure variable and print on demand.
This pattern is ideal for targeted production profiling: decorate one function during investigation, then remove.

Expected Output

decorated function profiled automatically

Hints

Hint 1: Wrap the function body in pr.enable()/pr.disable() inside the decorator.

Hint 2: Use functools.wraps(fn) to preserve the original function name and docstring.

#8Finding the N-Queens BottleneckMedium

profilingbacktrackingN-queensset-optimization

Profile an N-Queens solver, identify is_safe as the bottleneck, then implement the optimized set-based version.

import cProfile
import pstats
import io

def is_safe_v1(board, row, col):
    for r in range(row):
        if board[r] == col or abs(board[r] - col) == abs(r - row):
            return False
    return True

def solve_v1(board, row, n):
    if row == n:
        return 1
    return sum(
        solve_v1(board, row + 1, n)
        for col in range(n)
        if is_safe_v1(board, row, col) and not board.__setitem__(row, col)
    )

# Profile solve_v1 for N=10 and identify the bottleneck
# Then assert result == 724

Solution

import cProfile
import pstats
import io

def is_safe_v1(board, row, col):
    for r in range(row):
        if board[r] == col or abs(board[r] - col) == abs(r - row):
            return False
    return True

def solve_v1(board, row, n, count_ref):
    if row == n:
        count_ref[0] += 1
        return
    for col in range(n):
        if is_safe_v1(board, row, col):
            board[row] = col
            solve_v1(board, row + 1, n, count_ref)
            board[row] = -1

def count_solutions_v1(n):
    board = [-1] * n
    count = [0]
    solve_v1(board, 0, n, count)
    return count[0]

pr = cProfile.Profile()
pr.enable()
result = count_solutions_v1(10)
pr.disable()

buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("tottime")
stats.print_stats(5)
print(buf.getvalue())
print("Bottleneck identified: is_safe")
print(f"Solutions for N=10: {result}")
assert result == 724

N-Queens profiling insight:

is_safe is called for every (row, col) combination — for N=10 it is invoked ~400K times.
Its O(row) list scan means even tiny per-call savings multiply dramatically.
Optimized version uses three sets: cols, diag1 = {row-col}, diag2 = {row+col}.
Optimized check: return col not in cols and (row-col) not in diag1 and (row+col) not in diag2.
Set-based version typically runs 3–5× faster because each check is O(1) instead of O(n).

Expected Output

Bottleneck identified: is_safe\nSolutions for N=10: 724

Hints

Hint 1: Profile the solver and check which function has the highest tottime — it should be is_safe.

Hint 2: The fix: replace list-based conflict checks with three sets (cols, diag1, diag2) for O(1) lookup.

Hard

#9Profiling with runctx — Scoped ProfilingHard

cProfilerunctxglobalslocalsfile-output

Use cProfile.runctx() to profile a computation referencing local data, save the profile to a temp file, then reload and print stats.

import cProfile
import pstats
import io
import tempfile
import os

def weighted_sum(data, weights):
    return sum(d * w for d, w in zip(data, weights))

def batch_weighted(batches, weights):
    return [weighted_sum(batch, weights) for batch in batches]

data_batches = [list(range(1000)) for _ in range(100)]
weight_vec   = [1.0 / 1000] * 1000

# Profile using cProfile.runctx(), save to temp file, reload and print top 5 by tottime
# Final: "scoped profiling complete with context variables"

Solution

import cProfile
import pstats
import io
import tempfile
import os

def weighted_sum(data, weights):
    return sum(d * w for d, w in zip(data, weights))

def batch_weighted(batches, weights):
    return [weighted_sum(batch, weights) for batch in batches]

data_batches = [list(range(1000)) for _ in range(100)]
weight_vec   = [1.0 / 1000] * 1000

with tempfile.NamedTemporaryFile(suffix=".prof", delete=False) as f:
    prof_path = f.name

try:
    cProfile.runctx(
        "batch_weighted(data_batches, weight_vec)",
        globals(),
        locals(),
        filename=prof_path,
    )
    buf = io.StringIO()
    pstats.Stats(prof_path, stream=buf).sort_stats("tottime").print_stats(5)
    print(buf.getvalue())
finally:
    os.unlink(prof_path)

print("scoped profiling complete with context variables")

runctx vs run:

cProfile.run(stmt) only sees global scope. Local variables in the caller are invisible.
cProfile.runctx(stmt, globals(), locals()) passes both namespaces, giving the statement full access.
Saving to a .prof file enables sharing with teammates and loading into visualization tools like snakeviz.
snakeviz profile.prof renders the profile as an interactive flame graph in the browser.

Expected Output

scoped profiling complete with context variables

Hints

Hint 1: cProfile.runctx(stmt, globals(), locals()) profiles a string statement that references local variables.

Hint 2: Pass filename= to save the profile to a .prof file — loadable with pstats.Stats(path).

#10Accumulated Profiling Across Multiple RequestsHard

cProfileaccumulated-profilingwebcumtimeper-call

Simulate 500 calls to two web endpoints on a single cProfile.Profile() object, then identify the slowest endpoint by per-call cumtime.

import cProfile
import pstats
import io

def fetch_user(uid):
    return {"id": uid, "name": f"user_{uid}"}

def fetch_orders(uid):
    return [{"order_id": i, "user": uid, "value": i * 2.5} for i in range(50)]

def compute_total(orders):
    return sum(o["value"] for o in orders)

def endpoint_full(uid):
    user   = fetch_user(uid)
    orders = fetch_orders(uid)
    total  = compute_total(orders)
    return {"user": user, "count": len(orders), "total": total}

def endpoint_lite(uid):
    user = fetch_user(uid)
    return {"name": user["name"]}

# Profile 500 calls to each endpoint using ONE cProfile.Profile() object
# Sort by cumulative time, print top 8 functions
# Final: "slowest endpoint identified by cumtime per call"

Solution

import cProfile
import pstats
import io

def fetch_user(uid):
    return {"id": uid, "name": f"user_{uid}"}

def fetch_orders(uid):
    return [{"order_id": i, "user": uid, "value": i * 2.5} for i in range(50)]

def compute_total(orders):
    return sum(o["value"] for o in orders)

def endpoint_full(uid):
    user   = fetch_user(uid)
    orders = fetch_orders(uid)
    total  = compute_total(orders)
    return {"user": user, "count": len(orders), "total": total}

def endpoint_lite(uid):
    user = fetch_user(uid)
    return {"name": user["name"]}

pr = cProfile.Profile()

pr.enable()
for i in range(500):
    endpoint_full(i)
pr.disable()

pr.enable()
for i in range(500):
    endpoint_lite(i)
pr.disable()

buf = io.StringIO()
stats = pstats.Stats(pr, stream=buf).sort_stats("cumulative")
stats.print_stats(8)
print(buf.getvalue())
print("slowest endpoint identified by cumtime per call")

Accumulated profiling pattern:

pr.enable() / pr.disable() accumulates stats across multiple cycles on the same Profile object.
This mirrors production profiling middleware: enable on request entry, disable on response exit, dump stats periodically.
cumtime / ncalls gives per-request cost — compare endpoints at the same call volume.
endpoint_full calls fetch_orders (50 dict allocs) and compute_total on every request — it will dominate cumtime.

Expected Output

slowest endpoint identified by cumtime per call

Hints

Hint 1: A single cProfile.Profile() object accumulates stats across multiple enable()/disable() cycles.

Hint 2: Divide cumtime by ncalls to get the per-request cost — use that to compare endpoints.

#11Profile-Guided Optimization — Iterative WorkflowHard

profilingiterative-optimizationbefore-afterpstatsset

Apply a complete profile-guided optimization loop: profile v1, identify the bottleneck, implement v2 with one fix, verify correctness and speedup.

import cProfile
import pstats
import io

def count_unique_v1(records):
    """O(n^2): list scan on every membership check."""
    unique = []
    for r in records:
        if r not in unique:
            unique.append(r)
    return len(unique)

def count_unique_v2(records):
    """Implement O(n) version using a set."""
    pass

records = list(range(3000)) * 3  # 9000 items, 3000 unique

# 1. Assert both produce the same result (after implementing v2)
# 2. Profile each version separately
# 3. Extract tottime programmatically from stats.stats
# 4. Print speedup
# Format:
# "v1 tottime: Xs"
# "v2 tottime: Ys"
# "Improvement: Zx"

Solution

import cProfile
import pstats

def count_unique_v1(records):
    unique = []
    for r in records:
        if r not in unique:
            unique.append(r)
    return len(unique)

def count_unique_v2(records):
    return len(set(records))

records = list(range(3000)) * 3

assert count_unique_v1(records) == count_unique_v2(records), "correctness failed"

def get_tottime(pr, fn_name):
    stats = pstats.Stats(pr)
    stats.strip_dirs()
    for key, val in stats.stats.items():
        if fn_name in key[2]:
            return val[2]  # tottime
    return 0.0

pr1 = cProfile.Profile()
pr1.enable()
count_unique_v1(records)
pr1.disable()

pr2 = cProfile.Profile()
pr2.enable()
count_unique_v2(records)
pr2.disable()

t1 = get_tottime(pr1, "count_unique_v1")
t2 = get_tottime(pr2, "count_unique_v2")
improvement = t1 / t2 if t2 > 0 else float("inf")

print(f"v1 tottime: {t1:.5f}s")
print(f"v2 tottime: {t2:.7f}s")
print(f"Improvement: {improvement:.0f}x")

Profile-guided optimization workflow:

Profile v1 — get baseline tottime.
Find the hotspot: list.__contains__ is O(n) — scans the entire list every check.
Fix: replace list with set for O(1) average-case lookup.
Profile v2 — confirm improvement with real numbers.
stats.stats dict: key=(file, lineno, name), value=(ncalls, recursive_calls, tottime, cumtime, callers).
Access raw data programmatically for automated CI regression tracking.

Expected Output

v1 tottime: Xs\nv2 tottime: Ys\nImprovement: Zx

Hints

Hint 1: Profile v1, identify the hotspot by tottime, make one targeted fix, profile v2, confirm improvement.

Hint 2: Access raw profiling data via stats.stats dict: key=(file, line, name), value=(ncalls, recursive_ncalls, tottime, cumtime, callers).

Practice: cProfile and pstats

Easy​

Medium​

Hard​

Easy

Medium

Hard