Debugging Strategies - Systematic Approaches to Finding Bugs

Reading time: ~20 minutes | Level: Foundation → Engineering

Here is a program with a bug. Most developers spend 20 minutes guessing. Expert debuggers find it in 90 seconds. The difference is method, not luck:

def calculate_average(numbers):
    total = 0
    for n in numbers:
        total = total + n
    return total / len(numbers)

data = [10, 20, 30, 40, 50]
result = calculate_average(data)
print(f"Average: {result}")

# Remove the last element to simulate a filtered dataset
filtered = [x for x in data if x > 100]
average = calculate_average(filtered)
print(f"Filtered average: {average}")

Average: 30.0
Traceback (most recent call last):
  File "calc.py", line 12, in <module>
    average = calculate_average(filtered)
  File "calc.py", line 5, in calculate_average
    return total / len(numbers)
ZeroDivisionError: division by zero

An expert reads the traceback in five seconds and knows exactly what happened: filtered is an empty list (x > 100 matches nothing), so len(numbers) is 0. The fix is one line. The expert did not guess - they read the evidence.

This lesson teaches you to read evidence the way experts do.

What You Will Learn

The scientific method of debugging and why random code changes make bugs worse
Reading Python tracebacks from bottom to top - the correct direction
pdb and breakpoint(): the Python debugger and its essential commands
Post-mortem debugging with pdb.pm() after an unhandled exception
Print debugging with repr() and pprint when it is the right tool
The traceback module: capturing and formatting exception information programmatically
sys.exc_info(): accessing exception type, value, and traceback in code
warnings.warn(): soft errors and deprecation notices
Performance debugging: cProfile, line_profiler, memory_profiler
Real-world strategies: bisecting bugs, minimal reproduction, checking git diff

Prerequisites

Python 3.8+ installed
Understanding of Python exceptions, try/except, and tracebacks
Basic comfort with running Python scripts from the terminal

Part 1 - The Scientific Method of Debugging

Debugging is not guessing. It is the scientific method applied to code.

Key rule: change one thing at a time. Random multi-change commits destroy reproducibility.

What Not to Do

# BAD debugging workflow - every professional has done this, it always wastes time
# 1. See an error
# 2. Make 3 random changes based on vague intuitions
# 3. Run the code - different error
# 4. Revert all changes because you don't know which one "fixed" what
# 5. Make more random changes
# 6. Somehow it works but you don't know why
# 7. Commit it and hope no one notices

The scientific method prevents this. Form one hypothesis. Test it. Conclude. Repeat.

Part 2 - Reading Tracebacks

A traceback is a complete description of what happened and where. It is not noise - it is evidence.

The Anatomy of a Traceback

# buggy.py
def parse_config(text):
    lines = text.split("\n")
    return {line.split("=")[0]: line.split("=")[1] for line in lines}

def load_settings(filename):
    with open(filename) as f:
        content = f.read()
    return parse_config(content)

def start_app():
    settings = load_settings("config.txt")
    port = int(settings["port"])
    print(f"Starting on port {port}")

start_app()

Traceback (most recent call last):
  File "buggy.py", line 14, in <module>
    start_app()
  File "buggy.py", line 11, in start_app
    settings = load_settings("config.txt")
  File "buggy.py", line 7, in load_settings
    return parse_config(content)
  File "buggy.py", line 2, in parse_config
    lines = text.split("\n")
AttributeError: 'NoneType' object has no attribute 'split'

How to Read It - Bottom to Top

Reading Order	Traceback Line	Question It Answers
1 (start here)	`AttributeError: 'NoneType'...`	What happened?
2	`File "buggy.py", line 2, in parse_config`	Where exactly?
3	`lines = text.split("\n")`	What code failed?
4	`File "buggy.py", line 7, in load_settings`	Who called it?
5	`return parse_config(content)`	With what argument?
6	`File "buggy.py", line 11, in start_app`	And before that?
7 (entry point)	`File "buggy.py", line 14, in <module>`	Where execution started

Reading this:

The error is AttributeError: 'NoneType' object has no attribute 'split'
It happened on line 2 of parse_config, where text.split("\n") was called
text is None - it came from content = f.read()
f.read() returns None only if the file handle was opened in a mode that returns None... or if content was explicitly set to None between read and parse

The bug: f.read() on an empty file returns "" (empty string), not None. Something set content to None. Check the code path - likely the variable content is never assigned because open() raised an exception that was silently caught somewhere upstream. The traceback tells you exactly where to look.

Your Code vs Library Code

Traceback (most recent call last):
  File "app.py", line 45, in handle_request          ← YOUR CODE
    result = client.post(url, json=payload)
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 923   ← LIBRARY
    return self._send_single_request(request)
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 960   ← LIBRARY
    response = transport.handle_request(request)
  File "/usr/lib/python3.11/site-packages/httpx/_transports/default.py", line 230
    raise ConnectError(...)
httpx.ConnectError: [Errno -2] Name or service not known

Work upward from the error until you reach your code - that is where the bug usually is. Library code in /site-packages/ is rarely the source of bugs. Your code passed wrong arguments, wrong types, or is not handling a failure case.

Part 3 - pdb: The Python Debugger

pdb is Python's built-in interactive debugger. Unlike print statements, it lets you pause execution mid-run and inspect any variable, call any function, and step line by line through your code.

Inserting a Breakpoint

# Method 1: The classic way (Python 2 and 3)
import pdb; pdb.set_trace()

# Method 2: Modern way (Python 3.7+) - preferred
breakpoint()
# breakpoint() calls sys.breakpointhook(), which defaults to pdb.set_trace()
# You can override it with the PYTHONBREAKPOINT environment variable

When Python hits breakpoint(), execution pauses and you get an interactive prompt:

> /path/to/script.py(12)calculate_average()
-> return total / len(numbers)
(Pdb)

The (Pdb) prompt means you are inside the debugger. You can now inspect state.

Essential pdb Commands

Command	What It Does
`n`	next - execute the current line, stay in current frame
`s`	step - step into a function call
`c`	continue - run until the next breakpoint or end
`q`	quit - exit pdb (raises BdbQuit)
`p expr`	print - evaluate and print expr
`pp expr`	pretty-print - like p but formatted for dicts/lists
`l`	list - show 11 lines of source around current line
`l N`	list - show source around line N
`ll`	longlist - show the entire current function
`w`	where - print the call stack (same as bt/backtrace)
`u`	up - move up one frame in the call stack
`d`	down - move down one frame in the call stack
`b N`	break - set breakpoint at line N
`b func`	break - set breakpoint at start of function
`cl N`	clear - remove breakpoint N
`h`	help - list all commands
`h cmd`	help cmd - describe the command
`!expr`	execute Python expression (! needed if expr is a cmd)
`r`	return - continue until current function returns
`a`	args - print arguments of current function

A Complete pdb Session

# debug_demo.py
def parse_record(record):
    parts = record.split(",")
    name = parts[0].strip()
    age = int(parts[1].strip())    # Bug: parts[1] might be "twenty" not "20"
    score = float(parts[2].strip())
    return {"name": name, "age": age, "score": score}

records = [
    "Alice, 30, 95.5",
    "Bob, twenty, 87.0",   # Bug: age is not a number
    "Charlie, 25, 91.2",
]

breakpoint()   # Pause here before the loop

results = [parse_record(r) for r in records]

> debug_demo.py(11)<module>()
-> results = [parse_record(r) for r in records]
(Pdb) p records
['Alice, 30, 95.5', 'Bob, twenty, 87.0', 'Charlie, 25, 91.2']
(Pdb) p records[1]
'Bob, twenty, 87.0'
(Pdb) n
(Pdb) s      # Step into the list comprehension
(Pdb) s      # Step into parse_record
> debug_demo.py(3)parse_record()
-> parts = record.split(",")
(Pdb) p record
'Bob, twenty, 87.0'
(Pdb) n
(Pdb) p parts
['Bob', ' twenty', ' 87.0']
(Pdb) n
-> age = int(parts[1].strip())
(Pdb) p parts[1].strip()
'twenty'
(Pdb) n
ValueError: invalid literal for int() with base 10: 'twenty'

You have now confirmed the exact line, the exact value, and the exact problem - without guessing.

:::tip Disable Breakpoints Without Deleting Them Set the PYTHONBREAKPOINT=0 environment variable to make all breakpoint() calls no-ops. This lets you leave breakpoints in code during development without having them fire in CI:

PYTHONBREAKPOINT=0 python script.py

:::

Part 4 - Post-Mortem Debugging

Post-mortem debugging means inspecting the program state after a crash, without needing to set breakpoints in advance.

`pdb.pm()` - The Most Powerful Debugging Tool You Are Not Using

# Run your script normally in a REPL or with python -i
# -i means "enter interactive mode after running"

python -i buggy_script.py

If the script crashes:

Traceback (most recent call last):
  File "buggy_script.py", line 12
    result = calculate(data)
ValueError: invalid literal for int() with base 10: 'abc'
>>>

Now in the interactive REPL, run:

>>> import pdb
>>> pdb.pm()
> buggy_script.py(8)calculate()
-> value = int(raw_input)
(Pdb) p raw_input
'abc'
(Pdb) p data
{'key': 'abc', 'other': 42}

pdb.pm() opens the debugger at the exact frame where the unhandled exception occurred, with all variables intact. You can inspect the full call stack, print variables, and understand exactly what state the program was in when it crashed - without re-running or guessing.

Using `pdb.post_mortem()` Programmatically

import pdb
import traceback
import sys

def run_with_postmortem():
    try:
        # ... your code that might crash ...
        risky_operation()
    except Exception:
        # Get the current exception's traceback
        tb = sys.exc_info()[2]
        # Open pdb at the crash site
        pdb.post_mortem(tb)

run_with_postmortem()

This is useful in long-running scripts where you want automatic post-mortem on any unhandled exception.

Part 5 - Print Debugging: When It Is Valid

Print debugging has a bad reputation, but it is sometimes the right tool.

When Print Debugging Is Valid

In environments without a debugger (remote server, CI container)
When the bug is in code you cannot easily stop (threads, async, callbacks)
When the bug only happens with large data sets that take minutes to reproduce
When you need a log of ALL values over many iterations, not just one pause

How to Do Print Debugging Well

# BAD print debugging - no context, no types, hard to scan
print(x)
print("here")
print(result)

# GOOD print debugging - use repr() and label everything
print(f"[DEBUG] x={x!r} type={type(x).__name__}")
print(f"[DEBUG] After parse: result={result!r}")

# Use pprint for nested structures
from pprint import pprint
print("[DEBUG] config:")
pprint(config, indent=2)

# Mark entry/exit of functions
def process(data):
    print(f"[ENTER process] data={data!r}")
    result = _process_impl(data)
    print(f"[EXIT process] result={result!r}")
    return result

The !r format spec calls repr() on the value, which shows the type information:

"hello" prints as 'hello' (with quotes - you know it is a string)
None prints as None (you can distinguish it from the string "None")
[1, 2, 3] prints as [1, 2, 3] (you see the brackets - you know it is a list)

:::warning Remove Print Debugging Before Committing Print debugging is a temporary tool. Before committing code, replace temporary print statements with logger.debug() calls for information worth keeping, or delete them entirely. Print statements in version control are a code smell. :::

Part 6 - The `traceback` Module

The traceback module gives you programmatic access to exception information - useful when you want to capture, format, or store traceback information without using a debugger.

Formatting Tracebacks as Strings

import traceback

def risky():
    x = {}
    return x["missing_key"]

try:
    risky()
except KeyError:
    # Get the full traceback as a string
    tb_string = traceback.format_exc()
    print("Captured traceback:")
    print(tb_string)

    # Or print it directly to stderr (the default)
    traceback.print_exc()

Captured traceback:
Traceback (most recent call last):
  File "demo.py", line 7, in <module>
    risky()
  File "demo.py", line 4, in risky
    return x["missing_key"]
KeyError: 'missing_key'

Storing Tracebacks for Later Inspection

import traceback
import sys

errors = []   # Collect all errors from a batch job

def process_item(item):
    try:
        return parse_and_validate(item)
    except Exception:
        # Capture traceback NOW, while exception is active
        tb = traceback.format_exc()
        errors.append({"item": item, "error": tb})
        return None

results = [process_item(x) for x in large_dataset]

# After the batch, report all failures
for error in errors:
    print(f"Failed item: {error['item']}")
    print(error["error"])

`traceback.extract_tb()` - Structured Traceback Data

import traceback
import sys

try:
    int("not a number")
except ValueError:
    exc_type, exc_value, exc_tb = sys.exc_info()

    # Extract structured frames
    frames = traceback.extract_tb(exc_tb)
    for frame in frames:
        print(f"File: {frame.filename}")
        print(f"Line: {frame.lineno}")
        print(f"Function: {frame.name}")
        print(f"Code: {frame.line}")
        print("---")

File: demo.py
Line: 3
Function: <module>
Code: int("not a number")
---

This structured data is useful for building custom error reporting systems, sending tracebacks to Sentry, or logging tracebacks as structured JSON.

Part 7 - `sys.exc_info()`: Accessing Exception Data Programmatically

sys.exc_info() returns a tuple of (exc_type, exc_value, exc_traceback) for the currently active exception. It returns (None, None, None) when called outside an except block.

import sys
import traceback

def diagnose_exception():
    """Show what sys.exc_info() contains inside an except block."""
    try:
        result = 10 / 0
    except ZeroDivisionError:
        exc_type, exc_value, exc_tb = sys.exc_info()

        print(f"Exception type:  {exc_type}")          # <class 'ZeroDivisionError'>
        print(f"Exception value: {exc_value}")         # division by zero
        print(f"Exception class: {exc_type.__name__}") # ZeroDivisionError

        # Check if it is a specific type
        if issubclass(exc_type, ArithmeticError):
            print("This is an arithmetic error")

        # Get the traceback as formatted lines
        tb_lines = traceback.format_tb(exc_tb)
        print("Traceback:")
        for line in tb_lines:
            print(line, end="")

diagnose_exception()

Exception type:  <class 'ZeroDivisionError'>
Exception value: division by zero
Exception class: ZeroDivisionError
This is an arithmetic error
Traceback:
  File "demo.py", line 4, in diagnose_exception
    result = 10 / 0

:::warning Do Not Hold References to Traceback Objects Traceback objects contain references to the frame objects of the call stack. Holding a reference to exc_tb outside an except block can cause reference cycles that prevent garbage collection. Always delete it when done: del exc_tb. Or use traceback.format_tb(exc_tb) to get a plain string and then discard exc_tb. :::

Part 8 - `warnings.warn()`: Soft Errors and Deprecation

Sometimes a condition is not an error - it is a concern. Something is deprecated, an argument combination is unusual, or a feature will change in a future version. For these cases, use warnings.warn().

import warnings

def compute_mean(values, ignore_empty=None):
    """
    Compute the mean of values.
    ignore_empty is deprecated - use handle_empty parameter instead.
    """
    if ignore_empty is not None:
        warnings.warn(
            "ignore_empty is deprecated and will be removed in v3.0. "
            "Use handle_empty='skip' instead.",
            DeprecationWarning,
            stacklevel=2,   # Point to the caller, not to this function
        )

    if not values:
        warnings.warn(
            "compute_mean received an empty sequence - returning 0.0",
            RuntimeWarning,
            stacklevel=2,
        )
        return 0.0

    return sum(values) / len(values)


# Using deprecated argument - produces a DeprecationWarning
result = compute_mean([1, 2, 3], ignore_empty=True)

DeprecationWarning: ignore_empty is deprecated and will be removed in v3.0. Use handle_empty='skip' instead.
  result = compute_mean([1, 2, 3], ignore_empty=True)

Warning Categories

Category	Use When
`DeprecationWarning`	API or feature will be removed in future
`PendingDeprecationWarning`	Will be deprecated in the future
`RuntimeWarning`	Suspicious runtime behavior (not an error yet)
`UserWarning`	General-purpose warning for your users
`FutureWarning`	Behavior will change in the future
`ResourceWarning`	Resource not properly closed (Python 3.2+)

Controlling Warnings in Tests

import warnings

# Turn warnings into errors in tests - catch regressions early
warnings.filterwarnings("error", category=DeprecationWarning)

# Or with pytest: add to pyproject.toml or conftest.py
# [tool.pytest.ini_options]
# filterwarnings = ["error::DeprecationWarning"]

Part 9 - Debugging Performance Issues

When the bug is not a crash but "it is too slow" or "it uses too much memory", different tools apply.

`cProfile` - Finding Time Hotspots

import cProfile
import pstats
import io

def slow_function():
    """Simulate a function with nested calls."""
    result = []
    for i in range(1000):
        result.append(process_item(i))
    return result

def process_item(n):
    # Simulate expensive work
    return sum(range(n))


# Method 1: Profile the whole program
# python -m cProfile -s cumulative script.py

# Method 2: Profile a specific function
profiler = cProfile.Profile()
profiler.enable()

slow_function()

profiler.disable()

# Print sorted stats
stream = io.StringIO()
stats = pstats.Stats(profiler, stream=stream)
stats.sort_stats("cumulative")
stats.print_stats(10)   # Top 10 functions
print(stream.getvalue())

         3003 function calls in 0.012 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.012    0.012 demo.py:3(slow_function)
     1000    0.003    0.000    0.010    0.000 demo.py:9(process_item)
     1000    0.007    0.000    0.007    0.000 {built-in method builtins.sum}
     1000    0.001    0.000    0.001    0.000 {built-in method builtins.range}

Read: process_item was called 1000 times, contributing 0.010 seconds of cumulative time. The inner sum(range(n)) is the bottleneck.

`line_profiler` - Line-by-Line Timing

pip install line_profiler

# Add @profile decorator (provided by line_profiler, not imported)
@profile
def process_batch(records):
    results = []
    for record in records:
        parsed = parse(record)       # How long does this take?
        validated = validate(parsed) # And this?
        results.append(validated)
    return results

kernprof -l -v script.py

Line #    Hits     Time     Per Hit   % Time  Line Contents
=================================================
             1     12.0      12.0      0.2  results = []
          1000    320.0       0.3      5.4  for record in records:
          1000   4800.0       4.8     80.9  parsed = parse(record)
          1000    783.0       0.8     13.2  validated = validate(parsed)
          1000     17.0       0.0      0.3  results.append(validated)

parse() consumes 80.9% of the time. Fix parse().

`memory_profiler` - Finding Memory Leaks

pip install memory_profiler

from memory_profiler import profile

@profile
def load_large_dataset():
    data = []
    with open("large_file.csv") as f:
        for line in f:
            data.append(line.split(","))  # Accumulates in memory
    return data

Line #    Mem usage    Increment   Line Contents
================================================
     4   50.2 MiB    50.2 MiB    data = []
     5   50.2 MiB     0.0 MiB    with open("large_file.csv") as f:
     6   50.2 MiB     0.0 MiB        for line in f:
     7  312.4 MiB   262.2 MiB        data.append(...)  # 262 MB used here!

The memory increase happens at the append line - loading the entire file into memory. The fix: use a generator instead of accumulating a list.

Part 10 - IDE Debuggers vs pdb

VS Code Debugger - The Visual Alternative

VS Code with the Python extension provides a graphical debugger that wraps pdb:

// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": true   // Skip stepping into library code
        }
    ]
}

VS Code advantages over raw pdb:

Breakpoints set with a click (no code changes required)
Variables panel shows all local and global variables without p commands
Call stack panel shows the full frame tree visually
Watch expressions: monitor an expression across every step
Conditional breakpoints: break only when user_id == 42

pdb advantages over VS Code:

Works over SSH on remote servers with no GUI
Works inside Docker containers
Works anywhere Python runs - no IDE required
Scriptable and automatable
Faster for experienced users who know the keyboard commands

The choice depends on context. Know both.

Part 11 - Real-World Debugging Strategies

Strategy 1: Bisect the Codebase

When you do not know where the bug is, narrow it down by halving:

# You have a pipeline of 10 processing stages and the output is wrong
# Stage 1 → Stage 2 → Stage 3 → ... → Stage 10 → Wrong output

# Bisect: check the output at Stage 5
result_after_5 = run_stages(data, stages=5)
print(f"After stage 5: {result_after_5!r}")
# Is it correct? If yes, bug is in stages 6-10. If no, bug is in stages 1-5.
# Repeat with half the remaining range.

This is O(log n) - 10 stages require at most 4 checks to find the culprit stage.

Strategy 2: Reproduce with a Minimal Example

A minimal reproduction case:

Removes all unrelated code
Shows the bug in 10 lines or fewer
Uses hard-coded values, not external dependencies

# Original bug report: "The API fails when processing user 4471"
# Your minimal reproduction:
data = {"name": "O'Brien", "age": 42}   # Apostrophe in name!
result = build_sql_insert(data)
print(result)
# INSERT INTO users (name, age) VALUES ('O'Brien', 42)  <- SQL syntax error

Minimal reproduction cases:

Make it obvious exactly what triggers the bug
Are easy to share with teammates
Often reveal the cause just by the act of creating them ("rubber duck debugging")

Strategy 3: Check the Recent Git Diff

Most bugs were introduced recently. The diff tells you what changed:

# What changed in the last commit?
git diff HEAD~1 HEAD

# What changed in the last 5 commits, just in the relevant module?
git log --oneline -5 -- src/payments/

# When did this function last change?
git log -p --follow -S "calculate_tax" -- src/

# Find the commit that introduced a bug (binary search)
git bisect start
git bisect bad              # Current commit is broken
git bisect good v2.1.0      # This release was fine
# Git checks out commits; you test each one and mark good/bad
# Git finds the exact breaking commit in O(log n) steps

Strategy 4: Rubber Duck Debugging

Explain your code out loud, line by line, to an imaginary audience - a rubber duck, a stuffed animal, a plant, or a patient colleague.

The act of explaining forces you to:

Articulate every assumption
Notice where your explanation and the code diverge
Spot implicit "it must be doing X" beliefs that are actually wrong

This is not a joke. Senior engineers do this constantly. It works because verbalizing activates a different cognitive mode than reading.

Interview Questions

Q1: How do you read a Python traceback? What is the most important part?

Answer: Read from the bottom up. The very last line is the most important - it names the exception type and its message (ValueError: invalid literal for int() with base 10: 'abc'). This tells you what went wrong.

The second-to-last block shows exactly where in your code the error occurred - the filename, line number, function name, and the actual line of code.

Work upward from there to understand the call chain: who called that function, and who called that. Stop climbing when you reach library code in /site-packages/ - the bug is usually in your code that passed incorrect data to the library, not in the library itself.

Q2: What is the difference between `pdb.set_trace()` and `breakpoint()`?

Answer: breakpoint() was added in Python 3.7 and is the modern equivalent of import pdb; pdb.set_trace(). The important difference is that breakpoint() calls sys.breakpointhook, which can be overridden:

Setting PYTHONBREAKPOINT=0 makes all breakpoint() calls no-ops (useful to disable them in CI without deleting them from code)
Setting PYTHONBREAKPOINT=pudb.set_trace switches to a different debugger (pudb) without changing code
pdb.set_trace() always invokes pdb specifically - you cannot redirect it with an environment variable

Use breakpoint() in all new code.

Q3: What is post-mortem debugging and when is it useful?

Answer: Post-mortem debugging means opening the debugger at the crash site after an unhandled exception, without needing to set breakpoints in advance.

It is useful when:

You do not know where the crash will happen (cannot set a breakpoint in advance)
The crash only happens with specific data that is hard to reproduce in the debugger
You want to inspect the full program state at the moment of failure without re-running

Usage:

# In interactive mode (python -i script.py):
import pdb; pdb.pm()   # Opens debugger at the crash site

# Programmatically:
import sys, pdb, traceback
try:
    main()
except Exception:
    pdb.post_mortem(sys.exc_info()[2])

Q4: What does `traceback.format_exc()` return, and how is it different from `str(exception)`?

Answer: traceback.format_exc() returns the full formatted traceback string - including the "Traceback (most recent call last):" header, all the frame entries with file/line/function info, and the final exception line.

str(exception) returns only the exception message - just the string argument passed to the exception constructor.

Example:

try:
    int("abc")
except ValueError as e:
    print(str(e))                    # invalid literal for int() with base 10: 'abc'
    print(traceback.format_exc())    # Full multi-line traceback

Use traceback.format_exc() when you need to log or store the full traceback. Use str(e) when you only need the human-readable error message.

Q5: What is `cProfile` and how do you use it to find a performance bottleneck?

Answer: cProfile is Python's built-in statistical profiler. It counts how many times each function was called and measures the time spent in each function.

Usage:

# Profile a script from the command line
python -m cProfile -s cumulative my_script.py | head -20

Or in code:

import cProfile
cProfile.run("my_function(args)", sort="cumulative")

The output shows ncalls (call count), tottime (time in function, excluding callees), and cumtime (time including all nested calls). Sort by cumtime to find the highest-impact functions. The function at the top with the most cumulative time is your first optimization target.

Q6: When is print debugging appropriate, and how should it be done correctly?

Answer: Print debugging is appropriate when:

The environment does not support a debugger (remote server, CI, Docker container with no terminal)
The bug only manifests after many iterations (needs a log of all values, not a single pause)
The code uses threads, async, or callbacks that make pdb difficult to use interactively

How to do it correctly:

Label every print with a prefix like [DEBUG] so you can find and remove them later
Use !r format spec (calls repr()) to see the type and exact value: f"x={x!r}"
Use pprint.pprint() for complex nested structures
Remove or convert to logger.debug() before committing - print statements in production code are a code smell

Practice Challenges

Beginner - Read and Diagnose a Traceback

Given this traceback, identify: (1) what went wrong, (2) in which function, (3) what the probable root cause is, and (4) how to fix it.

Traceback (most recent call last):
  File "app.py", line 24, in <module>
    send_report(users)
  File "app.py", line 18, in send_report
    for user in users:
  File "app.py", line 12, in process_users
    email = user["email"].lower()
TypeError: 'NoneType' object has no attribute 'lower'

Solution

1. What went wrong: TypeError: 'NoneType' object has no attribute 'lower' - .lower() was called on None instead of a string.

2. In which function: process_users at line 12 in app.py, on the expression user["email"].lower().

3. Probable root cause: user["email"] returned None. This means one of the user dictionaries in the dataset has "email": None (or the key exists with a None value). The user was likely created without an email address, or the email field was explicitly set to None.

4. How to fix it:

# Option A: Guard with a check before calling .lower()
email = user.get("email")
if email is not None:
    email = email.lower()
else:
    # Handle missing email: skip user, use default, log warning
    logger.warning("User %s has no email address", user.get("id"))
    continue

# Option B: Use a default
email = (user.get("email") or "").lower()

# Option C: Validate user data at the boundary (best practice)
if not isinstance(user.get("email"), str):
    raise ValueError(f"User {user['id']} missing required email field")

The root fix is to ensure user records always have a valid email string, validated at the point of data entry or import.

Intermediate - Use pdb to Find a Hidden Bug

The following function has a bug that only appears with specific input. Use breakpoint() to investigate. Identify the exact line and condition that triggers the bug.

def merge_configs(base: dict, override: dict) -> dict:
    """
    Deep merge override into base. Lists in override EXTEND base lists.
    Scalars in override REPLACE base scalars.
    """
    result = dict(base)
    for key, value in override.items():
        if key in result and isinstance(result[key], list):
            result[key] = result[key] + value   # Extend list
        else:
            result[key] = value                 # Scalar override
    return result

# Test cases
base = {"host": "localhost", "port": 5432, "tags": ["prod"]}
override_a = {"port": 6000}
override_b = {"tags": ["v2", "blue"]}
override_c = {"tags": "not-a-list"}   # Tricky case

print(merge_configs(base, override_a))
print(merge_configs(base, override_b))
print(merge_configs(base, override_c))  # This one has a bug

Solution

The Bug: When override_c = {"tags": "not-a-list"} is merged:

result["tags"] is ["prod"] (a list)
value is "not-a-list" (a string)
The condition isinstance(result[key], list) is True
So the code runs result["tags"] = ["prod"] + "not-a-list" which raises TypeError: can only concatenate list (not "str") to list

Debugging with pdb:

def merge_configs(base: dict, override: dict) -> dict:
    result = dict(base)
    for key, value in override.items():
        breakpoint()   # Pause here on each iteration
        if key in result and isinstance(result[key], list):
            result[key] = result[key] + value
        else:
            result[key] = value
    return result

At the pdb prompt:

(Pdb) p key
'tags'
(Pdb) p value
'not-a-list'
(Pdb) p type(value)
<class 'str'>
(Pdb) p result[key]
['prod']
# AHA: result[key] is a list but value is a string - + will fail

The Fix:

def merge_configs(base: dict, override: dict) -> dict:
    result = dict(base)
    for key, value in override.items():
        if (key in result
                and isinstance(result[key], list)
                and isinstance(value, list)):   # Both must be lists
            result[key] = result[key] + value
        else:
            result[key] = value   # Scalar override (including str replacing list)
    return result

# Now override_c replaces the list with the string
print(merge_configs(base, override_c))
# {'host': 'localhost', 'port': 5432, 'tags': 'not-a-list'}

Whether this is correct behavior depends on requirements, but at least it does not crash. The point is that the bug was only visible when result[key] was a list but value was not - a condition breakpoint() revealed immediately.

Advanced - Profile and Optimize a Slow Function

The function below is correct but too slow for production use. Profile it with cProfile, identify the bottleneck, and rewrite it to be at least 10x faster.

import time
import random

def find_duplicates(numbers: list) -> list:
    """Return a list of numbers that appear more than once."""
    duplicates = []
    for i, num in enumerate(numbers):
        for j, other in enumerate(numbers):
            if i != j and num == other and num not in duplicates:
                duplicates.append(num)
    return duplicates

# Test with large data
data = [random.randint(1, 500) for _ in range(5000)]

start = time.perf_counter()
result = find_duplicates(data)
elapsed = time.perf_counter() - start
print(f"Found {len(result)} duplicates in {elapsed:.3f}s")

Solution

Step 1: Profile to confirm the bottleneck

import cProfile

data = [random.randint(1, 500) for _ in range(5000)]
cProfile.run("find_duplicates(data)", sort="cumulative")

         25004998 function calls in 14.2 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   14.1    14.1     14.2     14.2    solution.py:5(find_duplicates)

The entire time is in find_duplicates itself. The double loop is O(n squared): 5000 x 5000 = 25,000,000 iterations.

Step 2: Identify the algorithm problem The O(n squared) double loop compares every element to every other element. For each element we also do num not in duplicates which is O(n) on a list - making this O(n cubed) in the worst case.

Step 3: Rewrite with O(n) algorithm

from collections import Counter
import time
import random

def find_duplicates_fast(numbers: list) -> list:
    """Return a list of numbers that appear more than once. O(n) time."""
    counts = Counter(numbers)
    return [num for num, count in counts.items() if count > 1]

# Compare performance
data = [random.randint(1, 500) for _ in range(5000)]

start = time.perf_counter()
result_slow = find_duplicates(data)
slow_time = time.perf_counter() - start

start = time.perf_counter()
result_fast = find_duplicates_fast(data)
fast_time = time.perf_counter() - start

print(f"Slow: {slow_time:.3f}s   Fast: {fast_time:.6f}s")
print(f"Speedup: {slow_time / fast_time:.0f}x")
print(f"Results match: {set(result_slow) == set(result_fast)}")

# Output:
# Slow: 14.234s   Fast: 0.000892s
# Speedup: 15952x
# Results match: True

What cProfile revealed and why: The profiler showed almost all time was in find_duplicates itself. The double loop was O(n squared). Counter uses a hash map - building it is O(n), reading it is O(n). The fix is about choosing the right data structure, which cProfile makes obvious by showing you the hotspot.

Quick Reference

Technique	Command / Code	Use Case
Set breakpoint	`breakpoint()`	Pause execution for interactive inspection
Disable all breakpoints	`PYTHONBREAKPOINT=0 python script.py`	CI environments
Post-mortem debugging	`python -i script.py` then `import pdb; pdb.pm()`	Inspect state after crash
pdb: next line	`n`	Step over a line
pdb: step into	`s`	Enter a function call
pdb: continue	`c`	Run to next breakpoint
pdb: print value	`p expr`	Inspect a variable
pdb: call stack	`w`	See where you are
pdb: up/down frame	`u` / `d`	Navigate the call stack
Format traceback	`traceback.format_exc()`	Capture traceback as string
Print traceback	`traceback.print_exc()`	Print current exception traceback
Exception info	`sys.exc_info()`	Get (type, value, tb) tuple
Soft warning	`warnings.warn("msg", DeprecationWarning)`	Non-fatal advisory
Profile script	`python -m cProfile -s cumulative script.py`	Find time bottlenecks
Line profiler	`kernprof -l -v script.py`	Line-by-line timing
Pretty print	`from pprint import pprint; pprint(obj)`	Readable complex structures
repr of value	`f"x={x!r}"`	Show type info in debug output

Key Takeaways

Debugging is the scientific method: observe the error, form a specific hypothesis, test it with a minimal experiment, conclude - never make multiple simultaneous guesses
Read tracebacks from the bottom up - the last line names what failed, the second-to-last shows where, and you work upward to find where in your code the bad data originated
breakpoint() (Python 3.7+) is the modern way to open the interactive pdb debugger; set PYTHONBREAKPOINT=0 to disable all breakpoints without touching code
Post-mortem debugging with pdb.pm() is the most powerful technique for understanding crashes - it lets you inspect all state at the moment of failure after the fact
traceback.format_exc() captures the full traceback as a string for logging or storage; sys.exc_info() provides structured (type, value, tb) access inside except blocks
cProfile finds time bottlenecks in O(minutes): run it, sort by cumtime, and fix the top function - most performance bugs are algorithm or data structure choices, not micro-optimizations
The fastest path to a bug is usually: reproduce minimally, check the recent git diff, and use bisect to narrow down which change introduced it

What You Will Learn​

Prerequisites​

Part 1 - The Scientific Method of Debugging​

What Not to Do​

Part 2 - Reading Tracebacks​

The Anatomy of a Traceback​

How to Read It - Bottom to Top​

Your Code vs Library Code​

Part 3 - pdb: The Python Debugger​

Inserting a Breakpoint​

Essential pdb Commands​

A Complete pdb Session​

Part 4 - Post-Mortem Debugging​

pdb.pm() - The Most Powerful Debugging Tool You Are Not Using​

Using pdb.post_mortem() Programmatically​

Part 5 - Print Debugging: When It Is Valid​

When Print Debugging Is Valid​

How to Do Print Debugging Well​

Part 6 - The traceback Module​

Formatting Tracebacks as Strings​

Storing Tracebacks for Later Inspection​

traceback.extract_tb() - Structured Traceback Data​

Part 7 - sys.exc_info(): Accessing Exception Data Programmatically​

Part 8 - warnings.warn(): Soft Errors and Deprecation​

Warning Categories​

Controlling Warnings in Tests​

Part 9 - Debugging Performance Issues​

cProfile - Finding Time Hotspots​

line_profiler - Line-by-Line Timing​

memory_profiler - Finding Memory Leaks​

Part 10 - IDE Debuggers vs pdb​

VS Code Debugger - The Visual Alternative​

Part 11 - Real-World Debugging Strategies​

Strategy 1: Bisect the Codebase​

Strategy 2: Reproduce with a Minimal Example​

Strategy 3: Check the Recent Git Diff​

Strategy 4: Rubber Duck Debugging​

Interview Questions​

Q1: How do you read a Python traceback? What is the most important part?​

Q2: What is the difference between pdb.set_trace() and breakpoint()?​

Q3: What is post-mortem debugging and when is it useful?​

Q4: What does traceback.format_exc() return, and how is it different from str(exception)?​

Q5: What is cProfile and how do you use it to find a performance bottleneck?​

Q6: When is print debugging appropriate, and how should it be done correctly?​

Practice Challenges​

Beginner - Read and Diagnose a Traceback​

Intermediate - Use pdb to Find a Hidden Bug​

Advanced - Profile and Optimize a Slow Function​

Quick Reference​

Key Takeaways​

What You Will Learn

Prerequisites

Part 1 - The Scientific Method of Debugging

What Not to Do

Part 2 - Reading Tracebacks

The Anatomy of a Traceback

How to Read It - Bottom to Top

Your Code vs Library Code

Part 3 - pdb: The Python Debugger

Inserting a Breakpoint

Essential pdb Commands

A Complete pdb Session

Part 4 - Post-Mortem Debugging

`pdb.pm()` - The Most Powerful Debugging Tool You Are Not Using

Using `pdb.post_mortem()` Programmatically

Part 5 - Print Debugging: When It Is Valid

When Print Debugging Is Valid

How to Do Print Debugging Well

Part 6 - The `traceback` Module

Formatting Tracebacks as Strings

Storing Tracebacks for Later Inspection

`traceback.extract_tb()` - Structured Traceback Data

Part 7 - `sys.exc_info()`: Accessing Exception Data Programmatically

Part 8 - `warnings.warn()`: Soft Errors and Deprecation

Warning Categories

Controlling Warnings in Tests

Part 9 - Debugging Performance Issues

`cProfile` - Finding Time Hotspots

`line_profiler` - Line-by-Line Timing

`memory_profiler` - Finding Memory Leaks

Part 10 - IDE Debuggers vs pdb

VS Code Debugger - The Visual Alternative

Part 11 - Real-World Debugging Strategies

Strategy 1: Bisect the Codebase

Strategy 2: Reproduce with a Minimal Example

Strategy 3: Check the Recent Git Diff

Strategy 4: Rubber Duck Debugging

Interview Questions

Q1: How do you read a Python traceback? What is the most important part?

Q2: What is the difference between `pdb.set_trace()` and `breakpoint()`?

Q3: What is post-mortem debugging and when is it useful?

Q4: What does `traceback.format_exc()` return, and how is it different from `str(exception)`?

Q5: What is `cProfile` and how do you use it to find a performance bottleneck?

Q6: When is print debugging appropriate, and how should it be done correctly?

Practice Challenges

Beginner - Read and Diagnose a Traceback

Intermediate - Use pdb to Find a Hidden Bug

Advanced - Profile and Optimize a Slow Function

Quick Reference

Key Takeaways