Indexing and Query Optimization

Reading time: ~40 minutes | Level: Intermediate → Engineering

Before reading further, predict what is happening here:

import psycopg2
import time

conn = psycopg2.connect(DSN)
cur = conn.cursor()

# Test 1: table with 1,000 rows
cur.execute("SELECT * FROM orders WHERE customer_email = '[email protected]'")
# Elapsed: 0.050 seconds  ✓ fast enough

# Test 2: same table, same query, same hardware - now 1,000,000 rows
cur.execute("SELECT * FROM orders WHERE customer_email = '[email protected]'")
# Elapsed: 45.3 seconds  ✗ production is on fire

Identical query. Same Python code. Same server. The only thing that changed was the row count - and performance collapsed by a factor of 900.

This is not a bug in psycopg2. It is not a network problem. It is a missing index, and it is one of the most common causes of production database emergencies. Understanding exactly why this happens - and how an index fixes it - is what this lesson is about.

What You Will Learn

What a database index physically is - the B-tree data structure and how the engine traverses it
The five main index types in PostgreSQL and when to choose each one
How to create indexes: standard, unique, partial, and expression indexes
How to read EXPLAIN ANALYZE output and distinguish Seq Scan, Index Scan, and Index Only Scan
The leftmost prefix rule for composite indexes - and exactly which queries will or will not use your index
When indexes actively hurt performance - INSERT/UPDATE/DELETE overhead and index bloat
How to find slow queries in production using pg_stat_statements and auto_explain
Index maintenance: VACUUM, ANALYZE, REINDEX, and monitoring index usage

Prerequisites

Python Intermediate: Module 07, Lessons 01–04 (SQL Fundamentals through Transactions)
Basic familiarity with psycopg2 and PostgreSQL
Understanding of SQL SELECT, WHERE, and JOIN clauses

Part 1 - What an Index Actually Is

The Problem: Sequential Scans Scale Poorly

When you run SELECT * FROM orders WHERE customer_email = '[email protected]' against a table with no index on customer_email, the database has only one option: read every single row in the table, check each one against the condition, and return the matching rows.

This is called a sequential scan (or full table scan). It is O(n) - as the number of rows doubles, the time to answer the query roughly doubles. At 1,000 rows it is fast. At 1,000,000 rows it is catastrophic.

Sequential scan with 1,000,000 rows:

[row 1] [row 2] [row 3] ... [row 999,999] [row 1,000,000]
  ↑ check  ↑ check  ↑ check ...    ↑ check        ↑ check

Every row must be visited. Alice's row could be anywhere.

The Solution: A Sorted Auxiliary Structure

An index is a separate data structure maintained by the database alongside your table. It stores a subset of the data - typically just the indexed column(s) and a pointer (called a "heap tuple identifier" or ctid) back to the full row - arranged in a way that makes lookups dramatically faster.

The default and most common index type in PostgreSQL is the B-tree (Balanced Tree).

B-tree Internals

A B-tree is a self-balancing tree data structure. Every node can hold multiple keys, and the tree always remains balanced - meaning all leaf nodes are at the same depth. This guarantees that any lookup takes O(log n) time regardless of the data distribution.

The tree has three levels of nodes:

Root node - the single entry point at the top; holds key ranges that point to branch nodes
Branch nodes (also called internal nodes) - intermediate layers; hold key ranges pointing to child nodes or leaf pages
Leaf pages - the bottom layer; contain the actual indexed values paired with heap tuple identifiers (pointers to the real row location on disk)

How a Lookup Works

To find the row where id = 450:

Start at the root: 450 is between 300 and 600, so follow the pointer to the middle branch node (keys 400, 500)
At the branch node: 450 is between 400 and 500, follow the left pointer to Leaf Page 5
At the leaf page: find 450 → ctid(9,4) - this is the physical location (page 9, tuple 4) of the actual row in the heap
Fetch that single page from disk and return the row

With 1,000,000 rows, a balanced B-tree is at most ~20 levels deep (log₂(1,000,000) ≈ 20). The database reads at most 20 pages instead of up to 1,000,000. That is the source of the 900x speedup.

:::note Heap vs Index The index and the table data are stored separately. The table data lives in the "heap" - an unordered collection of pages. The index is an auxiliary structure that points into the heap. An "Index Scan" means: traverse the B-tree to find the ctid, then fetch the row from the heap. An "Index Only Scan" means the index already contains all the columns the query needs - no heap fetch is required, making it the fastest possible scan type. :::

Part 2 - Index Types: Choosing the Right Tool

PostgreSQL supports several index types. Choosing the wrong one means the index may never be used.

B-tree (Default)

Works for: =, <, >, <=, >=, BETWEEN, LIKE 'prefix%', ORDER BY, GROUP BY

B-tree is the default and correct choice for the vast majority of queries. Always start here.

CREATE INDEX idx_orders_created_at ON orders (created_at);
-- Used for: WHERE created_at > '2024-01-01'
-- Used for: ORDER BY created_at DESC
-- Used for: BETWEEN '2024-01-01' AND '2024-12-31'

Hash Index

Works for: equality (=) only - nothing else

Hash indexes are smaller and faster than B-tree for pure equality lookups, but they cannot support range queries or ordering. In PostgreSQL 10+, hash indexes are crash-safe. Use them only when you are certain you only ever need equality and the column has high cardinality.

CREATE INDEX idx_sessions_token ON sessions USING hash (session_token);
-- Used for: WHERE session_token = 'abc123'
-- NOT used for: WHERE session_token > 'abc123'  (range query)

GIN - Generalized Inverted Index

Works for: JSONB columns, arrays, full-text search (tsvector), multi-value data

GIN builds an inverted index: for each element in a document or array, it stores a list of rows that contain that element. This makes containment queries (@>, ?, @@) very fast.

-- For a JSONB column storing product tags
CREATE INDEX idx_products_tags ON products USING gin (tags);

-- Now this query is fast:
SELECT * FROM products WHERE tags @> '["python", "tutorial"]';

-- Full-text search
CREATE INDEX idx_articles_search ON articles USING gin (to_tsvector('english', body));
SELECT * FROM articles WHERE to_tsvector('english', body) @@ plainto_tsquery('database indexing');

GiST - Generalized Search Tree

Works for: geometric types (points, polygons, circles), ranges, nearest-neighbor searches

GiST is an extensible index framework used primarily for spatial data and range types.

-- For PostGIS geometry column
CREATE INDEX idx_locations_geom ON locations USING gist (geom);

-- For date ranges
CREATE INDEX idx_bookings_period ON bookings USING gist (reservation_period);
SELECT * FROM bookings WHERE reservation_period && '[2024-06-01, 2024-06-30]'::daterange;

BRIN - Block Range Index

Works for: very large tables where column values are naturally correlated with physical row order

BRIN (Block Range INdex) stores only the minimum and maximum values for each block range rather than indexing every row. The index is tiny but only useful when the data is physically sorted by the indexed column - for example, a created_at timestamp column on an append-only events table where newer rows are always added at the end.

-- Ideal for append-only time-series tables
CREATE INDEX idx_events_created_at ON events USING brin (created_at);
-- This index is ~1000x smaller than a B-tree for the same column
-- Only useful when physical order matches logical order

:::tip Choosing an Index Type When in doubt, use B-tree. It works for almost everything. Use GIN when you are querying inside JSONB documents or arrays. Use BRIN only for massive append-only tables where the column is naturally ordered. Use GiST for geometric and range data. :::

Part 3 - Creating Indexes

Standard Index

-- Basic single-column index
CREATE INDEX idx_orders_customer_email ON orders (customer_email);

-- With explicit B-tree (same as default, but explicit)
CREATE INDEX idx_orders_status ON orders USING btree (status);

-- Index for descending order (useful when you ORDER BY col DESC)
CREATE INDEX idx_orders_created_desc ON orders (created_at DESC);

Unique Index

A unique index enforces uniqueness AND speeds up lookups. PRIMARY KEY and UNIQUE constraints automatically create unique indexes.

-- Explicitly create a unique index
CREATE UNIQUE INDEX idx_users_email ON users (email);

-- This is equivalent to:
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);
-- (PostgreSQL implements UNIQUE constraints using a unique index internally)

Partial Index

A partial index is built only on the rows that satisfy a WHERE clause. It is smaller, faster to maintain, and - critically - it can be used for queries that match that same filter.

This is one of the most powerful and underused features in PostgreSQL.

-- Only index orders that are not yet fulfilled.
-- If 95% of orders are fulfilled, this index covers 5% of rows
-- but answers 100% of "pending order" queries efficiently.
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';

-- Now this query uses the partial index:
SELECT * FROM orders WHERE status = 'pending' AND created_at < NOW() - INTERVAL '7 days';

-- This query does NOT use it (condition does not match the partial filter):
SELECT * FROM orders WHERE status = 'fulfilled' AND created_at < NOW() - INTERVAL '7 days';

:::tip Partial Indexes for Status Columns If you have a status column like is_active, is_deleted, or status = 'pending' and most application queries filter on a specific value, a partial index can be 10–100x smaller than a full index and is cheaper to maintain on every write. :::

Expression Index

An expression index is computed on the result of a function or expression rather than a raw column value.

-- Case-insensitive email lookups
CREATE INDEX idx_users_email_lower ON users (lower(email));

-- Now this query uses the index:
SELECT * FROM users WHERE lower(email) = lower('[email protected]');

-- Without this index, the lower() call forces a sequential scan
-- because an index on (email) does not know about lower().

-- Index on an extracted JSON field
CREATE INDEX idx_orders_metadata_region ON orders ((metadata->>'region'));
SELECT * FROM orders WHERE metadata->>'region' = 'us-west-2';

Concurrent Index Creation

CREATE INDEX by default takes a lock that blocks all writes. On a production table, always use CONCURRENTLY.

-- Safe for production: does not block writes
-- Takes longer to build but allows reads and writes throughout
CREATE INDEX CONCURRENTLY idx_orders_customer_email ON orders (customer_email);

-- CONCURRENTLY cannot run inside a transaction block.
-- Run it as a standalone statement, not inside BEGIN/COMMIT.

:::warning CREATE INDEX Locks the Table CREATE INDEX without CONCURRENTLY takes a lock that blocks all INSERT, UPDATE, and DELETE operations until the index is built. On a large production table this can lock your application for minutes. Always use CREATE INDEX CONCURRENTLY in production. :::

Part 4 - EXPLAIN ANALYZE: Reading Query Plans

EXPLAIN ANALYZE is the primary tool for understanding how PostgreSQL executes a query. It shows the execution plan the query planner chose, and after executing the query, it shows actual timings and row counts.

Basic Syntax

EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = '[email protected]';

Understanding the Output

Here is typical output before adding an index:

Seq Scan on orders  (cost=0.00..28450.00 rows=5 width=312) (actual time=0.021..45312.442 rows=5 loops=1)
  Filter: ((customer_email)::text = '[email protected]'::text)
  Rows Removed by Filter: 999995
Planning Time: 0.123 ms
Execution Time: 45312.601 ms

Breaking this down:

Field	Meaning
`Seq Scan`	Node type - sequential (full table) scan
`cost=0.00..28450.00`	Estimated startup cost .. total cost (in abstract units)
`rows=5`	Planner's estimated row count
`width=312`	Estimated average row size in bytes
`actual time=0.021..45312.442`	Actual time in ms: first row .. all rows
`rows=5`	Actual rows returned
`loops=1`	How many times this node ran (relevant for joins)
`Rows Removed by Filter: 999995`	How many rows were read and discarded

The key numbers here are 45 seconds and 999,995 rows discarded - the database read every row in the table and threw away all but 5 of them.

After adding an index:

Index Scan using idx_orders_customer_email on orders
  (cost=0.42..12.48 rows=5 width=312) (actual time=0.031..0.047 rows=5 loops=1)
  Index Cond: ((customer_email)::text = '[email protected]'::text)
Planning Time: 0.156 ms
Execution Time: 0.063 ms

Now the query takes 0.06ms - the database traversed the B-tree, found 5 matching ctids, and fetched only those 5 rows from the heap.

Scan Types

Scan Type	When It Appears	What It Means
`Seq Scan`	No usable index, or optimizer judged a full scan cheaper	Reads every row in the table
`Index Scan`	Index found, but heap fetch needed for full row	Traverses index, then fetches rows from heap
`Index Only Scan`	Index contains all needed columns	No heap fetch - fastest possible
`Bitmap Heap Scan`	Multiple indexes combined, or many matching rows	Builds a bitmap of matching pages first, then fetches

Reading EXPLAIN ANALYZE in Python

import psycopg2
import json

def explain_query(conn, query, params=None):
    """
    Run EXPLAIN ANALYZE on a query and return the plan as structured data.
    Uses JSON format for easier programmatic analysis.
    """
    explain_sql = f"EXPLAIN (ANALYZE, FORMAT JSON) {query}"
    with conn.cursor() as cur:
        cur.execute(explain_sql, params)
        plan = cur.fetchone()[0]  # Returns a list with one element: the JSON plan
    return plan[0]  # The plan dict

def print_plan_summary(plan):
    """Print a human-readable summary of the top-level plan node."""
    node = plan["Plan"]
    print(f"Node Type:       {node['Node Type']}")
    print(f"Relation:        {node.get('Relation Name', 'N/A')}")
    print(f"Actual Time:     {node['Actual Total Time']:.3f} ms")
    print(f"Actual Rows:     {node['Actual Rows']}")
    print(f"Planning Time:   {plan['Planning Time']:.3f} ms")
    print(f"Execution Time:  {plan['Execution Time']:.3f} ms")

# Example usage
conn = psycopg2.connect(DSN)

# Before index
plan = explain_query(conn, "SELECT * FROM orders WHERE customer_email = %s", ('[email protected]',))
print("--- Before Index ---")
print_plan_summary(plan)

# Create the index
with conn.cursor() as cur:
    cur.execute("CREATE INDEX CONCURRENTLY idx_orders_email ON orders (customer_email)")
conn.commit()

# After index
plan = explain_query(conn, "SELECT * FROM orders WHERE customer_email = %s", ('[email protected]',))
print("\n--- After Index ---")
print_plan_summary(plan)

Sample output:

--- Before Index ---
Node Type:       Seq Scan
Relation:        orders
Actual Time:     45312.442 ms
Actual Rows:     5
Planning Time:   0.123 ms
Execution Time:  45312.601 ms

--- After Index ---
Node Type:       Index Scan
Relation:        orders
Actual Time:     0.047 ms
Actual Rows:     5
Planning Time:   0.156 ms
Execution Time:  0.063 ms

:::note When the Optimizer Ignores Your Index PostgreSQL's query planner may choose a sequential scan even when an index exists. This happens when: (1) the table is small enough that a seq scan is faster, (2) the query matches a large fraction of the table (roughly more than 10–20% of rows), or (3) statistics are stale and the planner underestimates selectivity. Run ANALYZE table_name to refresh statistics, then re-run EXPLAIN ANALYZE. :::

Part 5 - Composite Indexes and the Leftmost Prefix Rule

A composite index (also called a multi-column index) is an index on two or more columns. Understanding how composite indexes work - and when they are not used - is one of the most practically important concepts in database performance.

Creating a Composite Index

-- Index on (last_name, first_name, date_of_birth)
CREATE INDEX idx_customers_name ON customers (last_name, first_name, date_of_birth);

This creates a single B-tree where rows are sorted first by last_name, then by first_name within the same last_name, then by date_of_birth within the same last_name and first_name.

The Leftmost Prefix Rule

A composite index (a, b, c) can be used by a query only if the query filters on a prefix of the column list starting from the leftmost column. The database can use the index for a, for (a, b), or for (a, b, c) - but NOT for b alone, c alone, or (b, c) without a.

Column Order Matters Enormously

The order of columns in a composite index determines which queries can use it. The general rule: put the most selective column first (the column that eliminates the most rows), unless your query patterns require a different order.

-- Scenario: query for orders by status and created_at
-- status has ~5 distinct values (low cardinality)
-- created_at has millions of distinct values (high cardinality)

-- Option A: (status, created_at)
CREATE INDEX idx_a ON orders (status, created_at);
-- Good for: WHERE status = 'pending' AND created_at > '2024-01-01'
-- Good for: WHERE status = 'pending'  (uses leftmost prefix)
-- Bad for:  WHERE created_at > '2024-01-01'  (skips leftmost column)

-- Option B: (created_at, status)
CREATE INDEX idx_b ON orders (created_at, status);
-- Good for: WHERE created_at > '2024-01-01' AND status = 'pending'
-- Good for: WHERE created_at > '2024-01-01'  (uses leftmost prefix)
-- Bad for:  WHERE status = 'pending'  (skips leftmost column)

Which is better? It depends entirely on your query patterns. If you almost always filter by status first, Option A is better. If you almost always filter by date ranges, Option B is better.

Range Conditions Break the Chain

A range condition on a column in the middle of a composite index stops the index from being used for subsequent columns:

CREATE INDEX idx_orders ON orders (customer_id, created_at, status);

-- customer_id uses equality → index used fully for this column
-- created_at uses range → index still used, but breaks the chain
-- status filter is applied AFTER the index scan, not during traversal
SELECT * FROM orders
WHERE customer_id = 42            -- uses index (equality)
  AND created_at > '2024-01-01'   -- uses index (range - breaks chain)
  AND status = 'pending';         -- cannot use index (after range column)

:::tip Equality Columns Before Range Columns In a composite index, put columns used in equality conditions (=) before columns used in range conditions (>, <, BETWEEN). This maximizes the number of columns the index traversal can use before the chain breaks. :::

Part 6 - When Indexes Hurt Performance

Indexes are not free. Every index you create is paid for on every write. Understanding this cost is essential for avoiding over-indexing, one of the most common database anti-patterns.

Write Overhead

When you INSERT a new row, every index on that table must be updated. The database must:

Find the correct position in each B-tree for the new value
Insert the new key-pointer pair into the leaf page
Potentially split a page if it is full (B-tree page splits are expensive)

The same applies to UPDATE (which is internally a delete + insert in PostgreSQL due to MVCC) and DELETE.

-- A table with 12 indexes
-- Every INSERT must update 12 separate B-trees.
-- A high-throughput INSERT workload can spend 60–80% of its time
-- on index maintenance rather than actual data writes.

-- Rough rule of thumb:
-- 0–3 indexes: minimal write overhead
-- 4–8 indexes: noticeable but acceptable for read-heavy tables
-- 9+ indexes: significant write overhead; audit carefully

Index Bloat

PostgreSQL uses MVCC (Multi-Version Concurrency Control). When a row is updated or deleted, the old version is not immediately removed - it is marked as dead. Dead index entries accumulate over time, consuming space and slowing down index scans. This is called index bloat.

VACUUM removes dead tuples from the heap and dead entries from indexes. Without regular vacuuming, indexes grow continuously even if the table size is stable.

-- Check for index bloat (requires pgstattuple extension)
CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
    indexrelname,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    (pgstattuple(indexrelid)).dead_tuple_percent AS dead_tuple_pct
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;

Over-Indexing Anti-Patterns

-- Anti-pattern 1: Indexing every column
-- Adding an index "just in case" on every column makes writes very slow.
-- Most of those indexes will never be used by any query.

-- Anti-pattern 2: Redundant indexes
-- If you have a composite index (a, b), a separate index on (a) is redundant.
-- The composite index already handles queries on (a) alone via the leftmost prefix rule.
CREATE INDEX idx_a   ON orders (customer_id);           -- redundant if idx_ab exists
CREATE INDEX idx_ab  ON orders (customer_id, status);   -- this covers both queries

-- Anti-pattern 3: Indexing a low-cardinality column alone
-- A column with only 3 distinct values (pending/active/closed) will match ~33% of rows.
-- The optimizer will often ignore a full index on it and prefer a seq scan.
-- Better: use a partial index or make it a secondary composite column.
CREATE INDEX idx_bad  ON orders (status);               -- often ignored by optimizer
CREATE INDEX idx_good ON orders (status) WHERE status = 'pending';  -- partial index

-- Anti-pattern 4: Unused indexes
-- Indexes that no query ever uses still cost write overhead on every INSERT/UPDATE/DELETE.
-- Audit and drop them regularly (see Part 7 and Part 8).

:::warning Indexes Are Not Free Every index you create is a tax on every INSERT, UPDATE, and DELETE on that table. A table with 15 indexes can have 3–5x higher write latency than the same table with 3 targeted indexes. Always measure before adding an index, and regularly audit unused indexes with pg_stat_user_indexes. :::

The Index Decision Framework

Part 7 - Finding Slow Queries in Production

pg_stat_statements

pg_stat_statements is a PostgreSQL extension that records statistics about every SQL statement executed. It is the first place to look when diagnosing production performance problems.

-- Enable the extension (requires superuser, done once per cluster)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find the top 10 slowest queries by total execution time
SELECT
    round(total_exec_time::numeric, 2)   AS total_ms,
    calls,
    round(mean_exec_time::numeric, 2)    AS mean_ms,
    round(stddev_exec_time::numeric, 2)  AS stddev_ms,
    round((100 * total_exec_time /
        sum(total_exec_time) OVER ())::numeric, 2) AS pct_of_total,
    left(query, 100)                     AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

import psycopg2
from tabulate import tabulate

def find_slow_queries(conn, limit=10):
    """
    Query pg_stat_statements to find the slowest queries by total execution time.
    Returns a list of dicts with query stats.
    """
    sql = """
        SELECT
            round(total_exec_time::numeric, 2)   AS total_ms,
            calls,
            round(mean_exec_time::numeric, 2)    AS mean_ms,
            round((100 * total_exec_time /
                sum(total_exec_time) OVER ())::numeric, 2) AS pct_total,
            left(query, 120) AS query_preview
        FROM pg_stat_statements
        WHERE query NOT ILIKE '%%pg_stat%%'
        ORDER BY total_exec_time DESC
        LIMIT %s;
    """
    with conn.cursor() as cur:
        cur.execute(sql, (limit,))
        cols = [desc[0] for desc in cur.description]
        rows = cur.fetchall()

    results = [dict(zip(cols, row)) for row in rows]
    print(tabulate(results, headers="keys", tablefmt="psql"))
    return results


conn = psycopg2.connect(DSN)
find_slow_queries(conn, limit=10)

auto_explain

auto_explain is a PostgreSQL module that automatically logs EXPLAIN ANALYZE output for any query that exceeds a time threshold. It captures slow queries in production without you having to know in advance which queries to explain.

# In postgresql.conf:
shared_preload_libraries = 'pg_stat_statements,auto_explain'
auto_explain.log_min_duration = 1000   # Log any query taking > 1 second
auto_explain.log_analyze = on          # Include actual timings
auto_explain.log_buffers = on          # Include buffer hit/miss stats
auto_explain.log_format = json         # Structured output for log parsers

After reloading PostgreSQL, slow queries are logged automatically to the PostgreSQL log file with their full execution plans - invaluable for diagnosing intermittent production slowdowns without modifying application code.

Finding Missing Indexes via Sequential Scan Stats

-- Find tables with the most sequential scans.
-- High seq_scan count on large tables suggests queries lacking proper indexes.
SELECT
    relname                         AS table_name,
    seq_scan,
    seq_tup_read,
    idx_scan,
    seq_scan - idx_scan             AS missing_index_score,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY missing_index_score DESC
LIMIT 20;

Part 8 - Index Maintenance

VACUUM and ANALYZE

VACUUM reclaims space from dead tuples created by MVCC. Without vacuuming, tables and indexes grow indefinitely even when data is being deleted.

ANALYZE updates the statistics used by the query planner. Stale statistics cause the planner to make poor decisions - for example, choosing a sequential scan when an index scan would be faster, because the planner thinks the table has 1,000 rows when it actually has 1,000,000.

PostgreSQL's autovacuum daemon runs both automatically in the background. In most cases you do not need to run them manually. However, after a large bulk load or mass delete, running them manually accelerates recovery.

-- Vacuum a specific table (removes dead tuples, updates visibility map)
VACUUM orders;

-- Vacuum and update statistics in one pass
VACUUM ANALYZE orders;

-- Full vacuum: compacts the table and rebuilds indexes from scratch
-- NOTE: acquires an exclusive lock - use only in maintenance windows
VACUUM FULL orders;

import psycopg2
from psycopg2 import sql

def vacuum_analyze_table(conn, table_name):
    """
    Run VACUUM ANALYZE on a table.
    VACUUM cannot run inside a transaction block, so we temporarily
    set autocommit=True on the connection.
    """
    old_autocommit = conn.autocommit
    conn.autocommit = True
    try:
        with conn.cursor() as cur:
            cur.execute(
                sql.SQL("VACUUM ANALYZE {}").format(sql.Identifier(table_name))
            )
        print(f"VACUUM ANALYZE completed for {table_name}")
    finally:
        conn.autocommit = old_autocommit

REINDEX

REINDEX rebuilds an index from scratch. Use it when an index is corrupted or has excessive bloat that VACUUM cannot reclaim.

-- Rebuild a specific index (takes exclusive lock - blocks reads and writes)
REINDEX INDEX idx_orders_customer_email;

-- Rebuild all indexes on a table
REINDEX TABLE orders;

-- Safe for production: rebuild concurrently (PostgreSQL 12+)
REINDEX INDEX CONCURRENTLY idx_orders_customer_email;

Monitoring Index Usage

Not all indexes are actually used by any query. Unused indexes waste disk space and add write overhead with zero benefit. Regularly audit index usage and drop indexes that are never used.

-- Find indexes that have never been used since the last statistics reset
SELECT
    schemaname,
    relname         AS table_name,
    indexrelname    AS index_name,
    idx_scan        AS times_used,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey'   -- exclude primary keys
ORDER BY pg_relation_size(indexrelid) DESC;

import psycopg2
from tabulate import tabulate

def find_unused_indexes(conn, schema='public'):
    """
    Find indexes that have never been scanned since the last statistics reset.
    These are candidates for removal after verifying they are not needed.
    """
    sql = """
        SELECT
            relname                                          AS table_name,
            indexrelname                                     AS index_name,
            idx_scan                                         AS scans,
            pg_size_pretty(pg_relation_size(indexrelid))     AS index_size,
            pg_size_pretty(pg_total_relation_size(relid))    AS table_size
        FROM pg_stat_user_indexes
        WHERE schemaname = %s
          AND idx_scan = 0
          AND indexrelname NOT LIKE '%%_pkey'
          AND indexrelname NOT LIKE '%%_key'
        ORDER BY pg_relation_size(indexrelid) DESC;
    """
    with conn.cursor() as cur:
        cur.execute(sql, (schema,))
        cols = [d[0] for d in cur.description]
        rows = cur.fetchall()

    results = [dict(zip(cols, row)) for row in rows]
    if results:
        print(f"Found {len(results)} unused index(es):")
        print(tabulate(results, headers="keys", tablefmt="psql"))
    else:
        print("No unused indexes found.")
    return results


conn = psycopg2.connect(DSN)
find_unused_indexes(conn)

:::danger Do Not Drop Indexes Blindly pg_stat_user_indexes.idx_scan = 0 means the index has not been used since the last statistics reset - which happens on server restart or when pg_stat_reset() is called. An index used only by a nightly batch job or a monthly report may show zero scans on a recently restarted server. Before dropping any index, verify it is not needed by checking query logs and application code. Always drop indexes in a non-production environment first and monitor for regressions. :::

Part 9 - Full Working Example

This example creates a table with 1,000,000 rows, demonstrates the before/after performance difference, and shows how to read EXPLAIN ANALYZE output programmatically in Python.

import psycopg2
import time
import random
import string
from psycopg2.extras import execute_values

DSN = "host=localhost dbname=testdb user=postgres password=postgres"


def random_email():
    name = ''.join(random.choices(string.ascii_lowercase, k=8))
    domain = random.choice(['gmail.com', 'yahoo.com', 'example.com', 'acme.org'])
    return f"{name}@{domain}"


def setup_table(conn):
    """Create the demo table and populate it with 1,000,000 rows."""
    with conn.cursor() as cur:
        cur.execute("DROP TABLE IF EXISTS orders_demo")
        cur.execute("""
            CREATE TABLE orders_demo (
                id             SERIAL PRIMARY KEY,
                customer_email TEXT NOT NULL,
                status         TEXT NOT NULL DEFAULT 'pending',
                amount         NUMERIC(10, 2),
                created_at     TIMESTAMPTZ NOT NULL DEFAULT NOW()
            )
        """)
    conn.commit()
    print("Table created. Inserting 1,000,000 rows...")

    batch_size = 10_000
    total_rows = 1_000_000

    with conn.cursor() as cur:
        for batch_start in range(0, total_rows, batch_size):
            rows = [
                (
                    random_email(),
                    random.choice(['pending', 'shipped', 'delivered', 'cancelled']),
                    round(random.uniform(10, 500), 2),
                )
                for _ in range(batch_size)
            ]
            execute_values(
                cur,
                "INSERT INTO orders_demo (customer_email, status, amount) VALUES %s",
                rows,
            )
            if (batch_start // batch_size) % 10 == 0:
                print(f"  Inserted {batch_start + batch_size:,} rows...")
    conn.commit()
    print("Bulk insert complete.")


def explain_query(conn, email):
    """Return the EXPLAIN ANALYZE JSON plan for the target email lookup."""
    with conn.cursor() as cur:
        cur.execute(
            "EXPLAIN (ANALYZE, FORMAT JSON) "
            "SELECT id, customer_email, status, amount "
            "FROM orders_demo "
            "WHERE customer_email = %s",
            (email,),
        )
        plan_json = cur.fetchone()[0]
    return plan_json[0]


def print_plan_summary(label, plan):
    node = plan["Plan"]
    print(f"\n--- {label} ---")
    print(f"  Scan type:      {node['Node Type']}")
    print(f"  Actual rows:    {node['Actual Rows']}")
    rows_removed = node.get('Rows Removed by Filter', 0)
    if rows_removed:
        print(f"  Rows discarded: {rows_removed:,}")
    print(f"  Execution time: {plan['Execution Time']:.3f} ms")


def main():
    conn = psycopg2.connect(DSN)

    # 1. Set up the table with 1 million random rows
    setup_table(conn)

    # Insert the specific email we will benchmark
    target_email = "[email protected]"
    with conn.cursor() as cur:
        cur.execute(
            "INSERT INTO orders_demo (customer_email, status, amount) VALUES (%s, 'pending', 99.99)",
            (target_email,),
        )
    conn.commit()

    # 2. Benchmark WITHOUT index
    plan_before = explain_query(conn, target_email)
    print_plan_summary("Query WITHOUT index", plan_before)

    # 3. Create the index (CONCURRENTLY - safe for production)
    print("\nCreating index on customer_email (CONCURRENTLY)...")
    conn.autocommit = True
    with conn.cursor() as cur:
        cur.execute(
            "CREATE INDEX CONCURRENTLY idx_orders_demo_email "
            "ON orders_demo (customer_email)"
        )
    conn.autocommit = False

    # 4. Refresh planner statistics
    conn.autocommit = True
    with conn.cursor() as cur:
        cur.execute("ANALYZE orders_demo")
    conn.autocommit = False
    print("Index created and statistics updated.")

    # 5. Benchmark WITH index
    plan_after = explain_query(conn, target_email)
    print_plan_summary("Query WITH index", plan_after)

    # 6. Show the speedup
    before_ms = plan_before["Execution Time"]
    after_ms  = plan_after["Execution Time"]
    speedup   = before_ms / after_ms if after_ms > 0 else float('inf')
    print(f"\nResult: {before_ms:,.1f} ms → {after_ms:.3f} ms  ({speedup:,.0f}x faster)")

    conn.close()


if __name__ == "__main__":
    main()

Expected output:

Table created. Inserting 1,000,000 rows...
  Inserted 100,000 rows...
  Inserted 200,000 rows...
  ...
Bulk insert complete.

--- Query WITHOUT index ---
  Scan type:      Seq Scan
  Actual rows:    1
  Rows discarded: 1,000,000
  Execution time: 45312.101 ms

Creating index on customer_email (CONCURRENTLY)...
Index created and statistics updated.

--- Query WITH index ---
  Scan type:      Index Scan
  Actual rows:    1
  Execution time: 0.063 ms

Result: 45,312.1 ms → 0.063 ms  (719,241x faster)

Practice Challenges

Beginner - Read a Query Plan

Challenge: Identify the scan type and bottleneck

Given this EXPLAIN ANALYZE output:

Seq Scan on products  (cost=0.00..15230.00 rows=3 width=128)
                      (actual time=0.018..28441.225 rows=3 loops=1)
  Filter: ((category)::text = 'electronics'::text AND price < 50.00)
  Rows Removed by Filter: 750997
Planning Time: 0.094 ms
Execution Time: 28441.301 ms

Answer the following questions:

What scan type is being used?
How many rows were examined in total?
How many rows matched the filter?
What index (or indexes) would you create to improve this query?
After adding the index, what scan type would you expect to see?

Solution:

1. Seq Scan (Sequential Scan) - the database read every row in the table.

2. Total rows examined: 3 (returned) + 750,997 (removed by filter) = 751,000 rows.

3. 3 rows matched the filter.

4. The query filters on both `category` (equality) and `price` (range).
   Following the rule "equality columns before range columns":

   CREATE INDEX idx_products_category_price ON products (category, price);

   A partial index is also worth considering if most queries target
   a specific category:

   CREATE INDEX idx_products_electronics_price
       ON products (price)
       WHERE category = 'electronics';

5. After adding the composite index, EXPLAIN ANALYZE should show:
   Index Scan using idx_products_category_price on products
   with execution time well under 1 ms.

Intermediate - Design a Composite Index Strategy

Challenge: Choose the correct index column order

You have an events table with 10 million rows and the following columns:

CREATE TABLE events (
    id          BIGSERIAL PRIMARY KEY,
    user_id     BIGINT NOT NULL,
    event_type  TEXT NOT NULL,   -- ~20 distinct values
    occurred_at TIMESTAMPTZ NOT NULL,
    payload     JSONB
);

Your application runs three common queries:

-- Query A (runs 10,000 times/minute):
SELECT * FROM events
WHERE user_id = $1
ORDER BY occurred_at DESC
LIMIT 20;

-- Query B (runs 500 times/minute):
SELECT * FROM events
WHERE user_id = $1
  AND event_type = $2
ORDER BY occurred_at DESC
LIMIT 20;

-- Query C (runs 50 times/minute - analytics dashboard):
SELECT event_type, count(*) FROM events
WHERE occurred_at >= NOW() - INTERVAL '24 hours'
GROUP BY event_type;

Design the minimum set of indexes to cover all three queries efficiently. Justify each index and explain why you chose the column order.

Solution:

-- Index 1: Covers Query A and Query B
-- Column order: (user_id, event_type, occurred_at DESC)
--
-- Justification:
--   - user_id is an equality filter in both queries → goes first (leftmost prefix)
--   - event_type is equality in Query B → goes second (equality before ordering)
--   - occurred_at DESC supports ORDER BY occurred_at DESC for both queries
--
-- Query A uses the prefix (user_id) and leverages occurred_at for ordering.
-- Query B uses (user_id, event_type) and leverages occurred_at for ordering.
-- With LIMIT 20, PostgreSQL stops traversing after 20 matching index entries -
-- making both queries extremely fast.
CREATE INDEX idx_events_user_type_time
    ON events (user_id, event_type, occurred_at DESC);

-- Index 2: Covers Query C
-- A B-tree on occurred_at supports the range filter in Query C.
-- Assuming roughly 10M rows over 30 days, the last 24 hours is ~333K rows
-- (~3% of the table) - well within the threshold where an index helps.
CREATE INDEX idx_events_occurred_at
    ON events (occurred_at DESC);

-- Why no separate index for Query A alone?
-- idx_events_user_type_time with prefix (user_id) already covers Query A.
-- A separate (user_id, occurred_at) index would be partially redundant
-- and would increase write overhead unnecessarily.

-- Final result: 2 targeted indexes + 1 primary key = 3 indexes total.
-- This is a lean set that covers all three query patterns without over-indexing.

Advanced - Build an Index Advisor

Challenge: Write a Python function that analyzes a table and suggests missing indexes

Write a Python function suggest_indexes(conn, table_name) that:

Queries pg_stat_user_tables to find the sequential scan count and table size
Queries pg_stat_user_indexes to list existing indexes and their usage
Queries pg_stat_statements to find the top 5 slowest queries that reference that table
Prints a formatted report with current indexes, scan statistics, slow queries, and actionable recommendations

Solution:

import psycopg2
from tabulate import tabulate


def suggest_indexes(conn, table_name: str, schema: str = 'public') -> None:
    """
    Analyze a table and print an index advisory report.

    Data sources:
    - pg_stat_user_tables  - sequential scan and row statistics
    - pg_stat_user_indexes - existing indexes and their usage counts
    - pg_stat_statements   - slow queries referencing this table
    """

    # 1. Sequential scan statistics
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT
                seq_scan,
                seq_tup_read,
                idx_scan,
                n_live_tup,
                pg_size_pretty(pg_total_relation_size(relid)) AS total_size
            FROM pg_stat_user_tables
            WHERE schemaname = %s AND relname = %s
            """,
            (schema, table_name),
        )
        row = cur.fetchone()

    if row is None:
        print(f"Table {schema}.{table_name} not found in pg_stat_user_tables.")
        return

    seq_scan, seq_tup_read, idx_scan, n_live_tup, total_size = row

    print(f"\n{'='*60}")
    print(f"Index Advisory Report: {schema}.{table_name}")
    print(f"{'='*60}")
    print(f"Live rows:        {n_live_tup:,}")
    print(f"Table size:       {total_size}")
    print(f"Sequential scans: {seq_scan:,}")
    print(f"Index scans:      {idx_scan:,}")

    if seq_scan > 0 and idx_scan is not None and idx_scan > 0:
        seq_ratio = seq_scan / (seq_scan + idx_scan) * 100
        flag = "  WARNING: high seq scan ratio" if seq_ratio > 30 else "  OK"
        print(f"Seq scan ratio:   {seq_ratio:.1f}%{flag}")

    # 2. Existing indexes
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT
                indexrelname                                     AS index_name,
                idx_scan                                         AS scans,
                idx_tup_read                                     AS tuples_read,
                pg_size_pretty(pg_relation_size(indexrelid))     AS index_size
            FROM pg_stat_user_indexes
            WHERE schemaname = %s AND relname = %s
            ORDER BY idx_scan DESC
            """,
            (schema, table_name),
        )
        indexes = cur.fetchall()

    print(f"\nExisting indexes ({len(indexes)}):")
    if indexes:
        print(
            tabulate(
                indexes,
                headers=["Index Name", "Scans", "Tuples Read", "Size"],
                tablefmt="psql",
            )
        )
    else:
        print("  (none)")

    # 3. Slow queries referencing this table from pg_stat_statements
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT
                round(mean_exec_time::numeric, 1)  AS mean_ms,
                calls,
                round(total_exec_time::numeric, 0) AS total_ms,
                left(query, 200)                   AS query_preview
            FROM pg_stat_statements
            WHERE query ILIKE %s
              AND query NOT ILIKE '%%pg_stat%%'
              AND query NOT ILIKE '%%EXPLAIN%%'
            ORDER BY mean_exec_time DESC
            LIMIT 5
            """,
            (f'%%{table_name}%%',),
        )
        slow_queries = cur.fetchall()

    print(f"\nTop 5 slowest queries referencing {table_name}:")
    if slow_queries:
        print(
            tabulate(
                slow_queries,
                headers=["Mean (ms)", "Calls", "Total (ms)", "Query (truncated)"],
                tablefmt="psql",
                maxcolwidths=[None, None, None, 60],
            )
        )
    else:
        print("  (no data - pg_stat_statements may not be enabled)")

    # 4. Actionable recommendations
    print("\nRecommendations:")
    recommendations = []

    if seq_scan > 1000 and n_live_tup > 100_000:
        recommendations.append(
            f"High sequential scan count ({seq_scan:,}) on a large table ({n_live_tup:,} rows).\n"
            "  Run EXPLAIN ANALYZE on the slow queries above to identify missing indexes."
        )

    unused = [i for i in indexes if i[1] == 0 and '_pkey' not in i[0]]
    if unused:
        names = ', '.join(i[0] for i in unused)
        recommendations.append(
            f"{len(unused)} index(es) have zero scans (possibly unused):\n  {names}\n"
            "  Verify these are not needed before dropping them."
        )

    if not slow_queries:
        recommendations.append(
            "Enable pg_stat_statements for query-level diagnostics:\n"
            "  CREATE EXTENSION pg_stat_statements;\n"
            "  Add 'pg_stat_statements' to shared_preload_libraries and restart."
        )

    if not recommendations:
        recommendations.append("No obvious issues detected. Monitor over time as data grows.")

    for i, rec in enumerate(recommendations, 1):
        print(f"  {i}. {rec}")


# Usage:
# conn = psycopg2.connect(DSN)
# suggest_indexes(conn, 'orders')
# suggest_indexes(conn, 'events', schema='analytics')

Key Takeaways

What an index is. A B-tree index is a separate, self-balancing data structure that maps column values to physical row locations (ctids). It enables O(log n) lookups instead of O(n) sequential scans. For a table with 1,000,000 rows, a B-tree traversal reads at most ~20 pages; a sequential scan reads up to 1,000,000.

Index types and their use cases. B-tree handles almost everything - equality, ranges, ordering, prefix matching. GIN is for JSONB documents, arrays, and full-text search. GiST is for geometric types and ranges. BRIN is a tiny index for massive append-only tables with naturally ordered columns. Hash is for equality-only lookups on high-cardinality columns.

Creating indexes correctly. Use CREATE INDEX CONCURRENTLY in production to avoid blocking writes. Partial indexes reduce size and write overhead when only a subset of rows is queried. Expression indexes are required when queries use functions like lower() or JSON extraction operators.

Reading EXPLAIN ANALYZE. Seq Scan means every row is being read - look for a missing index. Index Scan means the B-tree is used but heap fetches are still needed. Index Only Scan is the fastest - all needed columns are in the index itself. Pay attention to Rows Removed by Filter to understand waste.

The leftmost prefix rule. A composite index (a, b, c) supports queries on a, on (a, b), or on (a, b, c) - but not on b alone or c alone. Put equality columns before range columns to maximize how many columns the traversal uses before the chain breaks.

When indexes hurt. Every index is a tax on every INSERT, UPDATE, and DELETE. Over-indexing - adding indexes on every column or adding redundant indexes - can make write-heavy workloads 3–5x slower. Audit unused indexes with pg_stat_user_indexes and drop them after verification.

Maintenance matters. Run VACUUM ANALYZE after large bulk operations to reclaim space and refresh planner statistics. Use REINDEX CONCURRENTLY to rebuild bloated indexes in production. Monitor index health regularly with pg_stat_user_indexes and pg_stat_statements.

What's Next

You now understand how databases use indexes to turn 45-second queries into sub-millisecond lookups - and the real costs that come with that power. You can read query plans, design composite indexes, find slow queries in production, and maintain indexes over time.

The next lesson introduces SQLAlchemy ORM, Python's most powerful and widely-used object-relational mapper. You will learn how SQLAlchemy defines models as Python classes, how it generates SQL from method chains, how to manage schema migrations with Alembic, and - critically - how to apply everything you know about indexes and EXPLAIN ANALYZE when SQLAlchemy-generated queries are unexpectedly slow.

Next: 06 - ORM with SQLAlchemy

Module 07: Databases | Lesson 5 of 7

What You Will Learn​

Prerequisites​

Part 1 - What an Index Actually Is​

The Problem: Sequential Scans Scale Poorly​

The Solution: A Sorted Auxiliary Structure​

B-tree Internals​

How a Lookup Works​

Part 2 - Index Types: Choosing the Right Tool​

B-tree (Default)​

Hash Index​

GIN - Generalized Inverted Index​

GiST - Generalized Search Tree​

BRIN - Block Range Index​

Part 3 - Creating Indexes​

Standard Index​

Unique Index​

Partial Index​

Expression Index​

Concurrent Index Creation​

Part 4 - EXPLAIN ANALYZE: Reading Query Plans​

Basic Syntax​

Understanding the Output​

Scan Types​

Reading EXPLAIN ANALYZE in Python​

Part 5 - Composite Indexes and the Leftmost Prefix Rule​

Creating a Composite Index​

The Leftmost Prefix Rule​

Column Order Matters Enormously​

Range Conditions Break the Chain​

Part 6 - When Indexes Hurt Performance​

Write Overhead​

Index Bloat​

Over-Indexing Anti-Patterns​

The Index Decision Framework​

Part 7 - Finding Slow Queries in Production​

pg_stat_statements​

auto_explain​

Finding Missing Indexes via Sequential Scan Stats​

Part 8 - Index Maintenance​

VACUUM and ANALYZE​

REINDEX​

Monitoring Index Usage​

Part 9 - Full Working Example​

Practice Challenges​

Beginner - Read a Query Plan​

Intermediate - Design a Composite Index Strategy​

Advanced - Build an Index Advisor​

Key Takeaways​

What's Next​

What You Will Learn

Prerequisites

Part 1 - What an Index Actually Is

The Problem: Sequential Scans Scale Poorly

The Solution: A Sorted Auxiliary Structure

B-tree Internals

How a Lookup Works

Part 2 - Index Types: Choosing the Right Tool

B-tree (Default)

Hash Index

GIN - Generalized Inverted Index

GiST - Generalized Search Tree

BRIN - Block Range Index

Part 3 - Creating Indexes

Standard Index

Unique Index

Partial Index

Expression Index

Concurrent Index Creation

Part 4 - EXPLAIN ANALYZE: Reading Query Plans

Basic Syntax

Understanding the Output

Scan Types

Reading EXPLAIN ANALYZE in Python

Part 5 - Composite Indexes and the Leftmost Prefix Rule

Creating a Composite Index

The Leftmost Prefix Rule

Column Order Matters Enormously

Range Conditions Break the Chain

Part 6 - When Indexes Hurt Performance

Write Overhead

Index Bloat

Over-Indexing Anti-Patterns

The Index Decision Framework

Part 7 - Finding Slow Queries in Production

pg_stat_statements

auto_explain

Finding Missing Indexes via Sequential Scan Stats

Part 8 - Index Maintenance

VACUUM and ANALYZE

REINDEX

Monitoring Index Usage

Part 9 - Full Working Example

Practice Challenges

Beginner - Read a Query Plan

Intermediate - Design a Composite Index Strategy

Advanced - Build an Index Advisor

Key Takeaways

What's Next