Indexing and Query Optimization
Reading time: ~40 minutes | Level: Intermediate → Engineering
Before reading further, predict what is happening here:
import psycopg2
import time
conn = psycopg2.connect(DSN)
cur = conn.cursor()
# Test 1: table with 1,000 rows
# Elapsed: 0.050 seconds ✓ fast enough
# Test 2: same table, same query, same hardware - now 1,000,000 rows
# Elapsed: 45.3 seconds ✗ production is on fire
Identical query. Same Python code. Same server. The only thing that changed was the row count - and performance collapsed by a factor of 900.
This is not a bug in psycopg2. It is not a network problem. It is a missing index, and it is one of the most common causes of production database emergencies. Understanding exactly why this happens - and how an index fixes it - is what this lesson is about.
What You Will Learn
- What a database index physically is - the B-tree data structure and how the engine traverses it
- The five main index types in PostgreSQL and when to choose each one
- How to create indexes: standard, unique, partial, and expression indexes
- How to read
EXPLAIN ANALYZEoutput and distinguish Seq Scan, Index Scan, and Index Only Scan - The leftmost prefix rule for composite indexes - and exactly which queries will or will not use your index
- When indexes actively hurt performance - INSERT/UPDATE/DELETE overhead and index bloat
- How to find slow queries in production using
pg_stat_statementsandauto_explain - Index maintenance:
VACUUM,ANALYZE,REINDEX, and monitoring index usage
Prerequisites
- Python Intermediate: Module 07, Lessons 01–04 (SQL Fundamentals through Transactions)
- Basic familiarity with
psycopg2and PostgreSQL - Understanding of SQL
SELECT,WHERE, andJOINclauses
Part 1 - What an Index Actually Is
The Problem: Sequential Scans Scale Poorly
When you run SELECT * FROM orders WHERE customer_email = '[email protected]' against a table with no index on customer_email, the database has only one option: read every single row in the table, check each one against the condition, and return the matching rows.
This is called a sequential scan (or full table scan). It is O(n) - as the number of rows doubles, the time to answer the query roughly doubles. At 1,000 rows it is fast. At 1,000,000 rows it is catastrophic.
Sequential scan with 1,000,000 rows:
[row 1] [row 2] [row 3] ... [row 999,999] [row 1,000,000]
↑ check ↑ check ↑ check ... ↑ check ↑ check
Every row must be visited. Alice's row could be anywhere.
The Solution: A Sorted Auxiliary Structure
An index is a separate data structure maintained by the database alongside your table. It stores a subset of the data - typically just the indexed column(s) and a pointer (called a "heap tuple identifier" or ctid) back to the full row - arranged in a way that makes lookups dramatically faster.
The default and most common index type in PostgreSQL is the B-tree (Balanced Tree).
B-tree Internals
A B-tree is a self-balancing tree data structure. Every node can hold multiple keys, and the tree always remains balanced - meaning all leaf nodes are at the same depth. This guarantees that any lookup takes O(log n) time regardless of the data distribution.
The tree has three levels of nodes:
- Root node - the single entry point at the top; holds key ranges that point to branch nodes
- Branch nodes (also called internal nodes) - intermediate layers; hold key ranges pointing to child nodes or leaf pages
- Leaf pages - the bottom layer; contain the actual indexed values paired with heap tuple identifiers (pointers to the real row location on disk)
How a Lookup Works
To find the row where id = 450:
- Start at the root:
450is between300and600, so follow the pointer to the middle branch node (keys 400, 500) - At the branch node:
450is between400and500, follow the left pointer to Leaf Page 5 - At the leaf page: find
450 → ctid(9,4)- this is the physical location (page 9, tuple 4) of the actual row in the heap - Fetch that single page from disk and return the row
With 1,000,000 rows, a balanced B-tree is at most ~20 levels deep (log₂(1,000,000) ≈ 20). The database reads at most 20 pages instead of up to 1,000,000. That is the source of the 900x speedup.
:::note Heap vs Index The index and the table data are stored separately. The table data lives in the "heap" - an unordered collection of pages. The index is an auxiliary structure that points into the heap. An "Index Scan" means: traverse the B-tree to find the ctid, then fetch the row from the heap. An "Index Only Scan" means the index already contains all the columns the query needs - no heap fetch is required, making it the fastest possible scan type. :::
Part 2 - Index Types: Choosing the Right Tool
PostgreSQL supports several index types. Choosing the wrong one means the index may never be used.
B-tree (Default)
Works for: =, <, >, <=, >=, BETWEEN, LIKE 'prefix%', ORDER BY, GROUP BY
B-tree is the default and correct choice for the vast majority of queries. Always start here.
CREATE INDEX idx_orders_created_at ON orders (created_at);
-- Used for: WHERE created_at > '2024-01-01'
-- Used for: ORDER BY created_at DESC
-- Used for: BETWEEN '2024-01-01' AND '2024-12-31'
Hash Index
Works for: equality (=) only - nothing else
Hash indexes are smaller and faster than B-tree for pure equality lookups, but they cannot support range queries or ordering. In PostgreSQL 10+, hash indexes are crash-safe. Use them only when you are certain you only ever need equality and the column has high cardinality.
CREATE INDEX idx_sessions_token ON sessions USING hash (session_token);
-- Used for: WHERE session_token = 'abc123'
-- NOT used for: WHERE session_token > 'abc123' (range query)
GIN - Generalized Inverted Index
Works for: JSONB columns, arrays, full-text search (tsvector), multi-value data
GIN builds an inverted index: for each element in a document or array, it stores a list of rows that contain that element. This makes containment queries (@>, ?, @@) very fast.
-- For a JSONB column storing product tags
CREATE INDEX idx_products_tags ON products USING gin (tags);
-- Now this query is fast:
SELECT * FROM products WHERE tags @> '["python", "tutorial"]';
-- Full-text search
CREATE INDEX idx_articles_search ON articles USING gin (to_tsvector('english', body));
SELECT * FROM articles WHERE to_tsvector('english', body) @@ plainto_tsquery('database indexing');
GiST - Generalized Search Tree
Works for: geometric types (points, polygons, circles), ranges, nearest-neighbor searches
GiST is an extensible index framework used primarily for spatial data and range types.
-- For PostGIS geometry column
CREATE INDEX idx_locations_geom ON locations USING gist (geom);
-- For date ranges
CREATE INDEX idx_bookings_period ON bookings USING gist (reservation_period);
SELECT * FROM bookings WHERE reservation_period && '[2024-06-01, 2024-06-30]'::daterange;
BRIN - Block Range Index
Works for: very large tables where column values are naturally correlated with physical row order
BRIN (Block Range INdex) stores only the minimum and maximum values for each block range rather than indexing every row. The index is tiny but only useful when the data is physically sorted by the indexed column - for example, a created_at timestamp column on an append-only events table where newer rows are always added at the end.
-- Ideal for append-only time-series tables
CREATE INDEX idx_events_created_at ON events USING brin (created_at);
-- This index is ~1000x smaller than a B-tree for the same column
-- Only useful when physical order matches logical order
:::tip Choosing an Index Type When in doubt, use B-tree. It works for almost everything. Use GIN when you are querying inside JSONB documents or arrays. Use BRIN only for massive append-only tables where the column is naturally ordered. Use GiST for geometric and range data. :::
Part 3 - Creating Indexes
Standard Index
-- Basic single-column index
CREATE INDEX idx_orders_customer_email ON orders (customer_email);
-- With explicit B-tree (same as default, but explicit)
CREATE INDEX idx_orders_status ON orders USING btree (status);
-- Index for descending order (useful when you ORDER BY col DESC)
CREATE INDEX idx_orders_created_desc ON orders (created_at DESC);
Unique Index
A unique index enforces uniqueness AND speeds up lookups. PRIMARY KEY and UNIQUE constraints automatically create unique indexes.
-- Explicitly create a unique index
CREATE UNIQUE INDEX idx_users_email ON users (email);
-- This is equivalent to:
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);
-- (PostgreSQL implements UNIQUE constraints using a unique index internally)
Partial Index
A partial index is built only on the rows that satisfy a WHERE clause. It is smaller, faster to maintain, and - critically - it can be used for queries that match that same filter.
This is one of the most powerful and underused features in PostgreSQL.
-- Only index orders that are not yet fulfilled.
-- If 95% of orders are fulfilled, this index covers 5% of rows
-- but answers 100% of "pending order" queries efficiently.
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';
-- Now this query uses the partial index:
SELECT * FROM orders WHERE status = 'pending' AND created_at < NOW() - INTERVAL '7 days';
-- This query does NOT use it (condition does not match the partial filter):
SELECT * FROM orders WHERE status = 'fulfilled' AND created_at < NOW() - INTERVAL '7 days';
:::tip Partial Indexes for Status Columns
If you have a status column like is_active, is_deleted, or status = 'pending' and most application queries filter on a specific value, a partial index can be 10–100x smaller than a full index and is cheaper to maintain on every write.
:::
Expression Index
An expression index is computed on the result of a function or expression rather than a raw column value.
-- Case-insensitive email lookups
CREATE INDEX idx_users_email_lower ON users (lower(email));
-- Now this query uses the index:
-- Without this index, the lower() call forces a sequential scan
-- because an index on (email) does not know about lower().
-- Index on an extracted JSON field
CREATE INDEX idx_orders_metadata_region ON orders ((metadata->>'region'));
SELECT * FROM orders WHERE metadata->>'region' = 'us-west-2';
Concurrent Index Creation
CREATE INDEX by default takes a lock that blocks all writes. On a production table, always use CONCURRENTLY.
-- Safe for production: does not block writes
-- Takes longer to build but allows reads and writes throughout
CREATE INDEX CONCURRENTLY idx_orders_customer_email ON orders (customer_email);
-- CONCURRENTLY cannot run inside a transaction block.
-- Run it as a standalone statement, not inside BEGIN/COMMIT.
:::warning CREATE INDEX Locks the Table
CREATE INDEX without CONCURRENTLY takes a lock that blocks all INSERT, UPDATE, and DELETE operations until the index is built. On a large production table this can lock your application for minutes. Always use CREATE INDEX CONCURRENTLY in production.
:::
Part 4 - EXPLAIN ANALYZE: Reading Query Plans
EXPLAIN ANALYZE is the primary tool for understanding how PostgreSQL executes a query. It shows the execution plan the query planner chose, and after executing the query, it shows actual timings and row counts.
Basic Syntax
Understanding the Output
Here is typical output before adding an index:
Seq Scan on orders (cost=0.00..28450.00 rows=5 width=312) (actual time=0.021..45312.442 rows=5 loops=1)
Filter: ((customer_email)::text = '[email protected]'::text)
Rows Removed by Filter: 999995
Planning Time: 0.123 ms
Execution Time: 45312.601 ms
Breaking this down:
| Field | Meaning |
|---|---|
Seq Scan | Node type - sequential (full table) scan |
cost=0.00..28450.00 | Estimated startup cost .. total cost (in abstract units) |
rows=5 | Planner's estimated row count |
width=312 | Estimated average row size in bytes |
actual time=0.021..45312.442 | Actual time in ms: first row .. all rows |
rows=5 | Actual rows returned |
loops=1 | How many times this node ran (relevant for joins) |
Rows Removed by Filter: 999995 | How many rows were read and discarded |
The key numbers here are 45 seconds and 999,995 rows discarded - the database read every row in the table and threw away all but 5 of them.
After adding an index:
Index Scan using idx_orders_customer_email on orders
(cost=0.42..12.48 rows=5 width=312) (actual time=0.031..0.047 rows=5 loops=1)
Index Cond: ((customer_email)::text = '[email protected]'::text)
Planning Time: 0.156 ms
Execution Time: 0.063 ms
Now the query takes 0.06ms - the database traversed the B-tree, found 5 matching ctids, and fetched only those 5 rows from the heap.
Scan Types
| Scan Type | When It Appears | What It Means |
|---|---|---|
Seq Scan | No usable index, or optimizer judged a full scan cheaper | Reads every row in the table |
Index Scan | Index found, but heap fetch needed for full row | Traverses index, then fetches rows from heap |
Index Only Scan | Index contains all needed columns | No heap fetch - fastest possible |
Bitmap Heap Scan | Multiple indexes combined, or many matching rows | Builds a bitmap of matching pages first, then fetches |
Reading EXPLAIN ANALYZE in Python
import psycopg2
import json
def explain_query(conn, query, params=None):
"""
Run EXPLAIN ANALYZE on a query and return the plan as structured data.
Uses JSON format for easier programmatic analysis.
"""
explain_sql = f"EXPLAIN (ANALYZE, FORMAT JSON) {query}"
with conn.cursor() as cur:
cur.execute(explain_sql, params)
plan = cur.fetchone()[0] # Returns a list with one element: the JSON plan
return plan[0] # The plan dict
def print_plan_summary(plan):
"""Print a human-readable summary of the top-level plan node."""
node = plan["Plan"]
print(f"Node Type: {node['Node Type']}")
print(f"Relation: {node.get('Relation Name', 'N/A')}")
print(f"Actual Time: {node['Actual Total Time']:.3f} ms")
print(f"Actual Rows: {node['Actual Rows']}")
print(f"Planning Time: {plan['Planning Time']:.3f} ms")
print(f"Execution Time: {plan['Execution Time']:.3f} ms")
# Example usage
conn = psycopg2.connect(DSN)
# Before index
plan = explain_query(conn, "SELECT * FROM orders WHERE customer_email = %s", ('[email protected]',))
print("--- Before Index ---")
print_plan_summary(plan)
# Create the index
with conn.cursor() as cur:
cur.execute("CREATE INDEX CONCURRENTLY idx_orders_email ON orders (customer_email)")
conn.commit()
# After index
plan = explain_query(conn, "SELECT * FROM orders WHERE customer_email = %s", ('[email protected]',))
print("\n--- After Index ---")
print_plan_summary(plan)
Sample output:
--- Before Index ---
Node Type: Seq Scan
Relation: orders
Actual Time: 45312.442 ms
Actual Rows: 5
Planning Time: 0.123 ms
Execution Time: 45312.601 ms
--- After Index ---
Node Type: Index Scan
Relation: orders
Actual Time: 0.047 ms
Actual Rows: 5
Planning Time: 0.156 ms
Execution Time: 0.063 ms
:::note When the Optimizer Ignores Your Index
PostgreSQL's query planner may choose a sequential scan even when an index exists. This happens when: (1) the table is small enough that a seq scan is faster, (2) the query matches a large fraction of the table (roughly more than 10–20% of rows), or (3) statistics are stale and the planner underestimates selectivity. Run ANALYZE table_name to refresh statistics, then re-run EXPLAIN ANALYZE.
:::
Part 5 - Composite Indexes and the Leftmost Prefix Rule
A composite index (also called a multi-column index) is an index on two or more columns. Understanding how composite indexes work - and when they are not used - is one of the most practically important concepts in database performance.
Creating a Composite Index
-- Index on (last_name, first_name, date_of_birth)
CREATE INDEX idx_customers_name ON customers (last_name, first_name, date_of_birth);
This creates a single B-tree where rows are sorted first by last_name, then by first_name within the same last_name, then by date_of_birth within the same last_name and first_name.
The Leftmost Prefix Rule
A composite index (a, b, c) can be used by a query only if the query filters on a prefix of the column list starting from the leftmost column. The database can use the index for a, for (a, b), or for (a, b, c) - but NOT for b alone, c alone, or (b, c) without a.
Column Order Matters Enormously
The order of columns in a composite index determines which queries can use it. The general rule: put the most selective column first (the column that eliminates the most rows), unless your query patterns require a different order.
-- Scenario: query for orders by status and created_at
-- status has ~5 distinct values (low cardinality)
-- created_at has millions of distinct values (high cardinality)
-- Option A: (status, created_at)
CREATE INDEX idx_a ON orders (status, created_at);
-- Good for: WHERE status = 'pending' AND created_at > '2024-01-01'
-- Good for: WHERE status = 'pending' (uses leftmost prefix)
-- Bad for: WHERE created_at > '2024-01-01' (skips leftmost column)
-- Option B: (created_at, status)
CREATE INDEX idx_b ON orders (created_at, status);
-- Good for: WHERE created_at > '2024-01-01' AND status = 'pending'
-- Good for: WHERE created_at > '2024-01-01' (uses leftmost prefix)
-- Bad for: WHERE status = 'pending' (skips leftmost column)
Which is better? It depends entirely on your query patterns. If you almost always filter by status first, Option A is better. If you almost always filter by date ranges, Option B is better.
Range Conditions Break the Chain
A range condition on a column in the middle of a composite index stops the index from being used for subsequent columns:
CREATE INDEX idx_orders ON orders (customer_id, created_at, status);
-- customer_id uses equality → index used fully for this column
-- created_at uses range → index still used, but breaks the chain
-- status filter is applied AFTER the index scan, not during traversal
SELECT * FROM orders
WHERE customer_id = 42 -- uses index (equality)
AND created_at > '2024-01-01' -- uses index (range - breaks chain)
AND status = 'pending'; -- cannot use index (after range column)
:::tip Equality Columns Before Range Columns
In a composite index, put columns used in equality conditions (=) before columns used in range conditions (>, <, BETWEEN). This maximizes the number of columns the index traversal can use before the chain breaks.
:::
Part 6 - When Indexes Hurt Performance
Indexes are not free. Every index you create is paid for on every write. Understanding this cost is essential for avoiding over-indexing, one of the most common database anti-patterns.
Write Overhead
When you INSERT a new row, every index on that table must be updated. The database must:
- Find the correct position in each B-tree for the new value
- Insert the new key-pointer pair into the leaf page
- Potentially split a page if it is full (B-tree page splits are expensive)
The same applies to UPDATE (which is internally a delete + insert in PostgreSQL due to MVCC) and DELETE.
-- A table with 12 indexes
-- Every INSERT must update 12 separate B-trees.
-- A high-throughput INSERT workload can spend 60–80% of its time
-- on index maintenance rather than actual data writes.
-- Rough rule of thumb:
-- 0–3 indexes: minimal write overhead
-- 4–8 indexes: noticeable but acceptable for read-heavy tables
-- 9+ indexes: significant write overhead; audit carefully
Index Bloat
PostgreSQL uses MVCC (Multi-Version Concurrency Control). When a row is updated or deleted, the old version is not immediately removed - it is marked as dead. Dead index entries accumulate over time, consuming space and slowing down index scans. This is called index bloat.
VACUUM removes dead tuples from the heap and dead entries from indexes. Without regular vacuuming, indexes grow continuously even if the table size is stable.
-- Check for index bloat (requires pgstattuple extension)
CREATE EXTENSION IF NOT EXISTS pgstattuple;
SELECT
indexrelname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
(pgstattuple(indexrelid)).dead_tuple_percent AS dead_tuple_pct
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
Over-Indexing Anti-Patterns
-- Anti-pattern 1: Indexing every column
-- Adding an index "just in case" on every column makes writes very slow.
-- Most of those indexes will never be used by any query.
-- Anti-pattern 2: Redundant indexes
-- If you have a composite index (a, b), a separate index on (a) is redundant.
-- The composite index already handles queries on (a) alone via the leftmost prefix rule.
CREATE INDEX idx_a ON orders (customer_id); -- redundant if idx_ab exists
CREATE INDEX idx_ab ON orders (customer_id, status); -- this covers both queries
-- Anti-pattern 3: Indexing a low-cardinality column alone
-- A column with only 3 distinct values (pending/active/closed) will match ~33% of rows.
-- The optimizer will often ignore a full index on it and prefer a seq scan.
-- Better: use a partial index or make it a secondary composite column.
CREATE INDEX idx_bad ON orders (status); -- often ignored by optimizer
CREATE INDEX idx_good ON orders (status) WHERE status = 'pending'; -- partial index
-- Anti-pattern 4: Unused indexes
-- Indexes that no query ever uses still cost write overhead on every INSERT/UPDATE/DELETE.
-- Audit and drop them regularly (see Part 7 and Part 8).
:::warning Indexes Are Not Free
Every index you create is a tax on every INSERT, UPDATE, and DELETE on that table. A table with 15 indexes can have 3–5x higher write latency than the same table with 3 targeted indexes. Always measure before adding an index, and regularly audit unused indexes with pg_stat_user_indexes.
:::
The Index Decision Framework
Part 7 - Finding Slow Queries in Production
pg_stat_statements
pg_stat_statements is a PostgreSQL extension that records statistics about every SQL statement executed. It is the first place to look when diagnosing production performance problems.
-- Enable the extension (requires superuser, done once per cluster)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find the top 10 slowest queries by total execution time
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
round((100 * total_exec_time /
sum(total_exec_time) OVER ())::numeric, 2) AS pct_of_total,
left(query, 100) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
import psycopg2
from tabulate import tabulate
def find_slow_queries(conn, limit=10):
"""
Query pg_stat_statements to find the slowest queries by total execution time.
Returns a list of dicts with query stats.
"""
sql = """
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round((100 * total_exec_time /
sum(total_exec_time) OVER ())::numeric, 2) AS pct_total,
left(query, 120) AS query_preview
FROM pg_stat_statements
WHERE query NOT ILIKE '%%pg_stat%%'
ORDER BY total_exec_time DESC
LIMIT %s;
"""
with conn.cursor() as cur:
cur.execute(sql, (limit,))
cols = [desc[0] for desc in cur.description]
rows = cur.fetchall()
results = [dict(zip(cols, row)) for row in rows]
print(tabulate(results, headers="keys", tablefmt="psql"))
return results
conn = psycopg2.connect(DSN)
find_slow_queries(conn, limit=10)
auto_explain
auto_explain is a PostgreSQL module that automatically logs EXPLAIN ANALYZE output for any query that exceeds a time threshold. It captures slow queries in production without you having to know in advance which queries to explain.
# In postgresql.conf:
shared_preload_libraries = 'pg_stat_statements,auto_explain'
auto_explain.log_min_duration = 1000 # Log any query taking > 1 second
auto_explain.log_analyze = on # Include actual timings
auto_explain.log_buffers = on # Include buffer hit/miss stats
auto_explain.log_format = json # Structured output for log parsers
After reloading PostgreSQL, slow queries are logged automatically to the PostgreSQL log file with their full execution plans - invaluable for diagnosing intermittent production slowdowns without modifying application code.
Finding Missing Indexes via Sequential Scan Stats
-- Find tables with the most sequential scans.
-- High seq_scan count on large tables suggests queries lacking proper indexes.
SELECT
relname AS table_name,
seq_scan,
seq_tup_read,
idx_scan,
seq_scan - idx_scan AS missing_index_score,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY missing_index_score DESC
LIMIT 20;
Part 8 - Index Maintenance
VACUUM and ANALYZE
VACUUM reclaims space from dead tuples created by MVCC. Without vacuuming, tables and indexes grow indefinitely even when data is being deleted.
ANALYZE updates the statistics used by the query planner. Stale statistics cause the planner to make poor decisions - for example, choosing a sequential scan when an index scan would be faster, because the planner thinks the table has 1,000 rows when it actually has 1,000,000.
PostgreSQL's autovacuum daemon runs both automatically in the background. In most cases you do not need to run them manually. However, after a large bulk load or mass delete, running them manually accelerates recovery.
-- Vacuum a specific table (removes dead tuples, updates visibility map)
VACUUM orders;
-- Vacuum and update statistics in one pass
VACUUM ANALYZE orders;
-- Full vacuum: compacts the table and rebuilds indexes from scratch
-- NOTE: acquires an exclusive lock - use only in maintenance windows
VACUUM FULL orders;
import psycopg2
from psycopg2 import sql
def vacuum_analyze_table(conn, table_name):
"""
Run VACUUM ANALYZE on a table.
VACUUM cannot run inside a transaction block, so we temporarily
set autocommit=True on the connection.
"""
old_autocommit = conn.autocommit
conn.autocommit = True
try:
with conn.cursor() as cur:
cur.execute(
sql.SQL("VACUUM ANALYZE {}").format(sql.Identifier(table_name))
)
print(f"VACUUM ANALYZE completed for {table_name}")
finally:
conn.autocommit = old_autocommit
REINDEX
REINDEX rebuilds an index from scratch. Use it when an index is corrupted or has excessive bloat that VACUUM cannot reclaim.
-- Rebuild a specific index (takes exclusive lock - blocks reads and writes)
REINDEX INDEX idx_orders_customer_email;
-- Rebuild all indexes on a table
REINDEX TABLE orders;
-- Safe for production: rebuild concurrently (PostgreSQL 12+)
REINDEX INDEX CONCURRENTLY idx_orders_customer_email;
Monitoring Index Usage
Not all indexes are actually used by any query. Unused indexes waste disk space and add write overhead with zero benefit. Regularly audit index usage and drop indexes that are never used.
-- Find indexes that have never been used since the last statistics reset
SELECT
schemaname,
relname AS table_name,
indexrelname AS index_name,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey' -- exclude primary keys
ORDER BY pg_relation_size(indexrelid) DESC;
import psycopg2
from tabulate import tabulate
def find_unused_indexes(conn, schema='public'):
"""
Find indexes that have never been scanned since the last statistics reset.
These are candidates for removal after verifying they are not needed.
"""
sql = """
SELECT
relname AS table_name,
indexrelname AS index_name,
idx_scan AS scans,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
pg_size_pretty(pg_total_relation_size(relid)) AS table_size
FROM pg_stat_user_indexes
WHERE schemaname = %s
AND idx_scan = 0
AND indexrelname NOT LIKE '%%_pkey'
AND indexrelname NOT LIKE '%%_key'
ORDER BY pg_relation_size(indexrelid) DESC;
"""
with conn.cursor() as cur:
cur.execute(sql, (schema,))
cols = [d[0] for d in cur.description]
rows = cur.fetchall()
results = [dict(zip(cols, row)) for row in rows]
if results:
print(f"Found {len(results)} unused index(es):")
print(tabulate(results, headers="keys", tablefmt="psql"))
else:
print("No unused indexes found.")
return results
conn = psycopg2.connect(DSN)
find_unused_indexes(conn)
:::danger Do Not Drop Indexes Blindly
pg_stat_user_indexes.idx_scan = 0 means the index has not been used since the last statistics reset - which happens on server restart or when pg_stat_reset() is called. An index used only by a nightly batch job or a monthly report may show zero scans on a recently restarted server. Before dropping any index, verify it is not needed by checking query logs and application code. Always drop indexes in a non-production environment first and monitor for regressions.
:::
Part 9 - Full Working Example
This example creates a table with 1,000,000 rows, demonstrates the before/after performance difference, and shows how to read EXPLAIN ANALYZE output programmatically in Python.
import psycopg2
import time
import random
import string
from psycopg2.extras import execute_values
DSN = "host=localhost dbname=testdb user=postgres password=postgres"
def random_email():
name = ''.join(random.choices(string.ascii_lowercase, k=8))
domain = random.choice(['gmail.com', 'yahoo.com', 'example.com', 'acme.org'])
return f"{name}@{domain}"
def setup_table(conn):
"""Create the demo table and populate it with 1,000,000 rows."""
with conn.cursor() as cur:
cur.execute("DROP TABLE IF EXISTS orders_demo")
cur.execute("""
CREATE TABLE orders_demo (
id SERIAL PRIMARY KEY,
customer_email TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
amount NUMERIC(10, 2),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
)
""")
conn.commit()
print("Table created. Inserting 1,000,000 rows...")
batch_size = 10_000
total_rows = 1_000_000
with conn.cursor() as cur:
for batch_start in range(0, total_rows, batch_size):
rows = [
(
random_email(),
random.choice(['pending', 'shipped', 'delivered', 'cancelled']),
round(random.uniform(10, 500), 2),
)
for _ in range(batch_size)
]
execute_values(
cur,
"INSERT INTO orders_demo (customer_email, status, amount) VALUES %s",
rows,
)
if (batch_start // batch_size) % 10 == 0:
print(f" Inserted {batch_start + batch_size:,} rows...")
conn.commit()
print("Bulk insert complete.")
def explain_query(conn, email):
"""Return the EXPLAIN ANALYZE JSON plan for the target email lookup."""
with conn.cursor() as cur:
cur.execute(
"EXPLAIN (ANALYZE, FORMAT JSON) "
"SELECT id, customer_email, status, amount "
"FROM orders_demo "
"WHERE customer_email = %s",
(email,),
)
plan_json = cur.fetchone()[0]
return plan_json[0]
def print_plan_summary(label, plan):
node = plan["Plan"]
print(f"\n--- {label} ---")
print(f" Scan type: {node['Node Type']}")
print(f" Actual rows: {node['Actual Rows']}")
rows_removed = node.get('Rows Removed by Filter', 0)
if rows_removed:
print(f" Rows discarded: {rows_removed:,}")
print(f" Execution time: {plan['Execution Time']:.3f} ms")
def main():
conn = psycopg2.connect(DSN)
# 1. Set up the table with 1 million random rows
setup_table(conn)
# Insert the specific email we will benchmark
with conn.cursor() as cur:
cur.execute(
"INSERT INTO orders_demo (customer_email, status, amount) VALUES (%s, 'pending', 99.99)",
(target_email,),
)
conn.commit()
# 2. Benchmark WITHOUT index
plan_before = explain_query(conn, target_email)
print_plan_summary("Query WITHOUT index", plan_before)
# 3. Create the index (CONCURRENTLY - safe for production)
print("\nCreating index on customer_email (CONCURRENTLY)...")
conn.autocommit = True
with conn.cursor() as cur:
cur.execute(
"CREATE INDEX CONCURRENTLY idx_orders_demo_email "
"ON orders_demo (customer_email)"
)
conn.autocommit = False
# 4. Refresh planner statistics
conn.autocommit = True
with conn.cursor() as cur:
cur.execute("ANALYZE orders_demo")
conn.autocommit = False
print("Index created and statistics updated.")
# 5. Benchmark WITH index
plan_after = explain_query(conn, target_email)
print_plan_summary("Query WITH index", plan_after)
# 6. Show the speedup
before_ms = plan_before["Execution Time"]
after_ms = plan_after["Execution Time"]
speedup = before_ms / after_ms if after_ms > 0 else float('inf')
print(f"\nResult: {before_ms:,.1f} ms → {after_ms:.3f} ms ({speedup:,.0f}x faster)")
conn.close()
if __name__ == "__main__":
main()
Expected output:
Table created. Inserting 1,000,000 rows...
Inserted 100,000 rows...
Inserted 200,000 rows...
...
Bulk insert complete.
--- Query WITHOUT index ---
Scan type: Seq Scan
Actual rows: 1
Rows discarded: 1,000,000
Execution time: 45312.101 ms
Creating index on customer_email (CONCURRENTLY)...
Index created and statistics updated.
--- Query WITH index ---
Scan type: Index Scan
Actual rows: 1
Execution time: 0.063 ms
Result: 45,312.1 ms → 0.063 ms (719,241x faster)
Practice Challenges
Beginner - Read a Query Plan
Challenge: Identify the scan type and bottleneck
Given this EXPLAIN ANALYZE output:
Seq Scan on products (cost=0.00..15230.00 rows=3 width=128)
(actual time=0.018..28441.225 rows=3 loops=1)
Filter: ((category)::text = 'electronics'::text AND price < 50.00)
Rows Removed by Filter: 750997
Planning Time: 0.094 ms
Execution Time: 28441.301 ms
Answer the following questions:
- What scan type is being used?
- How many rows were examined in total?
- How many rows matched the filter?
- What index (or indexes) would you create to improve this query?
- After adding the index, what scan type would you expect to see?
Solution:
1. Seq Scan (Sequential Scan) - the database read every row in the table.
2. Total rows examined: 3 (returned) + 750,997 (removed by filter) = 751,000 rows.
3. 3 rows matched the filter.
4. The query filters on both `category` (equality) and `price` (range).
Following the rule "equality columns before range columns":
CREATE INDEX idx_products_category_price ON products (category, price);
A partial index is also worth considering if most queries target
a specific category:
CREATE INDEX idx_products_electronics_price
ON products (price)
WHERE category = 'electronics';
5. After adding the composite index, EXPLAIN ANALYZE should show:
Index Scan using idx_products_category_price on products
with execution time well under 1 ms.
Intermediate - Design a Composite Index Strategy
Challenge: Choose the correct index column order
You have an events table with 10 million rows and the following columns:
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL,
event_type TEXT NOT NULL, -- ~20 distinct values
occurred_at TIMESTAMPTZ NOT NULL,
payload JSONB
);
Your application runs three common queries:
-- Query A (runs 10,000 times/minute):
SELECT * FROM events
WHERE user_id = $1
ORDER BY occurred_at DESC
LIMIT 20;
-- Query B (runs 500 times/minute):
SELECT * FROM events
WHERE user_id = $1
AND event_type = $2
ORDER BY occurred_at DESC
LIMIT 20;
-- Query C (runs 50 times/minute - analytics dashboard):
SELECT event_type, count(*) FROM events
WHERE occurred_at >= NOW() - INTERVAL '24 hours'
GROUP BY event_type;
Design the minimum set of indexes to cover all three queries efficiently. Justify each index and explain why you chose the column order.
Solution:
-- Index 1: Covers Query A and Query B
-- Column order: (user_id, event_type, occurred_at DESC)
--
-- Justification:
-- - user_id is an equality filter in both queries → goes first (leftmost prefix)
-- - event_type is equality in Query B → goes second (equality before ordering)
-- - occurred_at DESC supports ORDER BY occurred_at DESC for both queries
--
-- Query A uses the prefix (user_id) and leverages occurred_at for ordering.
-- Query B uses (user_id, event_type) and leverages occurred_at for ordering.
-- With LIMIT 20, PostgreSQL stops traversing after 20 matching index entries -
-- making both queries extremely fast.
CREATE INDEX idx_events_user_type_time
ON events (user_id, event_type, occurred_at DESC);
-- Index 2: Covers Query C
-- A B-tree on occurred_at supports the range filter in Query C.
-- Assuming roughly 10M rows over 30 days, the last 24 hours is ~333K rows
-- (~3% of the table) - well within the threshold where an index helps.
CREATE INDEX idx_events_occurred_at
ON events (occurred_at DESC);
-- Why no separate index for Query A alone?
-- idx_events_user_type_time with prefix (user_id) already covers Query A.
-- A separate (user_id, occurred_at) index would be partially redundant
-- and would increase write overhead unnecessarily.
-- Final result: 2 targeted indexes + 1 primary key = 3 indexes total.
-- This is a lean set that covers all three query patterns without over-indexing.
Advanced - Build an Index Advisor
Challenge: Write a Python function that analyzes a table and suggests missing indexes
Write a Python function suggest_indexes(conn, table_name) that:
- Queries
pg_stat_user_tablesto find the sequential scan count and table size - Queries
pg_stat_user_indexesto list existing indexes and their usage - Queries
pg_stat_statementsto find the top 5 slowest queries that reference that table - Prints a formatted report with current indexes, scan statistics, slow queries, and actionable recommendations
Solution:
import psycopg2
from tabulate import tabulate
def suggest_indexes(conn, table_name: str, schema: str = 'public') -> None:
"""
Analyze a table and print an index advisory report.
Data sources:
- pg_stat_user_tables - sequential scan and row statistics
- pg_stat_user_indexes - existing indexes and their usage counts
- pg_stat_statements - slow queries referencing this table
"""
# 1. Sequential scan statistics
with conn.cursor() as cur:
cur.execute(
"""
SELECT
seq_scan,
seq_tup_read,
idx_scan,
n_live_tup,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables
WHERE schemaname = %s AND relname = %s
""",
(schema, table_name),
)
row = cur.fetchone()
if row is None:
print(f"Table {schema}.{table_name} not found in pg_stat_user_tables.")
return
seq_scan, seq_tup_read, idx_scan, n_live_tup, total_size = row
print(f"\n{'='*60}")
print(f"Index Advisory Report: {schema}.{table_name}")
print(f"{'='*60}")
print(f"Live rows: {n_live_tup:,}")
print(f"Table size: {total_size}")
print(f"Sequential scans: {seq_scan:,}")
print(f"Index scans: {idx_scan:,}")
if seq_scan > 0 and idx_scan is not None and idx_scan > 0:
seq_ratio = seq_scan / (seq_scan + idx_scan) * 100
flag = " WARNING: high seq scan ratio" if seq_ratio > 30 else " OK"
print(f"Seq scan ratio: {seq_ratio:.1f}%{flag}")
# 2. Existing indexes
with conn.cursor() as cur:
cur.execute(
"""
SELECT
indexrelname AS index_name,
idx_scan AS scans,
idx_tup_read AS tuples_read,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE schemaname = %s AND relname = %s
ORDER BY idx_scan DESC
""",
(schema, table_name),
)
indexes = cur.fetchall()
print(f"\nExisting indexes ({len(indexes)}):")
if indexes:
print(
tabulate(
indexes,
headers=["Index Name", "Scans", "Tuples Read", "Size"],
tablefmt="psql",
)
)
else:
print(" (none)")
# 3. Slow queries referencing this table from pg_stat_statements
with conn.cursor() as cur:
cur.execute(
"""
SELECT
round(mean_exec_time::numeric, 1) AS mean_ms,
calls,
round(total_exec_time::numeric, 0) AS total_ms,
left(query, 200) AS query_preview
FROM pg_stat_statements
WHERE query ILIKE %s
AND query NOT ILIKE '%%pg_stat%%'
AND query NOT ILIKE '%%EXPLAIN%%'
ORDER BY mean_exec_time DESC
LIMIT 5
""",
(f'%%{table_name}%%',),
)
slow_queries = cur.fetchall()
print(f"\nTop 5 slowest queries referencing {table_name}:")
if slow_queries:
print(
tabulate(
slow_queries,
headers=["Mean (ms)", "Calls", "Total (ms)", "Query (truncated)"],
tablefmt="psql",
maxcolwidths=[None, None, None, 60],
)
)
else:
print(" (no data - pg_stat_statements may not be enabled)")
# 4. Actionable recommendations
print("\nRecommendations:")
recommendations = []
if seq_scan > 1000 and n_live_tup > 100_000:
recommendations.append(
f"High sequential scan count ({seq_scan:,}) on a large table ({n_live_tup:,} rows).\n"
" Run EXPLAIN ANALYZE on the slow queries above to identify missing indexes."
)
unused = [i for i in indexes if i[1] == 0 and '_pkey' not in i[0]]
if unused:
names = ', '.join(i[0] for i in unused)
recommendations.append(
f"{len(unused)} index(es) have zero scans (possibly unused):\n {names}\n"
" Verify these are not needed before dropping them."
)
if not slow_queries:
recommendations.append(
"Enable pg_stat_statements for query-level diagnostics:\n"
" CREATE EXTENSION pg_stat_statements;\n"
" Add 'pg_stat_statements' to shared_preload_libraries and restart."
)
if not recommendations:
recommendations.append("No obvious issues detected. Monitor over time as data grows.")
for i, rec in enumerate(recommendations, 1):
print(f" {i}. {rec}")
# Usage:
# conn = psycopg2.connect(DSN)
# suggest_indexes(conn, 'orders')
# suggest_indexes(conn, 'events', schema='analytics')
Key Takeaways
What an index is. A B-tree index is a separate, self-balancing data structure that maps column values to physical row locations (ctids). It enables O(log n) lookups instead of O(n) sequential scans. For a table with 1,000,000 rows, a B-tree traversal reads at most ~20 pages; a sequential scan reads up to 1,000,000.
Index types and their use cases. B-tree handles almost everything - equality, ranges, ordering, prefix matching. GIN is for JSONB documents, arrays, and full-text search. GiST is for geometric types and ranges. BRIN is a tiny index for massive append-only tables with naturally ordered columns. Hash is for equality-only lookups on high-cardinality columns.
Creating indexes correctly.
Use CREATE INDEX CONCURRENTLY in production to avoid blocking writes. Partial indexes reduce size and write overhead when only a subset of rows is queried. Expression indexes are required when queries use functions like lower() or JSON extraction operators.
Reading EXPLAIN ANALYZE.
Seq Scan means every row is being read - look for a missing index. Index Scan means the B-tree is used but heap fetches are still needed. Index Only Scan is the fastest - all needed columns are in the index itself. Pay attention to Rows Removed by Filter to understand waste.
The leftmost prefix rule.
A composite index (a, b, c) supports queries on a, on (a, b), or on (a, b, c) - but not on b alone or c alone. Put equality columns before range columns to maximize how many columns the traversal uses before the chain breaks.
When indexes hurt.
Every index is a tax on every INSERT, UPDATE, and DELETE. Over-indexing - adding indexes on every column or adding redundant indexes - can make write-heavy workloads 3–5x slower. Audit unused indexes with pg_stat_user_indexes and drop them after verification.
Maintenance matters.
Run VACUUM ANALYZE after large bulk operations to reclaim space and refresh planner statistics. Use REINDEX CONCURRENTLY to rebuild bloated indexes in production. Monitor index health regularly with pg_stat_user_indexes and pg_stat_statements.
What's Next
You now understand how databases use indexes to turn 45-second queries into sub-millisecond lookups - and the real costs that come with that power. You can read query plans, design composite indexes, find slow queries in production, and maintain indexes over time.
The next lesson introduces SQLAlchemy ORM, Python's most powerful and widely-used object-relational mapper. You will learn how SQLAlchemy defines models as Python classes, how it generates SQL from method chains, how to manage schema migrations with Alembic, and - critically - how to apply everything you know about indexes and EXPLAIN ANALYZE when SQLAlchemy-generated queries are unexpectedly slow.
Next: 06 - ORM with SQLAlchemy
Module 07: Databases | Lesson 5 of 7
