Database Migrations with Alembic
Reading time: ~40 minutes | Level: Intermediate → Engineering
Before reading further, consider this puzzle:
# Friday, 5:00 PM. You add a new column to the users table:
# ALTER TABLE users ADD COLUMN last_login TIMESTAMP NOT NULL;
# You deploy the new application code. The app crashes immediately.
# What happened?
Your old application pod (still running during the rolling deployment) executes this query:
INSERT INTO users (email, username, password_hash)
The database rejects it: null value in column "last_login" violates not-null constraint. The old code does not know about last_login. It does not supply a value. The column has no default. The database enforces the constraint.
The app crashes - under load, on a Friday.
This puzzle captures the core challenge of database schema evolution: your application code and your database schema are two separate things that must stay compatible across deployments. You cannot update both atomically. If you touch the schema before deploying code, old code breaks. If you deploy code before touching the schema, new code breaks. The solution requires a disciplined, ordered process - and tooling designed for exactly this problem.
That tool is Alembic.
What You Will Learn
- What database migrations are and why
create_all()is not enough for production - Setting up Alembic:
alembic init,env.py,alembic.ini, and folder structure - Creating migrations with
--autogenerateand what it can and cannot detect - Reading a migration file:
upgrade(),downgrade(), and theopobject - Running migrations:
upgrade head,downgrade -1,current,history - Migration best practices: immutability, downgrade testing, data migrations
- Zero-downtime deployments: the expand/contract pattern for column renames
- Running migrations in CI/CD: Docker entrypoints and pre-start scripts
- Multi-environment setup: dev, staging, production with environment variable URLs
Prerequisites
- Lesson 06 (ORM with SQLAlchemy) - Alembic reads your SQLAlchemy models to autogenerate migrations
- Lesson 04 (Transactions) - migrations run inside transactions; understanding rollback is essential
- Basic command-line familiarity - Alembic is a CLI tool
Part 1 - What Migrations Are and Why You Need Them
The Limitation of create_all()
In development, you likely call Base.metadata.create_all(engine) at startup. This is convenient: it inspects your models and creates any tables that do not exist yet. But it has a critical limitation - it only creates; it never alters.
If your User model gains a new column phone_number, create_all() does nothing to the existing users table. Your model says the column exists; your database says it does not. Queries that reference phone_number fail.
The naive fix - drop_all() then create_all() - deletes all your data. In production, that is not a fix; it is a disaster.
Schema as Code
A migration is a versioned, executable script that describes a single schema change - "add this column", "create this index", "rename this table". Each migration has an upgrade() function (apply the change) and a downgrade() function (reverse it).
Migrations give you:
| Property | What it means |
|---|---|
| Version history | Every schema change is recorded and auditable |
| Reproducibility | Any environment can reach the current schema by running all migrations in order |
| Rollback | A broken deployment can revert the schema change with downgrade |
| Collaboration | Multiple engineers' schema changes are tracked individually, like Git commits |
| CI/CD integration | Schema changes are deployed automatically, not manually |
Think of migrations as Git for your database schema. Each migration file is a commit. alembic upgrade head is git pull. alembic downgrade -1 is git revert.
Alembic and SQLAlchemy
Alembic is the official migration tool for SQLAlchemy, maintained by the same author (Mike Bayer). It integrates tightly with SQLAlchemy's metadata system: it can inspect your model definitions and compare them against the live database schema to automatically generate migration scripts.
pip install alembic sqlalchemy
Part 2 - Alembic Setup
Initializing a Project
Run this once in your project root:
alembic init alembic
This creates the following structure:
my_project/
├── alembic/
│ ├── env.py # Migration environment - controls how Alembic connects
│ ├── script.py.mako # Template for new migration files
│ └── versions/ # Generated migration scripts live here
│ └── (empty at first)
├── alembic.ini # Alembic configuration file
├── app/
│ ├── models.py # Your SQLAlchemy models
│ └── database.py # Engine and session setup
└── main.py
alembic.ini
The alembic.ini file is the primary configuration file. The most important setting is sqlalchemy.url:
# alembic.ini
[alembic]
# Path to the migration scripts
script_location = alembic
# The database URL - we will override this with env.py for multi-environment setups
sqlalchemy.url = postgresql+psycopg2://user:password@localhost:5432/mydb
# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = WARN
handlers = console
qualname =
[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine
[logger_alembic]
level = INFO
handlers =
qualname = alembic
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic
[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S
Never commit real database credentials in alembic.ini. The sqlalchemy.url in alembic.ini is a fallback. In the next section, you will configure env.py to read the URL from an environment variable instead. Commit alembic.ini with a placeholder URL like postgresql://user:password@localhost/mydb.
env.py - The Migration Environment
env.py is a Python script that Alembic executes every time you run a migration command. It is responsible for:
- Connecting to the database
- Pointing Alembic at your model metadata (for autogenerate)
- Running the migration in the correct context (online vs offline)
Here is a production-ready env.py for a FastAPI application:
# alembic/env.py
import os
from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context
# Import your Base so Alembic can compare models vs live schema
# Adjust this import to match your project layout
from app.models import Base
# -----------------------------------------------------------------
# Alembic Config object - gives access to alembic.ini values
# -----------------------------------------------------------------
config = context.config
# Override sqlalchemy.url with the DATABASE_URL environment variable.
# This is the correct multi-environment pattern - never hardcode credentials.
database_url = os.environ.get("DATABASE_URL")
if database_url:
config.set_main_option("sqlalchemy.url", database_url)
# Set up Python logging from alembic.ini
if config.config_file_name is not None:
fileConfig(config.config_file_name)
# -----------------------------------------------------------------
# target_metadata tells Alembic what your models look like.
# Without this, autogenerate cannot work.
# -----------------------------------------------------------------
target_metadata = Base.metadata
def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode.
Offline mode generates SQL scripts without connecting to the database.
Useful for reviewing what will be executed before running it.
"""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
# Compare server defaults so autogenerate picks up default changes
compare_server_defaults=True,
)
with context.begin_transaction():
context.run_migrations()
def run_migrations_online() -> None:
"""Run migrations in 'online' mode - connects to the DB directly."""
connectable = engine_from_config(
config.get_section(config.config_ini_section, {}),
prefix="sqlalchemy.",
poolclass=pool.NullPool, # NullPool: don't reuse connections in migrations
)
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata,
compare_server_defaults=True,
)
with context.begin_transaction():
context.run_migrations()
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()
pool.NullPool is intentional. The Alembic process is a short-lived script, not a long-running server. Using a connection pool would open connections and never return them. NullPool creates a fresh connection each time and closes it immediately - the correct behaviour for a migration runner.
Part 3 - Creating Migrations
autogenerate: The Core Workflow
With target_metadata pointing at your models, Alembic can compare your model definitions against the live database schema and generate a migration script automatically:
# Make sure your DATABASE_URL is set, then:
alembic revision --autogenerate -m "add users table"
This command:
- Connects to the database
- Reads the live schema (using
INFORMATION_SCHEMAor dialect-specific introspection) - Compares it against your
Base.metadata - Generates a migration script in
alembic/versions/describing the diff
The output looks like:
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
INFO [alembic.autogenerate.compare] Detected added table 'users'
INFO [alembic.autogenerate.compare] Detected added index 'ix_users_email' on '['email']'
Generating /my_project/alembic/versions/3a8f1c2d4b7e_add_users_table.py ... done
What autogenerate CAN Detect
| Change | Detected? |
|---|---|
| Added table | Yes |
| Dropped table | Yes |
| Added column | Yes |
| Dropped column | Yes |
| Changed column type | Yes (with caveats) |
| Added index | Yes |
| Dropped index | Yes |
| Added unique constraint | Yes |
| Added foreign key | Yes |
Changed server_default | Yes (if compare_server_defaults=True) |
| Added check constraint | Yes (SQLAlchemy-defined only) |
What autogenerate CANNOT Detect
autogenerate has real limitations. Always review generated migration files before applying them. Never blindly run an autogenerated migration in production.
The following changes are NOT detected:
- Column renames - Alembic sees a dropped column and an added column, not a rename. If you run this migration, you lose the column's data.
- Table renames - same problem. Alembic drops the old table (and all its data) and creates a new one.
- Custom SQL types not registered with SQLAlchemy (e.g., PostgreSQL enums created outside SQLAlchemy)
- Stored procedures and functions
- Triggers
- Views
- Partial indexes (PostgreSQL
WHEREclause on an index) - Row-level security policies
For these changes, you must write the migration manually.
Writing a Migration Manually
# Generate an empty migration file
alembic revision -m "add phone_number index"
Then edit the generated file:
def upgrade() -> None:
op.create_index(
"ix_users_phone_number",
"users",
["phone_number"],
unique=False,
postgresql_where=sa.text("phone_number IS NOT NULL"), # partial index
)
def downgrade() -> None:
op.drop_index("ix_users_phone_number", table_name="users")
Part 4 - Reading a Migration File
Here is a complete, annotated migration file. This is what Alembic generates when you add a users table and an orders table to your models:
# alembic/versions/3a8f1c2d4b7e_add_users_and_orders_tables.py
"""add users and orders tables
Revision ID: 3a8f1c2d4b7e
Revises: # empty string means this is the first migration
Create Date: 2026-03-05 10:23:41.123456
"""
from typing import Annotated
import sqlalchemy as sa
from alembic import op
# ----------------------------------------------------------------
# These two values form the migration chain.
# revision = this migration's unique ID (randomly generated)
# down_revision = the ID of the migration this one builds on top of.
# None means "this is the root migration".
# ----------------------------------------------------------------
revision: str = "3a8f1c2d4b7e"
down_revision: str | None = None
branch_labels: str | None = None
depends_on: str | None = None
def upgrade() -> None:
"""Apply the schema change - runs on 'alembic upgrade'."""
# op.create_table() is a context manager AND returns a Table object.
# All columns and constraints are defined inline.
op.create_table(
"users",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("email", sa.String(length=255), nullable=False),
sa.Column("username", sa.String(length=100), nullable=False),
sa.Column("password_hash", sa.String(length=255), nullable=False),
sa.Column("is_active", sa.Boolean(), nullable=False, server_default="true"),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.Column("bio", sa.Text(), nullable=True),
sa.PrimaryKeyConstraint("id"), # PRIMARY KEY (id)
sa.UniqueConstraint("email"), # UNIQUE (email)
sa.UniqueConstraint("username"), # UNIQUE (username)
)
# Indexes are created separately from the table
op.create_index(op.f("ix_users_email"), "users", ["email"], unique=True)
op.create_table(
"orders",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("user_id", sa.Integer(), nullable=False),
sa.Column("total_amount", sa.Numeric(precision=10, scale=2), nullable=False),
sa.Column("status", sa.String(length=50), nullable=False, server_default="pending"),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.ForeignKeyConstraint(["user_id"], ["users.id"], ondelete="CASCADE"),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(op.f("ix_orders_user_id"), "orders", ["user_id"], unique=False)
def downgrade() -> None:
"""Reverse the schema change - runs on 'alembic downgrade'."""
# IMPORTANT: reverse order of creation.
# Orders must be dropped before Users because of the foreign key constraint.
op.drop_index(op.f("ix_orders_user_id"), table_name="orders")
op.drop_table("orders")
op.drop_index(op.f("ix_users_email"), table_name="users")
op.drop_table("users")
The Second Migration - Adding a Column
# alembic/versions/9c3e5f8a1d2b_add_phone_number_to_users.py
"""add phone_number to users
Revision ID: 9c3e5f8a1d2b
Revises: 3a8f1c2d4b7e # builds on top of the first migration
Create Date: 2026-03-10 14:05:22.654321
"""
import sqlalchemy as sa
from alembic import op
revision: str = "9c3e5f8a1d2b"
down_revision: str | None = "3a8f1c2d4b7e" # points to previous migration
branch_labels: str | None = None
depends_on: str | None = None
def upgrade() -> None:
op.add_column(
"users",
sa.Column(
"phone_number",
sa.String(length=20),
nullable=True, # NULLABLE - so existing rows are not broken
),
)
# Add an index after the column exists
op.create_index("ix_users_phone_number", "users", ["phone_number"], unique=False)
def downgrade() -> None:
op.drop_index("ix_users_phone_number", table_name="users")
op.drop_column("users", "phone_number")
Common op Operations Reference
| Operation | Method |
|---|---|
| Create a table | op.create_table("name", *columns) |
| Drop a table | op.drop_table("name") |
| Add a column | op.add_column("table", sa.Column(...)) |
| Drop a column | op.drop_column("table", "column_name") |
| Rename a column | op.alter_column("table", "old_name", new_column_name="new_name") |
| Change column type | op.alter_column("table", "col", type_=sa.Text()) |
| Set/clear server default | op.alter_column("table", "col", server_default="now()") |
| Set/clear nullable | op.alter_column("table", "col", nullable=False) |
| Create an index | op.create_index("name", "table", ["col"]) |
| Drop an index | op.drop_index("name", table_name="table") |
| Add a foreign key | op.create_foreign_key("fk_name", "source", "referent", ["col"], ["col"]) |
| Execute raw SQL | op.execute("UPDATE users SET is_active = true WHERE created_at > '2025-01-01'") |
Part 5 - Running Migrations
The Migration Version Table
When Alembic first runs, it creates a table called alembic_version in your database:
CREATE TABLE alembic_version (
version_num VARCHAR(32) NOT NULL,
CONSTRAINT alembic_version_pkc PRIMARY KEY (version_num)
);
This table has exactly one row: the revision ID of the currently applied migration. Alembic reads this table to know where you are and where you need to go.
Core Commands
Check current state:
$ alembic current
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
3a8f1c2d4b7e (head)
The (head) label means you are at the latest migration.
View migration history:
$ alembic history --verbose
Rev: 9c3e5f8a1d2b (head)
Parent: 3a8f1c2d4b7e
Path: alembic/versions/9c3e5f8a1d2b_add_phone_number_to_users.py
add phone_number to users
Create Date: 2026-03-10 14:05:22.654321
Rev: 3a8f1c2d4b7e
Parent: <base>
Path: alembic/versions/3a8f1c2d4b7e_add_users_and_orders_tables.py
add users and orders tables
Create Date: 2026-03-05 10:23:41.123456
Apply all pending migrations:
$ alembic upgrade head
INFO [alembic.runtime.migration] Running upgrade -> 3a8f1c2d4b7e, add users and orders tables
INFO [alembic.runtime.migration] Running upgrade 3a8f1c2d4b7e -> 9c3e5f8a1d2b, add phone_number to users
Apply a specific number of steps:
$ alembic upgrade +1 # apply one migration forward
$ alembic upgrade +2 # apply two migrations forward
Apply to a specific revision:
$ alembic upgrade 3a8f1c2d4b7e # upgrade to (or stay at) this exact revision
Roll back one migration:
$ alembic downgrade -1
INFO [alembic.runtime.migration] Running downgrade 9c3e5f8a1d2b -> 3a8f1c2d4b7e, add phone_number to users
Roll back to the very beginning (no migrations applied):
$ alembic downgrade base
INFO [alembic.runtime.migration] Running downgrade 9c3e5f8a1d2b -> 3a8f1c2d4b7e, add phone_number to users
INFO [alembic.runtime.migration] Running downgrade 3a8f1c2d4b7e -> , add users and orders tables
Generate SQL without executing (offline mode - useful for review):
$ alembic upgrade head --sql
-- Running upgrade -> 3a8f1c2d4b7e
CREATE TABLE users (
id SERIAL NOT NULL,
email VARCHAR(255) NOT NULL,
...
);
-- output continues...
alembic upgrade head --sql is extremely useful in two situations: (1) reviewing exactly what SQL will run before a production deployment, and (2) in environments where your application does not have DDL permissions - you can pipe the output to a DBA who runs it manually.
Migration Workflow Diagram
Part 6 - Migration Best Practices
Rule 1: Migration Files Are Immutable
Once a migration has been applied to any shared environment (staging, production), it must never be edited. Alembic tracks the revision ID (a hash of the file). If you edit the file content without changing the revision ID, other developers and environments will see it as already applied - and your edit will silently never run.
If you made a mistake in a migration:
- Write a new migration that corrects the mistake
- Never edit the old one
Rule 2: Always Write and Test downgrade()
downgrade() is your emergency eject button. The moment you need it is the worst moment to discover it does not work.
After creating any migration, test the full round-trip locally:
alembic upgrade head # apply
alembic downgrade -1 # reverse
alembic upgrade head # apply again
# Verify your application still works
Many engineers write pass in downgrade() to skip the work. This is a trap. The first time you need to roll back a bad production deployment, a no-op downgrade() will leave your database in a state that the old code cannot run against - making the outage worse, not better.
Rule 3: Data Migrations Live in Migrations
If a schema change requires transforming existing data (backfilling a new column, splitting a column, normalising data), the data migration should live in the migration file itself - in upgrade().
def upgrade() -> None:
# Step 1: add the column nullable first
op.add_column("users", sa.Column("full_name", sa.String(200), nullable=True))
# Step 2: backfill from existing columns using bulk UPDATE
op.execute("""
UPDATE users
SET full_name = first_name || ' ' || last_name
WHERE first_name IS NOT NULL OR last_name IS NOT NULL
""")
# Step 3: now that all rows are populated, enforce NOT NULL
op.alter_column("users", "full_name", nullable=False)
# Step 4 (in a separate migration): drop the old columns
# (see expand/contract pattern in Part 7)
Rule 4: One Logical Change Per Migration
Resist the temptation to bundle many schema changes into one migration file. Small, focused migrations are:
- Easier to review
- Safer to roll back (you roll back only the one change that broke things, not five)
- Easier to understand in
alembic history
Rule 5: Keep Migration Files in Version Control
The alembic/versions/ directory must be committed to Git alongside your application code. A migration that exists only on one developer's laptop is a deployment accident waiting to happen.
Part 7 - Zero-Downtime Migrations
The Core Problem
A zero-downtime deployment uses rolling restarts: at any point during a deployment, both the old version of your application and the new version are running simultaneously against the same database. This means your database schema must be simultaneously compatible with both the old code and the new code.
This rules out many straightforward schema changes:
| Change | Zero-downtime safe? | Why |
|---|---|---|
| Add nullable column with default | Yes | Old code ignores it; new code uses it |
| Add NOT NULL column without default | No | Old code's INSERT breaks immediately |
| Drop a column | No | Old code still references it |
| Rename a column | No | Old code uses old name, new code uses new name |
| Add an index (CONCURRENT) | Yes | Non-blocking; does not hold a table lock |
| Drop a table | No | Old code still queries it |
The Expand/Contract Pattern
The expand/contract pattern is the standard technique for making breaking schema changes safely. It replaces one risky migration with three safe phases:
Phase 1 - Expand (deploy before new code): Add the new column alongside the old one. Both columns exist. Both old and new code can run.
Phase 2 - Migrate (run with new code): Deploy the new application code that writes to both columns. Backfill the new column for all existing rows.
Phase 3 - Contract (deploy after old code is gone): Once no old-code pods remain, remove the old column.
Zero-Downtime Column Rename: Example
Goal: rename users.full_name to users.display_name.
Migration 1 - Expand:
def upgrade() -> None:
# Add the new column - nullable so old INSERTs still work
op.add_column(
"users",
sa.Column("display_name", sa.String(200), nullable=True),
)
def downgrade() -> None:
op.drop_column("users", "display_name")
Application code update (deploy alongside Migration 1):
class User(Base):
__tablename__ = "users"
# ... other columns ...
full_name: Mapped[Optional[str]] = mapped_column(String(200), nullable=True)
display_name: Mapped[Optional[str]] = mapped_column(String(200), nullable=True)
@property
def effective_display_name(self) -> str:
"""During transition: prefer new column, fall back to old."""
return self.display_name or self.full_name or ""
Migration 2 - Backfill + enforce NOT NULL:
def upgrade() -> None:
# Backfill the new column from the old one
op.execute("""
UPDATE users
SET display_name = full_name
WHERE display_name IS NULL AND full_name IS NOT NULL
""")
# Now safe to enforce NOT NULL
op.alter_column("users", "display_name", nullable=False, server_default="")
def downgrade() -> None:
op.alter_column("users", "display_name", nullable=True, server_default=None)
Migration 3 - Contract (after old code is fully retired):
def upgrade() -> None:
op.drop_column("users", "full_name")
def downgrade() -> None:
# Restore the column - data is gone, but schema is reversible
op.add_column(
"users",
sa.Column("full_name", sa.String(200), nullable=True),
)
The expand/contract pattern feels slow - three migrations instead of one. But each phase is independently safe and independently reversible. The single-migration rename is fast to write and almost guaranteed to cause a production outage during a rolling deployment.
Adding an Index Without Locking the Table
A regular CREATE INDEX acquires a full table lock, blocking all reads and writes until the index build completes. On a large table, this can take minutes. Use PostgreSQL's CONCURRENTLY option:
def upgrade() -> None:
op.create_index(
"ix_users_email",
"users",
["email"],
unique=True,
postgresql_concurrently=True, # Non-blocking index build
)
def downgrade() -> None:
op.drop_index(
"ix_users_email",
table_name="users",
postgresql_concurrently=True,
)
CREATE INDEX CONCURRENTLY cannot run inside a transaction. Alembic wraps migrations in transactions by default. To use CONCURRENTLY, you need to execute the statement with op.execute() and set transactional_ddl = False at the module level in the migration file, or use with op.get_context().autocommit_block(): in newer versions of Alembic. Check the Alembic docs for your specific version.
Part 8 - Running Migrations in CI/CD
Where Migrations Should Run
Migrations must run after the database is reachable and before the new application code starts serving traffic. The correct order is:
1. Database is running and accessible
2. Run: alembic upgrade head
3. Wait for successful exit (exit code 0)
4. Start / restart the application pods
If the migration fails (bad SQL, constraint violation, etc.), the deployment stops. Old application code continues running against the old schema - your users are unaffected.
Docker Entrypoint Pattern
The most common pattern in containerised applications is a startup script:
#!/bin/bash
# entrypoint.sh - runs before the application starts
set -e # exit immediately if any command fails
echo "Waiting for database to be ready..."
# Wait for PostgreSQL to accept connections
python -c "
import time, psycopg2, os
db_url = os.environ['DATABASE_URL']
for attempt in range(30):
try:
conn = psycopg2.connect(db_url)
conn.close()
print('Database is ready.')
break
except psycopg2.OperationalError:
print(f'Attempt {attempt + 1}/30 - database not ready, waiting 2s...')
time.sleep(2)
else:
print('Database did not become ready in time. Exiting.')
exit(1)
"
echo "Running database migrations..."
alembic upgrade head
echo "Starting application..."
exec uvicorn app.main:app --host 0.0.0.0 --port 8000
# Dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Make entrypoint executable
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
Kubernetes Init Container Pattern
For Kubernetes deployments, a cleaner separation is to run migrations in an init container that must succeed before the main application container starts:
# deployment.yaml (excerpt)
spec:
initContainers:
- name: run-migrations
image: myapp:latest
command: ["alembic", "upgrade", "head"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
containers:
- name: app
image: myapp:latest
command: ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
Kubernetes will not start the app container until run-migrations exits with code 0. If migrations fail, the deployment halts with a clear error message.
Always back up your production database before running migrations.
Even with perfect migrations, unexpected things happen: a bug in a data migration corrupts rows, a DROP COLUMN discards data you thought was safe to remove, a migration takes 45 minutes and locks a critical table. A backup means "we can recover." No backup means the damage is permanent.
# PostgreSQL backup before migration
pg_dump -Fc --no-acl --no-owner \
-h $DB_HOST -U $DB_USER $DB_NAME \
> backup_$(date +%Y%m%d_%H%M%S).dump
# Then run the migration
alembic upgrade head
For managed PostgreSQL services (AWS RDS, Google Cloud SQL, Supabase), take a manual snapshot through the console before any significant migration. It takes 2 minutes and has saved production data many times.
Part 9 - Multi-Environment Setup
The Problem
You have three environments: local development, staging, and production. Each has a different database URL. You cannot hardcode any of them.
Environment Variable Pattern
The env.py configuration shown in Part 2 already reads from DATABASE_URL. The full pattern looks like this:
Environment DATABASE_URL
───────────── ──────────────────────────────────────────────
Local dev postgresql://dev_user:dev_pass@localhost/myapp_dev
Staging postgresql://staging_user:[email protected]/myapp_staging
Production postgresql://prod_user:[email protected]/myapp_prod
Set the variable before running Alembic:
# Local development - often set in .env file or shell profile
export DATABASE_URL="postgresql://dev_user:dev_pass@localhost/myapp_dev"
alembic upgrade head
# Staging - set in CI/CD pipeline secrets
DATABASE_URL="postgresql://..." alembic upgrade head
# Production - set in Kubernetes secret (see Part 8)
Using python-dotenv for Local Development
In local development, it is convenient to load DATABASE_URL from a .env file automatically when running Alembic:
# alembic/env.py - add near the top, after imports
from dotenv import load_dotenv
# Load .env file if it exists - for local development convenience
# In CI/CD and production, real environment variables take precedence
load_dotenv()
# .env (never commit this file - add to .gitignore)
DATABASE_URL=postgresql://dev_user:dev_pass@localhost/myapp_dev
Multi-Database Projects
Some applications use separate databases for different services. Alembic supports this via named configuration sections in alembic.ini and a custom env.py. For most projects, a single alembic/ directory per application service is cleaner.
my_project/
├── services/
│ ├── user_service/
│ │ ├── alembic/ # migrations for user_service DB
│ │ └── alembic.ini
│ └── order_service/
│ ├── alembic/ # migrations for order_service DB
│ └── alembic.ini
Full Example - Alembic with FastAPI
This is the complete, runnable setup for a FastAPI application with Alembic.
Project Structure
fastapi_alembic_demo/
├── alembic/
│ ├── env.py
│ ├── script.py.mako
│ └── versions/
│ └── 3a8f1c2d4b7e_initial_schema.py
├── app/
│ ├── __init__.py
│ ├── database.py # Engine and session factory
│ ├── models.py # SQLAlchemy models
│ └── main.py # FastAPI application
├── alembic.ini
├── .env
└── requirements.txt
app/database.py
# app/database.py
import os
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session
from typing import Generator
DATABASE_URL = os.environ["DATABASE_URL"]
engine = create_engine(
DATABASE_URL,
pool_size=10,
max_overflow=20,
pool_pre_ping=True,
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
def get_db() -> Generator[Session, None, None]:
"""FastAPI dependency that provides a database session per request."""
db = SessionLocal()
try:
yield db
db.commit()
except Exception:
db.rollback()
raise
finally:
db.close()
app/models.py
# app/models.py
from datetime import datetime
from typing import Optional
from sqlalchemy import String, Text, DateTime, ForeignKey, Numeric, func
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False, index=True)
username: Mapped[str] = mapped_column(String(100), unique=True, nullable=False)
password_hash: Mapped[str] = mapped_column(String(255), nullable=False)
is_active: Mapped[bool] = mapped_column(default=True, nullable=False)
bio: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), server_default=func.now(), nullable=False
)
orders: Mapped[list["Order"]] = relationship(back_populates="user")
def __repr__(self) -> str:
return f"<User id={self.id} email={self.email!r}>"
class Order(Base):
__tablename__ = "orders"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
user_id: Mapped[int] = mapped_column(ForeignKey("users.id", ondelete="CASCADE"))
total_amount: Mapped[float] = mapped_column(Numeric(10, 2), nullable=False)
status: Mapped[str] = mapped_column(String(50), nullable=False, server_default="pending")
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), server_default=func.now(), nullable=False
)
user: Mapped["User"] = relationship(back_populates="orders")
def __repr__(self) -> str:
return f"<Order id={self.id} user_id={self.user_id} status={self.status!r}>"
app/main.py
# app/main.py
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from sqlalchemy import select
from app.database import get_db
from app.models import User, Order
app = FastAPI(title="Alembic Demo API")
@app.get("/health")
def health_check():
return {"status": "ok"}
@app.get("/users/{user_id}")
def get_user(user_id: int, db: Session = Depends(get_db)):
user = db.scalar(select(User).where(User.id == user_id))
if user is None:
raise HTTPException(status_code=404, detail="User not found")
return {
"id": user.id,
"email": user.email,
"username": user.username,
"is_active": user.is_active,
}
@app.get("/users/{user_id}/orders")
def get_user_orders(user_id: int, db: Session = Depends(get_db)):
orders = db.scalars(
select(Order).where(Order.user_id == user_id)
).all()
return [
{
"id": o.id,
"total_amount": float(o.total_amount),
"status": o.status,
"created_at": o.created_at.isoformat(),
}
for o in orders
]
alembic/versions/3a8f1c2d4b7e_initial_schema.py
"""initial schema
Revision ID: 3a8f1c2d4b7e
Revises:
Create Date: 2026-03-05 10:23:41.123456
"""
import sqlalchemy as sa
from alembic import op
revision: str = "3a8f1c2d4b7e"
down_revision: str | None = None
branch_labels: str | None = None
depends_on: str | None = None
def upgrade() -> None:
op.create_table(
"users",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("email", sa.String(length=255), nullable=False),
sa.Column("username", sa.String(length=100), nullable=False),
sa.Column("password_hash", sa.String(length=255), nullable=False),
sa.Column("is_active", sa.Boolean(), nullable=False, server_default="true"),
sa.Column("bio", sa.Text(), nullable=True),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.PrimaryKeyConstraint("id"),
sa.UniqueConstraint("email"),
sa.UniqueConstraint("username"),
)
op.create_index(op.f("ix_users_email"), "users", ["email"], unique=True)
op.create_table(
"orders",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("user_id", sa.Integer(), nullable=False),
sa.Column("total_amount", sa.Numeric(precision=10, scale=2), nullable=False),
sa.Column("status", sa.String(length=50), nullable=False, server_default="pending"),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.ForeignKeyConstraint(["user_id"], ["users.id"], ondelete="CASCADE"),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(op.f("ix_orders_user_id"), "orders", ["user_id"], unique=False)
def downgrade() -> None:
op.drop_index(op.f("ix_orders_user_id"), table_name="orders")
op.drop_table("orders")
op.drop_index(op.f("ix_users_email"), table_name="users")
op.drop_table("users")
Running the Full Setup
# 1. Install dependencies
pip install fastapi uvicorn sqlalchemy alembic psycopg2-binary python-dotenv
# 2. Set up your .env file
echo 'DATABASE_URL=postgresql://dev_user:dev_pass@localhost/myapp_dev' > .env
# 3. Create the database (if it doesn't exist)
createdb myapp_dev
# 4. Run migrations
alembic upgrade head
# 5. Start the application
uvicorn app.main:app --reload
# 6. Verify
curl http://localhost:8000/health
# {"status": "ok"}
Practice Challenges
Beginner - Add a Profile Picture Column
Your User model needs a profile_picture_url column to store the URL of a user's avatar. The column should be optional (users may not have a profile picture).
Requirements:
- Column name:
profile_picture_url - Type:
String(500) - Nullable:
True - Safe for zero-downtime deployment (existing rows must not break)
Solution - Beginner
Step 1: Update the model
# app/models.py - add to User class
profile_picture_url: Mapped[Optional[str]] = mapped_column(
String(500), nullable=True
)
Step 2: Generate the migration
alembic revision --autogenerate -m "add profile_picture_url to users"
Step 3: Review the generated migration
# alembic/versions/abc123_add_profile_picture_url_to_users.py
def upgrade() -> None:
op.add_column(
"users",
sa.Column("profile_picture_url", sa.String(length=500), nullable=True),
)
def downgrade() -> None:
op.drop_column("users", "profile_picture_url")
Step 4: Apply
alembic upgrade head
Step 5: Test the downgrade
alembic downgrade -1
alembic upgrade head
This migration is safe for zero-downtime deployment because the column is nullable - old application code does not know about it and will not include it in INSERT statements, which is fine since the database will store NULL.
Intermediate - Add an Order Status Audit Table
You want to track every time an order's status changes. Design and create a migration for an order_status_history table that records the old status, new status, changed timestamp, and which user made the change.
Requirements:
id(primary key)order_id(foreign key toorders.id,ON DELETE CASCADE)changed_by_user_id(foreign key tousers.id,ON DELETE SET NULL, nullable)old_status(String 50, nullable - first change has no old status)new_status(String 50, not null)changed_at(DateTime with timezone, server defaultnow())- Index on
order_idfor fast lookup
Solution - Intermediate
Step 1: Update the model
# app/models.py - new model
class OrderStatusHistory(Base):
__tablename__ = "order_status_history"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
order_id: Mapped[int] = mapped_column(
ForeignKey("orders.id", ondelete="CASCADE"), nullable=False
)
changed_by_user_id: Mapped[Optional[int]] = mapped_column(
ForeignKey("users.id", ondelete="SET NULL"), nullable=True
)
old_status: Mapped[Optional[str]] = mapped_column(String(50), nullable=True)
new_status: Mapped[str] = mapped_column(String(50), nullable=False)
changed_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), server_default=func.now(), nullable=False
)
order: Mapped["Order"] = relationship()
changed_by: Mapped[Optional["User"]] = relationship()
Step 2: Generate the migration
alembic revision --autogenerate -m "add order_status_history table"
Step 3: Expected migration content
def upgrade() -> None:
op.create_table(
"order_status_history",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("order_id", sa.Integer(), nullable=False),
sa.Column("changed_by_user_id", sa.Integer(), nullable=True),
sa.Column("old_status", sa.String(length=50), nullable=True),
sa.Column("new_status", sa.String(length=50), nullable=False),
sa.Column(
"changed_at",
sa.DateTime(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.ForeignKeyConstraint(
["changed_by_user_id"], ["users.id"], ondelete="SET NULL"
),
sa.ForeignKeyConstraint(
["order_id"], ["orders.id"], ondelete="CASCADE"
),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(
op.f("ix_order_status_history_order_id"),
"order_status_history",
["order_id"],
unique=False,
)
def downgrade() -> None:
op.drop_index(
op.f("ix_order_status_history_order_id"),
table_name="order_status_history",
)
op.drop_table("order_status_history")
Step 4: Test the round-trip
alembic upgrade head
# Verify table exists: psql -c "\d order_status_history"
alembic downgrade -1
# Verify table is gone
alembic upgrade head
Advanced - Zero-Downtime Username Split Migration
The users table currently has a single username column (String(100)). The product team wants to split it into first_name (String(100)) and last_name (String(100)), while keeping the application running without downtime.
The current table has 500,000 rows. Usernames are in the format "firstname_lastname" (e.g., "john_doe"). Some usernames have no underscore.
Requirements:
- Design the full expand/contract migration sequence (at least 3 migrations)
- Write each migration file with correct
upgrade()anddowngrade() - The data migration must handle the edge case where there is no underscore in the username
- Each phase must be independently deployable without breaking running application code
Solution - Advanced
Overview of the three-phase plan:
Migration A: Add first_name and last_name columns (nullable)
→ Deploy app code that writes to all three columns, reads from new columns
Migration B: Backfill from username, enforce NOT NULL
→ All code now exclusively uses first_name/last_name
Migration C: Drop username column
→ Old code is completely retired
Migration A - Expand:
# alembic/versions/expand_add_name_columns.py
"""expand: add first_name and last_name to users
Revision ID: e1a2b3c4d5f6
Revises: 3a8f1c2d4b7e
"""
import sqlalchemy as sa
from alembic import op
revision = "e1a2b3c4d5f6"
down_revision = "3a8f1c2d4b7e"
def upgrade() -> None:
op.add_column("users", sa.Column("first_name", sa.String(100), nullable=True))
op.add_column("users", sa.Column("last_name", sa.String(100), nullable=True))
def downgrade() -> None:
op.drop_column("users", "last_name")
op.drop_column("users", "first_name")
Application code during transition (deploy with Migration A):
class User(Base):
__tablename__ = "users"
# Keep all three columns during transition
username: Mapped[str] = mapped_column(String(100), nullable=False)
first_name: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
last_name: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
@classmethod
def create(
cls,
username: str,
first_name: str,
last_name: str,
**kwargs,
) -> "User":
"""During transition, write all three columns."""
return cls(
username=username,
first_name=first_name,
last_name=last_name,
**kwargs,
)
Migration B - Backfill + enforce NOT NULL:
# alembic/versions/backfill_name_columns.py
"""migrate: backfill first_name and last_name from username
Revision ID: f2b3c4d5e6a7
Revises: e1a2b3c4d5f6
"""
import sqlalchemy as sa
from alembic import op
revision = "f2b3c4d5e6a7"
down_revision = "e1a2b3c4d5f6"
def upgrade() -> None:
# Backfill rows where first_name is still NULL.
# Handles "firstname_lastname" format and fallback for no-underscore usernames.
op.execute("""
UPDATE users
SET
first_name = CASE
WHEN username LIKE '%\_%' ESCAPE '\\'
THEN SPLIT_PART(username, '_', 1)
ELSE username
END,
last_name = CASE
WHEN username LIKE '%\_%' ESCAPE '\\'
THEN SUBSTRING(username FROM POSITION('_' IN username) + 1)
ELSE ''
END
WHERE first_name IS NULL
""")
# Enforce NOT NULL now that all rows are populated
op.alter_column("users", "first_name", nullable=False, server_default="")
op.alter_column("users", "last_name", nullable=False, server_default="")
def downgrade() -> None:
# Revert to nullable - data is still intact
op.alter_column("users", "first_name", nullable=True, server_default=None)
op.alter_column("users", "last_name", nullable=True, server_default=None)
Migration C - Contract:
# alembic/versions/contract_drop_username.py
"""contract: drop username column from users
Revision ID: a7b8c9d0e1f2
Revises: f2b3c4d5e6a7
"""
import sqlalchemy as sa
from alembic import op
revision = "a7b8c9d0e1f2"
down_revision = "f2b3c4d5e6a7"
def upgrade() -> None:
# Drop the unique constraint before dropping the column
op.drop_constraint("users_username_key", "users", type_="unique")
op.drop_column("users", "username")
def downgrade() -> None:
# Restore the column - populate it from first_name + last_name
op.add_column(
"users",
sa.Column("username", sa.String(100), nullable=True),
)
op.execute("""
UPDATE users
SET username = LOWER(first_name) || '_' || LOWER(last_name)
WHERE username IS NULL
""")
op.alter_column("users", "username", nullable=False)
op.create_unique_constraint("users_username_key", "users", ["username"])
Key points in this solution:
- Each migration is independently deployable and reversible
- Migration B runs the bulk UPDATE in a single SQL statement - efficient on 500,000 rows
- Migration C's
downgrade()reconstructsusernamefromfirst_name/last_name- it may not perfectly reverse the original data but restores schema compatibility - The edge case (no underscore) is handled by the
CASEexpression in Migration B last_namedefaults to an empty string for single-word usernames - a product decision you would document in the migration comment
Key Takeaways
-
create_all()is for development only. In production, use Alembic migrations to evolve your schema in a versioned, reversible, auditable way. -
The migration chain is a linked list. Each migration file has a
revisionID and adown_revisionpointing to its parent.alembic upgrade headwalks this list forward;alembic downgrade basewalks it back. -
autogenerate is a powerful assistant, not a reliable oracle. It detects most structural changes but misses column renames, table renames, stored procedures, triggers, views, and partial indexes. Always review generated files before applying them.
-
Migration files are immutable once applied. Edit them freely before they hit any shared environment. After that, write a new migration to correct mistakes - never modify the existing one.
-
Always write and test
downgrade(). An untested downgrade is a false safety net. Test the full round-trip (upgrade → downgrade → upgrade) in your local environment for every migration. -
Zero-downtime migrations require the expand/contract pattern. Breaking schema changes (column rename, column drop, NOT NULL without a default) must be split into multiple phases, each independently deployable.
-
Migrations run before the application starts. In Docker, use an entrypoint script. In Kubernetes, use an init container. The migration must succeed (exit 0) before the app container starts.
-
Never hardcode database credentials. Read
DATABASE_URLfrom an environment variable inenv.py. Use.envfiles for local development; use secrets management (Kubernetes Secrets, AWS Secrets Manager) for staging and production. -
Back up production before every significant migration. A
pg_dumpor managed service snapshot takes two minutes. It has unlimited value if something goes wrong.
What's Next
You have now completed the full databases module. You know how to write raw SQL, use SQLite and PostgreSQL, manage transactions, design indexes for performance, build an ORM layer with SQLAlchemy, and deploy schema changes safely with Alembic.
Continue to:
Module Projects - Apply everything from this module to build a production-quality data layer, including a FastAPI REST API backed by PostgreSQL with full Alembic migration history, connection pooling, and a zero-downtime deployment strategy.
