The 12-Factor App - Building Deployable Python Apps
Here is a Python application that violates 9 of the 12 factors. It runs on the developer's laptop. Deploying it to a second server is a week-long project.
# app.py
import sqlite3
import pickle
import logging
# Factor I (Codebase): no version control mentioned
# Factor II (Dependencies): uses system-installed packages, no requirements.txt
# Factor III (Config): hardcoded credentials
DB_PATH = "/home/dev/data/myapp.sqlite"
API_KEY = "sk_live_hardcoded_key"
# Factor IV (Backing Services): tightly coupled to local SQLite file
db = sqlite3.connect(DB_PATH)
# Factor VI (Processes): stores state in memory between requests
_user_cache = {}
# Factor VII (Port Binding): no self-contained server
# Factor X (Dev/Prod Parity): uses SQLite locally, PostgreSQL in production
# Factor XI (Logs): writes to a local file
logging.basicConfig(filename="/var/log/myapp.log", level=logging.INFO)
# Factor XII (Admin Processes): database migrations run manually via SQL files
def get_user(user_id):
if user_id in _user_cache:
return _user_cache[user_id]
cursor = db.execute("SELECT * FROM users WHERE id = ?", (user_id,))
user = cursor.fetchone()
_user_cache[user_id] = user # Factor VI violation: stateful process
# Factor IX (Disposability): no graceful shutdown, cache lost on restart
return user
The 12-Factor methodology, published by Heroku engineers, codifies the practices that make applications deployable, scalable, and maintainable in modern cloud environments. Every factor addresses a specific deployment pain point.
What You Will Learn
- All 12 factors explained with Python-specific context and code
- How FastAPI, Docker, and PostgreSQL naturally satisfy many factors
- Concrete implementation patterns for each factor
- Common violations in Python codebases and how to fix them
- How the factors interact and reinforce each other
Prerequisites
- Configuration management with
pydantic-settings(previous lesson) - Experience with FastAPI, SQLAlchemy, and Docker basics
- Understanding of dependency injection and Clean Architecture (earlier lessons)
- Familiarity with virtual environments and
pip
Factor I - Codebase: One Codebase, Many Deploys
One application has one codebase tracked in version control. The same codebase is deployed to dev, staging, and production.
Violation
Multiple copies of the app with different code paths per environment:
# BAD: different code per environment
if ENVIRONMENT == "production":
from prod_config import *
# 200 lines of production-specific code
else:
from dev_config import *
# 200 lines of development-specific code
Correct Approach
# GOOD: same code, different config
from config import Settings
settings = Settings() # reads from environment
app = create_app(settings)
The codebase is identical across environments. Only configuration differs.
:::tip Multiple Services = Multiple Repos (or Monorepo) If your system has separate services (API, worker, scheduler), each may have its own codebase. Alternatively, use a monorepo with shared libraries. But one codebase should never produce two fundamentally different applications based on config flags. :::
Factor II - Dependencies: Explicitly Declare and Isolate
Never rely on system-wide packages. Declare all dependencies and isolate them in a virtual environment.
# pyproject.toml - explicit dependency declaration
[project]
name = "myapp"
version = "1.0.0"
requires-python = ">=3.11"
dependencies = [
"fastapi>=0.104.0,<1.0",
"uvicorn[standard]>=0.24.0",
"sqlalchemy>=2.0,<3.0",
"asyncpg>=0.29.0",
"pydantic-settings>=2.0",
"alembic>=1.12.0",
"httpx>=0.25.0",
]
[project.optional-dependencies]
dev = [
"pytest>=7.0",
"pytest-asyncio>=0.23.0",
"ruff>=0.1.0",
"mypy>=1.7",
]
# Dockerfile - isolation via container
FROM python:3.12-slim
WORKDIR /app
# Install dependencies first (layer caching)
COPY pyproject.toml .
RUN pip install --no-cache-dir .
# Copy application code
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Violation
# BAD: "just install these packages globally"
sudo pip install fastapi sqlalchemy
# Which version? What if another app needs a different version?
Lock Files
# Pin exact versions for reproducible deploys
pip freeze > requirements.txt
# Or use pip-tools
pip-compile pyproject.toml -o requirements.lock
pip install -r requirements.lock
Factor III - Config: Store Config in the Environment
Everything that varies between deploys must come from environment variables. We covered this extensively in the previous lesson.
# config.py
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import SecretStr
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env")
database_url: str
secret_key: SecretStr
redis_url: str = "redis://localhost:6379/0"
debug: bool = False
log_level: str = "INFO"
The Litmus Test
Could you open-source the codebase right now without compromising any credentials?
If yes, your config is properly externalized. If no, secrets are still in your code.
Factor IV - Backing Services: Treat as Attached Resources
Backing services (databases, caches, message queues, email services) are attached resources accessed via URLs. Swapping a local PostgreSQL for a managed RDS instance should require only changing a URL.
# main.py - backing services as URLs from config
from sqlalchemy import create_engine
from redis import Redis
settings = Settings()
# Database - local or remote, same code
engine = create_engine(settings.database_url)
# Redis - local or ElastiCache, same code
redis_client = Redis.from_url(settings.redis_url)
Violation
# BAD: hardcoded local paths, not swappable
import sqlite3
db = sqlite3.connect("/home/dev/data/local.db")
# Cannot swap to PostgreSQL without rewriting code
Factor V - Build, Release, Run: Strictly Separate Stages
# BUILD: create a Docker image
docker build -t myapp:v1.2.3 .
# RELEASE: tag with config (image + env vars)
# (In practice, done by CI/CD or Kubernetes manifests)
# RUN: start the process
docker run -d \
--env-file .env.production \
-p 8000:8000 \
myapp:v1.2.3
In CI/CD (GitLab CI)
# .gitlab-ci.yml
stages:
- build
- release
- deploy
build:
stage: build
script:
- docker build -t registry.example.com/myapp:$CI_COMMIT_SHA .
- docker push registry.example.com/myapp:$CI_COMMIT_SHA
release:
stage: release
script:
- docker tag registry.example.com/myapp:$CI_COMMIT_SHA registry.example.com/myapp:latest
deploy:
stage: deploy
script:
- kubectl set image deployment/myapp myapp=registry.example.com/myapp:$CI_COMMIT_SHA
Violation
# BAD: editing code on the production server
ssh prod-server
vim /opt/myapp/config.py # modifying code directly
systemctl restart myapp
Factor VI - Processes: Execute as Stateless Processes
Application processes are stateless. Any data that needs to persist lives in a backing service (database, Redis, S3).
# BAD: in-memory state shared across requests
_user_cache = {} # lost on restart, not shared across workers
@app.get("/users/{user_id}")
def get_user(user_id: int):
if user_id in _user_cache:
return _user_cache[user_id]
user = db.query(User).get(user_id)
_user_cache[user_id] = user # stored in process memory
return user
# GOOD: use Redis for shared, persistent cache
from redis import Redis
import json
redis = Redis.from_url(settings.redis_url)
@app.get("/users/{user_id}")
def get_user(user_id: int, db: Session = Depends(get_db)):
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
user = db.query(User).get(user_id)
if user:
redis.setex(f"user:{user_id}", 300, json.dumps(user.to_dict()))
return user
Why Statelessness Matters
With stateless processes, you can:
- Scale horizontally (add more workers without coordination)
- Restart any process without data loss
- Deploy without downtime (rolling restarts)
:::tip Sticky Sessions Are a Code Smell If your load balancer needs sticky sessions (routing a user to the same worker), your application is stateful. Extract that state into Redis or the database. :::
Factor VII - Port Binding: Export Services via Port Binding
The application is self-contained. It binds to a port and serves requests. No external web server (Apache, nginx) is required to run it.
# main.py - self-contained with Uvicorn
import uvicorn
from fastapi import FastAPI
app = FastAPI()
@app.get("/health")
def health():
return {"status": "ok"}
if __name__ == "__main__":
uvicorn.run(
"main:app",
host="0.0.0.0",
port=int(os.getenv("PORT", "8000")),
)
# Run directly - no Apache/nginx needed
uvicorn main:app --host 0.0.0.0 --port 8000
Nginx or a cloud load balancer sits in front as a reverse proxy, but the app itself is self-contained.
Factor VIII - Concurrency: Scale Out via the Process Model
Scale by running more processes, not by making one process bigger.
# Scale Uvicorn workers
uvicorn main:app --workers 4 --host 0.0.0.0 --port 8000
# Or use Gunicorn with Uvicorn workers
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
# docker-compose.yml - different process types
services:
web:
build: .
command: uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
ports:
- "8000:8000"
deploy:
replicas: 2
worker:
build: .
command: celery -A tasks worker --loglevel=info --concurrency=4
deploy:
replicas: 2
scheduler:
build: .
command: celery -A tasks beat --loglevel=info
deploy:
replicas: 1 # only one scheduler
Violation
# BAD: trying to handle everything in one process with threads
import threading
class MonolithicApp:
def __init__(self):
self.web_thread = threading.Thread(target=self.serve_web)
self.worker_thread = threading.Thread(target=self.run_workers)
self.scheduler_thread = threading.Thread(target=self.run_scheduler)
Factor IX - Disposability: Fast Startup, Graceful Shutdown
Processes should start quickly and shut down gracefully. Handle SIGTERM to clean up resources.
# main.py - graceful shutdown with FastAPI
from contextlib import asynccontextmanager
from fastapi import FastAPI
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
settings = Settings()
engine = create_engine(settings.database_url)
app.state.engine = engine
app.state.session_factory = sessionmaker(bind=engine)
print("Application started")
yield # Application runs here
# Shutdown (on SIGTERM)
engine.dispose()
print("Database connections closed")
# Finish in-flight requests, close connections, flush logs
app = FastAPI(lifespan=lifespan)
# Celery worker - graceful shutdown
from celery.signals import worker_shutting_down
@worker_shutting_down.connect
def on_shutdown(sig, how, exitcode, **kwargs):
# Finish current task, do not pick up new ones
# Close database connections
# Flush metrics/logs
print("Worker shutting down gracefully")
Why It Matters
- Fast startup: new instances can be spun up quickly during scaling events.
- Graceful shutdown: in-flight requests complete, database transactions commit, and no data is lost during deploys.
- Crash resilience: if a process dies unexpectedly, the system recovers because processes are stateless (Factor VI) and tasks are idempotent.
Factor X - Dev/Prod Parity: Keep Development, Staging, and Production Similar
Minimize gaps between environments:
| Gap | Problem | Solution |
|---|---|---|
| Time gap | Weeks between dev and deploy | CI/CD: deploy daily or per-commit |
| Personnel gap | Developers write, ops deploy | DevOps culture: developers own deployment |
| Tools gap | SQLite in dev, PostgreSQL in prod | Use the same backing services in all environments |
Docker Compose for Local Development
# docker-compose.dev.yml - same services as production
services:
api:
build: .
volumes:
- .:/app # hot reload for development
environment:
DATABASE_URL: postgresql://dev:devpass@db:5432/myapp_dev
REDIS_URL: redis://redis:6379/0
DEBUG: "true"
ports:
- "8000:8000"
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
db:
image: postgres:16 # same version as production
environment:
POSTGRES_DB: myapp_dev
POSTGRES_USER: dev
POSTGRES_PASSWORD: devpass
redis:
image: redis:7-alpine # same version as production
:::danger The SQLite Trap Using SQLite in development and PostgreSQL in production introduces subtle bugs: different SQL dialects, different constraint behaviors, different performance characteristics. Always use the same database engine in all environments. :::
Factor XI - Logs: Treat Logs as Event Streams
The application should not manage log files. It writes to stdout, and the execution environment handles routing.
# GOOD: log to stdout/stderr
import logging
import sys
logging.basicConfig(
stream=sys.stdout,
level=logging.INFO,
format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
logger = logging.getLogger("myapp")
logger.info("Request processed", extra={"user_id": 42, "duration_ms": 150})
# For structured logging (JSON), use structlog
import structlog
structlog.configure(
processors=[
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.stdlib.BoundLogger,
logger_factory=structlog.PrintLoggerFactory(),
)
logger = structlog.get_logger()
logger.info("request_processed", user_id=42, duration_ms=150, path="/api/users")
# {"event": "request_processed", "user_id": 42, "duration_ms": 150,
# "path": "/api/users", "timestamp": "2025-01-15T10:30:00Z"}
Violation
# BAD: application manages its own log files
logging.basicConfig(
filename="/var/log/myapp/app.log", # where does this go in Docker?
level=logging.INFO,
)
# What about log rotation? Disk space? Log shipping?
Log Routing in Production
The app writes to stdout. Docker captures the stream. The log driver routes it to whatever log aggregation service you use.
Factor XII - Admin Processes: Run as One-Off Processes
Management tasks (database migrations, data fixes, reports) run as one-off processes in the same environment as the app.
# Database migrations
docker exec myapp-api alembic upgrade head
# Or as a separate container with the same image
docker run --rm --env-file .env.production myapp:v1.2.3 alembic upgrade head
# Django-style management commands
docker run --rm --env-file .env.production myapp:v1.2.3 python manage.py create_admin
# One-off scripts
docker run --rm --env-file .env.production myapp:v1.2.3 python -m scripts.backfill_data
# scripts/backfill_data.py - uses the same code and config as the running app
from config import Settings
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
def main():
settings = Settings()
engine = create_engine(settings.database_url)
Session = sessionmaker(bind=engine)
session = Session()
# Backfill logic using the same models and config as the app
users = session.query(User).filter(User.profile_complete.is_(False)).all()
for user in users:
user.profile_complete = compute_completeness(user)
session.commit()
print(f"Updated {len(users)} users")
if __name__ == "__main__":
main()
Alembic Migrations
# alembic/env.py - reads from the same config as the app
from config import Settings
settings = Settings()
config.set_main_option("sqlalchemy.url", settings.database_url)
Violation
# BAD: running SQL directly on the production database
psql -U admin production_db -c "UPDATE users SET role = 'premium' WHERE id = 42"
# No audit trail, no version control, no reproducibility
Factor Summary - FastAPI + Docker + PostgreSQL
| Factor | Python Implementation |
|---|---|
| I. Codebase | Git repository, one repo per service |
| II. Dependencies | pyproject.toml, pip-compile, Docker |
| III. Config | pydantic-settings with BaseSettings |
| IV. Backing Services | Connection URLs from environment variables |
| V. Build/Release/Run | docker build / CI tag / docker run |
| VI. Processes | Stateless FastAPI + Uvicorn workers |
| VII. Port Binding | Uvicorn binds to $PORT |
| VIII. Concurrency | Gunicorn workers, Celery workers, replicas |
| IX. Disposability | FastAPI lifespan, signal handlers |
| X. Dev/Prod Parity | Docker Compose with same services |
| XI. Logs | structlog or logging to stdout |
| XII. Admin | Alembic migrations, management scripts |
Key Takeaways
- The 12 factors are not abstract ideals - they are practical engineering decisions that eliminate entire categories of deployment bugs.
- Config in the environment (Factor III) is the foundation: it enables dev/prod parity, safe credential management, and environment-specific behavior without code changes.
- Stateless processes (Factor VI) enable horizontal scaling: if your process holds state in memory, you cannot add more workers or survive restarts.
- Dev/prod parity (Factor X) prevents "works on my machine" bugs: use the same database engine, the same queue system, and the same cache in every environment.
- Logs to stdout (Factor XI) simplifies operations: let the platform handle routing, aggregation, and retention. The application should not concern itself with log files.
- Admin processes (Factor XII) use the same code and config: migrations, backfills, and scripts run in the same environment as the web process, ensuring consistency.
- Docker naturally satisfies many factors: explicit dependencies (II), strict build/release/run (V), port binding (VII), disposability (IX), and dev/prod parity (X).
Graded Practice Challenges
Level 1 - Identify the Violation
Question 1: Which 12-factor principles does this Dockerfile violate?
FROM python:3.12
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
ENV DATABASE_URL=postgresql://admin:secret@db:5432/myapp
ENV SECRET_KEY=my-production-secret
CMD ["python", "app.py"]
Answer
Factor III (Config): Production credentials (DATABASE_URL, SECRET_KEY) are hardcoded in the Dockerfile. These should come from environment variables at runtime, not baked into the image. The same image should work in dev, staging, and production - only the environment variables change. Additionally, Factor V (Build/Release/Run) is violated because config is embedded in the build artifact.
Question 2: A developer stores uploaded files at /tmp/uploads/ on the web server's filesystem. Which factor is violated and what is the fix?
Answer
Factor VI (Stateless Processes). Files stored on the local filesystem are lost when the process restarts or when a different worker handles the next request. The fix: store uploads in a backing service (Factor IV) such as S3, Google Cloud Storage, or a shared volume. The upload path URL should come from environment variables.
Question 3: An application writes logs to /var/log/myapp/app.log and uses logrotate to manage file rotation. Which factor does this violate?
Answer
Factor XI (Logs). The application should not manage log files or rotation. Logs should be written to stdout as an event stream. The execution environment (Docker, systemd, Kubernetes) captures and routes the stream to log aggregation services. Writing to files couples the application to the host filesystem and complicates containerized deployments.
Level 2 - Refactoring Challenge
You have an application that violates multiple 12-factor principles:
# app.py
import sqlite3
import pickle
import logging
DB = sqlite3.connect("./data.db")
SESSION_STORE = {}
logging.basicConfig(filename="app.log")
def login(username, password):
cursor = DB.execute("SELECT * FROM users WHERE username=?", (username,))
user = cursor.fetchone()
if user and user[2] == password: # plaintext password comparison
session_id = generate_session_id()
SESSION_STORE[session_id] = user # in-memory state
return session_id
return None
Refactor to satisfy all 12 factors: externalize config, use PostgreSQL (accessed via URL), store sessions in Redis, log to stdout, and structure for Docker deployment. Produce the complete file set: config.py, main.py, Dockerfile, docker-compose.yml.
Level 3 - Design Challenge
Design the 12-factor-compliant deployment architecture for an e-learning platform (like EngineersOfAI) with:
- FastAPI web service (user-facing API)
- Celery workers (video transcoding, certificate generation)
- Celery Beat scheduler (daily email digests, course reminders)
- PostgreSQL database
- Redis (cache + Celery broker)
- S3 (video and certificate storage)
Produce: (a) the docker-compose.yml for local development, (b) the Dockerfile (multi-stage build), (c) the config.py with all environment variables, (d) the CI/CD pipeline definition. Map each component to the 12 factors.
What's Next
In the final lesson of this module, Microservices vs Monolith - Making the Right Choice, we will explore the architectural decision that determines how you apply these factors at scale: do you build one 12-factor app or many?
