Validation with Pydantic - Production Request and Response Models
Reading time: ~35 minutes | Level: Intermediate → Engineering
Before reading further, predict the output:
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
email: str
u = User(name="Alice", age="25", email="not-an-email")
print(u.age) # ?
print(u.email) # ?
25
not-an-email
Two surprises in one model. First: age="25" coerces to 25 - Pydantic validates that "25" can become an int and does the conversion silently. This is type coercion, not type checking, and it is intentional. Second: email="not-an-email" is accepted without error - str means "any string," not "a valid email address." For email validation, you need EmailStr from pydantic[email].
This is the fundamental tension in Pydantic: it is very helpful by default (coercion) but only as strict as you ask it to be (no semantic validation unless you add it). This lesson shows you where that line is, how to move it, and what tools exist on both sides.
What You Will Learn
BaseModel: field declaration, type coercion vs strict mode- All built-in validated types:
EmailStr,HttpUrl,UUID,IPvAnyAddress Field(): every constraint that matters in productionfield_validatorandmodel_validator: single-field and cross-field validation- The complete Pydantic validation pipeline from raw data to model
- Nested models and composition patterns
model_configwithConfigDict: ORM mode, whitespace stripping, enum handlingmodel_dump()andmodel_dump_json(): themode='json'gotcha and serialization optionsmodel_json_schema(): OpenAPI docs generation- Partial updates with
model_copy(update=...)for PATCH endpoints RootModelfor validating lists and primitivesAnnotatedtypes for reusable constraints- Discriminated unions for polymorphic request bodies
Prerequisites
- Lesson 04 (FastAPI) - Pydantic is FastAPI's request validation engine
- Lesson 07 (JSON Serialization) - understanding what
model_dump(mode='json')does - Basic Python type hints (
str,int,list[str],Optional,Union)
Part 1 - BaseModel and Type Coercion
Pydantic v2's BaseModel is the foundation. Every field is declared as a class attribute with a type annotation:
from pydantic import BaseModel
from datetime import datetime
from uuid import UUID
class Task(BaseModel):
id: UUID
title: str
priority: int
created_at: datetime
completed: bool = False # field with default
Coercion: What Pydantic Does by Default
Pydantic attempts to coerce input values to the declared type:
from pydantic import BaseModel
class Task(BaseModel):
title: str
priority: int
active: bool
# Coercion examples
task = Task(
title=123, # int coerces to str: "123"
priority="5", # str coerces to int: 5
active=1, # int coerces to bool: True
)
print(task.title) # "123"
print(task.priority) # 5
print(task.active) # True
Strict Mode: Disable Coercion
When you need exact types (e.g., an API that must reject "5" for an int field):
from pydantic import BaseModel, ConfigDict
class StrictTask(BaseModel):
model_config = ConfigDict(strict=True)
title: str
priority: int
# Now coercion is disabled
try:
StrictTask(title="write tests", priority="5")
except Exception as e:
print(e)
# priority: Input should be a valid integer [type=int_type, ...]
# Per-field strict mode (override model-level)
from pydantic import Field
from typing import Annotated
class HybridTask(BaseModel):
title: str # coercion enabled
priority: Annotated[int, Field(strict=True)] # this field: strict
Part 2 - Built-In Validated Types
These are Pydantic types that enforce semantic correctness beyond Python's type system:
from pydantic import BaseModel, EmailStr, HttpUrl, AnyUrl, IPvAnyAddress
from uuid import UUID
from datetime import datetime, date
from typing import Optional
class UserProfile(BaseModel):
id: UUID # validates UUID format
email: EmailStr # validates email syntax (requires pydantic[email])
website: Optional[HttpUrl] # validates URL with http/https scheme
api_endpoint: Optional[AnyUrl] # validates URL with any scheme
ip_address: Optional[IPvAnyAddress] # validates IPv4 or IPv6
created_at: datetime # parses ISO 8601 strings
birth_date: Optional[date] # parses YYYY-MM-DD strings
# All of these parse correctly:
user = UserProfile(
id="550e8400-e29b-41d4-a716-446655440000", # str → UUID
website="https://alice.dev",
ip_address="192.168.1.1",
created_at="2024-03-15T12:00:00Z", # str → datetime
birth_date="1990-05-20", # str → date
)
print(type(user.id)) # <class 'uuid.UUID'>
print(type(user.created_at)) # <class 'datetime.datetime'>
# These raise ValidationError:
try:
UserProfile(id="not-a-uuid", email="not-an-email", created_at="2024-03-15")
except Exception as e:
print(e)
EmailStr requires the email-validator package: pip install pydantic[email]. Without it, using EmailStr raises ImportError: optional dependency 'email-validator' is not installed. Add pydantic[email] to your pyproject.toml dependencies for any project that validates email addresses.
Part 3 - Field(): Every Constraint That Matters
Field() provides metadata and validation constraints for individual fields:
from pydantic import BaseModel, Field
from typing import Optional
from uuid import UUID
from datetime import datetime
import re
class TaskCreate(BaseModel):
# String constraints
title: str = Field(
min_length=1,
max_length=200,
description="Task title - must be non-empty",
examples=["Write unit tests", "Review PR #42"],
)
description: Optional[str] = Field(
default=None,
max_length=2000,
description="Optional detailed description",
)
# Numeric constraints
priority: int = Field(
default=2,
ge=1, # greater than or equal
le=5, # less than or equal
description="Priority 1 (lowest) to 5 (highest)",
)
estimated_hours: float = Field(
default=1.0,
gt=0, # strictly greater than - excludes 0
lt=168, # strictly less than 168 hours (one week)
)
# Pattern validation
reference_code: Optional[str] = Field(
default=None,
pattern=r"^TASK-\d{4,6}$", # e.g., "TASK-0042" or "TASK-123456"
description="Optional ticket reference code",
)
# Alias: accept a different key name in the input
due_date: Optional[datetime] = Field(
default=None,
alias="dueDate", # input JSON key is "dueDate", attribute is "due_date"
description="Optional due date in ISO 8601 format",
)
# default_factory: runs once per instance (never share mutable defaults)
tags: list[str] = Field(
default_factory=list,
description="List of tags",
)
# Title for OpenAPI schema display
assignee_id: Optional[UUID] = Field(
default=None,
title="Assignee User ID",
description="UUID of the user assigned to this task",
)
ge/le/gt/lt vs min_length/max_length
| Constraint | Applies to | Meaning |
|---|---|---|
ge=N | numbers | value ≥ N (inclusive) |
le=N | numbers | value ≤ N (inclusive) |
gt=N | numbers | value > N (exclusive) |
lt=N | numbers | value < N (exclusive) |
min_length=N | str, list, bytes | length ≥ N |
max_length=N | str, list, bytes | length ≤ N |
pattern=r"..." | str | must match regex (full match) |
Use model_config = ConfigDict(str_strip_whitespace=True) on all request models. Users submitting forms always add accidental leading or trailing whitespace to text fields. Without stripping, " Alice " passes min_length=1 validation but creates a user named " Alice " in your database. Stripping prevents an entire class of data quality bugs at zero cost.
Part 4 \text{---} The Pydantic Validation Pipeline
Pydantic collects all errors before raising \text{---} one call gives you every problem, not just the first one:
from pydantic import BaseModel, Field, ValidationError
class Product(BaseModel):
name: str = Field(min_length=1)
price: float = Field(gt=0)
stock: int = Field(ge=0)
try:
Product(name="", price=-5.0, stock=-1)
except ValidationError as e:
print(e.error_count()) # 3
for error in e.errors():
print(f" {'.'.join(str(x) for x in error['loc'])}: {error['msg']}")
# name: String should have at least 1 character
# price: Input should be greater than 0
# stock: Input should be greater than or equal to 0
Part 5 \text{---} field_validator: Single-Field Custom Validation
from pydantic import BaseModel, field_validator
from datetime import datetime, timezone
class BookingCreate(BaseModel):
title: str
start_time: datetime
end_time: datetime
guest_email: str
@field_validator("title")
@classmethod
def title_must_not_be_only_whitespace(cls, v: str) -> str:
"""mode='after' (default): runs after type coercion."""
if v.strip() == "":
raise ValueError("title must contain non-whitespace characters")
return v.strip() # validators can transform the value
@field_validator("guest_email", mode="before")
@classmethod
def normalize_email(cls, v) -> str:
"""mode='before': runs before type coercion \text{---} v may not be a str yet."""
if isinstance(v, str):
return v.lower().strip()
return v # let type coercion handle non-str input (will fail validation)
@field_validator("start_time", "end_time", mode="after")
@classmethod
def must_be_future(cls, v: datetime) -> datetime:
"""Same validator applied to multiple fields."""
now = datetime.now(tz=timezone.utc)
if v.tzinfo is None:
raise ValueError("datetime must be timezone-aware")
if v <= now:
raise ValueError("datetime must be in the future")
return v
mode='before' vs mode='after'
| Mode | When it runs | Receives | Use for |
|---|---|---|---|
'before' | Before type coercion | Raw input (any type) | Input normalization, format conversion |
'after' (default) | After type coercion | Already-coerced type | Business rule validation on the final value |
# mode='before': raw value \text{---} use for normalization
@field_validator("phone", mode="before")
@classmethod
def strip_phone_formatting(cls, v):
if isinstance(v, str):
return re.sub(r"[\s\-\(\)]", "", v) # "+1 (555) 123-4567" → "+15551234567"
return v
# mode='after': typed value \text{---} use for business rules
@field_validator("birth_date")
@classmethod
def must_be_adult(cls, v: date) -> date:
age = (date.today() - v).days // 365
if age < 18:
raise ValueError(f"User must be at least 18 years old (got age {age})")
return v
Part 6 \text{---} model_validator: Cross-Field Validation
When validation requires comparing multiple fields, use @model_validator:
from pydantic import BaseModel, model_validator
from datetime import datetime
from typing import Optional, Self
class EventCreate(BaseModel):
title: str
start_time: datetime
end_time: datetime
max_attendees: Optional[int] = None
waitlist_enabled: bool = False
@model_validator(mode="after")
def end_must_be_after_start(self) -> Self:
"""Cross-field validation: end_time must be after start_time."""
if self.end_time <= self.start_time:
raise ValueError(
f"end_time ({self.end_time.isoformat()}) must be after "
f"start_time ({self.start_time.isoformat()})"
)
return self
@model_validator(mode="after")
def waitlist_requires_max_attendees(self) -> Self:
"""Another cross-field rule: waitlist only makes sense with a capacity limit."""
if self.waitlist_enabled and self.max_attendees is None:
raise ValueError(
"waitlist_enabled=True requires max_attendees to be set"
)
return self
mode='before' for Model Validators
mode='before' receives the raw dict before any field processing \text{---} useful for data transformation:
from pydantic import BaseModel, model_validator
class LegacyOrderAdapter(BaseModel):
"""Accepts either new format or legacy format from old API."""
order_id: str
total_cents: int # internal representation
@model_validator(mode="before")
@classmethod
def handle_legacy_format(cls, data: dict) -> dict:
"""Normalize legacy 'amount_dollars' to 'total_cents'."""
if "amount_dollars" in data and "total_cents" not in data:
data["total_cents"] = int(float(data.pop("amount_dollars")) * 100)
if "orderId" in data and "order_id" not in data:
data["order_id"] = data.pop("orderId")
return data
# Works with both formats:
LegacyOrderAdapter(order_id="ORD-001", total_cents=999)
LegacyOrderAdapter(orderId="ORD-002", amount_dollars="19.99")
Part 7 \text{---} Nested Models and Composition
from pydantic import BaseModel, Field, EmailStr
from typing import Optional
from uuid import UUID, uuid4
from datetime import datetime, timezone
class Address(BaseModel):
street: str
city: str
country: str = Field(min_length=2, max_length=2, description="ISO 3166-1 alpha-2 country code")
postal_code: Optional[str] = None
class ContactInfo(BaseModel):
email: EmailStr
phone: Optional[str] = None
address: Optional[Address] = None
class UserCreate(BaseModel):
name: str = Field(min_length=1, max_length=100)
contact: ContactInfo # nested model
preferences: dict[str, str] = Field(default_factory=dict)
# Validation cascades through nested models
try:
UserCreate(
name="Alice",
contact=ContactInfo(
address=Address(
street="123 Main St",
city="Springfield",
country="USA", # 3 letters \text{---} violates max_length=2
)
)
)
except Exception as e:
print(e)
# contact.address.country: String should have at most 2 characters
# Valid nested model \text{---} input can use dicts, no need to pre-construct nested models
user = UserCreate(
name="Alice",
contact={ # dict is automatically validated into ContactInfo
"address": {
"street": "123 Main St",
"city": "Springfield",
"country": "US",
}
}
)
print(type(user.contact)) # <class 'ContactInfo'>
print(type(user.contact.address)) # <class 'Address'>
Part 8 \text{---} model_config with ConfigDict
ConfigDict controls model behavior globally:
from pydantic import BaseModel, ConfigDict, Field
from enum import Enum
class TaskStatus(str, Enum):
PENDING = "pending"
DONE = "done"
class TaskRequest(BaseModel):
model_config = ConfigDict(
# Strip whitespace from all str fields \text{---} prevents " Alice " in the DB
str_strip_whitespace=True,
# Lowercase all str fields (useful for email normalization)
# str_to_lower=True, # uncomment if you want lowercase
# Accept enum values instead of enum instances in input
use_enum_values=True,
# ORM mode: read attributes from SQLAlchemy model instances
from_attributes=True,
# Populate fields by field name AND by alias (default: alias only when alias is set)
populate_by_name=True,
# Validate default values (not just user-supplied values)
validate_default=False, # True adds overhead \text{---} use only when needed
)
title: str
status: TaskStatus = TaskStatus.PENDING
# str_strip_whitespace in action
task = TaskRequest(title=" Write tests ", status="pending")
print(repr(task.title)) # 'Write tests' \text{---} stripped
print(task.status) # 'pending' \text{---} use_enum_values returns the value, not the Enum
ORM Mode (from_attributes=True)
from pydantic import BaseModel, ConfigDict
from sqlalchemy import Column, String, Integer
from sqlalchemy.orm import DeclarativeBase
class Base(DeclarativeBase):
pass
class TaskORM(Base):
__tablename__ = "tasks"
id = Column(Integer, primary_key=True)
title = Column(String)
priority = Column(Integer)
class TaskResponse(BaseModel):
model_config = ConfigDict(from_attributes=True)
id: int
title: str
priority: int
# Without from_attributes, this raises TypeError:
# TaskResponse(**orm_task.__dict__) misses __sqlalchemy_internal_...
orm_task = TaskORM(id=1, title="Write tests", priority=3)
response = TaskResponse.model_validate(orm_task) # reads attributes, not dict
print(response.id) # 1
print(response.title) # "Write tests"
from_attributes=True (ORM mode) lets Pydantic read SQLAlchemy model attributes directly using model_validate(orm_object). Without it, you must convert the ORM object to a dict first, which requires knowing which attributes to include and risks triggering lazy-loaded relationships. With from_attributes=True, Pydantic accesses only the attributes it needs for the declared fields.
Part 9 \text{---} model_dump() and model_dump_json()
from pydantic import BaseModel, Field
from datetime import datetime, timezone
from uuid import UUID, uuid4
from typing import Optional
class TaskResponse(BaseModel):
id: UUID = Field(default_factory=uuid4)
title: str
created_at: datetime = Field(default_factory=lambda: datetime.now(tz=timezone.utc))
completed: bool = False
secret: Optional[str] = None # never expose this
task = TaskResponse(title="Write tests", secret="internal-note")
# model_dump() → Python dict (datetime stays as datetime, UUID stays as UUID)
d = task.model_dump()
print(type(d["id"])) # <class 'uuid.UUID'>
print(type(d["created_at"])) # <class 'datetime.datetime'>
# Passing this to json.dumps() without a custom encoder → TypeError
# model_dump(mode="json") → dict with JSON-serializable values
d_json = task.model_dump(mode="json")
print(type(d_json["id"])) # <class 'str'> \text{---} UUID serialized to string
print(type(d_json["created_at"])) # <class 'str'> \text{---} datetime serialized to ISO 8601
# model_dump_json() → JSON string directly (fastest, bypasses dict step)
json_str = task.model_dump_json()
print(json_str)
# Exclude fields from output
d = task.model_dump(exclude={"secret"})
# id, title, created_at, completed \text{---} no "secret"
# Exclude None values (useful for sparse responses)
task2 = TaskResponse(title="No secret task")
task2.model_dump(exclude_none=True) # "secret" omitted because it's None
# Exclude fields that were not explicitly set (useful for PATCH responses)
task3 = TaskResponse(title="Explicit title")
task3.model_dump(exclude_unset=True) # only "title" \text{---} id, created_at, completed have defaults
# Use alias in output
class AliasedTask(BaseModel):
task_id: UUID = Field(alias="taskId")
model_config = ConfigDict(populate_by_name=True)
at = AliasedTask(task_id=uuid4())
at.model_dump() # {"task_id": ...} \text{---} uses field name
at.model_dump(by_alias=True) # {"taskId": ...} \text{---} uses alias
Part 10 \text{---} Partial Updates: PATCH Endpoint Pattern
PATCH requests update only specified fields. Pydantic handles this with optional fields + model_copy:
from pydantic import BaseModel, ConfigDict
from typing import Optional
from uuid import UUID
class TaskUpdate(BaseModel):
"""All fields optional \text{---} only provided fields are updated."""
title: Optional[str] = None
priority: Optional[int] = None
completed: Optional[bool] = None
assignee_id: Optional[UUID] = None
class Task(BaseModel):
id: UUID
title: str
priority: int
completed: bool = False
assignee_id: Optional[UUID] = None
# PATCH endpoint pattern
async def patch_task(task_id: UUID, updates: TaskUpdate, db) -> Task:
existing = await db.get_task(task_id) # fetch current state
# Convert existing ORM object to Pydantic model
current = Task.model_validate(existing)
# Apply only the fields that were explicitly provided in the request
# exclude_unset=True: only fields the client sent, not all Optional=None fields
update_data = updates.model_dump(exclude_unset=True)
# model_copy(update=...) produces a new model with specified fields replaced
updated = current.model_copy(update=update_data)
await db.update_task(task_id, updated.model_dump(exclude={"id"}))
return updated
# Example:
# PATCH /tasks/123 {"title": "Updated title"}
# update_data = {"title": "Updated title"} \text{---} only title, not priority/completed/assignee_id
# current.model_copy(update={"title": "Updated title"})
# → Task with new title, all other fields unchanged
The key is model_dump(exclude_unset=True): it returns only fields that were explicitly included in the request body \text{---} not all fields that happen to be None. If the client sends {"title": "New"}, then exclude_unset=True returns {"title": "New"}, not {"title": "New", "priority": None, "completed": None, "assignee_id": None}. Without exclude_unset=True, a PATCH would set every unspecified optional field to None in the database.
Part 11 \text{---} RootModel for Lists and Primitives
When you need to validate a JSON array or a single value at the top level:
from pydantic import RootModel, Field
from typing import Annotated
# Validate a list of strings
class TagList(RootModel[list[str]]):
root: list[str] = Field(min_length=1, max_length=20)
tags = TagList.model_validate(["python", "fastapi", "pydantic"])
print(tags.root) # ['python', 'fastapi', 'pydantic']
tags.model_dump_json() # '["python","fastapi","pydantic"]'
# Validate a list of models
from uuid import UUID
class BulkTaskIds(RootModel[list[UUID]]):
pass
# DELETE /tasks with body ["uuid1", "uuid2", "uuid3"]
ids = BulkTaskIds.model_validate([
"550e8400-e29b-41d4-a716-446655440000",
"6ba7b810-9dad-11d1-80b4-00c04fd430c8",
])
print(ids.root) # [UUID('550e8400...'), UUID('6ba7b810...')]
# Validate a primitive with constraints
class PositiveScore(RootModel[Annotated[int, Field(ge=0, le=100)]]):
pass
score = PositiveScore.model_validate(87)
print(score.root) # 87
Part 12 \text{---} Annotated Types for Reusable Constraints
Instead of repeating Field(ge=1, le=5) everywhere, define reusable annotated types:
from typing import Annotated
from pydantic import Field, BaseModel
from uuid import UUID
# Reusable constrained types
PositiveInt = Annotated[int, Field(gt=0)]
Priority = Annotated[int, Field(ge=1, le=5, description="1=lowest, 5=highest")]
ShortStr = Annotated[str, Field(min_length=1, max_length=200)]
NonEmptyStr = Annotated[str, Field(min_length=1)]
Percentage = Annotated[float, Field(ge=0.0, le=100.0)]
TaskRef = Annotated[str, Field(pattern=r"^TASK-\d{4,6}$")]
# Use them across multiple models
class TaskCreate(BaseModel):
title: ShortStr
description: Annotated[str, Field(max_length=2000)] = ""
priority: Priority = 2
estimated_hours: PositiveInt = 1
class ProjectCreate(BaseModel):
name: ShortStr
completion_percentage: Percentage = 0.0
default_priority: Priority = 2
class TaskUpdate(BaseModel):
title: ShortStr | None = None
priority: Priority | None = None
# The constraint is defined once - changing PositiveInt changes it everywhere
Part 13 - Discriminated Unions for Polymorphic Bodies
When an endpoint accepts different payload shapes depending on a type field:
from pydantic import BaseModel, Field
from typing import Literal, Union, Annotated
from uuid import UUID
class EmailNotification(BaseModel):
type: Literal["email"]
recipient: str
subject: str
body: str
class SlackNotification(BaseModel):
type: Literal["slack"]
channel: str
message: str
thread_ts: str | None = None
class WebhookNotification(BaseModel):
type: Literal["webhook"]
url: str
payload: dict
secret: str | None = None
# Discriminated union: Pydantic uses "type" field to pick the correct model
Notification = Annotated[
Union[EmailNotification, SlackNotification, WebhookNotification],
Field(discriminator="type"),
]
class SendNotificationRequest(BaseModel):
notification: Notification
scheduled_at: str | None = None
# FastAPI endpoint
from fastapi import FastAPI
app = FastAPI()
@app.post("/notifications")
async def send_notification(request: SendNotificationRequest):
match request.notification:
case EmailNotification():
return {"action": "email", "to": request.notification.recipient}
case SlackNotification():
return {"action": "slack", "channel": request.notification.channel}
case WebhookNotification():
return {"action": "webhook", "url": request.notification.url}
# Input: {"notification": {"type": "email", "recipient": "[email protected]", "subject": "Hi", "body": "..."}}
# Pydantic sees type="email" → validates as EmailNotification
# Input: {"notification": {"type": "slack", "channel": "#general", "message": "Hello"}}
# Pydantic sees type="slack" → validates as SlackNotification
Without discriminated unions, Pydantic tries each model in order - slow for large unions and produces confusing errors. With a discriminator, it jumps directly to the correct model - O(1) dispatch.
Part 14 - model_json_schema() and OpenAPI Integration
Pydantic generates JSON Schema from your models, which FastAPI uses to build OpenAPI docs:
from pydantic import BaseModel, Field
from typing import Optional
import json
class TaskCreate(BaseModel):
title: str = Field(
min_length=1,
max_length=200,
description="Task title",
examples=["Write unit tests"],
)
priority: int = Field(
default=2,
ge=1,
le=5,
description="Priority 1 (lowest) to 5 (highest)",
)
assignee_id: Optional[str] = Field(
default=None,
description="Assignee user ID",
)
schema = TaskCreate.model_json_schema()
print(json.dumps(schema, indent=2))
# {
# "type": "object",
# "title": "TaskCreate",
# "properties": {
# "title": {
# "type": "string",
# "minLength": 1,
# "maxLength": 200,
# "description": "Task title",
# "examples": ["Write unit tests"]
# },
# "priority": {
# "type": "integer",
# "minimum": 1,
# "maximum": 5,
# "default": 2,
# "description": "Priority 1 (lowest) to 5 (highest)"
# },
# ...
# },
# "required": ["title"]
# }
FastAPI generates GET /openapi.json and renders it at GET /docs (Swagger UI) and GET /redoc automatically. Every Field(description=..., examples=[...]) appears in the Swagger docs.
Graded Practice
Level 1 - Predict the Validation Result
For each snippet, predict: does it succeed or raise ValidationError? If it succeeds, what are the field values?
1a:
from pydantic import BaseModel
class Item(BaseModel):
name: str
count: int
active: bool
item = Item(name=42, count="10", active=0)
print(item.name, item.count, item.active)
1b:
from pydantic import BaseModel, Field
class Product(BaseModel):
price: float = Field(gt=0, lt=1000)
p = Product(price=0)
1c:
from pydantic import BaseModel
from typing import Optional
class Task(BaseModel):
title: str
priority: Optional[int] = None
t = Task(title="test")
print(t.model_dump(exclude_unset=True))
print(t.model_dump(exclude_none=True))
Show Answer
1a - Succeeds with coercion:
42 10 False
name=42:intcoerces tostr→"42"count="10":strcoerces toint→10active=0:intcoerces tobool→False(0 is falsy)
Note: active=1 would give True, active=2 would give True (any non-zero int is truthy).
1b - Raises ValidationError:
price: Input should be greater than 0 [type=greater_than, ...]
gt=0 means strictly greater than - 0 itself is not allowed. Use ge=0 if you want to allow zero.
1c - Succeeds, two different outputs:
t.model_dump(exclude_unset=True)
# {'title': 'test'}
# "priority" was never set - it only has a default
t.model_dump(exclude_none=True)
# {'title': 'test'}
# "priority" is None - excluded
# Contrast:
t2 = Task(title="test", priority=None) # explicitly set to None
t2.model_dump(exclude_unset=True)
# {'title': 'test', 'priority': None} - priority was SET (to None)
t2.model_dump(exclude_none=True)
# {'title': 'test'} - None values excluded regardless
Level 2 - Debug the Validator
A developer built this model for a booking API. Find and fix all problems:
from pydantic import BaseModel, field_validator, model_validator
from datetime import datetime
class BookingCreate(BaseModel):
title: str
start_time: datetime
end_time: datetime
attendees: list[str]
@field_validator("attendees")
def validate_attendees(cls, v): # Bug 1: missing @classmethod
if len(v) == 0:
raise ValueError("must have at least one attendee")
if len(v) > 100:
raise ValueError("cannot exceed 100 attendees")
return v
@model_validator(mode="after")
def check_times(self):
if self.end_time < self.start_time: # Bug 2: should be <=, not <
raise ValueError("end_time must be after start_time")
return self
@field_validator("start_time", mode="after")
@classmethod
def start_must_be_future(cls, v: datetime) -> datetime:
if v < datetime.now(): # Bug 3: timezone-naive comparison
raise ValueError("start_time must be in the future")
return v
Show Answer
Bug 1 - Missing @classmethod on field_validator:
In Pydantic v2, @field_validator methods must be class methods. Without @classmethod, Pydantic raises PydanticUserError: 'field_validator' must be used with classmethod. The fix:
@field_validator("attendees")
@classmethod
def validate_attendees(cls, v: list[str]) -> list[str]:
if len(v) == 0:
raise ValueError("must have at least one attendee")
if len(v) > 100:
raise ValueError("cannot exceed 100 attendees")
return v
Note: @field_validator goes first, @classmethod goes second. This is the required order in Pydantic v2.
Bug 2 - < should be <= in the time comparison:
A booking where start_time == end_time would be accepted - a zero-duration booking makes no sense. The condition should exclude equal times:
@model_validator(mode="after")
def check_times(self) -> "BookingCreate":
if self.end_time <= self.start_time: # <=: end must be strictly after start
raise ValueError("end_time must be strictly after start_time")
return self
Bug 3 - Timezone-naive comparison in start_must_be_future:
datetime.now() returns a naive datetime (no timezone info). If start_time is parsed from an ISO 8601 string with a timezone offset (e.g., "2024-03-15T12:00:00Z"), v will be timezone-aware. Comparing a timezone-aware datetime to a naive datetime raises TypeError: can't compare offset-naive and offset-aware datetimes.
The fix:
from datetime import datetime, timezone
@field_validator("start_time", mode="after")
@classmethod
def start_must_be_future(cls, v: datetime) -> datetime:
if v.tzinfo is None:
raise ValueError("start_time must be timezone-aware (include UTC offset or Z)")
now = datetime.now(tz=timezone.utc)
if v <= now:
raise ValueError("start_time must be in the future")
return v
Always use datetime.now(tz=timezone.utc) for the comparison baseline. Require all incoming datetimes to be timezone-aware - reject naive datetimes at the validation layer.
Level 3 - Design the Request and Response Models
You are building a task management API with these endpoints:
POST /tasks- create a taskGET /tasks/{id}- get a task (response includes computed fields not in create)PATCH /tasks/{id}- partial updatePOST /tasks/bulk- create up to 100 tasks in one request
Business rules:
- Title: 1–200 chars, stripped of whitespace
- Priority: 1–5 (default 2)
- Due date: must be in the future if provided; must be timezone-aware
- Assignee: optional UUID; if provided, must be a valid v4 UUID
- Tags: list of strings, each 1–50 chars, maximum 10 tags, all lowercase
GETresponse includes:id(UUID),created_at(datetime UTC),updated_at(datetime UTC),title,priority,due_date,tags,assignee_id,status(an Enum)- PATCH must only update provided fields, not set unprovided fields to null
- Bulk create: array of TaskCreate, minimum 1, maximum 100
Design all models with full validation, config, and correct types.
Show Answer
from pydantic import BaseModel, Field, ConfigDict, field_validator, model_validator, RootModel
from typing import Annotated, Optional
from uuid import UUID, uuid4
from datetime import datetime, timezone
from enum import Enum
# ── Reusable annotated types ───────────────────────────────────────────────────
Title = Annotated[str, Field(min_length=1, max_length=200, description="Task title")]
Priority = Annotated[int, Field(ge=1, le=5, description="1=lowest, 5=highest")]
class TaskStatus(str, Enum):
PENDING = "pending"
IN_PROGRESS = "in_progress"
DONE = "done"
CANCELLED = "cancelled"
# ── Request models ─────────────────────────────────────────────────────────────
class TaskCreate(BaseModel):
model_config = ConfigDict(str_strip_whitespace=True)
title: Title
priority: Priority = 2
due_date: Optional[datetime] = None
assignee_id: Optional[UUID] = None
tags: list[str] = Field(default_factory=list, max_length=10)
@field_validator("tags", mode="before")
@classmethod
def normalize_tags(cls, v: list) -> list:
"""Lowercase and strip each tag before length validation."""
if isinstance(v, list):
return [str(tag).lower().strip() for tag in v]
return v
@field_validator("tags")
@classmethod
def validate_tag_lengths(cls, v: list[str]) -> list[str]:
for tag in v:
if not (1 <= len(tag) <= 50):
raise ValueError(
f"each tag must be 1–50 characters, got '{tag}' ({len(tag)} chars)"
)
return v
@field_validator("due_date", mode="after")
@classmethod
def due_date_must_be_future_and_aware(cls, v: Optional[datetime]) -> Optional[datetime]:
if v is None:
return v
if v.tzinfo is None:
raise ValueError("due_date must be timezone-aware (e.g. '2024-12-01T12:00:00Z')")
if v <= datetime.now(tz=timezone.utc):
raise ValueError("due_date must be in the future")
return v
class TaskUpdate(BaseModel):
"""All fields optional for PATCH - only provided fields are applied."""
model_config = ConfigDict(str_strip_whitespace=True)
title: Optional[Title] = None
priority: Optional[Priority] = None
due_date: Optional[datetime] = None
assignee_id: Optional[UUID] = None
tags: Optional[list[str]] = None
status: Optional[TaskStatus] = None
@field_validator("tags", mode="before")
@classmethod
def normalize_tags(cls, v) -> list | None:
if v is None:
return None
return [str(tag).lower().strip() for tag in v]
@field_validator("due_date", mode="after")
@classmethod
def due_date_must_be_future_and_aware(cls, v: Optional[datetime]) -> Optional[datetime]:
if v is None:
return v
if v.tzinfo is None:
raise ValueError("due_date must be timezone-aware")
if v <= datetime.now(tz=timezone.utc):
raise ValueError("due_date must be in the future")
return v
# ── Response model ─────────────────────────────────────────────────────────────
class TaskResponse(BaseModel):
"""Response model - reads from SQLAlchemy ORM object via from_attributes."""
model_config = ConfigDict(from_attributes=True, use_enum_values=True)
id: UUID
title: str
priority: int
status: TaskStatus
due_date: Optional[datetime]
assignee_id: Optional[UUID]
tags: list[str]
created_at: datetime
updated_at: datetime
def to_json(self) -> str:
"""Serialize to JSON with ISO 8601 datetimes."""
return self.model_dump_json()
# ── Bulk create ────────────────────────────────────────────────────────────────
class BulkTaskCreate(RootModel[list[TaskCreate]]):
"""Validates a JSON array of TaskCreate - minimum 1, maximum 100."""
@model_validator(mode="after")
def validate_count(self) -> "BulkTaskCreate":
if len(self.root) == 0:
raise ValueError("bulk create requires at least 1 task")
if len(self.root) > 100:
raise ValueError(
f"bulk create allows at most 100 tasks, got {len(self.root)}"
)
return self
# ── FastAPI endpoints ──────────────────────────────────────────────────────────
from fastapi import FastAPI, HTTPException
app = FastAPI()
@app.post("/tasks", response_model=TaskResponse, status_code=201)
async def create_task(body: TaskCreate):
# body is already validated: title stripped, tags lowercased, due_date UTC-aware
task = await db.create_task(body.model_dump())
return TaskResponse.model_validate(task)
@app.get("/tasks/{task_id}", response_model=TaskResponse)
async def get_task(task_id: UUID):
task = await db.get_task(task_id)
if not task:
raise HTTPException(status_code=404, detail=f"Task {task_id} not found")
return TaskResponse.model_validate(task)
@app.patch("/tasks/{task_id}", response_model=TaskResponse)
async def update_task(task_id: UUID, body: TaskUpdate):
existing = await db.get_task(task_id)
if not existing:
raise HTTPException(status_code=404, detail=f"Task {task_id} not found")
current = TaskResponse.model_validate(existing)
# Only update fields the client actually provided
update_data = body.model_dump(exclude_unset=True)
updated = current.model_copy(update=update_data)
await db.update_task(task_id, updated.model_dump(exclude={"id", "created_at", "updated_at"}))
return updated
@app.post("/tasks/bulk", response_model=list[TaskResponse], status_code=201)
async def bulk_create_tasks(body: BulkTaskCreate):
tasks = await db.bulk_create_tasks([t.model_dump() for t in body.root])
return [TaskResponse.model_validate(t) for t in tasks]
Key design decisions:
- Separate
TaskCreateandTaskUpdate: different validation semantics -TaskCreatehas required fields and defaults;TaskUpdatehas all-optional fields. Never use a single model for both. exclude_unset=Truein PATCH: critical - otherwise PATCH sets every unspecified optional field toNonein the database, destroying existing values.mode='before'for tag normalization: normalization (lowercase, strip) must run before themin_length=1check, or" python "fails the length check before being stripped to"python".RootModelfor bulk create: the top-level JSON is an array -RootModelis the correct Pydantic type for this.model_validatoron the root model validates the count.from_attributes=Trueonly onTaskResponse: request models do not need ORM mode; only the response model reads from SQLAlchemy objects. Keeping them separate avoids accidental attribute access at the wrong layer.
Key Takeaways
- Pydantic v2 coerces by default:
"25"becomes25for anintfield. UseConfigDict(strict=True)orAnnotated[int, Field(strict=True)]to disable coercion per-model or per-field. strdoes not validate email format: useEmailStrfrompydantic[email]. The same applies toHttpUrl,IPvAnyAddress- always use the semantic type, notstr, when you have a format requirement.@field_validatorrequires@classmethodin Pydantic v2. The order is@field_validator(...)first,@classmethodsecond. Swap them and you get aPydanticUserError.mode='before'for normalization,mode='after'for business rules: before-validators receive raw input (may not be the declared type); after-validators receive the already-coerced value.model_dump(exclude_unset=True)is essential for PATCH: without it, every optional field not sent by the client becomesNonein the database, silently destroying data.model_dump(mode="json")vsmodel_dump(): onlymode="json"produces a dict with JSON-serializable values. Plainmodel_dump()returns Python objects - passing it tojson.dumps()raisesTypeErrorfordatetime,UUID, andDecimalfields.from_attributes=True(ORM mode) enablesmodel_validate(orm_object)- Pydantic reads attributes directly instead of requiring a dict. Use it on response models, not request models.ConfigDict(str_strip_whitespace=True)on all request models: users always add accidental whitespace. Stripping it at the validation layer prevents data quality bugs in the database.- Discriminated unions (
Field(discriminator="type")) dispatch to the correct model in O(1) based on a literal field value - far faster and clearer than bareUnion[A, B, C]which tries each model in sequence. Annotatedtypes make constraints reusable: definePriority = Annotated[int, Field(ge=1, le=5)]once, use it across all models. Change the constraint in one place, apply everywhere.
What's Next
You have completed Module 06 - APIs and Web Basics. You can now:
- Build HTTP APIs with FastAPI - routes, dependency injection, middleware, error handling
- Serialize production data with custom encoders,
orjson, and Pydantic's JSON layer - Validate request and response models with Pydantic v2 - constraints, custom validators, cross-field rules, ORM mode
Module 07 - Databases builds directly on this foundation:
- SQLAlchemy Core and ORM - the
from_attributes=Truepattern you used in Pydantic becomes the primary integration point; defining ORM models that Pydantic can validate directly - Alembic migrations - database schema evolution, the same models evolve over time
- Async database access -
async with session:patterns that work correctly in FastAPI's async endpoints - Connection pooling - why
engine = create_engine(...)is called once and shared, not once per request - N+1 query problems - the SQLAlchemy lazy-loading issue referenced in Lesson 07 (JSON Serialization), solved with
selectinloadandjoinedload
