Skip to main content

Validation with Pydantic - Production Request and Response Models

Reading time: ~35 minutes | Level: Intermediate → Engineering

Before reading further, predict the output:

from pydantic import BaseModel

class User(BaseModel):
name: str
age: int
email: str

u = User(name="Alice", age="25", email="not-an-email")
print(u.age) # ?
print(u.email) # ?
25
not-an-email

Two surprises in one model. First: age="25" coerces to 25 - Pydantic validates that "25" can become an int and does the conversion silently. This is type coercion, not type checking, and it is intentional. Second: email="not-an-email" is accepted without error - str means "any string," not "a valid email address." For email validation, you need EmailStr from pydantic[email].

This is the fundamental tension in Pydantic: it is very helpful by default (coercion) but only as strict as you ask it to be (no semantic validation unless you add it). This lesson shows you where that line is, how to move it, and what tools exist on both sides.

What You Will Learn

  • BaseModel: field declaration, type coercion vs strict mode
  • All built-in validated types: EmailStr, HttpUrl, UUID, IPvAnyAddress
  • Field(): every constraint that matters in production
  • field_validator and model_validator: single-field and cross-field validation
  • The complete Pydantic validation pipeline from raw data to model
  • Nested models and composition patterns
  • model_config with ConfigDict: ORM mode, whitespace stripping, enum handling
  • model_dump() and model_dump_json(): the mode='json' gotcha and serialization options
  • model_json_schema(): OpenAPI docs generation
  • Partial updates with model_copy(update=...) for PATCH endpoints
  • RootModel for validating lists and primitives
  • Annotated types for reusable constraints
  • Discriminated unions for polymorphic request bodies

Prerequisites

  • Lesson 04 (FastAPI) - Pydantic is FastAPI's request validation engine
  • Lesson 07 (JSON Serialization) - understanding what model_dump(mode='json') does
  • Basic Python type hints (str, int, list[str], Optional, Union)

Part 1 - BaseModel and Type Coercion

Pydantic v2's BaseModel is the foundation. Every field is declared as a class attribute with a type annotation:

from pydantic import BaseModel
from datetime import datetime
from uuid import UUID

class Task(BaseModel):
id: UUID
title: str
priority: int
created_at: datetime
completed: bool = False # field with default

Coercion: What Pydantic Does by Default

Pydantic attempts to coerce input values to the declared type:

from pydantic import BaseModel

class Task(BaseModel):
title: str
priority: int
active: bool

# Coercion examples
task = Task(
title=123, # int coerces to str: "123"
priority="5", # str coerces to int: 5
active=1, # int coerces to bool: True
)
print(task.title) # "123"
print(task.priority) # 5
print(task.active) # True

Strict Mode: Disable Coercion

When you need exact types (e.g., an API that must reject "5" for an int field):

from pydantic import BaseModel, ConfigDict

class StrictTask(BaseModel):
model_config = ConfigDict(strict=True)

title: str
priority: int

# Now coercion is disabled
try:
StrictTask(title="write tests", priority="5")
except Exception as e:
print(e)
# priority: Input should be a valid integer [type=int_type, ...]

# Per-field strict mode (override model-level)
from pydantic import Field
from typing import Annotated

class HybridTask(BaseModel):
title: str # coercion enabled
priority: Annotated[int, Field(strict=True)] # this field: strict

Part 2 - Built-In Validated Types

These are Pydantic types that enforce semantic correctness beyond Python's type system:

from pydantic import BaseModel, EmailStr, HttpUrl, AnyUrl, IPvAnyAddress
from uuid import UUID
from datetime import datetime, date
from typing import Optional

class UserProfile(BaseModel):
id: UUID # validates UUID format
email: EmailStr # validates email syntax (requires pydantic[email])
website: Optional[HttpUrl] # validates URL with http/https scheme
api_endpoint: Optional[AnyUrl] # validates URL with any scheme
ip_address: Optional[IPvAnyAddress] # validates IPv4 or IPv6
created_at: datetime # parses ISO 8601 strings
birth_date: Optional[date] # parses YYYY-MM-DD strings

# All of these parse correctly:
user = UserProfile(
id="550e8400-e29b-41d4-a716-446655440000", # str → UUID
website="https://alice.dev",
ip_address="192.168.1.1",
created_at="2024-03-15T12:00:00Z", # str → datetime
birth_date="1990-05-20", # str → date
)
print(type(user.id)) # <class 'uuid.UUID'>
print(type(user.created_at)) # <class 'datetime.datetime'>

# These raise ValidationError:
try:
UserProfile(id="not-a-uuid", email="not-an-email", created_at="2024-03-15")
except Exception as e:
print(e)
warning

EmailStr requires the email-validator package: pip install pydantic[email]. Without it, using EmailStr raises ImportError: optional dependency 'email-validator' is not installed. Add pydantic[email] to your pyproject.toml dependencies for any project that validates email addresses.

Part 3 - Field(): Every Constraint That Matters

Field() provides metadata and validation constraints for individual fields:

from pydantic import BaseModel, Field
from typing import Optional
from uuid import UUID
from datetime import datetime
import re

class TaskCreate(BaseModel):
# String constraints
title: str = Field(
min_length=1,
max_length=200,
description="Task title - must be non-empty",
examples=["Write unit tests", "Review PR #42"],
)
description: Optional[str] = Field(
default=None,
max_length=2000,
description="Optional detailed description",
)

# Numeric constraints
priority: int = Field(
default=2,
ge=1, # greater than or equal
le=5, # less than or equal
description="Priority 1 (lowest) to 5 (highest)",
)
estimated_hours: float = Field(
default=1.0,
gt=0, # strictly greater than - excludes 0
lt=168, # strictly less than 168 hours (one week)
)

# Pattern validation
reference_code: Optional[str] = Field(
default=None,
pattern=r"^TASK-\d{4,6}$", # e.g., "TASK-0042" or "TASK-123456"
description="Optional ticket reference code",
)

# Alias: accept a different key name in the input
due_date: Optional[datetime] = Field(
default=None,
alias="dueDate", # input JSON key is "dueDate", attribute is "due_date"
description="Optional due date in ISO 8601 format",
)

# default_factory: runs once per instance (never share mutable defaults)
tags: list[str] = Field(
default_factory=list,
description="List of tags",
)

# Title for OpenAPI schema display
assignee_id: Optional[UUID] = Field(
default=None,
title="Assignee User ID",
description="UUID of the user assigned to this task",
)

ge/le/gt/lt vs min_length/max_length

ConstraintApplies toMeaning
ge=Nnumbersvalue ≥ N (inclusive)
le=Nnumbersvalue ≤ N (inclusive)
gt=Nnumbersvalue > N (exclusive)
lt=Nnumbersvalue < N (exclusive)
min_length=Nstr, list, byteslength ≥ N
max_length=Nstr, list, byteslength ≤ N
pattern=r"..."strmust match regex (full match)
tip

Use model_config = ConfigDict(str_strip_whitespace=True) on all request models. Users submitting forms always add accidental leading or trailing whitespace to text fields. Without stripping, " Alice " passes min_length=1 validation but creates a user named " Alice " in your database. Stripping prevents an entire class of data quality bugs at zero cost.

Part 4 \text{---} The Pydantic Validation Pipeline

Pydantic collects all errors before raising \text{---} one call gives you every problem, not just the first one:

from pydantic import BaseModel, Field, ValidationError

class Product(BaseModel):
name: str = Field(min_length=1)
price: float = Field(gt=0)
stock: int = Field(ge=0)

try:
Product(name="", price=-5.0, stock=-1)
except ValidationError as e:
print(e.error_count()) # 3
for error in e.errors():
print(f" {'.'.join(str(x) for x in error['loc'])}: {error['msg']}")
# name: String should have at least 1 character
# price: Input should be greater than 0
# stock: Input should be greater than or equal to 0

Part 5 \text{---} field_validator: Single-Field Custom Validation

from pydantic import BaseModel, field_validator
from datetime import datetime, timezone

class BookingCreate(BaseModel):
title: str
start_time: datetime
end_time: datetime
guest_email: str

@field_validator("title")
@classmethod
def title_must_not_be_only_whitespace(cls, v: str) -> str:
"""mode='after' (default): runs after type coercion."""
if v.strip() == "":
raise ValueError("title must contain non-whitespace characters")
return v.strip() # validators can transform the value

@field_validator("guest_email", mode="before")
@classmethod
def normalize_email(cls, v) -> str:
"""mode='before': runs before type coercion \text{---} v may not be a str yet."""
if isinstance(v, str):
return v.lower().strip()
return v # let type coercion handle non-str input (will fail validation)

@field_validator("start_time", "end_time", mode="after")
@classmethod
def must_be_future(cls, v: datetime) -> datetime:
"""Same validator applied to multiple fields."""
now = datetime.now(tz=timezone.utc)
if v.tzinfo is None:
raise ValueError("datetime must be timezone-aware")
if v <= now:
raise ValueError("datetime must be in the future")
return v

mode='before' vs mode='after'

ModeWhen it runsReceivesUse for
'before'Before type coercionRaw input (any type)Input normalization, format conversion
'after' (default)After type coercionAlready-coerced typeBusiness rule validation on the final value
# mode='before': raw value \text{---} use for normalization
@field_validator("phone", mode="before")
@classmethod
def strip_phone_formatting(cls, v):
if isinstance(v, str):
return re.sub(r"[\s\-\(\)]", "", v) # "+1 (555) 123-4567" → "+15551234567"
return v

# mode='after': typed value \text{---} use for business rules
@field_validator("birth_date")
@classmethod
def must_be_adult(cls, v: date) -> date:
age = (date.today() - v).days // 365
if age < 18:
raise ValueError(f"User must be at least 18 years old (got age {age})")
return v

Part 6 \text{---} model_validator: Cross-Field Validation

When validation requires comparing multiple fields, use @model_validator:

from pydantic import BaseModel, model_validator
from datetime import datetime
from typing import Optional, Self

class EventCreate(BaseModel):
title: str
start_time: datetime
end_time: datetime
max_attendees: Optional[int] = None
waitlist_enabled: bool = False

@model_validator(mode="after")
def end_must_be_after_start(self) -> Self:
"""Cross-field validation: end_time must be after start_time."""
if self.end_time <= self.start_time:
raise ValueError(
f"end_time ({self.end_time.isoformat()}) must be after "
f"start_time ({self.start_time.isoformat()})"
)
return self

@model_validator(mode="after")
def waitlist_requires_max_attendees(self) -> Self:
"""Another cross-field rule: waitlist only makes sense with a capacity limit."""
if self.waitlist_enabled and self.max_attendees is None:
raise ValueError(
"waitlist_enabled=True requires max_attendees to be set"
)
return self

mode='before' for Model Validators

mode='before' receives the raw dict before any field processing \text{---} useful for data transformation:

from pydantic import BaseModel, model_validator

class LegacyOrderAdapter(BaseModel):
"""Accepts either new format or legacy format from old API."""
order_id: str
total_cents: int # internal representation

@model_validator(mode="before")
@classmethod
def handle_legacy_format(cls, data: dict) -> dict:
"""Normalize legacy 'amount_dollars' to 'total_cents'."""
if "amount_dollars" in data and "total_cents" not in data:
data["total_cents"] = int(float(data.pop("amount_dollars")) * 100)
if "orderId" in data and "order_id" not in data:
data["order_id"] = data.pop("orderId")
return data

# Works with both formats:
LegacyOrderAdapter(order_id="ORD-001", total_cents=999)
LegacyOrderAdapter(orderId="ORD-002", amount_dollars="19.99")

Part 7 \text{---} Nested Models and Composition

from pydantic import BaseModel, Field, EmailStr
from typing import Optional
from uuid import UUID, uuid4
from datetime import datetime, timezone

class Address(BaseModel):
street: str
city: str
country: str = Field(min_length=2, max_length=2, description="ISO 3166-1 alpha-2 country code")
postal_code: Optional[str] = None

class ContactInfo(BaseModel):
email: EmailStr
phone: Optional[str] = None
address: Optional[Address] = None

class UserCreate(BaseModel):
name: str = Field(min_length=1, max_length=100)
contact: ContactInfo # nested model
preferences: dict[str, str] = Field(default_factory=dict)

# Validation cascades through nested models
try:
UserCreate(
name="Alice",
contact=ContactInfo(
address=Address(
street="123 Main St",
city="Springfield",
country="USA", # 3 letters \text{---} violates max_length=2
)
)
)
except Exception as e:
print(e)
# contact.address.country: String should have at most 2 characters

# Valid nested model \text{---} input can use dicts, no need to pre-construct nested models
user = UserCreate(
name="Alice",
contact={ # dict is automatically validated into ContactInfo
"email": "[email protected]",
"address": {
"street": "123 Main St",
"city": "Springfield",
"country": "US",
}
}
)
print(type(user.contact)) # <class 'ContactInfo'>
print(type(user.contact.address)) # <class 'Address'>

Part 8 \text{---} model_config with ConfigDict

ConfigDict controls model behavior globally:

from pydantic import BaseModel, ConfigDict, Field
from enum import Enum

class TaskStatus(str, Enum):
PENDING = "pending"
DONE = "done"

class TaskRequest(BaseModel):
model_config = ConfigDict(
# Strip whitespace from all str fields \text{---} prevents " Alice " in the DB
str_strip_whitespace=True,

# Lowercase all str fields (useful for email normalization)
# str_to_lower=True, # uncomment if you want lowercase

# Accept enum values instead of enum instances in input
use_enum_values=True,

# ORM mode: read attributes from SQLAlchemy model instances
from_attributes=True,

# Populate fields by field name AND by alias (default: alias only when alias is set)
populate_by_name=True,

# Validate default values (not just user-supplied values)
validate_default=False, # True adds overhead \text{---} use only when needed
)

title: str
status: TaskStatus = TaskStatus.PENDING

# str_strip_whitespace in action
task = TaskRequest(title=" Write tests ", status="pending")
print(repr(task.title)) # 'Write tests' \text{---} stripped
print(task.status) # 'pending' \text{---} use_enum_values returns the value, not the Enum

ORM Mode (from_attributes=True)

from pydantic import BaseModel, ConfigDict
from sqlalchemy import Column, String, Integer
from sqlalchemy.orm import DeclarativeBase

class Base(DeclarativeBase):
pass

class TaskORM(Base):
__tablename__ = "tasks"
id = Column(Integer, primary_key=True)
title = Column(String)
priority = Column(Integer)

class TaskResponse(BaseModel):
model_config = ConfigDict(from_attributes=True)

id: int
title: str
priority: int

# Without from_attributes, this raises TypeError:
# TaskResponse(**orm_task.__dict__) misses __sqlalchemy_internal_...
orm_task = TaskORM(id=1, title="Write tests", priority=3)
response = TaskResponse.model_validate(orm_task) # reads attributes, not dict
print(response.id) # 1
print(response.title) # "Write tests"
note

from_attributes=True (ORM mode) lets Pydantic read SQLAlchemy model attributes directly using model_validate(orm_object). Without it, you must convert the ORM object to a dict first, which requires knowing which attributes to include and risks triggering lazy-loaded relationships. With from_attributes=True, Pydantic accesses only the attributes it needs for the declared fields.

Part 9 \text{---} model_dump() and model_dump_json()

from pydantic import BaseModel, Field
from datetime import datetime, timezone
from uuid import UUID, uuid4
from typing import Optional

class TaskResponse(BaseModel):
id: UUID = Field(default_factory=uuid4)
title: str
created_at: datetime = Field(default_factory=lambda: datetime.now(tz=timezone.utc))
completed: bool = False
secret: Optional[str] = None # never expose this

task = TaskResponse(title="Write tests", secret="internal-note")

# model_dump() → Python dict (datetime stays as datetime, UUID stays as UUID)
d = task.model_dump()
print(type(d["id"])) # <class 'uuid.UUID'>
print(type(d["created_at"])) # <class 'datetime.datetime'>
# Passing this to json.dumps() without a custom encoder → TypeError

# model_dump(mode="json") → dict with JSON-serializable values
d_json = task.model_dump(mode="json")
print(type(d_json["id"])) # <class 'str'> \text{---} UUID serialized to string
print(type(d_json["created_at"])) # <class 'str'> \text{---} datetime serialized to ISO 8601

# model_dump_json() → JSON string directly (fastest, bypasses dict step)
json_str = task.model_dump_json()
print(json_str)

# Exclude fields from output
d = task.model_dump(exclude={"secret"})
# id, title, created_at, completed \text{---} no "secret"

# Exclude None values (useful for sparse responses)
task2 = TaskResponse(title="No secret task")
task2.model_dump(exclude_none=True) # "secret" omitted because it's None

# Exclude fields that were not explicitly set (useful for PATCH responses)
task3 = TaskResponse(title="Explicit title")
task3.model_dump(exclude_unset=True) # only "title" \text{---} id, created_at, completed have defaults

# Use alias in output
class AliasedTask(BaseModel):
task_id: UUID = Field(alias="taskId")

model_config = ConfigDict(populate_by_name=True)

at = AliasedTask(task_id=uuid4())
at.model_dump() # {"task_id": ...} \text{---} uses field name
at.model_dump(by_alias=True) # {"taskId": ...} \text{---} uses alias

Part 10 \text{---} Partial Updates: PATCH Endpoint Pattern

PATCH requests update only specified fields. Pydantic handles this with optional fields + model_copy:

from pydantic import BaseModel, ConfigDict
from typing import Optional
from uuid import UUID

class TaskUpdate(BaseModel):
"""All fields optional \text{---} only provided fields are updated."""
title: Optional[str] = None
priority: Optional[int] = None
completed: Optional[bool] = None
assignee_id: Optional[UUID] = None

class Task(BaseModel):
id: UUID
title: str
priority: int
completed: bool = False
assignee_id: Optional[UUID] = None

# PATCH endpoint pattern
async def patch_task(task_id: UUID, updates: TaskUpdate, db) -> Task:
existing = await db.get_task(task_id) # fetch current state

# Convert existing ORM object to Pydantic model
current = Task.model_validate(existing)

# Apply only the fields that were explicitly provided in the request
# exclude_unset=True: only fields the client sent, not all Optional=None fields
update_data = updates.model_dump(exclude_unset=True)

# model_copy(update=...) produces a new model with specified fields replaced
updated = current.model_copy(update=update_data)

await db.update_task(task_id, updated.model_dump(exclude={"id"}))
return updated

# Example:
# PATCH /tasks/123 {"title": "Updated title"}
# update_data = {"title": "Updated title"} \text{---} only title, not priority/completed/assignee_id
# current.model_copy(update={"title": "Updated title"})
# → Task with new title, all other fields unchanged
tip

The key is model_dump(exclude_unset=True): it returns only fields that were explicitly included in the request body \text{---} not all fields that happen to be None. If the client sends {"title": "New"}, then exclude_unset=True returns {"title": "New"}, not {"title": "New", "priority": None, "completed": None, "assignee_id": None}. Without exclude_unset=True, a PATCH would set every unspecified optional field to None in the database.

Part 11 \text{---} RootModel for Lists and Primitives

When you need to validate a JSON array or a single value at the top level:

from pydantic import RootModel, Field
from typing import Annotated

# Validate a list of strings
class TagList(RootModel[list[str]]):
root: list[str] = Field(min_length=1, max_length=20)

tags = TagList.model_validate(["python", "fastapi", "pydantic"])
print(tags.root) # ['python', 'fastapi', 'pydantic']
tags.model_dump_json() # '["python","fastapi","pydantic"]'

# Validate a list of models
from uuid import UUID

class BulkTaskIds(RootModel[list[UUID]]):
pass

# DELETE /tasks with body ["uuid1", "uuid2", "uuid3"]
ids = BulkTaskIds.model_validate([
"550e8400-e29b-41d4-a716-446655440000",
"6ba7b810-9dad-11d1-80b4-00c04fd430c8",
])
print(ids.root) # [UUID('550e8400...'), UUID('6ba7b810...')]

# Validate a primitive with constraints
class PositiveScore(RootModel[Annotated[int, Field(ge=0, le=100)]]):
pass

score = PositiveScore.model_validate(87)
print(score.root) # 87

Part 12 \text{---} Annotated Types for Reusable Constraints

Instead of repeating Field(ge=1, le=5) everywhere, define reusable annotated types:

from typing import Annotated
from pydantic import Field, BaseModel
from uuid import UUID

# Reusable constrained types
PositiveInt = Annotated[int, Field(gt=0)]
Priority = Annotated[int, Field(ge=1, le=5, description="1=lowest, 5=highest")]
ShortStr = Annotated[str, Field(min_length=1, max_length=200)]
NonEmptyStr = Annotated[str, Field(min_length=1)]
Percentage = Annotated[float, Field(ge=0.0, le=100.0)]
TaskRef = Annotated[str, Field(pattern=r"^TASK-\d{4,6}$")]

# Use them across multiple models
class TaskCreate(BaseModel):
title: ShortStr
description: Annotated[str, Field(max_length=2000)] = ""
priority: Priority = 2
estimated_hours: PositiveInt = 1

class ProjectCreate(BaseModel):
name: ShortStr
completion_percentage: Percentage = 0.0
default_priority: Priority = 2

class TaskUpdate(BaseModel):
title: ShortStr | None = None
priority: Priority | None = None

# The constraint is defined once - changing PositiveInt changes it everywhere

Part 13 - Discriminated Unions for Polymorphic Bodies

When an endpoint accepts different payload shapes depending on a type field:

from pydantic import BaseModel, Field
from typing import Literal, Union, Annotated
from uuid import UUID

class EmailNotification(BaseModel):
type: Literal["email"]
recipient: str
subject: str
body: str

class SlackNotification(BaseModel):
type: Literal["slack"]
channel: str
message: str
thread_ts: str | None = None

class WebhookNotification(BaseModel):
type: Literal["webhook"]
url: str
payload: dict
secret: str | None = None

# Discriminated union: Pydantic uses "type" field to pick the correct model
Notification = Annotated[
Union[EmailNotification, SlackNotification, WebhookNotification],
Field(discriminator="type"),
]

class SendNotificationRequest(BaseModel):
notification: Notification
scheduled_at: str | None = None

# FastAPI endpoint
from fastapi import FastAPI
app = FastAPI()

@app.post("/notifications")
async def send_notification(request: SendNotificationRequest):
match request.notification:
case EmailNotification():
return {"action": "email", "to": request.notification.recipient}
case SlackNotification():
return {"action": "slack", "channel": request.notification.channel}
case WebhookNotification():
return {"action": "webhook", "url": request.notification.url}

# Input: {"notification": {"type": "email", "recipient": "[email protected]", "subject": "Hi", "body": "..."}}
# Pydantic sees type="email" → validates as EmailNotification
# Input: {"notification": {"type": "slack", "channel": "#general", "message": "Hello"}}
# Pydantic sees type="slack" → validates as SlackNotification

Without discriminated unions, Pydantic tries each model in order - slow for large unions and produces confusing errors. With a discriminator, it jumps directly to the correct model - O(1) dispatch.

Part 14 - model_json_schema() and OpenAPI Integration

Pydantic generates JSON Schema from your models, which FastAPI uses to build OpenAPI docs:

from pydantic import BaseModel, Field
from typing import Optional
import json

class TaskCreate(BaseModel):
title: str = Field(
min_length=1,
max_length=200,
description="Task title",
examples=["Write unit tests"],
)
priority: int = Field(
default=2,
ge=1,
le=5,
description="Priority 1 (lowest) to 5 (highest)",
)
assignee_id: Optional[str] = Field(
default=None,
description="Assignee user ID",
)

schema = TaskCreate.model_json_schema()
print(json.dumps(schema, indent=2))
# {
# "type": "object",
# "title": "TaskCreate",
# "properties": {
# "title": {
# "type": "string",
# "minLength": 1,
# "maxLength": 200,
# "description": "Task title",
# "examples": ["Write unit tests"]
# },
# "priority": {
# "type": "integer",
# "minimum": 1,
# "maximum": 5,
# "default": 2,
# "description": "Priority 1 (lowest) to 5 (highest)"
# },
# ...
# },
# "required": ["title"]
# }

FastAPI generates GET /openapi.json and renders it at GET /docs (Swagger UI) and GET /redoc automatically. Every Field(description=..., examples=[...]) appears in the Swagger docs.

Graded Practice

Level 1 - Predict the Validation Result

For each snippet, predict: does it succeed or raise ValidationError? If it succeeds, what are the field values?

1a:

from pydantic import BaseModel

class Item(BaseModel):
name: str
count: int
active: bool

item = Item(name=42, count="10", active=0)
print(item.name, item.count, item.active)

1b:

from pydantic import BaseModel, Field

class Product(BaseModel):
price: float = Field(gt=0, lt=1000)

p = Product(price=0)

1c:

from pydantic import BaseModel
from typing import Optional

class Task(BaseModel):
title: str
priority: Optional[int] = None

t = Task(title="test")
print(t.model_dump(exclude_unset=True))
print(t.model_dump(exclude_none=True))
Show Answer

1a - Succeeds with coercion:

42 10 False
  • name=42: int coerces to str"42"
  • count="10": str coerces to int10
  • active=0: int coerces to boolFalse (0 is falsy)

Note: active=1 would give True, active=2 would give True (any non-zero int is truthy).

1b - Raises ValidationError:

price: Input should be greater than 0 [type=greater_than, ...]

gt=0 means strictly greater than - 0 itself is not allowed. Use ge=0 if you want to allow zero.

1c - Succeeds, two different outputs:

t.model_dump(exclude_unset=True)
# {'title': 'test'}
# "priority" was never set - it only has a default

t.model_dump(exclude_none=True)
# {'title': 'test'}
# "priority" is None - excluded

# Contrast:
t2 = Task(title="test", priority=None) # explicitly set to None
t2.model_dump(exclude_unset=True)
# {'title': 'test', 'priority': None} - priority was SET (to None)
t2.model_dump(exclude_none=True)
# {'title': 'test'} - None values excluded regardless

Level 2 - Debug the Validator

A developer built this model for a booking API. Find and fix all problems:

from pydantic import BaseModel, field_validator, model_validator
from datetime import datetime

class BookingCreate(BaseModel):
title: str
start_time: datetime
end_time: datetime
attendees: list[str]

@field_validator("attendees")
def validate_attendees(cls, v): # Bug 1: missing @classmethod
if len(v) == 0:
raise ValueError("must have at least one attendee")
if len(v) > 100:
raise ValueError("cannot exceed 100 attendees")
return v

@model_validator(mode="after")
def check_times(self):
if self.end_time < self.start_time: # Bug 2: should be <=, not <
raise ValueError("end_time must be after start_time")
return self

@field_validator("start_time", mode="after")
@classmethod
def start_must_be_future(cls, v: datetime) -> datetime:
if v < datetime.now(): # Bug 3: timezone-naive comparison
raise ValueError("start_time must be in the future")
return v
Show Answer

Bug 1 - Missing @classmethod on field_validator:

In Pydantic v2, @field_validator methods must be class methods. Without @classmethod, Pydantic raises PydanticUserError: 'field_validator' must be used with classmethod. The fix:

@field_validator("attendees")
@classmethod
def validate_attendees(cls, v: list[str]) -> list[str]:
if len(v) == 0:
raise ValueError("must have at least one attendee")
if len(v) > 100:
raise ValueError("cannot exceed 100 attendees")
return v

Note: @field_validator goes first, @classmethod goes second. This is the required order in Pydantic v2.

Bug 2 - < should be <= in the time comparison:

A booking where start_time == end_time would be accepted - a zero-duration booking makes no sense. The condition should exclude equal times:

@model_validator(mode="after")
def check_times(self) -> "BookingCreate":
if self.end_time <= self.start_time: # <=: end must be strictly after start
raise ValueError("end_time must be strictly after start_time")
return self

Bug 3 - Timezone-naive comparison in start_must_be_future:

datetime.now() returns a naive datetime (no timezone info). If start_time is parsed from an ISO 8601 string with a timezone offset (e.g., "2024-03-15T12:00:00Z"), v will be timezone-aware. Comparing a timezone-aware datetime to a naive datetime raises TypeError: can't compare offset-naive and offset-aware datetimes.

The fix:

from datetime import datetime, timezone

@field_validator("start_time", mode="after")
@classmethod
def start_must_be_future(cls, v: datetime) -> datetime:
if v.tzinfo is None:
raise ValueError("start_time must be timezone-aware (include UTC offset or Z)")
now = datetime.now(tz=timezone.utc)
if v <= now:
raise ValueError("start_time must be in the future")
return v

Always use datetime.now(tz=timezone.utc) for the comparison baseline. Require all incoming datetimes to be timezone-aware - reject naive datetimes at the validation layer.

Level 3 - Design the Request and Response Models

You are building a task management API with these endpoints:

  • POST /tasks - create a task
  • GET /tasks/{id} - get a task (response includes computed fields not in create)
  • PATCH /tasks/{id} - partial update
  • POST /tasks/bulk - create up to 100 tasks in one request

Business rules:

  • Title: 1–200 chars, stripped of whitespace
  • Priority: 1–5 (default 2)
  • Due date: must be in the future if provided; must be timezone-aware
  • Assignee: optional UUID; if provided, must be a valid v4 UUID
  • Tags: list of strings, each 1–50 chars, maximum 10 tags, all lowercase
  • GET response includes: id (UUID), created_at (datetime UTC), updated_at (datetime UTC), title, priority, due_date, tags, assignee_id, status (an Enum)
  • PATCH must only update provided fields, not set unprovided fields to null
  • Bulk create: array of TaskCreate, minimum 1, maximum 100

Design all models with full validation, config, and correct types.

Show Answer
from pydantic import BaseModel, Field, ConfigDict, field_validator, model_validator, RootModel
from typing import Annotated, Optional
from uuid import UUID, uuid4
from datetime import datetime, timezone
from enum import Enum

# ── Reusable annotated types ───────────────────────────────────────────────────

Title = Annotated[str, Field(min_length=1, max_length=200, description="Task title")]
Priority = Annotated[int, Field(ge=1, le=5, description="1=lowest, 5=highest")]

class TaskStatus(str, Enum):
PENDING = "pending"
IN_PROGRESS = "in_progress"
DONE = "done"
CANCELLED = "cancelled"

# ── Request models ─────────────────────────────────────────────────────────────

class TaskCreate(BaseModel):
model_config = ConfigDict(str_strip_whitespace=True)

title: Title
priority: Priority = 2
due_date: Optional[datetime] = None
assignee_id: Optional[UUID] = None
tags: list[str] = Field(default_factory=list, max_length=10)

@field_validator("tags", mode="before")
@classmethod
def normalize_tags(cls, v: list) -> list:
"""Lowercase and strip each tag before length validation."""
if isinstance(v, list):
return [str(tag).lower().strip() for tag in v]
return v

@field_validator("tags")
@classmethod
def validate_tag_lengths(cls, v: list[str]) -> list[str]:
for tag in v:
if not (1 <= len(tag) <= 50):
raise ValueError(
f"each tag must be 1–50 characters, got '{tag}' ({len(tag)} chars)"
)
return v

@field_validator("due_date", mode="after")
@classmethod
def due_date_must_be_future_and_aware(cls, v: Optional[datetime]) -> Optional[datetime]:
if v is None:
return v
if v.tzinfo is None:
raise ValueError("due_date must be timezone-aware (e.g. '2024-12-01T12:00:00Z')")
if v <= datetime.now(tz=timezone.utc):
raise ValueError("due_date must be in the future")
return v


class TaskUpdate(BaseModel):
"""All fields optional for PATCH - only provided fields are applied."""
model_config = ConfigDict(str_strip_whitespace=True)

title: Optional[Title] = None
priority: Optional[Priority] = None
due_date: Optional[datetime] = None
assignee_id: Optional[UUID] = None
tags: Optional[list[str]] = None
status: Optional[TaskStatus] = None

@field_validator("tags", mode="before")
@classmethod
def normalize_tags(cls, v) -> list | None:
if v is None:
return None
return [str(tag).lower().strip() for tag in v]

@field_validator("due_date", mode="after")
@classmethod
def due_date_must_be_future_and_aware(cls, v: Optional[datetime]) -> Optional[datetime]:
if v is None:
return v
if v.tzinfo is None:
raise ValueError("due_date must be timezone-aware")
if v <= datetime.now(tz=timezone.utc):
raise ValueError("due_date must be in the future")
return v


# ── Response model ─────────────────────────────────────────────────────────────

class TaskResponse(BaseModel):
"""Response model - reads from SQLAlchemy ORM object via from_attributes."""
model_config = ConfigDict(from_attributes=True, use_enum_values=True)

id: UUID
title: str
priority: int
status: TaskStatus
due_date: Optional[datetime]
assignee_id: Optional[UUID]
tags: list[str]
created_at: datetime
updated_at: datetime

def to_json(self) -> str:
"""Serialize to JSON with ISO 8601 datetimes."""
return self.model_dump_json()


# ── Bulk create ────────────────────────────────────────────────────────────────

class BulkTaskCreate(RootModel[list[TaskCreate]]):
"""Validates a JSON array of TaskCreate - minimum 1, maximum 100."""

@model_validator(mode="after")
def validate_count(self) -> "BulkTaskCreate":
if len(self.root) == 0:
raise ValueError("bulk create requires at least 1 task")
if len(self.root) > 100:
raise ValueError(
f"bulk create allows at most 100 tasks, got {len(self.root)}"
)
return self


# ── FastAPI endpoints ──────────────────────────────────────────────────────────

from fastapi import FastAPI, HTTPException

app = FastAPI()

@app.post("/tasks", response_model=TaskResponse, status_code=201)
async def create_task(body: TaskCreate):
# body is already validated: title stripped, tags lowercased, due_date UTC-aware
task = await db.create_task(body.model_dump())
return TaskResponse.model_validate(task)

@app.get("/tasks/{task_id}", response_model=TaskResponse)
async def get_task(task_id: UUID):
task = await db.get_task(task_id)
if not task:
raise HTTPException(status_code=404, detail=f"Task {task_id} not found")
return TaskResponse.model_validate(task)

@app.patch("/tasks/{task_id}", response_model=TaskResponse)
async def update_task(task_id: UUID, body: TaskUpdate):
existing = await db.get_task(task_id)
if not existing:
raise HTTPException(status_code=404, detail=f"Task {task_id} not found")

current = TaskResponse.model_validate(existing)

# Only update fields the client actually provided
update_data = body.model_dump(exclude_unset=True)
updated = current.model_copy(update=update_data)

await db.update_task(task_id, updated.model_dump(exclude={"id", "created_at", "updated_at"}))
return updated

@app.post("/tasks/bulk", response_model=list[TaskResponse], status_code=201)
async def bulk_create_tasks(body: BulkTaskCreate):
tasks = await db.bulk_create_tasks([t.model_dump() for t in body.root])
return [TaskResponse.model_validate(t) for t in tasks]

Key design decisions:

  • Separate TaskCreate and TaskUpdate: different validation semantics - TaskCreate has required fields and defaults; TaskUpdate has all-optional fields. Never use a single model for both.
  • exclude_unset=True in PATCH: critical - otherwise PATCH sets every unspecified optional field to None in the database, destroying existing values.
  • mode='before' for tag normalization: normalization (lowercase, strip) must run before the min_length=1 check, or " python " fails the length check before being stripped to "python".
  • RootModel for bulk create: the top-level JSON is an array - RootModel is the correct Pydantic type for this. model_validator on the root model validates the count.
  • from_attributes=True only on TaskResponse: request models do not need ORM mode; only the response model reads from SQLAlchemy objects. Keeping them separate avoids accidental attribute access at the wrong layer.

Key Takeaways

  • Pydantic v2 coerces by default: "25" becomes 25 for an int field. Use ConfigDict(strict=True) or Annotated[int, Field(strict=True)] to disable coercion per-model or per-field.
  • str does not validate email format: use EmailStr from pydantic[email]. The same applies to HttpUrl, IPvAnyAddress - always use the semantic type, not str, when you have a format requirement.
  • @field_validator requires @classmethod in Pydantic v2. The order is @field_validator(...) first, @classmethod second. Swap them and you get a PydanticUserError.
  • mode='before' for normalization, mode='after' for business rules: before-validators receive raw input (may not be the declared type); after-validators receive the already-coerced value.
  • model_dump(exclude_unset=True) is essential for PATCH: without it, every optional field not sent by the client becomes None in the database, silently destroying data.
  • model_dump(mode="json") vs model_dump(): only mode="json" produces a dict with JSON-serializable values. Plain model_dump() returns Python objects - passing it to json.dumps() raises TypeError for datetime, UUID, and Decimal fields.
  • from_attributes=True (ORM mode) enables model_validate(orm_object) - Pydantic reads attributes directly instead of requiring a dict. Use it on response models, not request models.
  • ConfigDict(str_strip_whitespace=True) on all request models: users always add accidental whitespace. Stripping it at the validation layer prevents data quality bugs in the database.
  • Discriminated unions (Field(discriminator="type")) dispatch to the correct model in O(1) based on a literal field value - far faster and clearer than bare Union[A, B, C] which tries each model in sequence.
  • Annotated types make constraints reusable: define Priority = Annotated[int, Field(ge=1, le=5)] once, use it across all models. Change the constraint in one place, apply everywhere.

What's Next

You have completed Module 06 - APIs and Web Basics. You can now:

  • Build HTTP APIs with FastAPI - routes, dependency injection, middleware, error handling
  • Serialize production data with custom encoders, orjson, and Pydantic's JSON layer
  • Validate request and response models with Pydantic v2 - constraints, custom validators, cross-field rules, ORM mode

Module 07 - Databases builds directly on this foundation:

  • SQLAlchemy Core and ORM - the from_attributes=True pattern you used in Pydantic becomes the primary integration point; defining ORM models that Pydantic can validate directly
  • Alembic migrations - database schema evolution, the same models evolve over time
  • Async database access - async with session: patterns that work correctly in FastAPI's async endpoints
  • Connection pooling - why engine = create_engine(...) is called once and shared, not once per request
  • N+1 query problems - the SQLAlchemy lazy-loading issue referenced in Lesson 07 (JSON Serialization), solved with selectinload and joinedload
© 2026 EngineersOfAI. All rights reserved.