What is python pydantic v2 validation?

Master Pydantic v2 at engineering depth - BaseModel, Field constraints, field and model validators, ORM mode, discriminated unions, partial updates for PATCH endpoints, JSON Schema generation, and the model_dump gotchas that silently corrupt production data.

How does python pydantic basemodel field validator work in practice?

Validation with Pydantic - Production Request and Response Models covers python pydantic v2 validation, python pydantic basemodel field validator, python pydantic model validator from first principles with code examples. Free lesson at https://engineersofai.com/docs/python/python-intermediate/apis-and-web-basics/validation-with-pydantic

What is the difference between python pydantic v2 validation and python pydantic model validator?

See the full breakdown at https://engineersofai.com/docs/python/python-intermediate/apis-and-web-basics/validation-with-pydantic

Validation with Pydantic - Production Request and Response Models

Reading time: ~35 minutes | Level: Intermediate → Engineering

Before reading further, predict the output:

from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int
    email: str

u = User(name="Alice", age="25", email="not-an-email")
print(u.age)    # ?
print(u.email)  # ?

25
not-an-email

Two surprises in one model. First: age="25" coerces to 25 - Pydantic validates that "25" can become an int and does the conversion silently. This is type coercion, not type checking, and it is intentional. Second: email="not-an-email" is accepted without error - str means "any string," not "a valid email address." For email validation, you need EmailStr from pydantic[email].

This is the fundamental tension in Pydantic: it is very helpful by default (coercion) but only as strict as you ask it to be (no semantic validation unless you add it). This lesson shows you where that line is, how to move it, and what tools exist on both sides.

What You Will Learn

BaseModel: field declaration, type coercion vs strict mode
All built-in validated types: EmailStr, HttpUrl, UUID, IPvAnyAddress
Field(): every constraint that matters in production
field_validator and model_validator: single-field and cross-field validation
The complete Pydantic validation pipeline from raw data to model
Nested models and composition patterns
model_config with ConfigDict: ORM mode, whitespace stripping, enum handling
model_dump() and model_dump_json(): the mode='json' gotcha and serialization options
model_json_schema(): OpenAPI docs generation
Partial updates with model_copy(update=...) for PATCH endpoints
RootModel for validating lists and primitives
Annotated types for reusable constraints
Discriminated unions for polymorphic request bodies

Prerequisites

Lesson 04 (FastAPI) - Pydantic is FastAPI's request validation engine
Lesson 07 (JSON Serialization) - understanding what model_dump(mode='json') does
Basic Python type hints (str, int, list[str], Optional, Union)

Part 1 - `BaseModel` and Type Coercion

Pydantic v2's BaseModel is the foundation. Every field is declared as a class attribute with a type annotation:

from pydantic import BaseModel
from datetime import datetime
from uuid import UUID

class Task(BaseModel):
    id: UUID
    title: str
    priority: int
    created_at: datetime
    completed: bool = False  # field with default

Coercion: What Pydantic Does by Default

Pydantic attempts to coerce input values to the declared type:

from pydantic import BaseModel

class Task(BaseModel):
    title: str
    priority: int
    active: bool

# Coercion examples
task = Task(
    title=123,       # int coerces to str: "123"
    priority="5",    # str coerces to int: 5
    active=1,        # int coerces to bool: True
)
print(task.title)    # "123"
print(task.priority) # 5
print(task.active)   # True

Strict Mode: Disable Coercion

When you need exact types (e.g., an API that must reject "5" for an int field):

from pydantic import BaseModel, ConfigDict

class StrictTask(BaseModel):
    model_config = ConfigDict(strict=True)

    title: str
    priority: int

# Now coercion is disabled
try:
    StrictTask(title="write tests", priority="5")
except Exception as e:
    print(e)
    # priority: Input should be a valid integer [type=int_type, ...]

# Per-field strict mode (override model-level)
from pydantic import Field
from typing import Annotated

class HybridTask(BaseModel):
    title: str                                  # coercion enabled
    priority: Annotated[int, Field(strict=True)]  # this field: strict

Part 2 - Built-In Validated Types

These are Pydantic types that enforce semantic correctness beyond Python's type system:

from pydantic import BaseModel, EmailStr, HttpUrl, AnyUrl, IPvAnyAddress
from uuid import UUID
from datetime import datetime, date
from typing import Optional

class UserProfile(BaseModel):
    id: UUID                        # validates UUID format
    email: EmailStr                 # validates email syntax (requires pydantic[email])
    website: Optional[HttpUrl]      # validates URL with http/https scheme
    api_endpoint: Optional[AnyUrl]  # validates URL with any scheme
    ip_address: Optional[IPvAnyAddress]  # validates IPv4 or IPv6
    created_at: datetime            # parses ISO 8601 strings
    birth_date: Optional[date]      # parses YYYY-MM-DD strings

# All of these parse correctly:
user = UserProfile(
    id="550e8400-e29b-41d4-a716-446655440000",  # str → UUID
    email="[email protected]",
    website="https://alice.dev",
    ip_address="192.168.1.1",
    created_at="2024-03-15T12:00:00Z",          # str → datetime
    birth_date="1990-05-20",                     # str → date
)
print(type(user.id))           # <class 'uuid.UUID'>
print(type(user.created_at))   # <class 'datetime.datetime'>

# These raise ValidationError:
try:
    UserProfile(id="not-a-uuid", email="not-an-email", created_at="2024-03-15")
except Exception as e:
    print(e)

warning

EmailStr requires the email-validator package: pip install pydantic[email]. Without it, using EmailStr raises ImportError: optional dependency 'email-validator' is not installed. Add pydantic[email] to your pyproject.toml dependencies for any project that validates email addresses.

Part 3 - `Field()`: Every Constraint That Matters

Field() provides metadata and validation constraints for individual fields:

from pydantic import BaseModel, Field
from typing import Optional
from uuid import UUID
from datetime import datetime
import re

class TaskCreate(BaseModel):
    # String constraints
    title: str = Field(
        min_length=1,
        max_length=200,
        description="Task title - must be non-empty",
        examples=["Write unit tests", "Review PR #42"],
    )
    description: Optional[str] = Field(
        default=None,
        max_length=2000,
        description="Optional detailed description",
    )

    # Numeric constraints
    priority: int = Field(
        default=2,
        ge=1,   # greater than or equal
        le=5,   # less than or equal
        description="Priority 1 (lowest) to 5 (highest)",
    )
    estimated_hours: float = Field(
        default=1.0,
        gt=0,   # strictly greater than - excludes 0
        lt=168, # strictly less than 168 hours (one week)
    )

    # Pattern validation
    reference_code: Optional[str] = Field(
        default=None,
        pattern=r"^TASK-\d{4,6}$",  # e.g., "TASK-0042" or "TASK-123456"
        description="Optional ticket reference code",
    )

    # Alias: accept a different key name in the input
    due_date: Optional[datetime] = Field(
        default=None,
        alias="dueDate",  # input JSON key is "dueDate", attribute is "due_date"
        description="Optional due date in ISO 8601 format",
    )

    # default_factory: runs once per instance (never share mutable defaults)
    tags: list[str] = Field(
        default_factory=list,
        description="List of tags",
    )

    # Title for OpenAPI schema display
    assignee_id: Optional[UUID] = Field(
        default=None,
        title="Assignee User ID",
        description="UUID of the user assigned to this task",
    )

`ge`/`le`/`gt`/`lt` vs `min_length`/`max_length`

Constraint	Applies to	Meaning
`ge=N`	numbers	value ≥ N (inclusive)
`le=N`	numbers	value ≤ N (inclusive)
`gt=N`	numbers	value > N (exclusive)
`lt=N`	numbers	value < N (exclusive)
`min_length=N`	`str`, `list`, `bytes`	length ≥ N
`max_length=N`	`str`, `list`, `bytes`	length ≤ N
`pattern=r"..."`	`str`	must match regex (full match)

tip

Use model_config = ConfigDict(str_strip_whitespace=True) on all request models. Users submitting forms always add accidental leading or trailing whitespace to text fields. Without stripping, " Alice " passes min_length=1 validation but creates a user named " Alice " in your database. Stripping prevents an entire class of data quality bugs at zero cost.

Part 4 \text{---} The Pydantic Validation Pipeline

Pydantic collects all errors before raising \text{---} one call gives you every problem, not just the first one:

from pydantic import BaseModel, Field, ValidationError

class Product(BaseModel):
    name: str = Field(min_length=1)
    price: float = Field(gt=0)
    stock: int = Field(ge=0)

try:
    Product(name="", price=-5.0, stock=-1)
except ValidationError as e:
    print(e.error_count())  # 3
    for error in e.errors():
        print(f"  {'.'.join(str(x) for x in error['loc'])}: {error['msg']}")
    # name: String should have at least 1 character
    # price: Input should be greater than 0
    # stock: Input should be greater than or equal to 0

Part 5 \text{---} `field_validator`: Single-Field Custom Validation

from pydantic import BaseModel, field_validator
from datetime import datetime, timezone

class BookingCreate(BaseModel):
    title: str
    start_time: datetime
    end_time: datetime
    guest_email: str

    @field_validator("title")
    @classmethod
    def title_must_not_be_only_whitespace(cls, v: str) -> str:
        """mode='after' (default): runs after type coercion."""
        if v.strip() == "":
            raise ValueError("title must contain non-whitespace characters")
        return v.strip()  # validators can transform the value

    @field_validator("guest_email", mode="before")
    @classmethod
    def normalize_email(cls, v) -> str:
        """mode='before': runs before type coercion \text{---} v may not be a str yet."""
        if isinstance(v, str):
            return v.lower().strip()
        return v  # let type coercion handle non-str input (will fail validation)

    @field_validator("start_time", "end_time", mode="after")
    @classmethod
    def must_be_future(cls, v: datetime) -> datetime:
        """Same validator applied to multiple fields."""
        now = datetime.now(tz=timezone.utc)
        if v.tzinfo is None:
            raise ValueError("datetime must be timezone-aware")
        if v <= now:
            raise ValueError("datetime must be in the future")
        return v

`mode='before'` vs `mode='after'`

Mode	When it runs	Receives	Use for
`'before'`	Before type coercion	Raw input (any type)	Input normalization, format conversion
`'after'` (default)	After type coercion	Already-coerced type	Business rule validation on the final value

# mode='before': raw value \text{---} use for normalization
@field_validator("phone", mode="before")
@classmethod
def strip_phone_formatting(cls, v):
    if isinstance(v, str):
        return re.sub(r"[\s\-\(\)]", "", v)  # "+1 (555) 123-4567" → "+15551234567"
    return v

# mode='after': typed value \text{---} use for business rules
@field_validator("birth_date")
@classmethod
def must_be_adult(cls, v: date) -> date:
    age = (date.today() - v).days // 365
    if age < 18:
        raise ValueError(f"User must be at least 18 years old (got age {age})")
    return v

Part 6 \text{---} `model_validator`: Cross-Field Validation

When validation requires comparing multiple fields, use @model_validator:

from pydantic import BaseModel, model_validator
from datetime import datetime
from typing import Optional, Self

class EventCreate(BaseModel):
    title: str
    start_time: datetime
    end_time: datetime
    max_attendees: Optional[int] = None
    waitlist_enabled: bool = False

    @model_validator(mode="after")
    def end_must_be_after_start(self) -> Self:
        """Cross-field validation: end_time must be after start_time."""
        if self.end_time <= self.start_time:
            raise ValueError(
                f"end_time ({self.end_time.isoformat()}) must be after "
                f"start_time ({self.start_time.isoformat()})"
            )
        return self

    @model_validator(mode="after")
    def waitlist_requires_max_attendees(self) -> Self:
        """Another cross-field rule: waitlist only makes sense with a capacity limit."""
        if self.waitlist_enabled and self.max_attendees is None:
            raise ValueError(
                "waitlist_enabled=True requires max_attendees to be set"
            )
        return self

`mode='before'` for Model Validators

mode='before' receives the raw dict before any field processing \text{---} useful for data transformation:

from pydantic import BaseModel, model_validator

class LegacyOrderAdapter(BaseModel):
    """Accepts either new format or legacy format from old API."""
    order_id: str
    total_cents: int  # internal representation

    @model_validator(mode="before")
    @classmethod
    def handle_legacy_format(cls, data: dict) -> dict:
        """Normalize legacy 'amount_dollars' to 'total_cents'."""
        if "amount_dollars" in data and "total_cents" not in data:
            data["total_cents"] = int(float(data.pop("amount_dollars")) * 100)
        if "orderId" in data and "order_id" not in data:
            data["order_id"] = data.pop("orderId")
        return data

# Works with both formats:
LegacyOrderAdapter(order_id="ORD-001", total_cents=999)
LegacyOrderAdapter(orderId="ORD-002", amount_dollars="19.99")

Part 7 \text{---} Nested Models and Composition

from pydantic import BaseModel, Field, EmailStr
from typing import Optional
from uuid import UUID, uuid4
from datetime import datetime, timezone

class Address(BaseModel):
    street: str
    city: str
    country: str = Field(min_length=2, max_length=2, description="ISO 3166-1 alpha-2 country code")
    postal_code: Optional[str] = None

class ContactInfo(BaseModel):
    email: EmailStr
    phone: Optional[str] = None
    address: Optional[Address] = None

class UserCreate(BaseModel):
    name: str = Field(min_length=1, max_length=100)
    contact: ContactInfo                    # nested model
    preferences: dict[str, str] = Field(default_factory=dict)

# Validation cascades through nested models
try:
    UserCreate(
        name="Alice",
        contact=ContactInfo(
            email="[email protected]",
            address=Address(
                street="123 Main St",
                city="Springfield",
                country="USA",  # 3 letters \text{---} violates max_length=2
            )
        )
    )
except Exception as e:
    print(e)
    # contact.address.country: String should have at most 2 characters

# Valid nested model \text{---} input can use dicts, no need to pre-construct nested models
user = UserCreate(
    name="Alice",
    contact={                             # dict is automatically validated into ContactInfo
        "email": "[email protected]",
        "address": {
            "street": "123 Main St",
            "city": "Springfield",
            "country": "US",
        }
    }
)
print(type(user.contact))           # <class 'ContactInfo'>
print(type(user.contact.address))   # <class 'Address'>

Part 8 \text{---} `model_config` with `ConfigDict`

ConfigDict controls model behavior globally:

from pydantic import BaseModel, ConfigDict, Field
from enum import Enum

class TaskStatus(str, Enum):
    PENDING = "pending"
    DONE = "done"

class TaskRequest(BaseModel):
    model_config = ConfigDict(
        # Strip whitespace from all str fields \text{---} prevents "  Alice  " in the DB
        str_strip_whitespace=True,

        # Lowercase all str fields (useful for email normalization)
        # str_to_lower=True,  # uncomment if you want lowercase

        # Accept enum values instead of enum instances in input
        use_enum_values=True,

        # ORM mode: read attributes from SQLAlchemy model instances
        from_attributes=True,

        # Populate fields by field name AND by alias (default: alias only when alias is set)
        populate_by_name=True,

        # Validate default values (not just user-supplied values)
        validate_default=False,  # True adds overhead \text{---} use only when needed
    )

    title: str
    status: TaskStatus = TaskStatus.PENDING

# str_strip_whitespace in action
task = TaskRequest(title="  Write tests  ", status="pending")
print(repr(task.title))   # 'Write tests'  \text{---} stripped
print(task.status)        # 'pending'  \text{---} use_enum_values returns the value, not the Enum

ORM Mode (`from_attributes=True`)

from pydantic import BaseModel, ConfigDict
from sqlalchemy import Column, String, Integer
from sqlalchemy.orm import DeclarativeBase

class Base(DeclarativeBase):
    pass

class TaskORM(Base):
    __tablename__ = "tasks"
    id = Column(Integer, primary_key=True)
    title = Column(String)
    priority = Column(Integer)

class TaskResponse(BaseModel):
    model_config = ConfigDict(from_attributes=True)

    id: int
    title: str
    priority: int

# Without from_attributes, this raises TypeError:
# TaskResponse(**orm_task.__dict__) misses __sqlalchemy_internal_...
orm_task = TaskORM(id=1, title="Write tests", priority=3)
response = TaskResponse.model_validate(orm_task)  # reads attributes, not dict
print(response.id)     # 1
print(response.title)  # "Write tests"

note

from_attributes=True (ORM mode) lets Pydantic read SQLAlchemy model attributes directly using model_validate(orm_object). Without it, you must convert the ORM object to a dict first, which requires knowing which attributes to include and risks triggering lazy-loaded relationships. With from_attributes=True, Pydantic accesses only the attributes it needs for the declared fields.

Part 9 \text{---} `model_dump()` and `model_dump_json()`

from pydantic import BaseModel, Field
from datetime import datetime, timezone
from uuid import UUID, uuid4
from typing import Optional

class TaskResponse(BaseModel):
    id: UUID = Field(default_factory=uuid4)
    title: str
    created_at: datetime = Field(default_factory=lambda: datetime.now(tz=timezone.utc))
    completed: bool = False
    secret: Optional[str] = None  # never expose this

task = TaskResponse(title="Write tests", secret="internal-note")

# model_dump() → Python dict (datetime stays as datetime, UUID stays as UUID)
d = task.model_dump()
print(type(d["id"]))           # <class 'uuid.UUID'>
print(type(d["created_at"]))   # <class 'datetime.datetime'>
# Passing this to json.dumps() without a custom encoder → TypeError

# model_dump(mode="json") → dict with JSON-serializable values
d_json = task.model_dump(mode="json")
print(type(d_json["id"]))          # <class 'str'>  \text{---} UUID serialized to string
print(type(d_json["created_at"]))  # <class 'str'>  \text{---} datetime serialized to ISO 8601

# model_dump_json() → JSON string directly (fastest, bypasses dict step)
json_str = task.model_dump_json()
print(json_str)

# Exclude fields from output
d = task.model_dump(exclude={"secret"})
# id, title, created_at, completed \text{---} no "secret"

# Exclude None values (useful for sparse responses)
task2 = TaskResponse(title="No secret task")
task2.model_dump(exclude_none=True)  # "secret" omitted because it's None

# Exclude fields that were not explicitly set (useful for PATCH responses)
task3 = TaskResponse(title="Explicit title")
task3.model_dump(exclude_unset=True)  # only "title" \text{---} id, created_at, completed have defaults

# Use alias in output
class AliasedTask(BaseModel):
    task_id: UUID = Field(alias="taskId")

    model_config = ConfigDict(populate_by_name=True)

at = AliasedTask(task_id=uuid4())
at.model_dump()              # {"task_id": ...}  \text{---} uses field name
at.model_dump(by_alias=True) # {"taskId": ...}   \text{---} uses alias

Part 10 \text{---} Partial Updates: PATCH Endpoint Pattern

PATCH requests update only specified fields. Pydantic handles this with optional fields + model_copy:

from pydantic import BaseModel, ConfigDict
from typing import Optional
from uuid import UUID

class TaskUpdate(BaseModel):
    """All fields optional \text{---} only provided fields are updated."""
    title: Optional[str] = None
    priority: Optional[int] = None
    completed: Optional[bool] = None
    assignee_id: Optional[UUID] = None

class Task(BaseModel):
    id: UUID
    title: str
    priority: int
    completed: bool = False
    assignee_id: Optional[UUID] = None

# PATCH endpoint pattern
async def patch_task(task_id: UUID, updates: TaskUpdate, db) -> Task:
    existing = await db.get_task(task_id)  # fetch current state

    # Convert existing ORM object to Pydantic model
    current = Task.model_validate(existing)

    # Apply only the fields that were explicitly provided in the request
    # exclude_unset=True: only fields the client sent, not all Optional=None fields
    update_data = updates.model_dump(exclude_unset=True)

    # model_copy(update=...) produces a new model with specified fields replaced
    updated = current.model_copy(update=update_data)

    await db.update_task(task_id, updated.model_dump(exclude={"id"}))
    return updated

# Example:
# PATCH /tasks/123 {"title": "Updated title"}
# update_data = {"title": "Updated title"}  \text{---} only title, not priority/completed/assignee_id
# current.model_copy(update={"title": "Updated title"})
# → Task with new title, all other fields unchanged

tip

The key is model_dump(exclude_unset=True): it returns only fields that were explicitly included in the request body \text{---} not all fields that happen to be None. If the client sends {"title": "New"}, then exclude_unset=True returns {"title": "New"}, not {"title": "New", "priority": None, "completed": None, "assignee_id": None}. Without exclude_unset=True, a PATCH would set every unspecified optional field to None in the database.

Part 11 \text{---} `RootModel` for Lists and Primitives

When you need to validate a JSON array or a single value at the top level:

from pydantic import RootModel, Field
from typing import Annotated

# Validate a list of strings
class TagList(RootModel[list[str]]):
    root: list[str] = Field(min_length=1, max_length=20)

tags = TagList.model_validate(["python", "fastapi", "pydantic"])
print(tags.root)  # ['python', 'fastapi', 'pydantic']
tags.model_dump_json()  # '["python","fastapi","pydantic"]'

# Validate a list of models
from uuid import UUID

class BulkTaskIds(RootModel[list[UUID]]):
    pass

# DELETE /tasks with body ["uuid1", "uuid2", "uuid3"]
ids = BulkTaskIds.model_validate([
    "550e8400-e29b-41d4-a716-446655440000",
    "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
])
print(ids.root)  # [UUID('550e8400...'), UUID('6ba7b810...')]

# Validate a primitive with constraints
class PositiveScore(RootModel[Annotated[int, Field(ge=0, le=100)]]):
    pass

score = PositiveScore.model_validate(87)
print(score.root)  # 87

Part 12 \text{---} `Annotated` Types for Reusable Constraints

Instead of repeating Field(ge=1, le=5) everywhere, define reusable annotated types:

from typing import Annotated
from pydantic import Field, BaseModel
from uuid import UUID

# Reusable constrained types
PositiveInt = Annotated[int, Field(gt=0)]
Priority = Annotated[int, Field(ge=1, le=5, description="1=lowest, 5=highest")]
ShortStr = Annotated[str, Field(min_length=1, max_length=200)]
NonEmptyStr = Annotated[str, Field(min_length=1)]
Percentage = Annotated[float, Field(ge=0.0, le=100.0)]
TaskRef = Annotated[str, Field(pattern=r"^TASK-\d{4,6}$")]

# Use them across multiple models
class TaskCreate(BaseModel):
    title: ShortStr
    description: Annotated[str, Field(max_length=2000)] = ""
    priority: Priority = 2
    estimated_hours: PositiveInt = 1

class ProjectCreate(BaseModel):
    name: ShortStr
    completion_percentage: Percentage = 0.0
    default_priority: Priority = 2

class TaskUpdate(BaseModel):
    title: ShortStr | None = None
    priority: Priority | None = None

# The constraint is defined once - changing PositiveInt changes it everywhere

Part 13 - Discriminated Unions for Polymorphic Bodies

When an endpoint accepts different payload shapes depending on a type field:

from pydantic import BaseModel, Field
from typing import Literal, Union, Annotated
from uuid import UUID

class EmailNotification(BaseModel):
    type: Literal["email"]
    recipient: str
    subject: str
    body: str

class SlackNotification(BaseModel):
    type: Literal["slack"]
    channel: str
    message: str
    thread_ts: str | None = None

class WebhookNotification(BaseModel):
    type: Literal["webhook"]
    url: str
    payload: dict
    secret: str | None = None

# Discriminated union: Pydantic uses "type" field to pick the correct model
Notification = Annotated[
    Union[EmailNotification, SlackNotification, WebhookNotification],
    Field(discriminator="type"),
]

class SendNotificationRequest(BaseModel):
    notification: Notification
    scheduled_at: str | None = None

# FastAPI endpoint
from fastapi import FastAPI
app = FastAPI()

@app.post("/notifications")
async def send_notification(request: SendNotificationRequest):
    match request.notification:
        case EmailNotification():
            return {"action": "email", "to": request.notification.recipient}
        case SlackNotification():
            return {"action": "slack", "channel": request.notification.channel}
        case WebhookNotification():
            return {"action": "webhook", "url": request.notification.url}

# Input: {"notification": {"type": "email", "recipient": "[email protected]", "subject": "Hi", "body": "..."}}
# Pydantic sees type="email" → validates as EmailNotification
# Input: {"notification": {"type": "slack", "channel": "#general", "message": "Hello"}}
# Pydantic sees type="slack" → validates as SlackNotification

Without discriminated unions, Pydantic tries each model in order - slow for large unions and produces confusing errors. With a discriminator, it jumps directly to the correct model - O(1) dispatch.

Part 14 - `model_json_schema()` and OpenAPI Integration

Pydantic generates JSON Schema from your models, which FastAPI uses to build OpenAPI docs:

from pydantic import BaseModel, Field
from typing import Optional
import json

class TaskCreate(BaseModel):
    title: str = Field(
        min_length=1,
        max_length=200,
        description="Task title",
        examples=["Write unit tests"],
    )
    priority: int = Field(
        default=2,
        ge=1,
        le=5,
        description="Priority 1 (lowest) to 5 (highest)",
    )
    assignee_id: Optional[str] = Field(
        default=None,
        description="Assignee user ID",
    )

schema = TaskCreate.model_json_schema()
print(json.dumps(schema, indent=2))
# {
#   "type": "object",
#   "title": "TaskCreate",
#   "properties": {
#     "title": {
#       "type": "string",
#       "minLength": 1,
#       "maxLength": 200,
#       "description": "Task title",
#       "examples": ["Write unit tests"]
#     },
#     "priority": {
#       "type": "integer",
#       "minimum": 1,
#       "maximum": 5,
#       "default": 2,
#       "description": "Priority 1 (lowest) to 5 (highest)"
#     },
#     ...
#   },
#   "required": ["title"]
# }

FastAPI generates GET /openapi.json and renders it at GET /docs (Swagger UI) and GET /redoc automatically. Every Field(description=..., examples=[...]) appears in the Swagger docs.

Graded Practice

Level 1 - Predict the Validation Result

For each snippet, predict: does it succeed or raise ValidationError? If it succeeds, what are the field values?

1a:

from pydantic import BaseModel

class Item(BaseModel):
    name: str
    count: int
    active: bool

item = Item(name=42, count="10", active=0)
print(item.name, item.count, item.active)

1b:

from pydantic import BaseModel, Field

class Product(BaseModel):
    price: float = Field(gt=0, lt=1000)

p = Product(price=0)

1c:

from pydantic import BaseModel
from typing import Optional

class Task(BaseModel):
    title: str
    priority: Optional[int] = None

t = Task(title="test")
print(t.model_dump(exclude_unset=True))
print(t.model_dump(exclude_none=True))

Show Answer

1a - Succeeds with coercion:

42 10 False

name=42: int coerces to str → "42"
count="10": str coerces to int → 10
active=0: int coerces to bool → False (0 is falsy)

Note: active=1 would give True, active=2 would give True (any non-zero int is truthy).

1b - Raises ValidationError:

price: Input should be greater than 0 [type=greater_than, ...]

gt=0 means strictly greater than - 0 itself is not allowed. Use ge=0 if you want to allow zero.

1c - Succeeds, two different outputs:

t.model_dump(exclude_unset=True)
# {'title': 'test'}
# "priority" was never set - it only has a default

t.model_dump(exclude_none=True)
# {'title': 'test'}
# "priority" is None - excluded

# Contrast:
t2 = Task(title="test", priority=None)  # explicitly set to None
t2.model_dump(exclude_unset=True)
# {'title': 'test', 'priority': None}   - priority was SET (to None)
t2.model_dump(exclude_none=True)
# {'title': 'test'}                     - None values excluded regardless

Level 2 - Debug the Validator

A developer built this model for a booking API. Find and fix all problems:

from pydantic import BaseModel, field_validator, model_validator
from datetime import datetime

class BookingCreate(BaseModel):
    title: str
    start_time: datetime
    end_time: datetime
    attendees: list[str]

    @field_validator("attendees")
    def validate_attendees(cls, v):           # Bug 1: missing @classmethod
        if len(v) == 0:
            raise ValueError("must have at least one attendee")
        if len(v) > 100:
            raise ValueError("cannot exceed 100 attendees")
        return v

    @model_validator(mode="after")
    def check_times(self):
        if self.end_time < self.start_time:   # Bug 2: should be <=, not <
            raise ValueError("end_time must be after start_time")
        return self

    @field_validator("start_time", mode="after")
    @classmethod
    def start_must_be_future(cls, v: datetime) -> datetime:
        if v < datetime.now():                # Bug 3: timezone-naive comparison
            raise ValueError("start_time must be in the future")
        return v

Show Answer

Bug 1 - Missing @classmethod on field_validator:

In Pydantic v2, @field_validator methods must be class methods. Without @classmethod, Pydantic raises PydanticUserError: 'field_validator' must be used with classmethod. The fix:

@field_validator("attendees")
@classmethod
def validate_attendees(cls, v: list[str]) -> list[str]:
    if len(v) == 0:
        raise ValueError("must have at least one attendee")
    if len(v) > 100:
        raise ValueError("cannot exceed 100 attendees")
    return v

Note: @field_validator goes first, @classmethod goes second. This is the required order in Pydantic v2.

Bug 2 - < should be <= in the time comparison:

A booking where start_time == end_time would be accepted - a zero-duration booking makes no sense. The condition should exclude equal times:

@model_validator(mode="after")
def check_times(self) -> "BookingCreate":
    if self.end_time <= self.start_time:  # <=: end must be strictly after start
        raise ValueError("end_time must be strictly after start_time")
    return self

Bug 3 - Timezone-naive comparison in start_must_be_future:

datetime.now() returns a naive datetime (no timezone info). If start_time is parsed from an ISO 8601 string with a timezone offset (e.g., "2024-03-15T12:00:00Z"), v will be timezone-aware. Comparing a timezone-aware datetime to a naive datetime raises TypeError: can't compare offset-naive and offset-aware datetimes.

The fix:

from datetime import datetime, timezone

@field_validator("start_time", mode="after")
@classmethod
def start_must_be_future(cls, v: datetime) -> datetime:
    if v.tzinfo is None:
        raise ValueError("start_time must be timezone-aware (include UTC offset or Z)")
    now = datetime.now(tz=timezone.utc)
    if v <= now:
        raise ValueError("start_time must be in the future")
    return v

Always use datetime.now(tz=timezone.utc) for the comparison baseline. Require all incoming datetimes to be timezone-aware - reject naive datetimes at the validation layer.

Level 3 - Design the Request and Response Models

You are building a task management API with these endpoints:

POST /tasks - create a task
GET /tasks/{id} - get a task (response includes computed fields not in create)
PATCH /tasks/{id} - partial update
POST /tasks/bulk - create up to 100 tasks in one request

Business rules:

Title: 1–200 chars, stripped of whitespace
Priority: 1–5 (default 2)
Due date: must be in the future if provided; must be timezone-aware
Assignee: optional UUID; if provided, must be a valid v4 UUID
Tags: list of strings, each 1–50 chars, maximum 10 tags, all lowercase
GET response includes: id (UUID), created_at (datetime UTC), updated_at (datetime UTC), title, priority, due_date, tags, assignee_id, status (an Enum)
PATCH must only update provided fields, not set unprovided fields to null
Bulk create: array of TaskCreate, minimum 1, maximum 100

Design all models with full validation, config, and correct types.

Show Answer

from pydantic import BaseModel, Field, ConfigDict, field_validator, model_validator, RootModel
from typing import Annotated, Optional
from uuid import UUID, uuid4
from datetime import datetime, timezone
from enum import Enum

# ── Reusable annotated types ───────────────────────────────────────────────────

Title = Annotated[str, Field(min_length=1, max_length=200, description="Task title")]
Priority = Annotated[int, Field(ge=1, le=5, description="1=lowest, 5=highest")]

class TaskStatus(str, Enum):
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    DONE = "done"
    CANCELLED = "cancelled"

# ── Request models ─────────────────────────────────────────────────────────────

class TaskCreate(BaseModel):
    model_config = ConfigDict(str_strip_whitespace=True)

    title: Title
    priority: Priority = 2
    due_date: Optional[datetime] = None
    assignee_id: Optional[UUID] = None
    tags: list[str] = Field(default_factory=list, max_length=10)

    @field_validator("tags", mode="before")
    @classmethod
    def normalize_tags(cls, v: list) -> list:
        """Lowercase and strip each tag before length validation."""
        if isinstance(v, list):
            return [str(tag).lower().strip() for tag in v]
        return v

    @field_validator("tags")
    @classmethod
    def validate_tag_lengths(cls, v: list[str]) -> list[str]:
        for tag in v:
            if not (1 <= len(tag) <= 50):
                raise ValueError(
                    f"each tag must be 1–50 characters, got '{tag}' ({len(tag)} chars)"
                )
        return v

    @field_validator("due_date", mode="after")
    @classmethod
    def due_date_must_be_future_and_aware(cls, v: Optional[datetime]) -> Optional[datetime]:
        if v is None:
            return v
        if v.tzinfo is None:
            raise ValueError("due_date must be timezone-aware (e.g. '2024-12-01T12:00:00Z')")
        if v <= datetime.now(tz=timezone.utc):
            raise ValueError("due_date must be in the future")
        return v


class TaskUpdate(BaseModel):
    """All fields optional for PATCH - only provided fields are applied."""
    model_config = ConfigDict(str_strip_whitespace=True)

    title: Optional[Title] = None
    priority: Optional[Priority] = None
    due_date: Optional[datetime] = None
    assignee_id: Optional[UUID] = None
    tags: Optional[list[str]] = None
    status: Optional[TaskStatus] = None

    @field_validator("tags", mode="before")
    @classmethod
    def normalize_tags(cls, v) -> list | None:
        if v is None:
            return None
        return [str(tag).lower().strip() for tag in v]

    @field_validator("due_date", mode="after")
    @classmethod
    def due_date_must_be_future_and_aware(cls, v: Optional[datetime]) -> Optional[datetime]:
        if v is None:
            return v
        if v.tzinfo is None:
            raise ValueError("due_date must be timezone-aware")
        if v <= datetime.now(tz=timezone.utc):
            raise ValueError("due_date must be in the future")
        return v


# ── Response model ─────────────────────────────────────────────────────────────

class TaskResponse(BaseModel):
    """Response model - reads from SQLAlchemy ORM object via from_attributes."""
    model_config = ConfigDict(from_attributes=True, use_enum_values=True)

    id: UUID
    title: str
    priority: int
    status: TaskStatus
    due_date: Optional[datetime]
    assignee_id: Optional[UUID]
    tags: list[str]
    created_at: datetime
    updated_at: datetime

    def to_json(self) -> str:
        """Serialize to JSON with ISO 8601 datetimes."""
        return self.model_dump_json()


# ── Bulk create ────────────────────────────────────────────────────────────────

class BulkTaskCreate(RootModel[list[TaskCreate]]):
    """Validates a JSON array of TaskCreate - minimum 1, maximum 100."""

    @model_validator(mode="after")
    def validate_count(self) -> "BulkTaskCreate":
        if len(self.root) == 0:
            raise ValueError("bulk create requires at least 1 task")
        if len(self.root) > 100:
            raise ValueError(
                f"bulk create allows at most 100 tasks, got {len(self.root)}"
            )
        return self


# ── FastAPI endpoints ──────────────────────────────────────────────────────────

from fastapi import FastAPI, HTTPException

app = FastAPI()

@app.post("/tasks", response_model=TaskResponse, status_code=201)
async def create_task(body: TaskCreate):
    # body is already validated: title stripped, tags lowercased, due_date UTC-aware
    task = await db.create_task(body.model_dump())
    return TaskResponse.model_validate(task)

@app.get("/tasks/{task_id}", response_model=TaskResponse)
async def get_task(task_id: UUID):
    task = await db.get_task(task_id)
    if not task:
        raise HTTPException(status_code=404, detail=f"Task {task_id} not found")
    return TaskResponse.model_validate(task)

@app.patch("/tasks/{task_id}", response_model=TaskResponse)
async def update_task(task_id: UUID, body: TaskUpdate):
    existing = await db.get_task(task_id)
    if not existing:
        raise HTTPException(status_code=404, detail=f"Task {task_id} not found")

    current = TaskResponse.model_validate(existing)

    # Only update fields the client actually provided
    update_data = body.model_dump(exclude_unset=True)
    updated = current.model_copy(update=update_data)

    await db.update_task(task_id, updated.model_dump(exclude={"id", "created_at", "updated_at"}))
    return updated

@app.post("/tasks/bulk", response_model=list[TaskResponse], status_code=201)
async def bulk_create_tasks(body: BulkTaskCreate):
    tasks = await db.bulk_create_tasks([t.model_dump() for t in body.root])
    return [TaskResponse.model_validate(t) for t in tasks]

Key design decisions:

Separate TaskCreate and TaskUpdate: different validation semantics - TaskCreate has required fields and defaults; TaskUpdate has all-optional fields. Never use a single model for both.
exclude_unset=True in PATCH: critical - otherwise PATCH sets every unspecified optional field to None in the database, destroying existing values.
mode='before' for tag normalization: normalization (lowercase, strip) must run before the min_length=1 check, or " python " fails the length check before being stripped to "python".
RootModel for bulk create: the top-level JSON is an array - RootModel is the correct Pydantic type for this. model_validator on the root model validates the count.
from_attributes=True only on TaskResponse: request models do not need ORM mode; only the response model reads from SQLAlchemy objects. Keeping them separate avoids accidental attribute access at the wrong layer.

Key Takeaways

Pydantic v2 coerces by default: "25" becomes 25 for an int field. Use ConfigDict(strict=True) or Annotated[int, Field(strict=True)] to disable coercion per-model or per-field.
str does not validate email format: use EmailStr from pydantic[email]. The same applies to HttpUrl, IPvAnyAddress - always use the semantic type, not str, when you have a format requirement.
@field_validator requires @classmethod in Pydantic v2. The order is @field_validator(...) first, @classmethod second. Swap them and you get a PydanticUserError.
mode='before' for normalization, mode='after' for business rules: before-validators receive raw input (may not be the declared type); after-validators receive the already-coerced value.
model_dump(exclude_unset=True) is essential for PATCH: without it, every optional field not sent by the client becomes None in the database, silently destroying data.
model_dump(mode="json") vs model_dump(): only mode="json" produces a dict with JSON-serializable values. Plain model_dump() returns Python objects - passing it to json.dumps() raises TypeError for datetime, UUID, and Decimal fields.
from_attributes=True (ORM mode) enables model_validate(orm_object) - Pydantic reads attributes directly instead of requiring a dict. Use it on response models, not request models.
ConfigDict(str_strip_whitespace=True) on all request models: users always add accidental whitespace. Stripping it at the validation layer prevents data quality bugs in the database.
Discriminated unions (Field(discriminator="type")) dispatch to the correct model in O(1) based on a literal field value - far faster and clearer than bare Union[A, B, C] which tries each model in sequence.
Annotated types make constraints reusable: define Priority = Annotated[int, Field(ge=1, le=5)] once, use it across all models. Change the constraint in one place, apply everywhere.

What's Next

You have completed Module 06 - APIs and Web Basics. You can now:

Build HTTP APIs with FastAPI - routes, dependency injection, middleware, error handling
Serialize production data with custom encoders, orjson, and Pydantic's JSON layer
Validate request and response models with Pydantic v2 - constraints, custom validators, cross-field rules, ORM mode

Module 07 - Databases builds directly on this foundation:

SQLAlchemy Core and ORM - the from_attributes=True pattern you used in Pydantic becomes the primary integration point; defining ORM models that Pydantic can validate directly
Alembic migrations - database schema evolution, the same models evolve over time
Async database access - async with session: patterns that work correctly in FastAPI's async endpoints
Connection pooling - why engine = create_engine(...) is called once and shared, not once per request
N+1 query problems - the SQLAlchemy lazy-loading issue referenced in Lesson 07 (JSON Serialization), solved with selectinload and joinedload

What You Will Learn​

Prerequisites​

Part 1 - BaseModel and Type Coercion​

Coercion: What Pydantic Does by Default​

Strict Mode: Disable Coercion​

Part 2 - Built-In Validated Types​

Part 3 - Field(): Every Constraint That Matters​

ge/le/gt/lt vs min_length/max_length​

Part 4 \text{---} The Pydantic Validation Pipeline​

Part 5 \text{---} field_validator: Single-Field Custom Validation​

mode='before' vs mode='after'​

Part 6 \text{---} model_validator: Cross-Field Validation​

mode='before' for Model Validators​

Part 7 \text{---} Nested Models and Composition​

Part 8 \text{---} model_config with ConfigDict​

ORM Mode (from_attributes=True)​

Part 9 \text{---} model_dump() and model_dump_json()​

Part 10 \text{---} Partial Updates: PATCH Endpoint Pattern​

Part 11 \text{---} RootModel for Lists and Primitives​

Part 12 \text{---} Annotated Types for Reusable Constraints​

Part 13 - Discriminated Unions for Polymorphic Bodies​

Part 14 - model_json_schema() and OpenAPI Integration​

Graded Practice​

Level 1 - Predict the Validation Result​

Level 2 - Debug the Validator​

Level 3 - Design the Request and Response Models​

Key Takeaways​

What's Next​

What You Will Learn

Prerequisites

Part 1 - `BaseModel` and Type Coercion

Coercion: What Pydantic Does by Default

Strict Mode: Disable Coercion

Part 2 - Built-In Validated Types

Part 3 - `Field()`: Every Constraint That Matters

`ge`/`le`/`gt`/`lt` vs `min_length`/`max_length`

Part 4 \text{---} The Pydantic Validation Pipeline

Part 5 \text{---} `field_validator`: Single-Field Custom Validation

`mode='before'` vs `mode='after'`

Part 6 \text{---} `model_validator`: Cross-Field Validation

`mode='before'` for Model Validators

Part 7 \text{---} Nested Models and Composition

Part 8 \text{---} `model_config` with `ConfigDict`

ORM Mode (`from_attributes=True`)

Part 9 \text{---} `model_dump()` and `model_dump_json()`

Part 10 \text{---} Partial Updates: PATCH Endpoint Pattern

Part 11 \text{---} `RootModel` for Lists and Primitives

Part 12 \text{---} `Annotated` Types for Reusable Constraints

Part 13 - Discriminated Unions for Polymorphic Bodies

Part 14 - `model_json_schema()` and OpenAPI Integration

Graded Practice

Level 1 - Predict the Validation Result

Level 2 - Debug the Validator

Level 3 - Design the Request and Response Models

Key Takeaways

What's Next