Protocol and Structural Subtyping
Predict the outcome. Does this code pass mypy?
from typing import Protocol
class Drawable(Protocol):
def draw(self) -> str: ...
class Circle:
def draw(self) -> str:
return "O"
class DatabaseConnection:
def draw(self) -> str:
return "SELECT * FROM connections"
def render(shape: Drawable) -> None:
print(shape.draw())
render(Circle()) # Line A
render(DatabaseConnection()) # Line B
Both Line A and Line B pass mypy with no errors. DatabaseConnection has nothing to do with shapes, but it satisfies the Drawable protocol because it has a draw() -> str method. This is structural subtyping -- compatibility is determined by structure (what methods/attributes exist), not by inheritance. This is powerful and dangerous in equal measure. This lesson teaches you to wield it precisely.
What You Will Learn
- How
typing.Protocoldiffers from abstract base classes (nominal subtyping) - How to define, compose, and extend protocols
- Using
runtime_checkablefor isinstance checks - Combining protocols with generics
- When Protocol is the right tool and when ABC is better
- Real-world protocols: file-like objects, Django querysets, plugin systems
Prerequisites
- Solid understanding of ABCs (
abc.ABC,@abstractmethod) - TypeVar and Generic from the previous lesson
- Experience with Python duck typing in practice
- Basic familiarity with the
typingmodule
Part 1 -- Nominal vs Structural Subtyping
Python has always had two subtyping philosophies in tension.
Nominal Subtyping (ABCs)
With ABCs, a class is a subtype only if it explicitly inherits from the base:
from abc import ABC, abstractmethod
class Serializable(ABC):
@abstractmethod
def serialize(self) -> bytes: ...
class UserRecord(Serializable):
def __init__(self, name: str) -> None:
self.name = name
def serialize(self) -> bytes:
return self.name.encode()
class Config:
def serialize(self) -> bytes:
return b"{}"
def save(obj: Serializable) -> None:
data = obj.serialize()
save(UserRecord("Alice")) # OK
save(Config()) # ERROR: Config is not a subclass of Serializable
Config has the right method signature but does not inherit from Serializable, so it fails. This is nominal -- the name (inheritance chain) matters.
Structural Subtyping (Protocol)
With Protocol, a class is a subtype if it has the required structure:
from typing import Protocol
class Serializable(Protocol):
def serialize(self) -> bytes: ...
class UserRecord:
def __init__(self, name: str) -> None:
self.name = name
def serialize(self) -> bytes:
return self.name.encode()
class Config:
def serialize(self) -> bytes:
return b"{}"
def save(obj: Serializable) -> None:
data = obj.serialize()
save(UserRecord("Alice")) # OK
save(Config()) # OK -- has serialize() -> bytes
No inheritance needed. If it has the methods, it satisfies the protocol.
When to Use Which
| Criterion | Protocol | ABC |
|---|---|---|
| Third-party classes you cannot modify | Protocol | Cannot use ABC |
| Need runtime enforcement of contract | ABC | Protocol only with runtime_checkable |
| Interop with existing duck-typed code | Protocol | Requires refactoring |
| Need shared implementation (mixin logic) | ABC | Protocol can have default methods but it is unconventional |
| Want to document "this IS-A relationship" | ABC | Protocol implies "behaves-like" |
Default to Protocol when defining interfaces for function parameters. Use ABC when you own the entire hierarchy and need shared implementation or runtime enforcement.
Part 2 -- Defining Protocols
Basic Protocol
from typing import Protocol
class SupportsClose(Protocol):
def close(self) -> None: ...
The ... (ellipsis) in the method body means "this is a specification, not an implementation." Any class with a close(self) -> None method satisfies this protocol.
Protocol with Attributes
Protocols can require attributes, not just methods:
from typing import Protocol
class Named(Protocol):
name: str
class User:
def __init__(self, name: str) -> None:
self.name = name
class Server:
name: str = "default"
def greet(entity: Named) -> str:
return f"Hello, {entity.name}"
greet(User("Alice")) # OK
greet(Server()) # OK
Protocol attribute checking has a subtlety: class-level annotations without values create instance attribute requirements. The conforming class must have the attribute accessible on instances, whether set in __init__, as a class variable, or as a property.
class Bad:
pass
# no 'name' attribute at all
greet(Bad()) # ERROR: Bad has no attribute "name"
Protocol with Properties
from typing import Protocol
class SizedAndNamed(Protocol):
@property
def name(self) -> str: ...
@property
def size(self) -> int: ...
class File:
def __init__(self, name: str, data: bytes) -> None:
self._name = name
self._data = data
@property
def name(self) -> str:
return self._name
@property
def size(self) -> int:
return len(self._data)
def describe(item: SizedAndNamed) -> str:
return f"{item.name}: {item.size} bytes"
describe(File("readme.txt", b"hello")) # OK
Protocol Methods with Default Implementations
Protocols can include methods with implementations. These serve as documentation, but the key rule remains: conforming classes do NOT need to inherit from the protocol.
from typing import Protocol
class Loggable(Protocol):
def log_prefix(self) -> str: ...
def log(self, message: str) -> str:
"""Default implementation -- but conforming classes need their own."""
return f"[{self.log_prefix()}] {message}"
A method with a body in a Protocol is still part of the structural contract. The conforming class must have a compatible method signature, but it does not inherit the implementation unless it explicitly subclasses the protocol (which defeats the purpose of structural subtyping).
Part 3 -- Composing Protocols
Protocol Inheritance
Protocols can extend other protocols to build richer interfaces:
from typing import Protocol
class SupportsRead(Protocol):
def read(self, n: int = -1) -> bytes: ...
class SupportsWrite(Protocol):
def write(self, data: bytes) -> int: ...
class SupportsClose(Protocol):
def close(self) -> None: ...
class ReadWriteCloseable(SupportsRead, SupportsWrite, SupportsClose, Protocol):
"""A file-like object that supports reading, writing, and closing."""
...
When composing protocols through inheritance, you must include Protocol in the base classes. Without it, you create a regular abstract class, not a protocol.
# WRONG -- not a protocol, just a regular class
class ReadWrite(SupportsRead, SupportsWrite):
...
# CORRECT -- still a protocol
class ReadWrite(SupportsRead, SupportsWrite, Protocol):
...
Intersection via Composition
from typing import Protocol
class Hashable(Protocol):
def __hash__(self) -> int: ...
class Comparable(Protocol):
def __lt__(self, other: object) -> bool: ...
def __eq__(self, other: object) -> bool: ...
class SortableHashable(Hashable, Comparable, Protocol):
"""Can be used as dict key and sorted."""
...
def dedupe_and_sort(items: list[SortableHashable]) -> list[SortableHashable]:
return sorted(set(items))
Small, Focused Protocols
The Interface Segregation Principle applies directly. Prefer many small protocols over one large one:
from typing import Protocol
# BAD: monolithic protocol
class Repository(Protocol):
def get(self, id: int) -> object: ...
def save(self, obj: object) -> None: ...
def delete(self, id: int) -> None: ...
def list_all(self) -> list[object]: ...
def count(self) -> int: ...
def exists(self, id: int) -> bool: ...
# GOOD: composed from focused protocols
class Readable(Protocol):
def get(self, id: int) -> object: ...
def list_all(self) -> list[object]: ...
class Writable(Protocol):
def save(self, obj: object) -> None: ...
def delete(self, id: int) -> None: ...
class Queryable(Protocol):
def count(self) -> int: ...
def exists(self, id: int) -> bool: ...
Functions that only read take Readable. Functions that write take Writable. This reduces coupling and makes testing simpler.
Part 4 -- runtime_checkable
By default, protocols are static-only. Adding @runtime_checkable enables isinstance checks:
from typing import Protocol, runtime_checkable
@runtime_checkable
class SupportsLen(Protocol):
def __len__(self) -> int: ...
print(isinstance([1, 2, 3], SupportsLen)) # True
print(isinstance("hello", SupportsLen)) # True
print(isinstance(42, SupportsLen)) # False
Limitations of runtime_checkable
Runtime checks only verify method/attribute existence, not signatures:
from typing import Protocol, runtime_checkable
@runtime_checkable
class Transformer(Protocol):
def transform(self, data: bytes) -> str: ...
class BadTransformer:
def transform(self) -> int: # Wrong signature entirely
return 42
print(isinstance(BadTransformer(), Transformer)) # True! (only checks name exists)
runtime_checkable + isinstance checks only that the method/attribute names exist on the object. It does not verify:
- Parameter types or counts
- Return types
- Property vs method distinction
Use it for coarse filtering, not for type safety guarantees. Rely on static type checking (mypy/pyright) for full signature verification.
Performance Consideration
isinstance with a runtime-checkable Protocol is slower than isinstance with a regular class, because it must inspect the object's attributes dynamically via hasattr checks. Avoid using it in performance-critical hot loops. For high-throughput code, consider caching the check result or using a different dispatch mechanism.
Part 5 -- Generic Protocols
Protocols and generics combine to create powerful, reusable interfaces:
from typing import Protocol, TypeVar
T = TypeVar("T")
T_co = TypeVar("T_co", covariant=True)
class Mapper(Protocol[T, T_co]):
"""A protocol for objects that map from T to some output type."""
def map(self, value: T) -> T_co: ...
class StringMapper:
def map(self, value: str) -> int:
return len(value)
class IntMapper:
def map(self, value: int) -> str:
return str(value)
def apply_mapper(mapper: Mapper[str, int], value: str) -> int:
return mapper.map(value)
apply_mapper(StringMapper(), "hello") # OK, returns 5
apply_mapper(IntMapper(), "hello") # ERROR: IntMapper is Mapper[int, str]
Generic Repository Protocol
from typing import Protocol, TypeVar, Sequence
T = TypeVar("T")
class Repository(Protocol[T]):
def get(self, id: int) -> T | None: ...
def save(self, entity: T) -> T: ...
def list_all(self) -> Sequence[T]: ...
class User:
def __init__(self, id: int, name: str) -> None:
self.id = id
self.name = name
class InMemoryUserRepo:
"""Satisfies Repository[User] without inheriting from it."""
def __init__(self) -> None:
self._store: dict[int, User] = {}
def get(self, id: int) -> User | None:
return self._store.get(id)
def save(self, entity: User) -> User:
self._store[entity.id] = entity
return entity
def list_all(self) -> Sequence[User]:
return list(self._store.values())
def count_entities(repo: Repository[User]) -> int:
return len(repo.list_all())
count_entities(InMemoryUserRepo()) # OK -- structural match
Callable Protocol (for Complex Callable Signatures)
When Callable[[...], ...] is not expressive enough, use a Protocol with __call__:
from typing import Protocol
class EventHandler(Protocol):
def __call__(self, event_type: str, *, payload: dict[str, object]) -> bool:
"""Handle an event. Keyword-only payload argument."""
...
def on_click(event_type: str, *, payload: dict[str, object]) -> bool:
print(f"Clicked: {event_type}")
return True
def register(handler: EventHandler) -> None:
handler("click", payload={"x": 10, "y": 20})
register(on_click) # OK -- matches the Protocol's __call__ signature
Use a __call__ Protocol instead of Callable when you need:
- Keyword-only arguments
- Overloaded call signatures
- Docstring documentation on the callable interface
- Complex parameter patterns (variadic, defaults)
Part 6 -- Real-World Protocol Patterns
Pattern: File-Like Objects
The stdlib has many functions that accept "file-like objects." Protocol makes this explicit:
from typing import Protocol
class ReadableFile(Protocol):
def read(self, n: int = -1) -> bytes: ...
def seek(self, offset: int, whence: int = 0) -> int: ...
def tell(self) -> int: ...
def compute_hash(f: ReadableFile) -> str:
import hashlib
h = hashlib.sha256()
while chunk := f.read(8192):
h.update(chunk)
return h.hexdigest()
# Works with actual files:
# with open("data.bin", "rb") as f:
# compute_hash(f) # OK
# Works with BytesIO:
import io
compute_hash(io.BytesIO(b"hello world")) # OK
# Works with any object that has read/seek/tell:
class S3Object:
def __init__(self, data: bytes) -> None:
self._data = data
self._pos = 0
def read(self, n: int = -1) -> bytes:
if n == -1:
result = self._data[self._pos:]
self._pos = len(self._data)
else:
result = self._data[self._pos:self._pos + n]
self._pos += n
return result
def seek(self, offset: int, whence: int = 0) -> int:
self._pos = offset
return self._pos
def tell(self) -> int:
return self._pos
compute_hash(S3Object(b"cloud data")) # OK -- structural match
Pattern: Django-Style QuerySet Protocol
from typing import Protocol, TypeVar, Sequence, Iterator
T = TypeVar("T")
class QuerySetLike(Protocol[T]):
def filter(self, **kwargs: object) -> "QuerySetLike[T]": ...
def exclude(self, **kwargs: object) -> "QuerySetLike[T]": ...
def first(self) -> T | None: ...
def count(self) -> int: ...
def __iter__(self) -> Iterator[T]: ...
def get_active_items(qs: QuerySetLike[T]) -> Sequence[T]:
"""Works with Django QuerySets, test fakes, or any conforming container."""
return list(qs.filter(active=True))
This pattern lets you write business logic that works with Django's ORM in production and simple in-memory fakes in tests -- without importing Django at all.
Pattern: Plugin System
from typing import Protocol, runtime_checkable
@runtime_checkable
class Plugin(Protocol):
name: str
version: str
def initialize(self) -> None: ...
def execute(self, context: dict[str, object]) -> object: ...
def teardown(self) -> None: ...
class MetricsPlugin:
name = "metrics"
version = "1.0.0"
def initialize(self) -> None:
print("Metrics initialized")
def execute(self, context: dict[str, object]) -> object:
return {"requests": 42}
def teardown(self) -> None:
print("Metrics stopped")
def load_plugins(candidates: list[object]) -> list[Plugin]:
"""Filter objects that satisfy the Plugin protocol at runtime."""
return [c for c in candidates if isinstance(c, Plugin)]
plugins = load_plugins([MetricsPlugin(), "not a plugin", 42])
# plugins contains only MetricsPlugin()
Part 7 -- Protocol Gotchas and Advanced Details
Gotcha: Mutable Attribute vs Read-Only Property
from typing import Protocol
class HasName(Protocol):
@property
def name(self) -> str: ...
class MutableUser:
def __init__(self, name: str) -> None:
self.name = name # Regular mutable attribute
def greet(obj: HasName) -> str:
return f"Hi {obj.name}"
greet(MutableUser("Alice")) # OK -- mutable attribute satisfies read-only property
A mutable attribute satisfies a read-only property protocol (it can be read). But a read-only property does NOT satisfy a mutable attribute protocol (it cannot be assigned to).
Gotcha: Class Methods in Protocols
from typing import Protocol
class Creatable(Protocol):
@classmethod
def create(cls) -> "Creatable": ...
class Widget:
@classmethod
def create(cls) -> "Widget":
return cls()
def factory(creator: type[Creatable]) -> Creatable:
return creator.create()
factory(Widget) # OK
Gotcha: Protocol Cannot Be Instantiated
from typing import Protocol
class MyProto(Protocol):
def method(self) -> int: ...
# obj = MyProto() # TypeError at runtime -- Protocol classes cannot be instantiated
Gotcha: Self-Referencing Protocols
from typing import Protocol, Self
class Chainable(Protocol):
def then(self, other: Self) -> Self: ...
class Step:
def then(self, other: "Step") -> "Step":
return other
def chain(a: Chainable, b: Chainable) -> Chainable:
return a.then(b)
chain(Step(), Step()) # OK
Gotcha: Protocol Inheritance from Non-Protocol
A protocol class can inherit from non-protocol classes, but all methods from the non-protocol base become part of the structural contract:
from typing import Protocol
class Base:
def base_method(self) -> int:
return 0
class MyProtocol(Base, Protocol):
def proto_method(self) -> str: ...
# Now any conforming class must have BOTH base_method() -> int AND proto_method() -> str
Inheriting from non-protocol classes in a protocol is rare and can be confusing. Prefer composing from other protocols or defining methods directly.
Key Takeaways
- Protocol enables structural subtyping: a class satisfies a protocol if it has the right methods/attributes, no inheritance needed
- ABCs use nominal subtyping: explicit inheritance is required
- Prefer Protocol for function parameter types, especially when working with third-party code you cannot modify
- Compose protocols through inheritance -- always include
Protocolin bases when combining protocols - Follow the Interface Segregation Principle: many small protocols over one large one
@runtime_checkableenablesisinstancechecks but only verifies attribute/method existence, not signatures- Generic protocols (
Protocol[T]) combine structural subtyping with parametric polymorphism - Use
__call__protocols for complex callable signatures thatCallablecannot express - A mutable attribute satisfies a read-only property protocol, but not vice versa
- Protocol classes cannot be instantiated directly
Graded Practice Challenges
Level 1 -- Predict the Type Checker Output
Question 1: Does this pass mypy?
from typing import Protocol
class Closeable(Protocol):
def close(self) -> None: ...
class Connection:
def close(self, force: bool = False) -> None:
pass
def cleanup(resource: Closeable) -> None:
resource.close()
cleanup(Connection())
Answer
Yes, this passes mypy. Connection.close has an additional parameter force with a default value. Since it can be called as close() (with no arguments beyond self), it satisfies the protocol's close(self) -> None signature. A method with additional optional parameters is compatible with a protocol requiring fewer parameters.
Question 2: Does this pass mypy?
from typing import Protocol
class Sized(Protocol):
def __len__(self) -> int: ...
class Weighted(Protocol):
weight: float
class SizedWeighted(Sized, Weighted, Protocol):
...
class Package:
weight: float = 1.0
def __len__(self) -> int:
return 1
class Feather:
weight: float = 0.01
def ship(item: SizedWeighted) -> None:
print(f"Shipping item of size {len(item)} and weight {item.weight}")
ship(Package())
ship(Feather())
Answer
ship(Package()) passes -- Package has both __len__ and weight.
ship(Feather()) fails -- Feather has weight but no __len__ method. It does not satisfy the Sized part of the composed SizedWeighted protocol. mypy reports: Argument 1 to "ship" has incompatible type "Feather"; expected "SizedWeighted".
Question 3: What happens at runtime?
from typing import Protocol, runtime_checkable
@runtime_checkable
class Drawable(Protocol):
def draw(self, x: int, y: int) -> None: ...
class Pen:
def draw(self) -> str:
return "line"
print(isinstance(Pen(), Drawable))
Answer
It prints True. runtime_checkable only checks that the draw attribute exists on the instance. It does not verify the method signature (parameter types/counts or return type). This is a known limitation. Static type checkers would correctly reject Pen as not conforming to Drawable, but the runtime isinstance check passes.
Level 2 -- Debug and Fix
This code is supposed to define a protocol for objects that can be serialized to JSON and deserialized back. It has multiple errors. Find and fix all of them.
from typing import Protocol, runtime_checkable, TypeVar, Type
T = TypeVar("T")
class JsonSerializable(Protocol):
def to_json(self) -> str: ...
@classmethod
def from_json(cls: Type[T], data: str) -> T: ...
class UserDTO:
def __init__(self, name: str, age: int) -> None:
self.name = name
self.age = age
def to_json(self) -> str:
import json
return json.dumps({"name": self.name, "age": self.age})
@classmethod
def from_json(cls, data: str) -> "UserDTO":
import json
d = json.loads(data)
return cls(d["name"], d["age"])
def round_trip(obj: JsonSerializable) -> JsonSerializable:
json_str = obj.to_json()
return type(obj).from_json(json_str)
result = round_trip(UserDTO("Alice", 30))
print(result.name) # Error: JsonSerializable has no attribute 'name'
Answer
There are two issues:
-
Return type of
round_trip: It returnsJsonSerializable, which has nonameattribute. The function loses the concrete type. -
from_jsonclassmethod in Protocol: The TypeVarTin the classmethod is disconnected from the protocol's structural contract, making it hard for type checkers to track.
Fix -- make round_trip generic and separate concerns:
from typing import Protocol, TypeVar, Callable
class JsonSerializable(Protocol):
def to_json(self) -> str: ...
T = TypeVar("T")
def round_trip(
obj: JsonSerializable,
deserialize: Callable[[str], T],
) -> T:
json_str = obj.to_json()
return deserialize(json_str)
result = round_trip(UserDTO("Alice", 30), UserDTO.from_json)
print(result.name) # OK: result is UserDTO
The deeper lesson: classmethods in protocols are tricky because the protocol describes instance behavior, but from_json is a class-level constructor. Separating the serialization protocol (instance) from the deserialization callable (class-level) produces cleaner types.
Level 3 -- Design Challenge
Design a middleware system using protocols:
# Desired usage:
@dataclass
class Request:
path: str
headers: dict[str, str]
body: bytes
@dataclass
class Response:
status: int
body: bytes
# Middleware should be composable:
app = compose_middleware(
logging_middleware,
auth_middleware,
rate_limit_middleware,
handler,
)
response = app(Request(path="/api/data", headers={}, body=b""))
Requirements:
- Define a
Handlerprotocol and aMiddlewareprotocol - Middleware takes a
Handlerand returns a newHandler(wrapping pattern) compose_middlewarechains multiple middleware around a base handler- All type-safe -- mypy should verify the chain
- No inheritance required for concrete middleware implementations
Hint
from typing import Protocol
from dataclasses import dataclass
@dataclass
class Request:
path: str
headers: dict[str, str]
body: bytes
@dataclass
class Response:
status: int
body: bytes
class Handler(Protocol):
def __call__(self, request: Request) -> Response: ...
class Middleware(Protocol):
def __call__(self, next_handler: Handler) -> Handler: ...
def compose_middleware(
*layers: Middleware,
handler: Handler,
) -> Handler:
result = handler
for mw in reversed(layers):
result = mw(result)
return result
def logging_middleware(next_handler: Handler) -> Handler:
def wrapped(request: Request) -> Response:
print(f"-> {request.path}")
response = next_handler(request)
print(f"<- {response.status}")
return response
return wrapped
def handler(request: Request) -> Response:
return Response(status=200, body=b"OK")
app = compose_middleware(logging_middleware, handler=handler)
app(Request("/api", {}, b""))
What's Next
In the next lesson, ParamSpec and Concatenate, we tackle the hardest typing problem in Python: preserving function signatures through decorators. You will learn how ParamSpec and Concatenate make decorators fully type-safe.
