HTTP Deep Dive - What Actually Travels Over the Wire
Reading time: ~35 minutes | Level: Intermediate → Engineering
Before reading further, predict what this code does step by step:
import socket
# Raw HTTP request - no library, no abstraction
sock = socket.create_connection(("httpbin.org", 80))
sock.sendall(b"GET /get HTTP/1.1\r\nHost: httpbin.org\r\nConnection: close\r\n\r\n")
response = b""
while chunk := sock.recv(4096):
response += chunk
sock.close()
print(response[:500].decode())
Show Answer
This sends a raw HTTP/1.1 GET request directly over a TCP socket and receives the full HTTP response. The output will look like:
HTTP/1.1 200 OK
Date: Wed, 05 Mar 2026 10:00:00 GMT
Content-Type: application/json
Content-Length: 312
Connection: close
Server: gunicorn/19.9.0
{
"args": {},
"headers": {
"Host": "httpbin.org"
},
"url": "http://httpbin.org/get"
}
This is all requests does - wrap this socket interaction. The requests library adds connection pooling, keep-alive, retry logic, cookie management, authentication helpers, SSL/TLS, and a cleaner API. But at the wire level, every HTTP request is exactly this: a formatted text message over a TCP connection.
Understanding this demystifies every HTTP library, every framework error message, and every network debugging session.
When requests.get("https://api.github.com/users/python") hangs in production and you have no idea why, you are debugging a socket. When a ConnectionResetError fires after 300 requests, it is the connection pool telling you the server closed a keep-alive connection. When you see SSLError: certificate verify failed, it is the TLS handshake failing before a single byte of HTTP is sent. These are not library bugs - they are the wire asserting itself. This lesson makes the wire legible.
What You Will Learn
- HTTP/1.1 wire format: every field, every delimiter, exact byte representation
- HTTP method semantics: safe, idempotent - why these properties matter for caching and retries
- Status code families: what each family means and which codes matter in production
- Critical headers:
Content-Type,Authorization,Cache-Control,ETag,X-Request-ID - HTTP request/response lifecycle: DNS → TCP → TLS → HTTP → response → connection reuse
- HTTP/2: multiplexing and header compression - why it matters for high-throughput APIs
- HTTP/3 and QUIC: what changed and why
requests:Session,HTTPAdapter, retry, timeout, hookshttpx: async-capable, HTTP/2 support, drop-in familiar API- Connection pooling: why
requests.get()in a loop is a performance anti-pattern
Prerequisites
- Python functions, loops, and basic networking concepts (TCP/IP at a high level)
- Basic familiarity with
requests.get()at the usage level - Comfort with dictionaries and string formatting
Part 1 - The HTTP/1.1 Wire Format
Anatomy of an HTTP Request
An HTTP/1.1 request is plain text with a precise structure. Every field matters:
GET /users/123?include=orders HTTP/1.1\r\n
Host: api.example.com\r\n
Accept: application/json\r\n
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...\r\n
Accept-Encoding: gzip, deflate, br\r\n
User-Agent: python-requests/2.31.0\r\n
X-Request-ID: f47ac10b-58cc-4372-a567-0e02b2c3d479\r\n
\r\n
Breaking it down:
| Component | Description |
|---|---|
GET | HTTP method |
/users/123?include=orders | Request target (path + query string) |
HTTP/1.1 | Protocol version |
\r\n | CRLF - carriage return + line feed - required between each line |
Host: api.example.com | Required header in HTTP/1.1; tells the server which virtual host to route to |
Blank line (\r\n\r\n) | Signals end of headers; everything after is the body |
For a POST request with a JSON body:
POST /users HTTP/1.1\r\n
Host: api.example.com\r\n
Content-Type: application/json\r\n
Content-Length: 38\r\n
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...\r\n
\r\n
{"name": "Alice", "email": "[email protected]"}
Content-Length tells the server how many bytes to read as the body. If it is wrong - too short (body truncated) or too long (server waits forever) - the request is broken.
Anatomy of an HTTP Response
HTTP/1.1 201 Created\r\n
Content-Type: application/json\r\n
Content-Length: 67\r\n
Location: /users/456\r\n
X-Request-ID: f47ac10b-58cc-4372-a567-0e02b2c3d479\r\n
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d"\r\n
\r\n
{"id": 456, "name": "Alice", "email": "[email protected]", "created_at": "2026-03-05"}
The status line has three parts: protocol version, status code, and reason phrase. The reason phrase is informational and ignored by machines - the status code is what matters.
Part 2 - HTTP Methods: Semantics, Not Just Conventions
HTTP methods have two properties that define how infrastructure (proxies, CDNs, browsers, load balancers) treats requests:
- Safe: the request does not modify server state (read-only). Infrastructure can retry safe requests freely.
- Idempotent: repeating the request N times has the same effect as sending it once. Infrastructure can retry idempotent requests after a network failure.
| Method | Safe | Idempotent | Use |
|---|---|---|---|
GET | Yes | Yes | Read a resource. Cache-able by default. |
HEAD | Yes | Yes | Like GET but no body. Used to check existence, get headers, validate cache. |
OPTIONS | Yes | Yes | Discover allowed methods. Used in CORS preflight. |
POST | No | No | Create a resource or trigger an action. Never cache. Never auto-retry. |
PUT | No | Yes | Full replace of a resource. Idempotent: sending the same PUT twice results in the same state. |
PATCH | No | No (by default) | Partial update. Not idempotent unless the patch is defined that way. |
DELETE | No | Yes | Delete a resource. First call deletes; subsequent calls are 404 or 204. Same end state. |
:::danger Never Use GET for State-Changing Operations Proxies, CDNs, and browsers cache GET requests. A browser may serve a cached response for a GET. A CDN may serve the same cached GET to a million users. A monitoring tool may probe your GET endpoint every 30 seconds. If your GET route deletes data, creates records, or triggers payments, any of these will cause production incidents. Use POST, PUT, PATCH, or DELETE for operations with side effects - no exceptions. :::
:::note Idempotency in Practice
PUT /users/123 with {"name": "Alice"} is idempotent - sending it ten times leaves the user named Alice. POST /users with {"name": "Alice"} is not - sending it ten times creates ten Alice records. This is why payment APIs use idempotency keys on POST requests: the key makes the server treat repeated POSTs as idempotent on the application layer.
:::
Part 3 - Status Code Families
1xx - Informational
Rarely seen in Python client code. 100 Continue tells the client it may send a large body after confirming the server will accept it.
2xx - Success
| Code | Name | When to use |
|---|---|---|
200 OK | Success | GET, PUT, PATCH responses with a body |
201 Created | Created | POST response after creating a resource; include Location header |
204 No Content | No content | DELETE, or PUT/PATCH with no response body |
3xx - Redirection
| Code | Name | Behaviour |
|---|---|---|
301 Moved Permanently | Permanent redirect | Browser/client updates bookmarks; method may change to GET |
302 Found | Temporary redirect | Method may change to GET on redirect (legacy) |
307 Temporary Redirect | Temporary, method preserved | Client must repeat with same method (POST stays POST) |
308 Permanent Redirect | Permanent, method preserved | Same as 307 but permanent |
requests follows 301/302 redirects automatically and changes POST to GET - which is the standard browser behaviour but wrong for API clients that POST to an endpoint that has moved. Set allow_redirects=False and handle redirects explicitly for POST/PUT.
4xx - Client Error
The client sent a request the server could not process. Retrying without changing the request is pointless.
| Code | Name | When |
|---|---|---|
400 Bad Request | Malformed request | Syntax error, missing field |
401 Unauthorized | Not authenticated | No credentials or invalid credentials (misleading name) |
403 Forbidden | Not authorized | Authenticated but lacks permission |
404 Not Found | Resource not found | Resource does not exist at this URL |
409 Conflict | Conflict | Duplicate resource, version conflict |
422 Unprocessable Entity | Validation failed | Syntactically valid but semantically invalid (FastAPI uses this) |
429 Too Many Requests | Rate limited | Check Retry-After header before retrying |
:::warning 401 vs 403: The Naming Confusion
401 Unauthorized should have been called "Unauthenticated" - it means the request lacks valid credentials. 403 Forbidden should have been called "Unauthorized" - it means the request has credentials but lacks permission. The naming is historical and cannot be changed. In your API, return 401 when credentials are missing or invalid, and 403 when the authenticated identity lacks permission.
:::
5xx - Server Error
The server failed to process a valid request. Clients should retry with backoff; requests may succeed after the server recovers.
| Code | Name | When |
|---|---|---|
500 Internal Server Error | Unhandled exception | Bug in server code |
502 Bad Gateway | Gateway received invalid response | Upstream service returned garbage |
503 Service Unavailable | Server temporarily down | Overloaded or in maintenance |
504 Gateway Timeout | Gateway timed out waiting | Upstream service too slow |
Part 4 - Critical Headers
Request Headers
import requests
response = requests.get(
"https://api.example.com/users/123",
headers={
# Tell the server what format you want back
"Accept": "application/json",
# Tell the server you accept compressed responses
"Accept-Encoding": "gzip, deflate, br",
# Bearer token authentication
"Authorization": "Bearer eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ1c2VyXzEyMyJ9...",
# Distributed tracing - attach to every request in a chain
"X-Request-ID": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
# Conditional GET - only return body if resource changed since ETag
"If-None-Match": '"33a64df551425fcc55e4d42a148795d9f25f89d"',
}
)
Response Headers
| Header | Meaning |
|---|---|
Content-Type: application/json; charset=utf-8 | Body is JSON, UTF-8 encoded |
Content-Length: 1024 | Body is exactly 1024 bytes |
Location: /users/456 | Used with 201 Created or 3xx redirects |
ETag: "abc123" | Opaque identifier for the current version of the resource |
Last-Modified: Wed, 05 Mar 2026 10:00:00 GMT | When the resource was last changed |
Cache-Control: max-age=3600, private | Cache this response for 1 hour, only in private (browser) caches |
Cache-Control: no-store | Never cache this response |
Retry-After: 60 | Wait 60 seconds before retrying (used with 429 and 503) |
X-Request-ID: f47ac10... | Echo back the request ID for distributed tracing |
Authorization Patterns
import base64
import requests
# Basic Auth - base64-encode "username:password"
credentials = base64.b64encode(b"alice:secret_password").decode()
headers = {"Authorization": f"Basic {credentials}"}
# OR use requests' built-in:
response = requests.get(url, auth=("alice", "secret_password"))
# Bearer Token - JWT or opaque token
headers = {"Authorization": f"Bearer {access_token}"}
# API Key - varies by service; often in a custom header
headers = {"X-API-Key": "your-api-key-here"}
# OR in query string (less secure - ends up in logs):
response = requests.get(f"{url}?api_key={key}")
ETag and Conditional Requests
ETags enable efficient cache validation - the client only downloads the body when it has changed:
import requests
session = requests.Session()
# First request - server returns ETag
response = session.get("https://api.example.com/users/123")
etag = response.headers.get("ETag") # e.g. '"33a64df551..."'
data = response.json() # parse and cache locally
# Later - conditional GET: only download if resource changed
response = session.get(
"https://api.example.com/users/123",
headers={"If-None-Match": etag}
)
if response.status_code == 304: # Not Modified
# Use cached data - body is empty, ETag still valid
pass
elif response.status_code == 200:
# Resource changed - parse fresh data and update local cache
data = response.json()
etag = response.headers.get("ETag")
Part 5 - HTTP Request/Response Lifecycle
For a cold HTTPS request to a new host, there are three round trips before the first byte of HTTP:
- DNS resolution (0–300ms depending on TTL and resolver)
- TCP handshake (one RTT - typically 20–200ms)
- TLS 1.3 handshake (one RTT with session resumption, or 1.5 RTT cold)
Then the HTTP request and response (one RTT minimum for small responses). This is why connection reuse (keep-alive) and connection pooling matter so much - they eliminate round trips 2 and 3 for subsequent requests to the same host.
Part 6 - Connection Management and the requests Library
Why requests.get() in a Loop Is Wrong
import requests
users = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# WRONG - creates a new TCP + TLS connection for each request
# 10 requests = 10 DNS lookups + 10 TCP handshakes + 10 TLS handshakes
for user_id in users:
response = requests.get(f"https://api.example.com/users/{user_id}")
process(response.json())
import requests
# RIGHT - Session reuses the connection pool
session = requests.Session()
# Session persists: cookies, auth headers, base headers, connection pool
session.headers.update({
"Authorization": "Bearer your-token",
"Accept": "application/json",
})
for user_id in users:
response = session.get(f"https://api.example.com/users/{user_id}")
process(response.json())
:::warning requests.get() Creates a New Connection Each Time
Every call to requests.get(), requests.post(), etc. at the module level creates a new Session internally and tears it down after the call. For a single request, this is fine. For multiple requests to the same host - even two - use a Session explicitly. The connection pool reuses the TCP and TLS layer, reducing latency from hundreds of milliseconds per request to under 10ms.
:::
Timeouts: Always Set Them Explicitly
import requests
# WRONG - no timeout; this can block forever
response = requests.get("https://slow-api.example.com/data")
# RIGHT - connect timeout and read timeout separately
response = requests.get(
"https://api.example.com/data",
timeout=(3.05, 30)
# (connect_timeout, read_timeout)
# connect: how long to wait to establish the connection
# read: how long to wait between bytes once connected
)
# OR a single value - applies to both connect and read
response = requests.get("https://api.example.com/data", timeout=10)
:::tip Always Set Explicit Timeouts
A requests call without a timeout will block indefinitely if the server stops responding mid-transfer. In production, this means a single slow upstream API can exhaust your thread pool, taking down the entire service. Set timeout=(connect_seconds, read_seconds) on every request. The connect timeout should be short (1–5 seconds); the read timeout depends on the expected response size and server processing time.
:::
Retry Logic with HTTPAdapter
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def create_session_with_retry() -> requests.Session:
"""
Production-grade requests Session with:
- 3 retries on transient failures
- Exponential backoff (0.5s, 1s, 2s)
- Retry only on 5xx errors and connection failures
- NOT retrying on 4xx (client errors are not transient)
"""
session = requests.Session()
retry_strategy = Retry(
total=3,
backoff_factor=0.5, # waits: 0.5, 1.0, 2.0 seconds
status_forcelist=[500, 502, 503, 504],
allowed_methods=["GET", "HEAD", "OPTIONS"], # safe + idempotent only
raise_on_status=False, # don't raise; let the caller check status
)
adapter = HTTPAdapter(
max_retries=retry_strategy,
pool_connections=10, # number of connection pools (one per host)
pool_maxsize=20, # max connections per pool
)
session.mount("https://", adapter)
session.mount("http://", adapter)
return session
session = create_session_with_retry()
response = session.get("https://api.example.com/data", timeout=(3, 30))
response.raise_for_status() # raises HTTPError for 4xx/5xx after retries exhausted
data = response.json()
:::danger Never Disable SSL Verification in Production
requests.get(url, verify=False) disables certificate validation entirely. It silences the SSL warning, but it does not fix the underlying problem - it opens the connection to man-in-the-middle attacks. Any intermediate network node (corporate proxy, cloud load balancer, ISP) can intercept and read the traffic. Fix SSL errors properly: update the CA bundle (pip install certifi), specify the correct CA file (verify="/path/to/ca-bundle.crt"), or debug the certificate chain. Never ship verify=False in production code.
:::
Request Hooks
import requests
import time
import logging
logger = logging.getLogger(__name__)
def log_response_time(response, *args, **kwargs):
"""Hook called after every response. Logs timing for observability."""
elapsed = response.elapsed.total_seconds()
logger.info(
"HTTP %s %s → %d in %.3fs",
response.request.method,
response.request.url,
response.status_code,
elapsed,
)
if elapsed > 1.0:
logger.warning("Slow response: %.3fs for %s", elapsed, response.request.url)
session = requests.Session()
session.hooks["response"].append(log_response_time)
# Every response through this session logs timing automatically
response = session.get("https://api.example.com/users/123", timeout=10)
Part 7 - httpx: The Modern Alternative
httpx is a near drop-in replacement for requests that adds async support and HTTP/2:
import httpx
# Synchronous - identical API to requests
with httpx.Client(timeout=10) as client:
response = client.get("https://api.example.com/users/123")
data = response.json()
# Async - same client, async methods
import asyncio
async def fetch_users(user_ids: list[int]) -> list[dict]:
"""Fetch multiple users concurrently - impossible with synchronous requests."""
async with httpx.AsyncClient(timeout=10) as client:
tasks = [
client.get(f"https://api.example.com/users/{uid}")
for uid in user_ids
]
responses = await asyncio.gather(*tasks)
return [r.json() for r in responses]
# HTTP/2 - multiplexes multiple requests over one connection
async def fetch_with_http2():
async with httpx.AsyncClient(http2=True, timeout=10) as client:
response = await client.get("https://api.example.com/users/123")
return response.json()
asyncio.run(fetch_with_http2())
| Feature | requests | httpx |
|---|---|---|
| Synchronous | Yes | Yes |
| Async | No | Yes |
| HTTP/2 | No | Yes (with httpx[http2]) |
| API familiarity | Reference | Very similar to requests |
| Streaming | Yes | Yes |
| Test client | No (use responses) | Yes (built-in httpx.MockTransport) |
Part 8 - HTTP/2 and HTTP/3
HTTP/2: Multiplexing and Header Compression
HTTP/1.1 has two performance problems:
- Head-of-line blocking: requests on a single connection are serialized. If response #1 is large, response #2 waits.
- Header repetition: every request sends the full header set (often 500–1000 bytes of repeated
Cookie,User-Agent,Authorization).
HTTP/2 solves both:
HPACK header compression maintains a shared table of previously sent headers. After the first request sends Authorization: Bearer eyJ..., subsequent requests send a single-byte reference to that header instead of the full value. For APIs with large auth tokens, this reduces per-request overhead by 60–90%.
Server push (rarely used in APIs): the server can send responses the client has not asked for yet. Used in web browsers for CSS/JS assets but almost never in API design.
HTTP/3 and QUIC
HTTP/3 replaces TCP with QUIC (UDP-based, with reliability built in):
- Eliminates TCP head-of-line blocking at the transport layer (HTTP/2 still suffers this at TCP level)
- Faster connection establishment (0-RTT for resumed sessions)
- Better performance on lossy mobile networks (packet loss only blocks one stream, not all)
Python support for HTTP/3 is early-stage as of 2026. Most production Python APIs are HTTP/1.1 or HTTP/2. QUIC matters more for CDN edge nodes and mobile clients than for server-to-server API calls.
Part 9 - A Production HTTP Client Pattern
import uuid
import logging
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
from typing import Any
logger = logging.getLogger(__name__)
class APIClient:
"""
Production HTTP client for a single upstream API.
Features:
- Connection pooling via Session
- Retry with exponential backoff on 5xx
- Explicit timeouts on every request
- Request ID injection for distributed tracing
- Response time logging
- Consistent error handling
"""
def __init__(self, base_url: str, api_key: str, timeout: tuple = (3, 30)):
self.base_url = base_url.rstrip("/")
self.timeout = timeout
self._session = requests.Session()
self._session.headers.update({
"Authorization": f"Bearer {api_key}",
"Accept": "application/json",
"Content-Type": "application/json",
})
retry = Retry(
total=3,
backoff_factor=0.5,
status_forcelist=[500, 502, 503, 504],
allowed_methods=["GET", "HEAD", "DELETE"],
)
adapter = HTTPAdapter(max_retries=retry, pool_maxsize=20)
self._session.mount("https://", adapter)
self._session.mount("http://", adapter)
self._session.hooks["response"].append(self._log_response)
def _log_response(self, response, *args, **kwargs):
elapsed = response.elapsed.total_seconds()
logger.info(
"API %s %s → %d (%.3fs)",
response.request.method,
response.request.url,
response.status_code,
elapsed,
)
def _request(self, method: str, path: str, **kwargs) -> requests.Response:
url = f"{self.base_url}/{path.lstrip('/')}"
request_id = str(uuid.uuid4())
headers = kwargs.pop("headers", {})
headers["X-Request-ID"] = request_id
response = self._session.request(
method,
url,
headers=headers,
timeout=self.timeout,
**kwargs,
)
response.raise_for_status()
return response
def get(self, path: str, **kwargs) -> Any:
return self._request("GET", path, **kwargs).json()
def post(self, path: str, body: dict, **kwargs) -> Any:
return self._request("POST", path, json=body, **kwargs).json()
def patch(self, path: str, body: dict, **kwargs) -> Any:
return self._request("PATCH", path, json=body, **kwargs).json()
def delete(self, path: str, **kwargs) -> None:
self._request("DELETE", path, **kwargs)
def close(self):
self._session.close()
def __enter__(self):
return self
def __exit__(self, *exc):
self.close()
return False
# Usage
with APIClient("https://api.example.com", api_key="sk-...") as client:
user = client.get("/users/123")
client.delete("/users/456")
:::note HTTP Is Stateless Every HTTP request is independent. The server has no memory of previous requests from the same client. Cookies, sessions, JWTs, and API keys are all application-layer illusions built on top of a fundamentally stateless protocol. Each request must carry all the context the server needs to process it. This is a feature - it makes HTTP services horizontally scalable, because any server instance can handle any request. :::
Graded Practice Challenges
Level 1 - Predict and Identify
Question 1: What is the minimum number of network round trips needed before an HTTP response body can start arriving on a cold HTTPS connection?
Show Answer
Three round trips before HTTP:
- DNS resolution (not strictly a round trip in the HTTP sense, but a network call)
- TCP handshake (SYN / SYN-ACK / ACK = one RTT)
- TLS 1.3 handshake (one RTT for the key exchange)
Then one more RTT for the HTTP request and the first response bytes to arrive.
Total: 3–4 RTTs on a cold connection. On a warm connection (Session with keep-alive), steps 2 and 3 are skipped - the HTTP request goes out immediately. This is why Session reuse matters.
Question 2: Which HTTP methods are both safe AND idempotent? Which are idempotent but not safe? Which are neither?
Show Answer
Safe AND idempotent (read-only, repeatable): GET, HEAD, OPTIONS
Idempotent but NOT safe (write operations with stable end state): PUT, DELETE
Neither safe nor idempotent: POST, PATCH
The practical consequence: GET, HEAD, and OPTIONS can be retried automatically by any infrastructure (proxy, load balancer, retry adapter). PUT and DELETE can be retried by the application if a network error occurs mid-flight. POST and PATCH must not be automatically retried - a duplicate POST creates two records; a duplicate PATCH may apply the change twice (unless the server implements idempotency keys).
Question 3: What does this code actually do differently from a plain requests.get() call?
session = requests.Session()
session.mount("https://", HTTPAdapter(pool_maxsize=20))
response = session.get("https://api.example.com/users/1", timeout=(3, 30))
Show Answer
Three differences from requests.get():
-
Connection pooling: the
Sessionholds up to 20 open HTTPS connections toapi.example.com. Subsequent requests to the same host reuse existing connections, skipping the TCP + TLS handshake (saving 50–400ms per request depending on RTT). -
Timeout:
(3, 30)sets a 3-second connect timeout and a 30-second read timeout. Plainrequests.get()withouttimeoutblocks indefinitely. This is the most important difference for production safety. -
Reusable configuration:
Sessionpersists headers, auth, and cookies across requests. Any subsequent call tosession.get()orsession.post()reuses the same pool and configuration without re-initialization overhead.
Question 4: A service makes 1000 API calls per hour to the same external host. Currently it uses requests.get() at the module level. Estimate how much time is wasted on connection overhead compared to using a Session.
Show Answer
Assumptions: typical RTT to a public HTTPS endpoint is 80ms. Each cold connection requires:
- TCP handshake: ~80ms (1 RTT)
- TLS 1.3 handshake: ~80ms (1 RTT)
- Total overhead per request: ~160ms
With requests.get() (new connection each time):
- 1000 requests × 160ms = 160 seconds of wasted connection overhead per hour
With Session (connection pool reused):
- First request: 160ms overhead
- Subsequent 999 requests: ~0ms connection overhead
- Total overhead: ~160ms
Net savings: ~160 seconds per hour - wasted purely on redundant TCP + TLS handshakes. At higher request rates this becomes a significant latency and resource cost.
Level 2 - Debug and Fix
Find and fix all problems in this HTTP client code:
import requests
def fetch_all_users(user_ids):
results = []
for uid in user_ids:
resp = requests.get(f"https://api.example.com/users/{uid}")
if resp.status_code == 200:
results.append(resp.json())
elif resp.status_code == 500:
results.append(fetch_all_users([uid])) # retry
return results
def create_user(name, email):
resp = requests.get(
"https://api.example.com/users",
params={"name": name, "email": email}
)
return resp.json()
def delete_user(uid):
resp = requests.post(
"https://api.example.com/users",
json={"user_id": uid, "action": "delete"},
verify=False
)
return resp.status_code == 200
Show Solution
Bug 1 - No Session, no timeout:
requests.get() at module level creates a new connection for every call. For a list of user IDs, this is expensive.
Bug 2 - Naive retry causes infinite recursion:
fetch_all_users([uid]) on a 500 will recurse on every 500 response - infinite recursion if the server stays down.
Bug 3 - Wrong method for create:
requests.get() for creating a user is wrong semantically and dangerous - GET is cacheable and safe. Use POST.
Bug 4 - Wrong method for delete:
requests.post() for a delete operation violates REST conventions and is not idempotent as expressed.
Bug 5 - verify=False:
SSL verification disabled - man-in-the-middle vulnerability in production.
Fixed version:
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def make_session() -> requests.Session:
session = requests.Session()
retry = Retry(
total=3,
backoff_factor=0.5,
status_forcelist=[500, 502, 503, 504],
allowed_methods=["GET", "DELETE"],
)
session.mount("https://", HTTPAdapter(max_retries=retry, pool_maxsize=10))
return session
_SESSION = make_session()
def fetch_all_users(user_ids: list[int]) -> list[dict]:
results = []
for uid in user_ids:
response = _SESSION.get(
f"https://api.example.com/users/{uid}",
timeout=(3, 30),
)
response.raise_for_status() # raises on 4xx/5xx; retry handles 5xx
results.append(response.json())
return results
def create_user(name: str, email: str) -> dict:
response = _SESSION.post(
"https://api.example.com/users",
json={"name": name, "email": email},
timeout=(3, 30),
)
response.raise_for_status()
return response.json()
def delete_user(uid: int) -> None:
response = _SESSION.delete(
f"https://api.example.com/users/{uid}",
timeout=(3, 30),
)
response.raise_for_status()
Level 3 - Design Challenge
Design a RateLimitedHTTPClient class that:
- Wraps
requests.Session - Respects
Retry-Afterresponse headers on429responses - Maintains a configurable per-host request rate (e.g. max 10 requests/second)
- Logs all rate limiting events with host, wait duration, and request URL
- Raises
RateLimitExceededafter a configurable maximum wait time
Show Reference Solution
import time
import logging
import threading
import requests
from collections import defaultdict, deque
logger = logging.getLogger(__name__)
class RateLimitExceeded(Exception):
pass
class RateLimitedHTTPClient:
"""
HTTP client that enforces per-host rate limits and respects Retry-After.
Args:
requests_per_second: Maximum requests per second per host (token bucket)
max_wait_seconds: Maximum time to wait on a 429 before raising
session: Optional pre-configured requests.Session
"""
def __init__(
self,
requests_per_second: float = 10.0,
max_wait_seconds: float = 60.0,
session: requests.Session = None,
):
self.requests_per_second = requests_per_second
self.max_wait_seconds = max_wait_seconds
self._session = session or requests.Session()
# Per-host sliding window of request timestamps
self._lock = threading.Lock()
self._request_times: dict[str, deque] = defaultdict(
lambda: deque(maxlen=int(requests_per_second * 2))
)
def _host(self, url: str) -> str:
from urllib.parse import urlparse
return urlparse(url).netloc
def _throttle(self, url: str) -> None:
"""Block until the per-host rate limit allows the next request."""
host = self._host(url)
window = 1.0 / self.requests_per_second # seconds between requests
with self._lock:
now = time.monotonic()
times = self._request_times[host]
if len(times) >= self.requests_per_second:
oldest = times[0]
wait = oldest + 1.0 - now
if wait > 0:
logger.debug("Rate throttle: %.3fs for %s", wait, host)
time.sleep(wait)
self._request_times[host].append(time.monotonic())
def _handle_429(self, response: requests.Response, url: str) -> None:
retry_after = response.headers.get("Retry-After", "").strip()
try:
wait = float(retry_after)
except (ValueError, TypeError):
wait = 5.0 # default backoff if Retry-After is missing or invalid
logger.warning(
"Rate limited (429): %s - waiting %.1fs (Retry-After: %r)",
url, wait, retry_after
)
if wait > self.max_wait_seconds:
raise RateLimitExceeded(
f"Server requested {wait}s wait on {url}; "
f"max_wait_seconds={self.max_wait_seconds}"
)
time.sleep(wait)
def request(self, method: str, url: str, **kwargs) -> requests.Response:
kwargs.setdefault("timeout", (3, 30))
attempts = 0
max_attempts = 5
while attempts < max_attempts:
self._throttle(url)
response = self._session.request(method, url, **kwargs)
if response.status_code == 429:
attempts += 1
if attempts >= max_attempts:
raise RateLimitExceeded(
f"Still rate limited after {max_attempts} attempts: {url}"
)
self._handle_429(response, url)
continue
return response
raise RateLimitExceeded(f"Max retry attempts exceeded for {url}")
def get(self, url: str, **kwargs) -> requests.Response:
return self.request("GET", url, **kwargs)
def post(self, url: str, **kwargs) -> requests.Response:
return self.request("POST", url, **kwargs)
def close(self):
self._session.close()
def __enter__(self):
return self
def __exit__(self, *exc):
self.close()
return False
# Usage
with RateLimitedHTTPClient(requests_per_second=5, max_wait_seconds=30) as client:
for user_id in range(100):
response = client.get(f"https://api.example.com/users/{user_id}")
response.raise_for_status()
process(response.json())
Design decisions:
- Per-host sliding window prevents one busy host from affecting requests to other hosts
_handle_429separates server-requested rate limiting (viaRetry-After) from client-side throttling - they have different semantics and different wait durationsmax_wait_secondsprovides a circuit breaker for server-requested waits that would block the caller for unacceptable durations- Thread lock on the request time window makes the client safe for multi-threaded use
- The
request()method is the single dispatch point - all HTTP methods go through it, making rate limiting and retry logic uniform
Key Takeaways
- HTTP is plain text over a TCP socket: request line + headers + blank line + optional body. Every HTTP library, framework, and debugging tool operates on this format
GET,HEAD,OPTIONSare safe and idempotent - infrastructure can retry them freely.PUT,DELETEare idempotent but not safe.POSTandPATCHare neither - never auto-retry them- Status code families: 2xx (success), 3xx (redirect), 4xx (client error - fix the request), 5xx (server error - retry with backoff)
401means unauthenticated (missing/invalid credentials);403means unauthorized (lacks permission).429means rate limited - checkRetry-Afterbefore retrying- ETags enable conditional GET: send
If-None-Matchwith the ETag and get304 Not Modified(with no body) when the resource has not changed - A cold HTTPS connection requires DNS + TCP handshake + TLS handshake before the first HTTP byte - 3+ round trips.
Sessionreuse eliminates handshake overhead for subsequent requests - Always set explicit timeouts:
timeout=(connect_seconds, read_seconds). A hanging request without a timeout blocks a thread or coroutine indefinitely - Use
Sessionfor multiple requests to the same host.requests.get()at module level creates a new connection every time - Never use
verify=Falsein production. Fix SSL errors properly with correct CA bundles HTTPAdapterwithRetryhandles transient 5xx failures with exponential backoff - only safe for idempotent methodshttpxprovides near-identical API torequestswith added async support and HTTP/2 multiplexing- HTTP/2 multiplexes multiple streams over one connection and compresses repeated headers - significant throughput improvement for high-request-rate API clients
What's Next
Lesson 02 covers REST principles - the architectural style that gives HTTP methods their meaning in API design. You will see two designs for the same API, understand why one is "RESTful" and the other is not, and learn the specific rules (URL design, method semantics, status codes, error formats, pagination, versioning) that make REST APIs predictable, cacheable, and maintainable at scale.
