What is python HTTP wire format?

Master HTTP/1.1 at the byte level - request/response wire format, method semantics, status code families, critical headers, connection pooling, the requests and httpx libraries, HTTP/2 multiplexing, and why every production client needs explicit timeouts.

How does python requests library work in practice?

HTTP Deep Dive - What Actually Travels Over the Wire covers python HTTP wire format, python requests library, python httpx from first principles with code examples. Free lesson at https://engineersofai.com/docs/python/python-intermediate/apis-and-web-basics/http-deep-dive

What is the difference between python HTTP wire format and python httpx?

See the full breakdown at https://engineersofai.com/docs/python/python-intermediate/apis-and-web-basics/http-deep-dive

HTTP Deep Dive - What Actually Travels Over the Wire

Reading time: ~35 minutes | Level: Intermediate → Engineering

Before reading further, predict what this code does step by step:

import socket

# Raw HTTP request - no library, no abstraction
sock = socket.create_connection(("httpbin.org", 80))
sock.sendall(b"GET /get HTTP/1.1\r\nHost: httpbin.org\r\nConnection: close\r\n\r\n")
response = b""
while chunk := sock.recv(4096):
    response += chunk
sock.close()
print(response[:500].decode())

Show Answer

This sends a raw HTTP/1.1 GET request directly over a TCP socket and receives the full HTTP response. The output will look like:

HTTP/1.1 200 OK
Date: Wed, 05 Mar 2026 10:00:00 GMT
Content-Type: application/json
Content-Length: 312
Connection: close
Server: gunicorn/19.9.0

{
  "args": {},
  "headers": {
    "Host": "httpbin.org"
  },
  "url": "http://httpbin.org/get"
}

This is all requests does - wrap this socket interaction. The requests library adds connection pooling, keep-alive, retry logic, cookie management, authentication helpers, SSL/TLS, and a cleaner API. But at the wire level, every HTTP request is exactly this: a formatted text message over a TCP connection.

Understanding this demystifies every HTTP library, every framework error message, and every network debugging session.

When requests.get("https://api.github.com/users/python") hangs in production and you have no idea why, you are debugging a socket. When a ConnectionResetError fires after 300 requests, it is the connection pool telling you the server closed a keep-alive connection. When you see SSLError: certificate verify failed, it is the TLS handshake failing before a single byte of HTTP is sent. These are not library bugs - they are the wire asserting itself. This lesson makes the wire legible.

What You Will Learn

HTTP/1.1 wire format: every field, every delimiter, exact byte representation
HTTP method semantics: safe, idempotent - why these properties matter for caching and retries
Status code families: what each family means and which codes matter in production
Critical headers: Content-Type, Authorization, Cache-Control, ETag, X-Request-ID
HTTP request/response lifecycle: DNS → TCP → TLS → HTTP → response → connection reuse
HTTP/2: multiplexing and header compression - why it matters for high-throughput APIs
HTTP/3 and QUIC: what changed and why
requests: Session, HTTPAdapter, retry, timeout, hooks
httpx: async-capable, HTTP/2 support, drop-in familiar API
Connection pooling: why requests.get() in a loop is a performance anti-pattern

Prerequisites

Python functions, loops, and basic networking concepts (TCP/IP at a high level)
Basic familiarity with requests.get() at the usage level
Comfort with dictionaries and string formatting

Part 1 - The HTTP/1.1 Wire Format

Anatomy of an HTTP Request

An HTTP/1.1 request is plain text with a precise structure. Every field matters:

GET /users/123?include=orders HTTP/1.1\r\n
Host: api.example.com\r\n
Accept: application/json\r\n
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...\r\n
Accept-Encoding: gzip, deflate, br\r\n
User-Agent: python-requests/2.31.0\r\n
X-Request-ID: f47ac10b-58cc-4372-a567-0e02b2c3d479\r\n
\r\n

Breaking it down:

Component	Description
`GET`	HTTP method
`/users/123?include=orders`	Request target (path + query string)
`HTTP/1.1`	Protocol version
`\r\n`	CRLF - carriage return + line feed - required between each line
`Host: api.example.com`	Required header in HTTP/1.1; tells the server which virtual host to route to
Blank line (`\r\n\r\n`)	Signals end of headers; everything after is the body

For a POST request with a JSON body:

POST /users HTTP/1.1\r\n
Host: api.example.com\r\n
Content-Type: application/json\r\n
Content-Length: 38\r\n
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...\r\n
\r\n
{"name": "Alice", "email": "[email protected]"}

Content-Length tells the server how many bytes to read as the body. If it is wrong - too short (body truncated) or too long (server waits forever) - the request is broken.

Anatomy of an HTTP Response

HTTP/1.1 201 Created\r\n
Content-Type: application/json\r\n
Content-Length: 67\r\n
Location: /users/456\r\n
X-Request-ID: f47ac10b-58cc-4372-a567-0e02b2c3d479\r\n
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d"\r\n
\r\n
{"id": 456, "name": "Alice", "email": "[email protected]", "created_at": "2026-03-05"}

The status line has three parts: protocol version, status code, and reason phrase. The reason phrase is informational and ignored by machines - the status code is what matters.

Part 2 - HTTP Methods: Semantics, Not Just Conventions

HTTP methods have two properties that define how infrastructure (proxies, CDNs, browsers, load balancers) treats requests:

Safe: the request does not modify server state (read-only). Infrastructure can retry safe requests freely.
Idempotent: repeating the request N times has the same effect as sending it once. Infrastructure can retry idempotent requests after a network failure.

Method	Safe	Idempotent	Use
`GET`	Yes	Yes	Read a resource. Cache-able by default.
`HEAD`	Yes	Yes	Like GET but no body. Used to check existence, get headers, validate cache.
`OPTIONS`	Yes	Yes	Discover allowed methods. Used in CORS preflight.
`POST`	No	No	Create a resource or trigger an action. Never cache. Never auto-retry.
`PUT`	No	Yes	Full replace of a resource. Idempotent: sending the same PUT twice results in the same state.
`PATCH`	No	No (by default)	Partial update. Not idempotent unless the patch is defined that way.
`DELETE`	No	Yes	Delete a resource. First call deletes; subsequent calls are 404 or 204. Same end state.

:::danger Never Use GET for State-Changing Operations Proxies, CDNs, and browsers cache GET requests. A browser may serve a cached response for a GET. A CDN may serve the same cached GET to a million users. A monitoring tool may probe your GET endpoint every 30 seconds. If your GET route deletes data, creates records, or triggers payments, any of these will cause production incidents. Use POST, PUT, PATCH, or DELETE for operations with side effects - no exceptions. :::

:::note Idempotency in Practice PUT /users/123 with {"name": "Alice"} is idempotent - sending it ten times leaves the user named Alice. POST /users with {"name": "Alice"} is not - sending it ten times creates ten Alice records. This is why payment APIs use idempotency keys on POST requests: the key makes the server treat repeated POSTs as idempotent on the application layer. :::

Part 3 - Status Code Families

1xx - Informational

Rarely seen in Python client code. 100 Continue tells the client it may send a large body after confirming the server will accept it.

2xx - Success

Code	Name	When to use
`200 OK`	Success	GET, PUT, PATCH responses with a body
`201 Created`	Created	POST response after creating a resource; include `Location` header
`204 No Content`	No content	DELETE, or PUT/PATCH with no response body

3xx - Redirection

Code	Name	Behaviour
`301 Moved Permanently`	Permanent redirect	Browser/client updates bookmarks; method may change to GET
`302 Found`	Temporary redirect	Method may change to GET on redirect (legacy)
`307 Temporary Redirect`	Temporary, method preserved	Client must repeat with same method (POST stays POST)
`308 Permanent Redirect`	Permanent, method preserved	Same as 307 but permanent

requests follows 301/302 redirects automatically and changes POST to GET - which is the standard browser behaviour but wrong for API clients that POST to an endpoint that has moved. Set allow_redirects=False and handle redirects explicitly for POST/PUT.

4xx - Client Error

The client sent a request the server could not process. Retrying without changing the request is pointless.

Code	Name	When
`400 Bad Request`	Malformed request	Syntax error, missing field
`401 Unauthorized`	Not authenticated	No credentials or invalid credentials (misleading name)
`403 Forbidden`	Not authorized	Authenticated but lacks permission
`404 Not Found`	Resource not found	Resource does not exist at this URL
`409 Conflict`	Conflict	Duplicate resource, version conflict
`422 Unprocessable Entity`	Validation failed	Syntactically valid but semantically invalid (FastAPI uses this)
`429 Too Many Requests`	Rate limited	Check `Retry-After` header before retrying

:::warning 401 vs 403: The Naming Confusion 401 Unauthorized should have been called "Unauthenticated" - it means the request lacks valid credentials. 403 Forbidden should have been called "Unauthorized" - it means the request has credentials but lacks permission. The naming is historical and cannot be changed. In your API, return 401 when credentials are missing or invalid, and 403 when the authenticated identity lacks permission. :::

5xx - Server Error

The server failed to process a valid request. Clients should retry with backoff; requests may succeed after the server recovers.

Code	Name	When
`500 Internal Server Error`	Unhandled exception	Bug in server code
`502 Bad Gateway`	Gateway received invalid response	Upstream service returned garbage
`503 Service Unavailable`	Server temporarily down	Overloaded or in maintenance
`504 Gateway Timeout`	Gateway timed out waiting	Upstream service too slow

Part 4 - Critical Headers

Request Headers

import requests

response = requests.get(
    "https://api.example.com/users/123",
    headers={
        # Tell the server what format you want back
        "Accept": "application/json",

        # Tell the server you accept compressed responses
        "Accept-Encoding": "gzip, deflate, br",

        # Bearer token authentication
        "Authorization": "Bearer eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ1c2VyXzEyMyJ9...",

        # Distributed tracing - attach to every request in a chain
        "X-Request-ID": "f47ac10b-58cc-4372-a567-0e02b2c3d479",

        # Conditional GET - only return body if resource changed since ETag
        "If-None-Match": '"33a64df551425fcc55e4d42a148795d9f25f89d"',
    }
)

Response Headers

Header	Meaning
`Content-Type: application/json; charset=utf-8`	Body is JSON, UTF-8 encoded
`Content-Length: 1024`	Body is exactly 1024 bytes
`Location: /users/456`	Used with 201 Created or 3xx redirects
`ETag: "abc123"`	Opaque identifier for the current version of the resource
`Last-Modified: Wed, 05 Mar 2026 10:00:00 GMT`	When the resource was last changed
`Cache-Control: max-age=3600, private`	Cache this response for 1 hour, only in private (browser) caches
`Cache-Control: no-store`	Never cache this response
`Retry-After: 60`	Wait 60 seconds before retrying (used with 429 and 503)
`X-Request-ID: f47ac10...`	Echo back the request ID for distributed tracing

Authorization Patterns

import base64
import requests

# Basic Auth - base64-encode "username:password"
credentials = base64.b64encode(b"alice:secret_password").decode()
headers = {"Authorization": f"Basic {credentials}"}
# OR use requests' built-in:
response = requests.get(url, auth=("alice", "secret_password"))

# Bearer Token - JWT or opaque token
headers = {"Authorization": f"Bearer {access_token}"}

# API Key - varies by service; often in a custom header
headers = {"X-API-Key": "your-api-key-here"}
# OR in query string (less secure - ends up in logs):
response = requests.get(f"{url}?api_key={key}")

ETag and Conditional Requests

ETags enable efficient cache validation - the client only downloads the body when it has changed:

import requests

session = requests.Session()

# First request - server returns ETag
response = session.get("https://api.example.com/users/123")
etag = response.headers.get("ETag")  # e.g. '"33a64df551..."'
data = response.json()               # parse and cache locally

# Later - conditional GET: only download if resource changed
response = session.get(
    "https://api.example.com/users/123",
    headers={"If-None-Match": etag}
)

if response.status_code == 304:      # Not Modified
    # Use cached data - body is empty, ETag still valid
    pass
elif response.status_code == 200:
    # Resource changed - parse fresh data and update local cache
    data = response.json()
    etag = response.headers.get("ETag")

Part 5 - HTTP Request/Response Lifecycle

For a cold HTTPS request to a new host, there are three round trips before the first byte of HTTP:

DNS resolution (0–300ms depending on TTL and resolver)
TCP handshake (one RTT - typically 20–200ms)
TLS 1.3 handshake (one RTT with session resumption, or 1.5 RTT cold)

Then the HTTP request and response (one RTT minimum for small responses). This is why connection reuse (keep-alive) and connection pooling matter so much - they eliminate round trips 2 and 3 for subsequent requests to the same host.

Part 6 - Connection Management and the `requests` Library

Why `requests.get()` in a Loop Is Wrong

import requests

users = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# WRONG - creates a new TCP + TLS connection for each request
# 10 requests = 10 DNS lookups + 10 TCP handshakes + 10 TLS handshakes
for user_id in users:
    response = requests.get(f"https://api.example.com/users/{user_id}")
    process(response.json())

import requests

# RIGHT - Session reuses the connection pool
session = requests.Session()

# Session persists: cookies, auth headers, base headers, connection pool
session.headers.update({
    "Authorization": "Bearer your-token",
    "Accept": "application/json",
})

for user_id in users:
    response = session.get(f"https://api.example.com/users/{user_id}")
    process(response.json())

:::warning requests.get() Creates a New Connection Each Time Every call to requests.get(), requests.post(), etc. at the module level creates a new Session internally and tears it down after the call. For a single request, this is fine. For multiple requests to the same host - even two - use a Session explicitly. The connection pool reuses the TCP and TLS layer, reducing latency from hundreds of milliseconds per request to under 10ms. :::

Timeouts: Always Set Them Explicitly

import requests

# WRONG - no timeout; this can block forever
response = requests.get("https://slow-api.example.com/data")

# RIGHT - connect timeout and read timeout separately
response = requests.get(
    "https://api.example.com/data",
    timeout=(3.05, 30)
    # (connect_timeout, read_timeout)
    # connect: how long to wait to establish the connection
    # read: how long to wait between bytes once connected
)

# OR a single value - applies to both connect and read
response = requests.get("https://api.example.com/data", timeout=10)

:::tip Always Set Explicit Timeouts A requests call without a timeout will block indefinitely if the server stops responding mid-transfer. In production, this means a single slow upstream API can exhaust your thread pool, taking down the entire service. Set timeout=(connect_seconds, read_seconds) on every request. The connect timeout should be short (1–5 seconds); the read timeout depends on the expected response size and server processing time. :::

Retry Logic with `HTTPAdapter`

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def create_session_with_retry() -> requests.Session:
    """
    Production-grade requests Session with:
    - 3 retries on transient failures
    - Exponential backoff (0.5s, 1s, 2s)
    - Retry only on 5xx errors and connection failures
    - NOT retrying on 4xx (client errors are not transient)
    """
    session = requests.Session()

    retry_strategy = Retry(
        total=3,
        backoff_factor=0.5,        # waits: 0.5, 1.0, 2.0 seconds
        status_forcelist=[500, 502, 503, 504],
        allowed_methods=["GET", "HEAD", "OPTIONS"],  # safe + idempotent only
        raise_on_status=False,     # don't raise; let the caller check status
    )

    adapter = HTTPAdapter(
        max_retries=retry_strategy,
        pool_connections=10,       # number of connection pools (one per host)
        pool_maxsize=20,           # max connections per pool
    )

    session.mount("https://", adapter)
    session.mount("http://", adapter)

    return session


session = create_session_with_retry()
response = session.get("https://api.example.com/data", timeout=(3, 30))
response.raise_for_status()  # raises HTTPError for 4xx/5xx after retries exhausted
data = response.json()

:::danger Never Disable SSL Verification in Production requests.get(url, verify=False) disables certificate validation entirely. It silences the SSL warning, but it does not fix the underlying problem - it opens the connection to man-in-the-middle attacks. Any intermediate network node (corporate proxy, cloud load balancer, ISP) can intercept and read the traffic. Fix SSL errors properly: update the CA bundle (pip install certifi), specify the correct CA file (verify="/path/to/ca-bundle.crt"), or debug the certificate chain. Never ship verify=False in production code. :::

Request Hooks

import requests
import time
import logging

logger = logging.getLogger(__name__)

def log_response_time(response, *args, **kwargs):
    """Hook called after every response. Logs timing for observability."""
    elapsed = response.elapsed.total_seconds()
    logger.info(
        "HTTP %s %s → %d in %.3fs",
        response.request.method,
        response.request.url,
        response.status_code,
        elapsed,
    )
    if elapsed > 1.0:
        logger.warning("Slow response: %.3fs for %s", elapsed, response.request.url)

session = requests.Session()
session.hooks["response"].append(log_response_time)

# Every response through this session logs timing automatically
response = session.get("https://api.example.com/users/123", timeout=10)

Part 7 - `httpx`: The Modern Alternative

httpx is a near drop-in replacement for requests that adds async support and HTTP/2:

import httpx

# Synchronous - identical API to requests
with httpx.Client(timeout=10) as client:
    response = client.get("https://api.example.com/users/123")
    data = response.json()

# Async - same client, async methods
import asyncio

async def fetch_users(user_ids: list[int]) -> list[dict]:
    """Fetch multiple users concurrently - impossible with synchronous requests."""
    async with httpx.AsyncClient(timeout=10) as client:
        tasks = [
            client.get(f"https://api.example.com/users/{uid}")
            for uid in user_ids
        ]
        responses = await asyncio.gather(*tasks)
    return [r.json() for r in responses]

# HTTP/2 - multiplexes multiple requests over one connection
async def fetch_with_http2():
    async with httpx.AsyncClient(http2=True, timeout=10) as client:
        response = await client.get("https://api.example.com/users/123")
        return response.json()

asyncio.run(fetch_with_http2())

Feature	`requests`	`httpx`
Synchronous	Yes	Yes
Async	No	Yes
HTTP/2	No	Yes (with `httpx[http2]`)
API familiarity	Reference	Very similar to requests
Streaming	Yes	Yes
Test client	No (use `responses`)	Yes (built-in `httpx.MockTransport`)

Part 8 - HTTP/2 and HTTP/3

HTTP/2: Multiplexing and Header Compression

HTTP/1.1 has two performance problems:

Head-of-line blocking: requests on a single connection are serialized. If response #1 is large, response #2 waits.
Header repetition: every request sends the full header set (often 500–1000 bytes of repeated Cookie, User-Agent, Authorization).

HTTP/2 solves both:

HPACK header compression maintains a shared table of previously sent headers. After the first request sends Authorization: Bearer eyJ..., subsequent requests send a single-byte reference to that header instead of the full value. For APIs with large auth tokens, this reduces per-request overhead by 60–90%.

Server push (rarely used in APIs): the server can send responses the client has not asked for yet. Used in web browsers for CSS/JS assets but almost never in API design.

HTTP/3 and QUIC

HTTP/3 replaces TCP with QUIC (UDP-based, with reliability built in):

Eliminates TCP head-of-line blocking at the transport layer (HTTP/2 still suffers this at TCP level)
Faster connection establishment (0-RTT for resumed sessions)
Better performance on lossy mobile networks (packet loss only blocks one stream, not all)

Python support for HTTP/3 is early-stage as of 2026. Most production Python APIs are HTTP/1.1 or HTTP/2. QUIC matters more for CDN edge nodes and mobile clients than for server-to-server API calls.

Part 9 - A Production HTTP Client Pattern

import uuid
import logging
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
from typing import Any

logger = logging.getLogger(__name__)


class APIClient:
    """
    Production HTTP client for a single upstream API.

    Features:
    - Connection pooling via Session
    - Retry with exponential backoff on 5xx
    - Explicit timeouts on every request
    - Request ID injection for distributed tracing
    - Response time logging
    - Consistent error handling
    """

    def __init__(self, base_url: str, api_key: str, timeout: tuple = (3, 30)):
        self.base_url = base_url.rstrip("/")
        self.timeout = timeout

        self._session = requests.Session()
        self._session.headers.update({
            "Authorization": f"Bearer {api_key}",
            "Accept": "application/json",
            "Content-Type": "application/json",
        })

        retry = Retry(
            total=3,
            backoff_factor=0.5,
            status_forcelist=[500, 502, 503, 504],
            allowed_methods=["GET", "HEAD", "DELETE"],
        )
        adapter = HTTPAdapter(max_retries=retry, pool_maxsize=20)
        self._session.mount("https://", adapter)
        self._session.mount("http://", adapter)

        self._session.hooks["response"].append(self._log_response)

    def _log_response(self, response, *args, **kwargs):
        elapsed = response.elapsed.total_seconds()
        logger.info(
            "API %s %s → %d (%.3fs)",
            response.request.method,
            response.request.url,
            response.status_code,
            elapsed,
        )

    def _request(self, method: str, path: str, **kwargs) -> requests.Response:
        url = f"{self.base_url}/{path.lstrip('/')}"
        request_id = str(uuid.uuid4())

        headers = kwargs.pop("headers", {})
        headers["X-Request-ID"] = request_id

        response = self._session.request(
            method,
            url,
            headers=headers,
            timeout=self.timeout,
            **kwargs,
        )
        response.raise_for_status()
        return response

    def get(self, path: str, **kwargs) -> Any:
        return self._request("GET", path, **kwargs).json()

    def post(self, path: str, body: dict, **kwargs) -> Any:
        return self._request("POST", path, json=body, **kwargs).json()

    def patch(self, path: str, body: dict, **kwargs) -> Any:
        return self._request("PATCH", path, json=body, **kwargs).json()

    def delete(self, path: str, **kwargs) -> None:
        self._request("DELETE", path, **kwargs)

    def close(self):
        self._session.close()

    def __enter__(self):
        return self

    def __exit__(self, *exc):
        self.close()
        return False


# Usage
with APIClient("https://api.example.com", api_key="sk-...") as client:
    user = client.get("/users/123")
    client.patch("/users/123", body={"email": "[email protected]"})
    client.delete("/users/456")

:::note HTTP Is Stateless Every HTTP request is independent. The server has no memory of previous requests from the same client. Cookies, sessions, JWTs, and API keys are all application-layer illusions built on top of a fundamentally stateless protocol. Each request must carry all the context the server needs to process it. This is a feature - it makes HTTP services horizontally scalable, because any server instance can handle any request. :::

Graded Practice Challenges

Level 1 - Predict and Identify

Question 1: What is the minimum number of network round trips needed before an HTTP response body can start arriving on a cold HTTPS connection?

Show Answer

Three round trips before HTTP:

DNS resolution (not strictly a round trip in the HTTP sense, but a network call)
TCP handshake (SYN / SYN-ACK / ACK = one RTT)
TLS 1.3 handshake (one RTT for the key exchange)

Then one more RTT for the HTTP request and the first response bytes to arrive.

Total: 3–4 RTTs on a cold connection. On a warm connection (Session with keep-alive), steps 2 and 3 are skipped - the HTTP request goes out immediately. This is why Session reuse matters.

Question 2: Which HTTP methods are both safe AND idempotent? Which are idempotent but not safe? Which are neither?

Show Answer

Safe AND idempotent (read-only, repeatable): GET, HEAD, OPTIONS

Idempotent but NOT safe (write operations with stable end state): PUT, DELETE

Neither safe nor idempotent: POST, PATCH

The practical consequence: GET, HEAD, and OPTIONS can be retried automatically by any infrastructure (proxy, load balancer, retry adapter). PUT and DELETE can be retried by the application if a network error occurs mid-flight. POST and PATCH must not be automatically retried - a duplicate POST creates two records; a duplicate PATCH may apply the change twice (unless the server implements idempotency keys).

Question 3: What does this code actually do differently from a plain requests.get() call?

session = requests.Session()
session.mount("https://", HTTPAdapter(pool_maxsize=20))
response = session.get("https://api.example.com/users/1", timeout=(3, 30))

Show Answer

Three differences from requests.get():

Connection pooling: the Session holds up to 20 open HTTPS connections to api.example.com. Subsequent requests to the same host reuse existing connections, skipping the TCP + TLS handshake (saving 50–400ms per request depending on RTT).
Timeout: (3, 30) sets a 3-second connect timeout and a 30-second read timeout. Plain requests.get() without timeout blocks indefinitely. This is the most important difference for production safety.
Reusable configuration: Session persists headers, auth, and cookies across requests. Any subsequent call to session.get() or session.post() reuses the same pool and configuration without re-initialization overhead.

Question 4: A service makes 1000 API calls per hour to the same external host. Currently it uses requests.get() at the module level. Estimate how much time is wasted on connection overhead compared to using a Session.

Show Answer

Assumptions: typical RTT to a public HTTPS endpoint is 80ms. Each cold connection requires:

TCP handshake: ~80ms (1 RTT)
TLS 1.3 handshake: ~80ms (1 RTT)
Total overhead per request: ~160ms

With requests.get() (new connection each time):

1000 requests × 160ms = 160 seconds of wasted connection overhead per hour

With Session (connection pool reused):

First request: 160ms overhead
Subsequent 999 requests: ~0ms connection overhead
Total overhead: ~160ms

Net savings: ~160 seconds per hour - wasted purely on redundant TCP + TLS handshakes. At higher request rates this becomes a significant latency and resource cost.

Level 2 - Debug and Fix

Find and fix all problems in this HTTP client code:

import requests

def fetch_all_users(user_ids):
    results = []
    for uid in user_ids:
        resp = requests.get(f"https://api.example.com/users/{uid}")
        if resp.status_code == 200:
            results.append(resp.json())
        elif resp.status_code == 500:
            results.append(fetch_all_users([uid]))  # retry
    return results

def create_user(name, email):
    resp = requests.get(
        "https://api.example.com/users",
        params={"name": name, "email": email}
    )
    return resp.json()

def delete_user(uid):
    resp = requests.post(
        "https://api.example.com/users",
        json={"user_id": uid, "action": "delete"},
        verify=False
    )
    return resp.status_code == 200

Show Solution

Bug 1 - No Session, no timeout: requests.get() at module level creates a new connection for every call. For a list of user IDs, this is expensive.

Bug 2 - Naive retry causes infinite recursion: fetch_all_users([uid]) on a 500 will recurse on every 500 response - infinite recursion if the server stays down.

Bug 3 - Wrong method for create: requests.get() for creating a user is wrong semantically and dangerous - GET is cacheable and safe. Use POST.

Bug 4 - Wrong method for delete: requests.post() for a delete operation violates REST conventions and is not idempotent as expressed.

Bug 5 - verify=False: SSL verification disabled - man-in-the-middle vulnerability in production.

Fixed version:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def make_session() -> requests.Session:
    session = requests.Session()
    retry = Retry(
        total=3,
        backoff_factor=0.5,
        status_forcelist=[500, 502, 503, 504],
        allowed_methods=["GET", "DELETE"],
    )
    session.mount("https://", HTTPAdapter(max_retries=retry, pool_maxsize=10))
    return session

_SESSION = make_session()

def fetch_all_users(user_ids: list[int]) -> list[dict]:
    results = []
    for uid in user_ids:
        response = _SESSION.get(
            f"https://api.example.com/users/{uid}",
            timeout=(3, 30),
        )
        response.raise_for_status()  # raises on 4xx/5xx; retry handles 5xx
        results.append(response.json())
    return results

def create_user(name: str, email: str) -> dict:
    response = _SESSION.post(
        "https://api.example.com/users",
        json={"name": name, "email": email},
        timeout=(3, 30),
    )
    response.raise_for_status()
    return response.json()

def delete_user(uid: int) -> None:
    response = _SESSION.delete(
        f"https://api.example.com/users/{uid}",
        timeout=(3, 30),
    )
    response.raise_for_status()

Level 3 - Design Challenge

Design a RateLimitedHTTPClient class that:

Wraps requests.Session
Respects Retry-After response headers on 429 responses
Maintains a configurable per-host request rate (e.g. max 10 requests/second)
Logs all rate limiting events with host, wait duration, and request URL
Raises RateLimitExceeded after a configurable maximum wait time

Show Reference Solution

import time
import logging
import threading
import requests
from collections import defaultdict, deque

logger = logging.getLogger(__name__)


class RateLimitExceeded(Exception):
    pass


class RateLimitedHTTPClient:
    """
    HTTP client that enforces per-host rate limits and respects Retry-After.

    Args:
        requests_per_second: Maximum requests per second per host (token bucket)
        max_wait_seconds: Maximum time to wait on a 429 before raising
        session: Optional pre-configured requests.Session
    """

    def __init__(
        self,
        requests_per_second: float = 10.0,
        max_wait_seconds: float = 60.0,
        session: requests.Session = None,
    ):
        self.requests_per_second = requests_per_second
        self.max_wait_seconds = max_wait_seconds
        self._session = session or requests.Session()

        # Per-host sliding window of request timestamps
        self._lock = threading.Lock()
        self._request_times: dict[str, deque] = defaultdict(
            lambda: deque(maxlen=int(requests_per_second * 2))
        )

    def _host(self, url: str) -> str:
        from urllib.parse import urlparse
        return urlparse(url).netloc

    def _throttle(self, url: str) -> None:
        """Block until the per-host rate limit allows the next request."""
        host = self._host(url)
        window = 1.0 / self.requests_per_second  # seconds between requests

        with self._lock:
            now = time.monotonic()
            times = self._request_times[host]

            if len(times) >= self.requests_per_second:
                oldest = times[0]
                wait = oldest + 1.0 - now
                if wait > 0:
                    logger.debug("Rate throttle: %.3fs for %s", wait, host)
                    time.sleep(wait)

            self._request_times[host].append(time.monotonic())

    def _handle_429(self, response: requests.Response, url: str) -> None:
        retry_after = response.headers.get("Retry-After", "").strip()
        try:
            wait = float(retry_after)
        except (ValueError, TypeError):
            wait = 5.0  # default backoff if Retry-After is missing or invalid

        logger.warning(
            "Rate limited (429): %s - waiting %.1fs (Retry-After: %r)",
            url, wait, retry_after
        )

        if wait > self.max_wait_seconds:
            raise RateLimitExceeded(
                f"Server requested {wait}s wait on {url}; "
                f"max_wait_seconds={self.max_wait_seconds}"
            )

        time.sleep(wait)

    def request(self, method: str, url: str, **kwargs) -> requests.Response:
        kwargs.setdefault("timeout", (3, 30))
        attempts = 0
        max_attempts = 5

        while attempts < max_attempts:
            self._throttle(url)
            response = self._session.request(method, url, **kwargs)

            if response.status_code == 429:
                attempts += 1
                if attempts >= max_attempts:
                    raise RateLimitExceeded(
                        f"Still rate limited after {max_attempts} attempts: {url}"
                    )
                self._handle_429(response, url)
                continue

            return response

        raise RateLimitExceeded(f"Max retry attempts exceeded for {url}")

    def get(self, url: str, **kwargs) -> requests.Response:
        return self.request("GET", url, **kwargs)

    def post(self, url: str, **kwargs) -> requests.Response:
        return self.request("POST", url, **kwargs)

    def close(self):
        self._session.close()

    def __enter__(self):
        return self

    def __exit__(self, *exc):
        self.close()
        return False


# Usage
with RateLimitedHTTPClient(requests_per_second=5, max_wait_seconds=30) as client:
    for user_id in range(100):
        response = client.get(f"https://api.example.com/users/{user_id}")
        response.raise_for_status()
        process(response.json())

Design decisions:

Per-host sliding window prevents one busy host from affecting requests to other hosts
_handle_429 separates server-requested rate limiting (via Retry-After) from client-side throttling - they have different semantics and different wait durations
max_wait_seconds provides a circuit breaker for server-requested waits that would block the caller for unacceptable durations
Thread lock on the request time window makes the client safe for multi-threaded use
The request() method is the single dispatch point - all HTTP methods go through it, making rate limiting and retry logic uniform

Key Takeaways

HTTP is plain text over a TCP socket: request line + headers + blank line + optional body. Every HTTP library, framework, and debugging tool operates on this format
GET, HEAD, OPTIONS are safe and idempotent - infrastructure can retry them freely. PUT, DELETE are idempotent but not safe. POST and PATCH are neither - never auto-retry them
Status code families: 2xx (success), 3xx (redirect), 4xx (client error - fix the request), 5xx (server error - retry with backoff)
401 means unauthenticated (missing/invalid credentials); 403 means unauthorized (lacks permission). 429 means rate limited - check Retry-After before retrying
ETags enable conditional GET: send If-None-Match with the ETag and get 304 Not Modified (with no body) when the resource has not changed
A cold HTTPS connection requires DNS + TCP handshake + TLS handshake before the first HTTP byte - 3+ round trips. Session reuse eliminates handshake overhead for subsequent requests
Always set explicit timeouts: timeout=(connect_seconds, read_seconds). A hanging request without a timeout blocks a thread or coroutine indefinitely
Use Session for multiple requests to the same host. requests.get() at module level creates a new connection every time
Never use verify=False in production. Fix SSL errors properly with correct CA bundles
HTTPAdapter with Retry handles transient 5xx failures with exponential backoff - only safe for idempotent methods
httpx provides near-identical API to requests with added async support and HTTP/2 multiplexing
HTTP/2 multiplexes multiple streams over one connection and compresses repeated headers - significant throughput improvement for high-request-rate API clients

What's Next

Lesson 02 covers REST principles - the architectural style that gives HTTP methods their meaning in API design. You will see two designs for the same API, understand why one is "RESTful" and the other is not, and learn the specific rules (URL design, method semantics, status codes, error formats, pagination, versioning) that make REST APIs predictable, cacheable, and maintainable at scale.

What You Will Learn​

Prerequisites​

Part 1 - The HTTP/1.1 Wire Format​

Anatomy of an HTTP Request​

Anatomy of an HTTP Response​

Part 2 - HTTP Methods: Semantics, Not Just Conventions​

Part 3 - Status Code Families​

1xx - Informational​

2xx - Success​

3xx - Redirection​

4xx - Client Error​

5xx - Server Error​

Part 4 - Critical Headers​

Request Headers​

Response Headers​

Authorization Patterns​

ETag and Conditional Requests​

Part 5 - HTTP Request/Response Lifecycle​

Part 6 - Connection Management and the requests Library​

Why requests.get() in a Loop Is Wrong​

Timeouts: Always Set Them Explicitly​

Retry Logic with HTTPAdapter​

Request Hooks​

Part 7 - httpx: The Modern Alternative​

Part 8 - HTTP/2 and HTTP/3​

HTTP/2: Multiplexing and Header Compression​

HTTP/3 and QUIC​

Part 9 - A Production HTTP Client Pattern​

Graded Practice Challenges​

Level 1 - Predict and Identify​

Level 2 - Debug and Fix​

Level 3 - Design Challenge​

Key Takeaways​

What's Next​

What You Will Learn

Prerequisites

Part 1 - The HTTP/1.1 Wire Format

Anatomy of an HTTP Request

Anatomy of an HTTP Response

Part 2 - HTTP Methods: Semantics, Not Just Conventions

Part 3 - Status Code Families

1xx - Informational

2xx - Success

3xx - Redirection

4xx - Client Error

5xx - Server Error

Part 4 - Critical Headers

Request Headers

Response Headers

Authorization Patterns

ETag and Conditional Requests

Part 5 - HTTP Request/Response Lifecycle

Part 6 - Connection Management and the `requests` Library

Why `requests.get()` in a Loop Is Wrong

Timeouts: Always Set Them Explicitly

Retry Logic with `HTTPAdapter`

Request Hooks

Part 7 - `httpx`: The Modern Alternative

Part 8 - HTTP/2 and HTTP/3

HTTP/2: Multiplexing and Header Compression

HTTP/3 and QUIC

Part 9 - A Production HTTP Client Pattern

Graded Practice Challenges

Level 1 - Predict and Identify

Level 2 - Debug and Fix

Level 3 - Design Challenge

Key Takeaways

What's Next