Skip to main content

MCP Architecture - Client-Server

The IDE plugin had been working perfectly in local testing. The engineer connected Claude to a filesystem MCP server, asked it to read some files, ran some queries - smooth. Then the team deployed the same server in a shared cloud environment so all developers could use it, changed the transport from stdio to HTTP, and everything broke. The server would receive requests, but responses were arriving out of order. Some were never arriving at all. The issue took two days to debug: the engineer had assumed the HTTP+SSE transport was a drop-in replacement for stdio, that you could swap transports and the behavior would be identical. It is not. The two transports have fundamentally different connection lifecycles, different session models, and different behavior when connections drop. Understanding the architecture at this level would have saved two days.

MCP's architecture appears simple from the outside: a client sends requests, a server responds. But the internal structure - the three-role model, the two transport options, the initialization handshake that establishes capabilities, the session lifecycle that governs what happens from connection to disconnection - has specific behaviors and failure modes that matter in production. This lesson walks through every layer of the architecture with enough depth that you understand not just how it works but why it was designed the way it was.

By the end of this lesson you will understand the Host/Client/Server triad, be able to implement both stdio and HTTP+SSE transports, understand the JSON-RPC 2.0 message format and how MCP uses it, trace the full session lifecycle from initialization to termination, and recognize the failure modes that affect each transport in production environments.


Why This Architecture

The Problem With Direct Model-to-Tool Communication

The naive approach to AI tool integration is direct: the model produces a function call, the application executes it, the result goes back to the model. This works for simple cases but breaks down in two ways.

First, it couples tool implementations to AI application code. If five AI applications need access to the same GitHub API, all five need to include GitHub integration code. When GitHub changes their API, all five need to be updated simultaneously.

Second, it limits reuse. A tool built for one application cannot be used by another without copying and adapting the code. The ecosystem fragments.

MCP's client-server architecture separates concerns: tool implementations live in dedicated server processes, and AI applications connect to those servers through a protocol. This decoupling is the architectural insight that enables the N+M reduction described in Lesson 01.

Why Process Separation

MCP servers run as separate processes, not as libraries imported into the AI application. This choice was deliberate and has important consequences:

Security isolation. A malicious or buggy MCP server cannot directly corrupt the AI application's memory space. The process boundary provides isolation.

Language independence. The server can be written in any language - Python, TypeScript, Rust, Go - as long as it speaks the MCP protocol. The client does not care about the server's implementation language.

Composability. Multiple MCP servers can be connected to the same AI application simultaneously. Each runs in its own process; they cannot interfere with each other.

Lifecycle independence. The server can be started, stopped, and restarted independently of the AI application.


:::tip 🎮 Interactive Playground Visualize this concept: Try the MCP Architecture demo on the EngineersOfAI Playground - no code required. :::

The Three-Role Model: Host, Client, Server

MCP uses three roles, not two. Understanding the distinction between Host and Client is important for understanding how MCP integrates with AI applications.

The Host

The Host is the AI application - Claude Desktop, a VS Code extension, your custom application. The host:

  • Manages the conversation with the language model
  • Decides which MCP servers to connect to (based on user configuration)
  • Creates and manages one MCP client per server
  • Aggregates tool definitions from all connected servers and presents them to the model
  • Routes model tool calls to the appropriate MCP client
  • Presents results to the user

The host is the orchestrator. It owns the user experience and the model interaction. It treats MCP clients as infrastructure components.

The MCP Client

The MCP client is a protocol implementation embedded within the host. There is one client per connected server. The client:

  • Manages the session lifecycle with its MCP server (connect, initialize, maintain, disconnect)
  • Translates the model's tool call requests into MCP protocol messages
  • Handles the transport layer (stdio or HTTP+SSE)
  • Implements the JSON-RPC 2.0 message format
  • Surfaces the server's capabilities (tools, resources, prompts) to the host

The client is the protocol adapter. It abstracts the MCP server behind a clean interface that the host uses without needing to know transport details.

The MCP Server

The MCP server is the tool provider - a separate process (local or remote) that exposes tools, resources, and/or prompts through the MCP protocol. The server:

  • Registers the capabilities it provides (list of tools, resources, prompts)
  • Responds to tool call requests by executing functions and returning results
  • Responds to resource read requests by returning data
  • Handles its own authentication with external systems (GitHub, Postgres, Slack)
  • Manages its own lifecycle independently of the host

Transport Layers

MCP supports two transport mechanisms: stdio (standard input/output) and HTTP with Server-Sent Events (SSE). Both transport the same JSON-RPC 2.0 messages - the difference is in how they establish and manage the connection.

stdio Transport

In stdio transport, the MCP client launches the MCP server as a child process and communicates through the subprocess's stdin/stdout streams. Messages are newline-delimited JSON.

Host process
└── MCP Client
├── Spawns server subprocess: python my_server.py
├── Writes JSON to server's stdin
└── Reads JSON from server's stdout

Server subprocess
├── Reads requests from stdin
├── Executes tool/resource operations
└── Writes responses to stdout

Advantages of stdio:

  • No network configuration required
  • Inherits the user's file system permissions naturally
  • Subprocess isolation - server crash does not crash the host
  • Simple deployment - server is just a Python or Node script
  • No authentication needed for local use

Limitations of stdio:

  • Single user only - the subprocess is owned by one host process
  • Cannot be shared across machines or users
  • Requires the server to be installable on the user's machine

When to use stdio: Local development, single-user tools, tools that access local resources (filesystem, local database), tools that need to run with the user's permissions.

# How Claude Desktop launches a stdio MCP server (conceptual)
import asyncio
import subprocess
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def connect_stdio_server():
server_params = StdioServerParameters(
command="python",
args=["./my_mcp_server.py"],
env={"GITHUB_TOKEN": "ghp_..."} # Passed as env vars
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the session
await session.initialize()

# List available tools
tools = await session.list_tools()
print(f"Connected to server with {len(tools.tools)} tools")

# Call a tool
result = await session.call_tool(
"list_files",
{"path": "/home/user/documents", "recursive": False}
)
print(result.content)

asyncio.run(connect_stdio_server())

HTTP + SSE Transport

In HTTP+SSE transport, the MCP server runs as an independent HTTP server (FastAPI, Flask, or the MCP HTTP server SDK). The client connects to it over HTTP. The server sends responses back over a Server-Sent Events (SSE) stream for efficient streaming.

Host process
└── MCP Client
├── HTTP POST requests → server endpoint
└── SSE stream ← server pushes responses/notifications

MCP Server (separate process/machine)
├── HTTP endpoint: POST /messages
├── SSE endpoint: GET /sse
└── Handles authentication, TLS

Advantages of HTTP+SSE:

  • Multi-user: multiple clients can connect to one server simultaneously
  • Remote: server can run on a different machine or in the cloud
  • Scalable: can be deployed behind a load balancer, run in containers
  • Monitorable: standard HTTP logging, metrics, health checks

Limitations of HTTP+SSE:

  • More infrastructure required (TLS, authentication, reverse proxy)
  • Network dependency - latency and reliability depend on network
  • Must implement authentication carefully (no implicit user-level isolation)

When to use HTTP+SSE: Shared team tools, production deployments, tools hosted in the cloud, tools that need to be used by multiple people simultaneously.

# MCP server configured for HTTP+SSE transport
from mcp.server import Server
from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route
import uvicorn

# Create the MCP server
app = Server("team-tools-server")

@app.list_tools()
async def list_tools():
return [
# Tool definitions here
]

@app.call_tool()
async def call_tool(name: str, arguments: dict):
# Tool execution here
pass

# Configure HTTP+SSE transport
sse = SseServerTransport("/messages/")

async def handle_sse(request):
async with sse.connect_sse(
request.scope, request.receive, request._send
) as streams:
await app.run(
streams[0], streams[1],
app.create_initialization_options()
)

# Starlette/FastAPI app
starlette_app = Starlette(
routes=[
Route("/sse", endpoint=handle_sse),
Route("/messages/", endpoint=sse.handle_post_message, methods=["POST"]),
]
)

if __name__ == "__main__":
uvicorn.run(starlette_app, host="0.0.0.0", port=8080)

JSON-RPC 2.0 Message Format

MCP uses JSON-RPC 2.0 as its message format. JSON-RPC is a lightweight remote procedure call protocol that uses JSON for encoding. Understanding the message format helps you debug protocol issues and implement custom clients or servers.

Request Message

{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "list_files",
"arguments": {
"path": "/home/user/documents",
"recursive": false
}
}
}

Fields:

  • jsonrpc: Always "2.0"
  • id: Unique identifier that pairs requests to responses. String or integer. Must be included if a response is expected.
  • method: The RPC method being called
  • params: Method parameters (object or array)

Response Message (Success)

{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": [
{
"type": "text",
"text": "documents/\n report.pdf\n notes.txt\n analysis.xlsx"
}
],
"isError": false
}
}

Response Message (Error)

{
"jsonrpc": "2.0",
"id": 1,
"error": {
"code": -32602,
"message": "Invalid params",
"data": {
"details": "Path '/nonexistent' does not exist"
}
}
}

Standard JSON-RPC error codes:

  • -32700: Parse error - invalid JSON
  • -32600: Invalid request - missing required fields
  • -32601: Method not found
  • -32602: Invalid params
  • -32603: Internal error
  • -32000 to -32099: Server-defined errors

Notification Message

Notifications are messages that do not expect a response. They have no id field.

{
"jsonrpc": "2.0",
"method": "notifications/tools/list_changed",
"params": {}
}

MCP uses notifications for server-to-client events: tool list changes, resource content updates, progress reports for long-running operations.


MCP Protocol Methods

MCP defines a specific set of methods that clients and servers implement.

Lifecycle Methods

initialize Client → Server Start session, negotiate capabilities
initialized Client → Server Notification confirming initialization
ping Client → Server Health check

Tool Methods

tools/list Client → Server List available tools
tools/call Client → Server Execute a tool
tools/list_changed Server → Client Notification: tool list has changed

Resource Methods

resources/list Client → Server List available resources
resources/read Client → Server Read a resource's content
resources/subscribe Client → Server Subscribe to resource change notifications
resources/unsubscribe Client → Server Unsubscribe
resources/list_changed Server → Client Notification: resource list changed
resources/updated Server → Client Notification: specific resource updated

Prompt Methods

prompts/list Client → Server List available prompts
prompts/get Client → Server Get rendered prompt with arguments
prompts/list_changed Server → Client Notification: prompt list changed

Initialization Handshake and Capability Negotiation

The initialization handshake is the first exchange in every MCP session. It establishes which protocol version both sides support and what capabilities each side offers.

The Initialize Request

{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {
"roots": {"listChanged": true},
"sampling": {}
},
"clientInfo": {
"name": "claude-desktop",
"version": "1.0.0"
}
}
}

Client capabilities tell the server what the client can handle:

  • roots: The client supports workspace roots (file system paths the client cares about)
  • sampling: The client supports server-initiated sampling (server can ask client to generate LLM completions)

The Initialize Response

{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2024-11-05",
"capabilities": {
"tools": {"listChanged": true},
"resources": {"subscribe": true, "listChanged": true},
"prompts": {"listChanged": true},
"logging": {}
},
"serverInfo": {
"name": "filesystem-server",
"version": "0.6.2"
}
}
}

Server capabilities tell the client what the server supports:

  • tools: Server has tools; listChanged means it will notify the client if the tool list changes
  • resources: Server has resources; subscribe means clients can subscribe to resource updates
  • prompts: Server has prompts
  • logging: Server will send log messages to the client

Protocol Version Negotiation

Both client and server specify which protocol version they support in the initialize exchange. The server's response indicates the agreed protocol version. If versions are incompatible, the server returns an error and the session cannot proceed. This mechanism allows the protocol to evolve while maintaining backward compatibility.


Session Lifecycle

A complete MCP session moves through five phases:

Phase 1: Connect

stdio: The client spawns the server as a subprocess. The subprocess's stdin and stdout become the communication channels.

HTTP+SSE: The client opens an SSE connection to the server's SSE endpoint (GET /sse). The server assigns a session ID and the connection remains open for the duration of the session.

Phase 2: Initialize

The client sends the initialize request. The server responds with its capabilities. The client sends the initialized notification. The session is now active.

If the server rejects the initialize request (incompatible protocol version, authentication failure), the connection is closed immediately.

Phase 3: Discovery

After initialization, the client typically calls tools/list, resources/list, and prompts/list to discover what the server offers. The host aggregates these from all connected servers and presents them to the language model.

Discovery happens once per session (not once per tool call). The results are cached by the client and refreshed when the server sends a list_changed notification.

Phase 4: Operation

The session enters its operational phase. Tool calls, resource reads, and prompt retrievals happen here, interleaved with any notifications from the server. This phase is the "steady state" of the MCP session.

Phase 5: Shutdown

stdio: The client closes the subprocess's stdin, which signals the server to terminate. The server can also exit on its own (crash or intentional shutdown), which the client detects via the broken subprocess pipe.

HTTP+SSE: The client closes the SSE connection. The server cleans up the session state. Either side can terminate - the server by closing the SSE stream, the client by dropping the connection.


Python Client Implementation

Here is a complete implementation of an MCP client using the official Python SDK, showing the full lifecycle:

import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from mcp.types import TextContent

async def run_mcp_client():
"""Complete MCP client example: connect, discover, use tools, disconnect."""

# Configure the server to connect to
server_params = StdioServerParameters(
command="python",
args=["filesystem_server.py"],
env={
"ALLOWED_PATHS": "/home/user/documents:/tmp"
}
)

print("Connecting to MCP server...")

async with stdio_client(server_params) as (read_stream, write_stream):
async with ClientSession(read_stream, write_stream) as session:

# Phase 2: Initialize
init_result = await session.initialize()
print(f"Connected to: {init_result.serverInfo.name} v{init_result.serverInfo.version}")
print(f"Protocol: {init_result.protocolVersion}")
print(f"Capabilities: {init_result.capabilities}")

# Phase 3: Discovery
tools_response = await session.list_tools()
print(f"\nAvailable tools ({len(tools_response.tools)}):")
for tool in tools_response.tools:
print(f" - {tool.name}: {tool.description}")

resources_response = await session.list_resources()
print(f"\nAvailable resources ({len(resources_response.resources)}):")
for resource in resources_response.resources:
print(f" - {resource.uri}: {resource.name}")

# Phase 4: Operation - call a tool
print("\nCalling list_files tool...")
result = await session.call_tool(
"list_files",
{"path": "/home/user/documents", "recursive": False}
)

for content_item in result.content:
if isinstance(content_item, TextContent):
print(f"Result:\n{content_item.text}")

# Read a resource
if resources_response.resources:
resource = resources_response.resources[0]
print(f"\nReading resource: {resource.uri}")
resource_content = await session.read_resource(resource.uri)
for content in resource_content.contents:
if hasattr(content, 'text'):
print(f"Content preview: {content.text[:200]}...")

# Phase 5: Disconnect (automatic when context managers exit)
print("\nDisconnecting from MCP server...")

print("Session closed.")

asyncio.run(run_mcp_client())

Error Handling and Resilience

Production MCP clients need to handle three categories of failures:

Transport Failures

stdio: Server process crashes or exits unexpectedly. The client detects this via EOF on the subprocess stdout stream. Response: log the failure, attempt to restart the server, surface error to user.

HTTP+SSE: Network interruption drops the SSE connection. The client detects this via connection close. Response: reconnect with exponential backoff, reinitialize the session, re-discover capabilities.

import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def resilient_client(server_params: StdioServerParameters, max_retries: int = 3):
"""MCP client with automatic reconnection on transport failure."""

for attempt in range(max_retries):
try:
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
yield session
return # Success - exit retry loop
except Exception as e:
if attempt < max_retries - 1:
wait = 2 ** attempt # Exponential backoff: 1s, 2s, 4s
print(f"Connection attempt {attempt + 1} failed: {e}. Retrying in {wait}s...")
await asyncio.sleep(wait)
else:
raise RuntimeError(f"Failed to connect after {max_retries} attempts") from e

Protocol Errors

JSON-RPC errors returned by the server in the error field of a response. These indicate issues with the request itself (invalid params, tool not found, execution error).

from mcp.types import McpError

async def safe_tool_call(session: ClientSession, tool_name: str, args: dict) -> str | None:
"""Call a tool with error handling. Returns None on any error."""
try:
result = await session.call_tool(tool_name, args)
if result.isError:
# Tool executed but returned an application-level error
print(f"Tool error: {result.content}")
return None
# Extract text content
texts = [c.text for c in result.content if hasattr(c, 'text')]
return "\n".join(texts)
except McpError as e:
# Protocol-level error (invalid params, tool not found, etc.)
print(f"MCP protocol error {e.error.code}: {e.error.message}")
return None
except Exception as e:
print(f"Unexpected error calling {tool_name}: {e}")
return None

Application Errors

Tool execution errors - the tool ran, but the underlying operation failed (file not found, API rate limit, permission denied). MCP reports these as successful protocol responses with isError: true in the result. The application must check this flag.


Production Engineering Notes

Session Pooling for HTTP+SSE

In high-throughput deployments, maintaining a pool of pre-established MCP sessions avoids the overhead of initialize handshakes on every request. A session pool should maintain N warm connections to each server, health-check them periodically, and replace failed sessions automatically.

Capability Caching

Cache the results of tools/list, resources/list, and prompts/list and only invalidate the cache when the server sends a list_changed notification. Re-discovering capabilities on every tool call adds unnecessary latency.

Timeout Configuration

Set explicit timeouts for all protocol operations:

  • Initialize: 10–30 seconds (server startup may be slow)
  • Tool calls: 10–300 seconds depending on the tool (database queries vs. web scraping)
  • Resource reads: 5–30 seconds depending on resource size
import asyncio
from mcp import ClientSession

async def tool_call_with_timeout(
session: ClientSession,
tool_name: str,
args: dict,
timeout_seconds: float = 30.0
):
try:
result = await asyncio.wait_for(
session.call_tool(tool_name, args),
timeout=timeout_seconds
)
return result
except asyncio.TimeoutError:
raise TimeoutError(
f"Tool '{tool_name}' exceeded {timeout_seconds}s timeout. "
"Check server health and tool implementation."
)

:::warning stdio Server Startup Time When using stdio transport, the MCP client must wait for the server subprocess to start and initialize before sending any requests. Server startup time varies: Python servers with many imports may take 2–5 seconds. Factor this into connection establishment timeouts and avoid spawning new subprocess connections per request - establish the session once and reuse it. :::

:::danger Do Not Mix Transport Assumptions stdio sessions are 1:1 (one client, one server process). HTTP+SSE sessions are N:M (many clients, one server). Authentication, state management, and concurrency assumptions differ fundamentally between the two. Do not deploy code written and tested with stdio to an HTTP+SSE environment without auditing every assumption about session isolation and user identity. :::


Interview Q&A

Q1: Explain the Host, Client, and Server roles in MCP architecture.

The Host is the AI application - Claude Desktop, VS Code, or a custom application. It manages the conversation with the language model, decides which MCP servers to connect to based on user configuration, creates and manages one MCP Client per server, and aggregates all tool definitions for the model. The MCP Client is a protocol implementation embedded in the host. Each client manages the connection lifecycle with one MCP server: initialization handshake, capability negotiation, translating model tool calls into protocol messages, and handling the transport layer. The Server is the tool provider - a separate process that exposes tools, resources, and prompts through the MCP protocol. It handles its own authentication with external systems and its own lifecycle. The separation of Host from Client is what enables the host to connect to multiple servers simultaneously while each client handles one server relationship cleanly.

Q2: What are the two MCP transport options and when would you use each?

stdio transport runs the MCP server as a subprocess of the host and communicates through the process's stdin/stdout pipes. Use stdio for local development, single-user tools, and tools that access local resources like the filesystem. It inherits the user's permissions naturally, requires no network configuration, and is simple to deploy. HTTP+SSE transport runs the MCP server as an HTTP server that clients connect to over the network, with responses sent via Server-Sent Events. Use HTTP+SSE for shared team tools, production cloud deployments, and tools that need to serve multiple users simultaneously. It supports standard HTTP infrastructure (load balancers, monitoring, authentication) but requires explicit auth implementation and TLS configuration.

Q3: Describe the MCP initialization handshake.

The client sends an initialize request containing the protocol version it supports and its capabilities (e.g., whether it supports server-initiated sampling or workspace roots). The server responds with the protocol version it will use for the session and its capabilities (which primitives it exposes and which features of those primitives it supports). The client then sends an initialized notification confirming it received the response and is ready. After this three-message exchange, the session is active and the client typically calls tools/list, resources/list, and prompts/list to discover what the server offers. Capability negotiation is important because it allows a newer client to work with an older server gracefully - each side only advertises features it actually supports.

Q4: How does MCP handle tool execution errors vs. protocol errors?

These are two distinct error paths. Protocol errors are JSON-RPC errors returned in the error field of a response - they indicate problems at the protocol level: invalid request format, method not found, invalid parameters, internal server error. These use standard JSON-RPC error codes. Application errors - a tool executed but the underlying operation failed (file not found, API rate limit exceeded, permission denied) - are returned as successful protocol responses with isError: true in the result content. The client must check result.isError to distinguish a successful tool call from a tool call that executed but failed. This distinction matters because retry logic differs: protocol errors may indicate a broken session, while application errors indicate an issue with the specific request.

Q5: How would you implement resilience in a production MCP client?

Three layers. First, transport resilience: for stdio, detect subprocess crashes via EOF on stdout and restart with exponential backoff (1s, 2s, 4s...). For HTTP+SSE, detect connection drops and reconnect, then reinitialize the session. Second, operation-level resilience: wrap every tool call and resource read in a timeout using asyncio.wait_for, with timeouts appropriate for each operation type. Retry transient failures (network errors, rate limits) up to N times with jitter. Third, capability caching: cache the results of tools/list, resources/list, and prompts/list and only re-query when the server sends list_changed notifications, avoiding unnecessary round trips on every invocation. Add structured logging of every operation - tool name, arguments, result, latency - so failures can be diagnosed from logs without needing to reproduce the issue.

© 2026 EngineersOfAI. All rights reserved.