Skip to main content

MCP vs Function Calling

The engineering team at a startup had been using Anthropic's function calling for six months. They had 12 tools defined - database queries, API calls, file operations, search functions. Everything was working. The tools were defined in a big dictionary at the top of their agent code, the handlers were functions in the same file, and it all ran as a single Python process.

Then a second team started building a different AI assistant for a different part of the business. They wanted to use the same 12 tools. The first team offered to share the integration code. The second team looked at it and realized: the tools were deeply coupled to the first team's agent code. The database connection pooling, the authentication headers, the error handling - all of it was tangled up in the agent logic. Extracting the tools would require a significant refactor. The second team wrote their own versions.

Three months later, the company had three AI products and three independent implementations of the same 12 tool integrations. When the Postgres schema changed, all three needed updating. When they discovered a rate limiting issue with the search API, all three needed the same fix. The integration work was being done three times.

This is the problem MCP solves - but it is not the only problem that matters. Function calling solves a different problem: the model-level interface for requesting tool execution. Both exist for a reason. Both have their appropriate use cases. The question is not "which is better?" - it is "which is appropriate for this situation?"

Understanding the difference is one of the most practically important architectural decisions in applied AI engineering. Get it right and you build systems that are maintainable, reusable, and well-structured. Get it wrong and you build either over-engineered infrastructure for simple use cases or fragmented integrations that need rewriting for every new application.


Why This Exists: Two Different Problems

Function calling and MCP solve problems at different layers of the AI application stack. This is the most important distinction and the source of most confusion.

Function calling solves the problem of how the language model communicates that it wants to call a function. It defines the interface between the model and the application: the model produces a structured tool_use block, the application reads it, executes the function, and returns the result. Function calling is a model-level protocol.

MCP solves the problem of how the application communicates with the tools it needs to execute. It defines the interface between the application and the tool servers: the MCP client sends a JSON-RPC request to the MCP server, which executes the tool and returns the result. MCP is an infrastructure-level protocol.

They are not competing alternatives. They are layers in the same stack. In a full MCP deployment, function calling is still happening - it is how the model tells the application to call a tool. MCP is how the application then calls the actual tool server. Both are present.


:::tip 🎮 Interactive Playground Visualize this concept: Try the Tool Use & Function Calling demo on the EngineersOfAI Playground - no code required. :::

Historical Context

Function calling (tool use) was introduced as a capability in large language models in 2023. OpenAI's function calling API debuted in June 2023. Anthropic introduced tool use in Claude in August 2023 with the Claude 2.1 preview. The concept was formalized in both APIs: the model receives tool definitions in JSON Schema format, can request tool calls via structured output, and processes tool results returned by the application.

For more than a year, function calling was the only standardized mechanism for tool use. Every application that needed tools wrote its own tool definitions and handlers. This worked well for single-application deployments but created the fragmentation problem as organizations built multiple AI products.

MCP, announced in November 2024, was explicitly designed to solve the infrastructure layer that function calling does not address: how do you reuse tool implementations across multiple AI applications? The answer: standardize the interface between applications and tool servers with a protocol.

The two coexist: function calling handles model-to-application communication; MCP handles application-to-tool-server communication.


Deep Comparison: 12 Dimensions

DimensionFunction CallingMCP
LayerModel ↔ ApplicationApplication ↔ Tool Server
Where tools liveIn application codeIn separate server processes
DiscoveryDefined per-request (sent in API call)Discovered at runtime via tools/list
TransportAPI request/response (HTTP)stdio or HTTP+SSE
FormatJSON Schema in API payloadJSON Schema in server registration
LifecycleStateless per-requestPersistent sessions
ReusabilityNo (per-application)Yes (any MCP client)
PrimitivesTools onlyTools + Resources + Prompts
Multi-modelTied to specific model APIModel-agnostic
Setup complexityLow (inline code)Higher (separate server process)
MaintenancePer-applicationCentralized per server
EcosystemNo standard ecosystemGrowing open ecosystem

Function Calling: Detailed Behavior

In function calling, tool definitions are passed in every API request alongside the conversation. The model reads these definitions, reasons about whether any tool should be called, and if so, returns a tool_use block instead of (or in addition to) text.

import anthropic
import json

client = anthropic.Anthropic()

# Tool definitions are passed with every API request
tools = [
{
"name": "search_database",
"description": (
"Search the product database for items matching a query. "
"Returns product names, prices, and stock levels. "
"Use for: finding products, checking prices, checking availability."
),
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search terms for products"
},
"category": {
"type": "string",
"enum": ["electronics", "clothing", "books", "all"],
"default": "all"
},
"max_price": {
"type": "number",
"description": "Maximum price filter (optional)"
}
},
"required": ["query"]
}
},
{
"name": "get_order_status",
"description": "Get the status of a customer order by order ID.",
"input_schema": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "Order ID (format: ORD-XXXXXXXX)"
}
},
"required": ["order_id"]
}
}
]

def execute_tool(tool_name: str, tool_input: dict) -> str:
"""Execute tool calls inline in the application."""
if tool_name == "search_database":
# Direct function call - tool lives in application code
results = db.search_products(
query=tool_input["query"],
category=tool_input.get("category", "all"),
max_price=tool_input.get("max_price")
)
return json.dumps(results)
elif tool_name == "get_order_status":
order = orders_api.get_order(tool_input["order_id"])
return json.dumps(order)
raise ValueError(f"Unknown tool: {tool_name}")

def run_agent(user_message: str) -> str:
messages = [{"role": "user", "content": user_message}]

while True:
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=tools, # Tool definitions sent with every request
messages=messages
)

if response.stop_reason == "end_turn":
return response.content[0].text

if response.stop_reason == "tool_use":
# Model requested a tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})

messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})

Key characteristics:

  • Tools defined in the application code - the definitions, the handlers, and the logic are all together
  • Definitions sent with every API call - if you have 20 tools, 20 tool definitions go in every request (consuming input tokens)
  • Stateless - each request is independent; the model sees the same tool definitions every time
  • Simple to set up - no separate server process, no protocol, just add tools to the API call

MCP Tool Calling: Detailed Behavior

With MCP, the AI application connects to one or more MCP servers at startup. The servers register their tools during initialization. When the model requests a tool call, the application routes it to the appropriate MCP server via the MCP protocol.

import asyncio
import anthropic
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

client = anthropic.Anthropic()

async def run_agent_with_mcp(user_message: str) -> str:
"""
Agent that uses MCP for tool execution.
The application connects to MCP servers, discovers their tools,
and routes model tool calls through MCP.
"""

# Connect to MCP servers at startup
server_params = StdioServerParameters(
command="python",
args=["product_tools_server.py"],
env={"DATABASE_URL": "postgresql://..."}
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as mcp_session:
# Initialize and discover tools from the MCP server
await mcp_session.initialize()
tools_response = await mcp_session.list_tools()

# Convert MCP tool format to Anthropic API tool format
anthropic_tools = [
{
"name": tool.name,
"description": tool.description,
"input_schema": tool.inputSchema
}
for tool in tools_response.tools
]

# Run the agent loop
messages = [{"role": "user", "content": user_message}]

while True:
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=anthropic_tools,
messages=messages
)

if response.stop_reason == "end_turn":
for block in response.content:
if hasattr(block, "text"):
return block.text

if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
# Route tool call through MCP protocol
# Instead of calling a local function,
# we send an MCP request to the server
mcp_result = await mcp_session.call_tool(
block.name,
block.input
)
result_text = mcp_result.content[0].text if mcp_result.content else ""
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result_text
})

messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})

asyncio.run(run_agent_with_mcp("Find me laptops under $1000"))

Key characteristics:

  • Tools live in the MCP server - the application does not implement tool logic
  • Tools discovered at runtime - the application asks "what tools do you have?" rather than defining them inline
  • Persistent session - the MCP connection stays open across multiple tool calls
  • Routing required - the application must match model tool calls to the right MCP server (when multiple servers are connected)
  • Setup more involved - requires a running MCP server process

When to Use Function Calling

Function calling without MCP is the right choice when:

1. Single-application deployment. You have one AI application and no plans to share tools with other applications. The overhead of MCP's server infrastructure is not justified by the reuse it enables.

2. Simple or bespoke tools. Tools that are specific to your application's logic - custom business rules, application-specific data transformations - that would not make sense outside this context.

3. Low tool count. Fewer than 5–10 tools. The token overhead of sending tool definitions with every request is small. The coordination overhead of MCP servers is not justified.

4. Rapid prototyping. You want to test whether tools are useful before committing to MCP infrastructure.

5. Tight latency requirements. Eliminating the MCP protocol overhead (subprocess communication, JSON-RPC serialization) reduces latency for time-sensitive applications.

6. Stateless serverless deployments. If your application runs as serverless functions (AWS Lambda, Cloudflare Workers), maintaining persistent MCP server connections is complicated. Function calling is simpler.

# When function calling is correct: single app, few tools, simple deployment
class SimpleSearchAgent:
"""
Self-contained agent with inline tool implementations.
Appropriate for: single-app deployment, small tool count.
"""

def __init__(self):
self.client = anthropic.Anthropic()
self.tools = self._define_tools()

def _define_tools(self) -> list[dict]:
return [
{
"name": "search_web",
"description": "Search the web for information.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
]

def _execute_tool(self, name: str, arguments: dict) -> str:
if name == "search_web":
return self._search(arguments["query"])
raise ValueError(f"Unknown tool: {name}")

def _search(self, query: str) -> str:
# Direct implementation - no MCP needed for single-app deployment
import httpx
# ... call search API ...
return f"Results for: {query}"

When to Use MCP

MCP is the right choice when:

1. Multiple AI applications need the same tools. If two or more applications need GitHub access, build one GitHub MCP server. Both applications connect to it. Maintenance happens once.

2. Tool implementations are substantial. Complex database integrations, multi-step API workflows, tools with sophisticated error handling and retry logic - these are best maintained as dedicated server implementations rather than embedded in application code.

3. IDE or Claude Desktop integration. These are MCP clients. If you want your tools to work in Claude Desktop, VS Code, Cursor, or Zed, you must build MCP servers.

4. Team ownership separation. Platform team owns the tools; product teams build the AI applications. MCP provides the boundary: platform team builds and maintains MCP servers, product teams connect to them.

5. Large tool sets. When you have 30+ tools, an MCP server with discovery reduces the per-request token overhead (you do not need to send all 30 tool definitions in every API call - the application only sends the relevant subset).

6. Resources and prompts. When you need MCP's Resources and Prompts primitives - not just tool execution - MCP is required.


The Decision Matrix


Hybrid Patterns

Many production deployments use both approaches simultaneously, each for its appropriate purpose.

Pattern 1: Core Tools as Function Calling, Shared Tools as MCP

import asyncio
import anthropic
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

client = anthropic.Anthropic()

# Application-specific tools (function calling - not shared with other apps)
APPLICATION_SPECIFIC_TOOLS = [
{
"name": "get_user_context",
"description": "Get the current user's account details and preferences.",
"input_schema": {
"type": "object",
"properties": {
"user_id": {"type": "string"}
},
"required": ["user_id"]
}
}
]

async def build_hybrid_agent(user_id: str, user_message: str) -> str:
"""
Hybrid agent: application-specific tools via function calling,
shared tools (GitHub, Postgres) via MCP.
"""
# Connect to shared MCP servers
github_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/[email protected]"],
env={"GITHUB_PERSONAL_ACCESS_TOKEN": "..."}
)

async with stdio_client(github_params) as (read, write):
async with ClientSession(read, write) as github_session:
await github_session.initialize()
github_tools = await github_session.list_tools()

# Combine: application-specific (function calling) + shared (MCP)
all_tools = APPLICATION_SPECIFIC_TOOLS + [
{
"name": f"github_{t.name}", # Prefix to avoid name collisions
"description": t.description,
"input_schema": t.inputSchema
}
for t in github_tools.tools
]

messages = [{"role": "user", "content": user_message}]

while True:
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=all_tools,
messages=messages
)

if response.stop_reason == "end_turn":
return response.content[0].text

if response.stop_reason == "tool_use":
results = []
for block in response.content:
if block.type == "tool_use":
if block.name == "get_user_context":
# Application-specific: handled locally
user = get_user(block.input["user_id"])
result = str(user)
elif block.name.startswith("github_"):
# Shared tool: routed through MCP
actual_name = block.name[len("github_"):]
mcp_result = await github_session.call_tool(
actual_name, block.input
)
result = mcp_result.content[0].text
else:
result = f"Unknown tool: {block.name}"

results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})

messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": results})

def get_user(user_id: str) -> dict:
return {"id": user_id, "name": "Test User", "plan": "pro"}

asyncio.run(build_hybrid_agent("user-123", "Check my latest GitHub PRs"))

Pattern 2: MCP for Infrastructure, Function Calling for Business Logic

This pattern is common in enterprise deployments:

Database access (complex, shared by many apps) → MCP server
Authentication service (shared, maintained by auth team) → MCP server

Order processing logic (specific to this app) → Function calling
User preference retrieval (app-specific) → Function calling
Notification routing (app-specific business rules) → Function calling

The infrastructure tools that multiple teams need are MCP servers. The application logic that is specific to one product remains as function calling. This gives you the maintenance benefits of MCP where they matter most without over-engineering the parts that are genuinely application-specific.


Migrating from Function Calling to MCP

When you have existing function calling tools and want to move them to MCP:

"""
Migration helper: extract function calling tools into MCP server format.
This is the pattern for migrating an existing agent to use MCP.
"""

from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent, CallToolResult
import asyncio

# BEFORE: Function calling tools defined in application
OLD_TOOLS = [
{
"name": "search_products",
"description": "Search the product catalog",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "integer", "default": 10}
},
"required": ["query"]
}
}
]

# BEFORE: Handler in application code
def handle_search_products(query: str, limit: int = 10) -> str:
# ... database query ...
return "results"

# ──────────────────────────────────────────────────────────────
# AFTER: Same tools as MCP server
# ──────────────────────────────────────────────────────────────

app = Server("product-tools-server")

@app.list_tools()
async def list_tools() -> list[Tool]:
return [
Tool(
name="search_products",
# Same description - model behavior unchanged
description="Search the product catalog",
# Same schema - model sends same arguments
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "integer", "default": 10}
},
"required": ["query"]
}
)
]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> CallToolResult:
if name == "search_products":
# Same implementation - just wrapped in MCP protocol
result = handle_search_products(**arguments)
return CallToolResult(
content=[TextContent(type="text", text=result)],
isError=False
)
return CallToolResult(
content=[TextContent(type="text", text=f"Unknown tool: {name}")],
isError=True
)

# The model sees the same tool with the same behavior.
# The difference: now any MCP client can use it, not just this one application.

async def main():
async with stdio_server() as (read, write):
await app.run(read, write, app.create_initialization_options())

if __name__ == "__main__":
asyncio.run(main())

Migration steps:

  1. Extract tool definitions (JSON Schema) - they are identical in both systems
  2. Move handler functions into the MCP server's call_tool handler
  3. Update the application to connect to the MCP server and route tool calls through it
  4. Test: the model should see the same tools and behavior; only the execution path changed
  5. Once verified, the same MCP server can be connected to other applications

Token Efficiency Comparison

Function calling sends all tool definitions in every API request. MCP enables more selective tool sharing - the application only sends the tools that are relevant to the current task.

import anthropic
import json

def estimate_tool_tokens(tools: list[dict]) -> int:
"""Rough estimate: each character ≈ 0.25 tokens."""
return len(json.dumps(tools)) // 4

# Scenario: 30 tools defined
all_tools = [
{
"name": f"tool_{i}",
"description": f"Tool {i} does specific thing {i} with these parameters",
"input_schema": {
"type": "object",
"properties": {
"param_a": {"type": "string", "description": f"Parameter for tool {i}"},
"param_b": {"type": "integer", "default": 10}
},
"required": ["param_a"]
}
}
for i in range(30)
]

all_token_estimate = estimate_tool_tokens(all_tools)
print(f"All 30 tools in every request: ~{all_token_estimate} tokens")

# MCP with selective routing: only send relevant tools
relevant_tools = all_tools[:5] # Only tools relevant to current task
selective_token_estimate = estimate_tool_tokens(relevant_tools)
print(f"5 relevant tools (MCP selective): ~{selective_token_estimate} tokens")
print(f"Token savings per request: ~{all_token_estimate - selective_token_estimate}")
print(f"At 1000 requests/day: ~{(all_token_estimate - selective_token_estimate) * 1000:,} tokens saved")

For large tool counts, MCP enables intelligent tool routing that reduces per-request token consumption. The application learns which tools are relevant for which types of tasks and sends only those in the API call.


Production Engineering Notes

Latency Trade-offs

Function calling eliminates the MCP protocol layer, which saves approximately 10–50ms per tool call depending on server startup time and communication overhead. For interactive applications where tool calls happen during a conversation, this is noticeable. For batch processing or background agents, it is irrelevant.

Measure the actual latency impact in your application before making architecture decisions based on latency concerns.

Context Management

When using MCP with multiple servers, the application must manage which tools it presents to the model. Presenting all tools from all servers in every request:

  • Increases input token consumption linearly with tool count
  • Can confuse the model when many similar tools are available

Better approach: Use semantic routing to present only the tools relevant to the current task:

async def get_relevant_tools(
user_message: str,
all_tools: dict[str, list[dict]]
) -> list[dict]:
"""
Select only the tools relevant to this specific user message.
Reduces token consumption and improves tool selection accuracy.
"""
# Simple keyword-based routing
keywords_to_servers = {
"github": ["github", "pr", "issue", "code", "repository", "commit"],
"database": ["database", "query", "record", "table", "data"],
"filesystem": ["file", "directory", "read", "write", "document"],
}

message_lower = user_message.lower()
relevant_servers = set()

for server, keywords in keywords_to_servers.items():
if any(kw in message_lower for kw in keywords):
relevant_servers.add(server)

# If no specific server detected, return all tools
if not relevant_servers:
return [tool for tools in all_tools.values() for tool in tools]

return [
tool
for server in relevant_servers
for tool in all_tools.get(server, [])
]

:::tip The Practical Rule If you are writing the tool definition and the tool handler in the same file as your agent code, use function calling - it is simpler and more appropriate. If you are writing a tool that will be shared across multiple applications, or you want it to work in Claude Desktop or IDE plugins, build an MCP server. Start with function calling; migrate to MCP when you see the reuse need emerge. :::

:::warning Do Not Over-Engineer Early Building MCP servers for every tool from day one is over-engineering in most cases. MCP's benefits (reuse, ecosystem integration, team ownership separation) only materialize when multiple applications need the same tools. For a first AI product, function calling is almost always the right starting point. Add MCP infrastructure when you have a concrete need for it, not in anticipation of a need that may not emerge. :::


Summary: The Practical Decision

SituationRecommendation
First AI product, few toolsFunction Calling
Building for Claude Desktop / IDEMCP (required)
Second team wants same toolsMigrate to MCP
Platform team / product team separationMCP
Serverless deploymentFunction Calling
More than 20 toolsMCP
Latency under 200ms requiredFunction Calling
Resources or Prompts neededMCP
PrototypingFunction Calling
Production, multi-applicationMCP

Interview Q&A

Q1: What is the fundamental difference between MCP and function calling?

They operate at different layers of the stack. Function calling is the model-to-application interface: the protocol by which the language model communicates "call this function with these arguments" to the application that called it. It defines how the model requests tool execution. MCP is the application-to-tool-server interface: the protocol by which an AI application calls out to external tool servers. It defines how the application executes tools. In a full MCP deployment, both are present - function calling happens between the model and the application; MCP happens between the application and the tool servers. They are complementary layers, not competing alternatives.

Q2: When would you choose function calling over MCP?

Function calling is more appropriate when: you have a single AI application with no plans to share tools with others (no reuse benefit from MCP infrastructure); the tool count is small (under 10, where the per-request token overhead is negligible); you are deploying in a serverless environment where persistent MCP connections are complex; you have strict latency requirements where eliminating the MCP protocol overhead matters; or you are in a rapid prototyping phase and want to validate tool utility before committing to MCP infrastructure. The key question is: will these tools be used by multiple applications? If no, function calling is simpler and more appropriate.

Q3: Walk through how you would migrate an existing function calling implementation to MCP.

Four steps. First, extract the tool definitions (JSON Schema) - these are identical in both systems, so nothing changes in what the model sees or how it calls tools. Second, move the tool handler functions into an MCP server using the mcp Python SDK: create a Server instance, register the same tools via @server.list_tools(), implement execution via @server.call_tool(). Third, update the application: instead of calling handler functions directly, connect to the MCP server at startup, discover tools via session.list_tools(), and route tool calls via session.call_tool(). Fourth, test: the model behavior should be identical - same tool names, same descriptions, same input schemas, same outputs. The only thing that changed is the execution path. Once verified, other applications can connect to the same MCP server.

Q4: How does using MCP affect token consumption compared to function calling?

In a naive implementation, MCP does not change token consumption - you still send the same tool definitions in the API request. But MCP enables optimization that function calling does not: because tools are registered on the server and discovered at runtime, the application can selectively present only the relevant tools for each user request. If you have 30 tools across 3 MCP servers but a specific user question is clearly about GitHub, you can send only the 10 GitHub tools rather than all 30. With function calling, the tool list is static per application, so you always send all of them. At scale with large tool sets, this selective routing can meaningfully reduce input token consumption.

Q5: A team has been using function calling for 6 months with 8 tools. A new team wants to use the same tools. How do you advise them?

This is exactly the scenario where migrating to MCP makes sense. The new team wanting the same tools creates the concrete reuse need that justifies MCP infrastructure. Recommended approach: the platform team builds an MCP server containing the 8 tools, with proper error handling, authentication, and logging. Both teams connect to it as an MCP client. Ongoing maintenance happens once in the MCP server - when the underlying API changes, both teams benefit automatically. For the migration: the tool definitions transfer directly since JSON Schema is the same format. The handler functions move into the MCP server. Each team's application code changes to connect to the MCP server and route tool calls through it. Total migration effort is typically 1–2 days for 8 tools, and the maintenance savings compound over time.

© 2026 EngineersOfAI. All rights reserved.