AI Letters #28 - Agent Error Handling: LangChain Wins on Features, But What Does It Actually Catch?

April 17, 2026 · 8 min read

AI Engineering Education

LangChain wins on both dimensions - fewest lines (5) and most built-in error features (6/7). But its ToolException converts failures into LLM observations, making the model your error handler. SynapseKit's CircuitBreaker stops broken services from being hammered. LlamaIndex ships 1/7 features and expects you to bring the rest.

Interactive Timeline

Error Handling History →

From exception hierarchies (1960s) to circuit breakers (2007) to LLM-native error recovery - the lineage behind today's agent resilience patterns.

Interactive Explorer

Three Error Handling Paradigms →

Side-by-side code, error flow diagrams, and feature breakdowns for LangChain, SynapseKit, and LlamaIndex.

Benchmark Results

Full Feature Matrix & LoC Charts →

LoC comparison, 7-feature heatmap, design philosophy cards, and the complementary gap between LangChain and SynapseKit's error coverage.

The Numbers

Lines of error-handling code (imports + error-specific lines):

Framework       Imports  Error lines  Total
------------------------------------------
LangChain             2           3      5
SynapseKit            2           5      7
LlamaIndex            2           6      8

What those lines actually give you (feature depth score):

Feature                         LangChain  SynapseKit  LlamaIndex
-----------------------------------------------------------------
Dedicated exception type          Yes        No          No
Error → LLM observation           Yes        No          No
Handle LLM parse errors           Yes        No          No
LLM fallback chain                Yes        Yes         No
Circuit breaker                   No         Yes         No
Max iterations guard              Yes        Yes         Yes
Custom error handler fn           Yes        No          No

Score (out of 7):                  6          3           1

The score gap is wide. LangChain ships 6/7 error handling features out of the box. LlamaIndex ships 1. That 1 is max_iterations - a last-resort stop, not a recovery mechanism.

The Three Design Philosophies

What happens when a tool throws an exception?

LangChain                SynapseKit               LlamaIndex
──────────────────────   ──────────────────────   ──────────────────────
ToolException raised     try/except in            try/except wrapper
  ↓                        tool.run()               function (manual)
AgentExecutor catches      ↓                         ↓
handle_tool_error=True   return error string       return error string
  ↓                        ↓                         ↓
Error becomes LLM        Check CircuitState        Propagates up
  observation             FallbackChain              (uncaught = crash)
  ↓                       if LLM fails
LLM tries to recover

LangChain turns tool errors into LLM observations. Raise a ToolException inside a tool, set handle_tool_error=True on AgentExecutor, and the exception message becomes a new observation in the agent's thought/action/observation loop. The LLM sees it as: "The tool returned an error: API timeout." It can then reason about it - retry, use a different tool, or tell the user. This is elegant. It's also the source of a subtle failure mode: the LLM will try to reason its way through errors it cannot fix.

SynapseKit handles errors at both layers. Manual try/except in tool.run() for tool-level failures (return a fallback string). FallbackChain for model-level failures - if gpt-4o-mini fails, automatically retry with gpt-3.5-turbo. CircuitState tracks repeated failures and can short-circuit a tool that keeps breaking. Fewer convenience features. More explicit control over what happens when the model itself is the problem.

LlamaIndex provides no built-in error primitives. Max iterations as a last resort. Everything else is a wrapper function you write yourself. FunctionTool.from_defaults(fn=safe_search) where safe_search is just a try/except you added manually. The framework makes no distinction between a tool that errored and a tool that returned normally - both return strings.

What LangChain's `handle_tool_error` Actually Does

This is the mechanism most engineers misunderstand. When you set handle_tool_error=True:

Your tool raises ToolException("Search failed: API timeout")
AgentExecutor catches it
The error message becomes the next Observation in the ReAct loop
The LLM reads: Observation: Search failed: API timeout
The LLM decides what to do next

The LLM is now your error handler. For recoverable errors ("Search failed, try a different query"), this works well. For unrecoverable errors ("Database credentials invalid"), the LLM will loop - trying variations, rephrasing the query, eventually hitting max_iterations. You need both handle_tool_error=True and max_iterations to prevent infinite loops on hard failures.

handle_tool_error can also accept a string (fixed message to the LLM) or a callable (function that takes the exception and returns a message). The callable pattern is the most production-safe: you can inspect the exception type and give the LLM targeted instructions for specific error classes.

What This Means for Engineers

For tool-level failures, LangChain's ToolException is the fastest path. Three lines, immediate recovery loop, no custom code. If your tools are external APIs that occasionally fail, ToolException + handle_tool_error=True gets you working recovery behavior in minutes.
For model-level failures, LangChain gives you .with_fallbacks(). Chain multiple models: primary_llm.with_fallbacks([backup_llm]). This is built-in but not wired into AgentExecutor automatically - you need to apply it at the LLM construction step, not the agent step.
SynapseKit's CircuitBreaker is the only primitive that stops compounding failures. If a tool fails three times in a row, CircuitState can mark it as open and refuse subsequent calls until a timeout passes. No LLM framework besides SynapseKit ships this by default. In production systems that call external APIs, a circuit breaker is the difference between "the agent degraded gracefully" and "the agent hammered a failing endpoint 47 times."
LlamaIndex's 1/7 score is a design choice, not a bug. LlamaIndex's philosophy is composability: you bring your own retry logic, your own circuit breaker, your own fallback chain. The framework won't make assumptions about your error handling policy. For teams with existing resilience infrastructure (Polly, Tenacity, custom retry decorators), this is actually fine - LlamaIndex slots in without conflict.
The absence of LangChain's parse error handling in the others is significant. handle_parsing_errors=True catches malformed LLM outputs - when the model returns something that doesn't match the expected ReAct format. This is common with weaker models or unusual prompts. SynapseKit and LlamaIndex both crash on malformed output. LangChain retries with a parsing error message injected back to the LLM.

The Thing Most People Miss

Error handling in LLM agents is not the same problem as error handling in deterministic software.

In a REST API, an error is a signal: something failed, here's the status code, the client decides what to do. The error is the end of the interaction.

In an LLM agent, an error is an observation: something failed, the model reads the error message, and the model decides what to do next. The error is the beginning of a new reasoning step.

LangChain's design is built for this. ToolException is not a crash - it's a structured message to the reasoning loop. The implication: you need to write error messages for an LLM audience, not a developer audience. "API timeout" is poor. "The search API is temporarily unavailable. You can either retry the same query or answer from your training knowledge." is better. The LLM will use that context to make a better decision.

The circuit breaker fills a gap this reasoning loop cannot. If the search API is down for 30 minutes, no amount of LLM reasoning will fix it. The circuit breaker stops the agent from trying 20 more times before giving up. It's the only error primitive that operates outside the reasoning loop entirely - which is exactly why LangChain doesn't have one. LangChain's model is: route everything through the LLM. SynapseKit's model is: some failures should never reach the LLM.

Three Things Worth Doing This Week

Add handle_parsing_errors=True to every AgentExecutor you have in production. Malformed LLM outputs are silent failures without this. One extra kwarg, zero code changes.
Audit your tool exception messages for LLM readability. If you're using handle_tool_error=True, the error message is going to the model. Rewrite your ToolException strings as instructions: what happened, what the LLM can try instead.
Count how many times each external tool is called in a single agent run. If any tool can be called more than 5 times, you need a circuit breaker or a call cap. Without one, a single stuck agent can exhaust an API quota.

The five-line win is real. What you do with it determines whether errors become recoverable observations or infinite loops.

Engineers of AI

Read more: www.engineersofai.com

If this was useful, forward it to one engineer who should be reading it.

The Numbers​

The Three Design Philosophies​

What LangChain's handle_tool_error Actually Does​

What This Means for Engineers​

The Thing Most People Miss​

Three Things Worth Doing This Week​

Want to Think Like an AI Architect?