Agent Error Handling: Three Design Philosophies

How LangChain, SynapseKit, and LlamaIndex each approach failure — click to explore.

LangChain

SynapseKit

LlamaIndex

LangChain — Error as LLM Observation

Raise ToolException inside a tool. Set handle_tool_error=True on AgentExecutor. The exception message becomes the next Observation in the ReAct loop — the LLM reads it and decides what to do. handle_parsing_errors=True catches malformed LLM outputs and retries. The model is the error handler.

from langchain_core.tools import StructuredTool, ToolException
from langchain_classic.agents import AgentExecutor

def flaky_search(query: str) -> str:
    raise ToolException(f"Search failed: API timeout. Try a different approach.")

tool = StructuredTool.from_function(
    func=flaky_search,
    handle_tool_error=True   # catches ToolException → LLM observation
)

executor = AgentExecutor(
    agent=agent, tools=[tool],
    handle_tool_error=True,      # tool-level errors
    handle_parsing_errors=True,  # malformed LLM output
    max_iterations=5
)

What it catches

✅ToolException (dedicated type)

✅Error → LLM observation (auto)

✅Malformed LLM parse errors

✅LLM model fallback (.with_fallbacks())

✅Custom error handler callable

✅Max iterations guard

❌Circuit breaker (requires LangSmith or custom)

The Error Flow

Tool raises ToolException with descriptive message

AgentExecutor catches it — wraps message as Observation

LLM reads the error — reasons about next action

LLM decides: retry, different tool, or give up

max_iterations stops infinite loops

Key tradeoff

Writing error messages for an LLM audience changes the quality of recovery. "API timeout" is poor — the LLM doesn't know what to do. "The search API is unavailable. You can answer from training knowledge or ask the user to retry later." gives the LLM a recovery path. LangChain wins on built-in features (6/7), but the quality of recovery depends entirely on how you write ToolException messages.

SynapseKit — LLM-Level Resilience

Manual try/except for tool-level errors. FallbackChain for model-level failures — if the primary model fails, automatically retry with a backup. CircuitState tracks per-tool failure counts and short-circuits after threshold. Some errors should never reach the LLM — stop them at the source.

from synapsekit import Agent, Tool, AgentConfig
from synapsekit import FallbackChain, FallbackChainConfig, CircuitState

# Tool-level: manual try/except
class SearchTool(Tool):
    async def run(self, query: str) -> str:
        try:
            return await self._fetch(query)
        except Exception:
            return "Search unavailable. Answering from training data."

# Model-level: FallbackChain
fallback = FallbackChain(FallbackChainConfig(
    models=['gpt-4o-mini', 'gpt-3.5-turbo']  # try in order
))

# Repeated failure guard: CircuitState
circuit = CircuitState(failure_threshold=3, timeout_seconds=60)
agent = Agent(config=AgentConfig(circuit_breaker=circuit))

What it catches

❌Dedicated exception type

❌Auto error → LLM observation

❌Parse error handling

✅LLM model fallback (FallbackChain)

✅Circuit breaker (CircuitState)

✅Max iterations guard

❌Custom error handler fn

The Error Flow

Tool throws → manual try/except returns fallback string

CircuitState checks failure count for this tool

If threshold exceeded → short-circuit, skip tool entirely

LLM call fails → FallbackChain tries next model

Explicit control — nothing happens automatically

Key tradeoff

SynapseKit's 3/7 feature score masks where it actually wins. The circuit breaker and FallbackChain cover failure modes LangChain ignores: what happens when the model itself is rate-limited or unavailable? These are the production failures that cause the most damage — an agent that keeps calling a broken service, or keeps waiting for a model that won't respond. SynapseKit handles these. LangChain requires custom code or LangSmith for the same coverage.

LlamaIndex — Bring Your Own Everything

No built-in error primitives beyond max_iterations. Wrap your tool function in try/except, return an error string, and the agent treats it like any other output. The framework makes no distinction between success and failure. Composability over convention — you attach whatever resilience library you already use.

from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent

# Manual wrapper — all error handling is yours to write
def safe_search(query: str) -> str:
    try:
        return flaky_search(query)
    except Exception as e:
        return f"Search unavailable: {e}. Proceeding from knowledge."

tool = FunctionTool.from_defaults(fn=safe_search)
agent = ReActAgent.from_tools(
    [tool],
    max_iterations=5  # the only built-in guard
)

What it catches

❌Dedicated exception type

❌Auto error → LLM observation

❌Parse error handling

❌LLM model fallback

❌Circuit breaker

✅Max iterations guard

❌Custom error handler fn

The Error Flow

Uncaught exception → propagates up, agent crashes

Manual try/except → return error string as tool output

LLM sees the string — same as any other tool result

No framework assistance — you own all logic beyond this

max_iterations as last resort stop

Key tradeoff

LlamaIndex's 1/7 score reflects a deliberate choice. The framework assumes you have (or will build) your own resilience layer. For teams that already use Tenacity for retries or have a circuit breaker in their infrastructure, LlamaIndex slots in without conflict. For teams building production agents from scratch with no existing resilience library, LlamaIndex means writing every error handling pattern manually. The most DIY of the three — the most composable, and the least opinionated.

www.engineersofai.com · AI Letters #28 · LLM Showdown Notebook #20