Error Handling in Software: A History

From exception hierarchies to LLM-native error recovery. Click any milestone to expand.

General Software

Distributed Systems

LLM Era

Modern Frameworks

1960sException handling: errors as first-class objectsGeneral

Early languages like PL/I introduced structured exception handling — the idea that errors should be caught by the call stack rather than crashing the process. Java's checked exceptions (1995) made error handling contractual: callers must declare what they catch. Python's exception hierarchy made it idiomatic. The core model: errors propagate up the call stack until caught. Uncaught exceptions terminate the program. The key assumption: errors are rare, deterministic, and fixable by code.

→ LlamaIndex follows this model exactly — exceptions propagate up. The developer writes the try/except. No framework magic.

1990sRetry patterns: idempotency and exponential backoffGeneral

As networked systems became common, a new failure class emerged: transient failures. The server is temporarily unavailable. The database is under load. Retry with backoff became the standard pattern. AWS's SDK baked in exponential backoff. Tenacity (Python) made retry decorators idiomatic. The key insight: not all errors mean the operation failed — some mean "try again later." Retry logic assumes the operation is idempotent or that you can tolerate duplicate attempts.

→ Neither LangChain nor LlamaIndex ship built-in retry at the tool level. SynapseKit's CircuitBreaker is the inverse: "stop trying" rather than "keep trying".

2007Circuit breaker pattern: stop compounding failuresDistributed

Michael Nygard's "Release It!" (2007) formalized the circuit breaker pattern. If a downstream service fails repeatedly, stop calling it — trip the circuit, return a fallback immediately, and check again after a timeout. Three states: Closed (normal), Open (failing — reject calls), Half-Open (testing — try one call to see if recovery happened). Netflix Hystrix (2011) popularized it for microservices. Hystrix was later deprecated in favor of Resilience4j. The key insight: a failing service under load from retry storms fails harder. Stop the storm at the source.

→ SynapseKit's CircuitState is the only LLM framework primitive that implements this pattern. LangChain routes all failures through the LLM instead — the LLM becomes the circuit breaker, which has obvious limits.

2012Fallback chains: degrade gracefully under pressureDistributed

Cascade failures in distributed systems led to the fallback chain pattern: if the primary path fails, try a cheaper/simpler/cached secondary path. In recommendation systems: try personalized recs → fall back to collaborative filtering → fall back to editorial picks. In APIs: try premium tier → fall back to cached response → fall back to static default. The pattern acknowledges that "somewhat degraded" is almost always better than "completely unavailable." Hystrix Fallback, then Polly's fallback policy, codified this.

→ Both SynapseKit (FallbackChain) and LangChain (.with_fallbacks()) implement this for LLM calls. LlamaIndex does not — no built-in model fallback.

2023 Q2LangChain ToolException: errors as LLM observationsLLM Era

LangChain introduced ToolException — a dedicated exception type that, when raised inside a tool, gets caught by AgentExecutor and converted into an Observation in the ReAct loop. Set handle_tool_error=True (or a string, or a callable) and the LLM sees the error as structured input to reason about. This is a fundamentally different model from traditional exception handling: the error is not a crash signal but a new data point for the reasoning loop. handle_parsing_errors=True extends this to malformed LLM outputs — a second distinct failure class LangChain explicitly handles.

→ The LLM becomes the error recovery mechanism. Elegant for recoverable, semantically meaningful errors. A liability for hard failures where the LLM will loop futilely.

2023 Q4LlamaIndex: no built-in primitives, manual compositionLLM Era

LlamaIndex's error handling philosophy is bring-your-own. FunctionTool.from_defaults() accepts any Python function — if you wrap it in try/except and return an error string, the agent sees that string as the tool output. No dedicated exception type. No automatic conversion to observations. No fallback chain. max_iterations as the only built-in guard. The design favors composability: you attach whatever retry/circuit-breaker/fallback library you already use. LlamaIndex makes no assumptions about your resilience stack. Score: 1/7 built-in features.

→ For teams with mature resilience infrastructure, this is fine. For teams building from scratch, this means writing all error handling logic manually.

2024 Q1SynapseKit FallbackChain + CircuitState: LLM-native resilienceLLM Era

SynapseKit shipped FallbackChain (try gpt-4o-mini, fall back to gpt-3.5-turbo on failure) and CircuitState (track per-tool failure counts, short-circuit after threshold) as first-class framework primitives. Tool-level errors still require manual try/except — SynapseKit doesn't convert exceptions to observations. But model-level failures (rate limits, model unavailable, API errors) are handled by FallbackChain without any code beyond the configuration. CircuitState prevents the agent from hammering a broken tool on repeated calls. Score: 3/7 built-in features, but the 3 it has are the ones LangChain lacks.

→ The two frameworks are complementary: LangChain handles tool-level + parse errors; SynapseKit handles model-level + circuit breaking. Neither covers everything.

2024–2025Structured outputs: eliminating parse errors by designModern

OpenAI's structured outputs (JSON mode, function calling with schema enforcement) and Anthropic's tool_choice=required eliminated an entire class of parse errors: malformed LLM responses. If the model must return valid JSON matching a schema, handle_parsing_errors=True becomes less necessary. The error class LangChain specifically handles — malformed ReAct format output — is partly mitigated by constrained generation. But not eliminated: structured output APIs can still fail at the API level, the model can still hallucinate field values that pass schema validation but break downstream logic.

→ The future of error handling in LLM agents is likely structured outputs reducing the parse error surface, combined with circuit breakers and fallback chains for infrastructure failures.

www.engineersofai.com · AI Letters #28 · LLM Showdown Notebook #20