Skip to main content

AI Letters #31 - Graph Workflows: When Chains Break and DAGs Take Over

· 10 min read
EngineersOfAI
AI Engineering Education

A linear chain handles most tasks. Research, generate, done. But production workflows branch. If the query is complex, run a deeper research step. If it is simple, take the fast path. If quality is insufficient, loop back. This requires a graph, not a chain. Notebook #23 of the LLM Showdown tests which frameworks ship graph primitives - and which force you to build infrastructure from scratch.

"The difference between a framework with graph primitives and one without is the difference between declaring your workflow and implementing your workflow engine."

A chain is a sequence. Step 1 feeds step 2. Step 2 feeds step 3. No decisions. No branches. No loops. For a simple RAG pipeline - retrieve, augment, generate - a chain is all you need.

Then requirements arrive. Route complex queries to a deep research path and simple queries to a fast path. Retry if the answer confidence is below a threshold. Run web search and database lookup in parallel, then merge results. Pause for human approval before executing a tool call.

Each of these patterns requires a directed acyclic graph (or a cyclic one, for loops). You need nodes, edges, conditional routing, state that persists across steps, and an execution engine that handles branching and merging. The question is whether your framework ships this as a primitive or whether you build it yourself.

Notebook #23 builds the same conditional 3-node workflow in all three frameworks: a research node, a conditional router that branches to either a detailed or quick answer path, and terminal nodes. Same logic, same behavior, different APIs.

The results split cleanly into two tiers.

What We Measured

Each framework implements a conditional pipeline: research -> router -> (detailed answer OR quick answer). The router branches based on query length (a proxy for complexity). We measured four things.

MetricWhat it captures
Lines of codeLoC to build the conditional 3-node graph
Feature coverage7 graph capabilities: StateGraph, conditional edges, parallel branches, cycles, checkpointing, streaming, visualization
API clarityHow readable is the graph definition?
Native supportDoes the framework ship graph primitives or require manual Python?

Frameworks: SynapseKit 1.4 (StateGraph), LangChain 1.2 + LangGraph (StateGraph), LlamaIndex Core 0.14 (manual routing)


The Numbers

Lines of code: Conditional 3-node graph

Framework Imports Code Total
----------------------------------------
SynapseKit 1 19 20
LangChain 2 18 20
LlamaIndex 3 12 15

LlamaIndex has the fewest lines. But those 15 lines implement only the happy path - manual if/else routing with no state schema, no checkpointing, no streaming, no visualization. Fewer lines of application code, more lines of infrastructure you will write later.

SynapseKit and LangChain are identical at 20 lines each. The APIs are so similar that porting code from one to the other takes minutes.


The Feature Matrix

This is the real story.

Graph Feature Support (7 features):

Feature SynapseKit LangChain LlamaIndex
---------------------------------------------------------
StateGraph primitive Yes Yes No
Conditional edges Yes Yes No
Parallel branches Yes Yes No
Cycle / loop support Yes Yes No
Built-in checkpointing Yes Yes No
Stream graph events Yes Yes No
Graph visualization Yes Yes No
---------------------------------------------------------
Score 7/7 7/7 0/7

SynapseKit: 7 out of 7. LangChain: 7 out of 7. LlamaIndex: 0 out of 7.

This is not a close race with a narrow winner. This is a binary split. Two frameworks ship a complete graph runtime. One framework ships nothing.


The API Comparison

The most surprising finding: SynapseKit and LangGraph have nearly identical APIs.

SynapseKit:
graph = StateGraph(schema)
graph.add_node('research', research_fn)
graph.add_conditional_edge('research', router, mapping)
graph.add_edge('detailed_answer', END)
app = graph.compile()
result = app.run_sync(initial_state)

LangGraph:
graph = StateGraph(State)
graph.add_node('research', research_fn)
graph.add_conditional_edges('research', router, mapping)
graph.add_edge('detailed_answer', END)
app = graph.compile()
result = app.invoke(initial_state)

The differences: add_conditional_edge (singular) vs add_conditional_edges (plural). run_sync vs invoke. TypedState(fields={...}) vs TypedDict. That is it. The graph definition pattern is identical.

LlamaIndex:
research_result = research_fn(query)
if len(query) > 20:
result = detailed_fn(research_result)
else:
result = quick_fn(research_result)

No graph object. No state schema. No conditional edge declaration. Just Python control flow. This works for the simple case. But when you need to add checkpointing, streaming, parallel branches, or cycle detection, you are building a graph engine, not using one.


The One Meaningful Difference

Where SynapseKit and LangChain diverge is state definition.

LangGraph uses a plain TypedDict:

class State(TypedDict):
query: str
result: str

SynapseKit uses TypedState with explicit StateField declarations:

schema = TypedState(fields={
'query': StateField(default=''),
'result': StateField(default=''),
})

For simple last-write-wins state, LangGraph's TypedDict is cleaner and more Pythonic. For parallel branches that merge state - where two nodes independently append to a shared list, for example - SynapseKit's StateField reducers handle the merge logic declaratively. You define how concurrent writes resolve instead of writing merge code.

If your workflows are linear with conditional branches, LangGraph's state model is simpler. If your workflows have parallel fan-out/fan-in patterns, SynapseKit's reducer model prevents merge bugs.


When You Need a Graph

Not every pipeline needs graph primitives. A simple retrieve-augment-generate chain is fine as a chain. Reach for a graph when:

When to use a graph workflow:

Pattern Example
---------------------------------------------------------
Conditional routing Route to different models by query
complexity or topic domain

Retry loops Re-run generation if confidence < 0.8,
up to 3 times

Parallel branches Web search + DB lookup simultaneously,
merge results before generation

Human-in-the-loop Pause at review node, wait for
approval, resume or reject

Quality gates Evaluate output against criteria,
loop back to improve if insufficient

Multi-step agents Agent reasons, acts, observes, decides
whether to continue or terminate

If none of these patterns apply to your workflow, a chain is simpler, debuggable, and sufficient. Do not adopt graph complexity for linear pipelines.


What This Means for Engineers

  1. SynapseKit and LangChain tie on graph workflows. Both ship a complete StateGraph primitive with 7/7 features. The APIs are nearly identical. If graph workflows are your primary concern, both frameworks are equivalent choices.

  2. LlamaIndex has no graph primitive. Zero out of 7 features. If your workflow requires conditional routing, loops, or parallel branches, you will build the orchestration layer yourself. This is a significant gap for complex pipeline architectures.

  3. LangGraph's TypedDict state is simpler for basic cases. Plain Python TypedDict with no special imports. For last-write-wins state, this is cleaner than SynapseKit's StateField approach.

  4. SynapseKit's StateField reducers win for parallel merging. When two branches write to the same state key concurrently, reducers define how to merge. Without reducers, you write merge logic manually and hope you handle every edge case.

  5. Fewer lines does not mean simpler. LlamaIndex's 15-line implementation has less code but also less capability. The missing 5 lines buy you state schemas, streaming, checkpointing, visualization, and cycle detection - things you will eventually build by hand.


The Thing Most People Miss

Graph workflows are not about replacing chains. They are about making conditional logic declarative instead of imperative.

You can build any graph workflow in raw Python. If/else for routing. While loops for retries. Threading for parallel branches. Dict for state. It works. But the moment you need to debug a failed run at 3am, you want to see the graph structure, replay from a checkpoint, stream events to a dashboard, and visualize where the execution went.

Raw Python gives you none of that. A graph primitive gives you all of it.

The engineer who reaches for a StateGraph is not the one who cannot write if/else statements. They are the one who has debugged enough production workflows to know that the execution infrastructure matters more than the business logic. The business logic is 15 lines. The observability, checkpointing, streaming, and error handling around it is 150 lines. A framework graph primitive absorbs those 150 lines so you write the 15.

SynapseKit and LangChain both understand this. LlamaIndex, for now, does not.

Week 4 continues: cost tracking, guardrails, MCP support, and the final scorecard. The graph benchmark gives both SynapseKit and LangChain a point. The cumulative race holds steady.


Three Things Worth Doing This Week

  1. Audit your pipeline for hidden conditional logic. Search for if/else branches that route between different processing paths. Each one is a candidate for a graph node with a conditional edge. Declare the routing, do not embed it in procedural code.

  2. Add checkpointing to any workflow that takes more than 30 seconds. If a 5-node pipeline fails at node 4, you should resume from node 3, not restart from node 1. Both SynapseKit and LangGraph ship checkpointers. Use them.

  3. Visualize your graph before deploying it. Both SynapseKit (app.get_mermaid()) and LangGraph (app.get_graph().draw_mermaid()) export Mermaid diagrams. Generate the diagram, review the edges, confirm the routing logic matches your intent. A graph you can see is a graph you can debug.

The best workflow architecture is the one where adding a new branch takes one line, not a refactor. Graph primitives make that possible. Raw Python makes it a project.


Engineers of AI

Read more: www.engineersofai.com

If this was useful, forward it to one engineer who should be reading it.

Want to Think Like an AI Architect?

Join engineers receiving weekly breakdowns of AI systems, production failures, and architectural decisions.