Full 3-Week Cumulative Rankings

21 benchmarks across Developer Experience, RAG Pipelines, and Agents & Tools

#1 Overall
SynapseKit
45
/ 63 possible pts
W1: 15 · W2: 14 · W3: 16
#2 Overall
LangChain
31
/ 63 possible pts
W1: 8 · W2: 10 · W3: 13
#3 Overall
LlamaIndex
26
/ 63 possible pts
W1: 7 · W2: 12 · W3: 7
Week-by-Week Points — Cumulative Stacked
Each color shade = one week. Darker = earlier. Hover for exact values.
Week 3 Radar — Agent Dimensions
Each axis = one benchmark (#15–#20). Max score per axis = 3.
Week 3 Detail — Agents & Tools (Notebooks #15–#20)
Notebook Dimension SynapseKit LangChain LlamaIndex Winner
#15 ReAct Agents LoC + built-in tools 321 SynapseKit
#16 Function Calling Schema LoC + multi-format 321 SynapseKit
#17 Built-in Tools Tool count + zero-config 321 SynapseKit
#18 Multi-Agent LoC + patterns supported 321 SynapseKit
#19 Observability LoC to enable + depth 222 3-Way Tie
#20 Error Handling LoC + error primitives 231 LangChain
Week 3 Total 16137 SynapseKit
Insight 1
SK dominates agent ergonomics
Wins ReAct, Function Calling, Built-in Tools, and Multi-Agent outright. The Crew + Task(context_from=[...]) pattern is the most concise multi-agent API across all three.
Insight 2
LC wins the one that matters most
Error handling. ToolException + handle_tool_error is genuinely well-designed. SK and LI make you write boilerplate try/except for every tool.
Insight 3
LlamaIndex is not an agent framework
No built-in error primitives, 3/6 orchestration patterns, 3 core tool types. It's a retrieval framework with agents bolted on. Strong in Week 2 (RAG), weak in Week 3 (Agents).
Insight 4
The cumulative gap is narrowing
SK-to-LC delta: Week 1 = 7pts, Week 2 = 4pts, Week 3 = 3pts. LangChain closes ground each week. Week 4 (production: async, graph, eval, cost, MCP) may narrow it further.
www.engineersofai.com · AI Letters #29 · LLM Showdown #21 · Frameworks: SynapseKit 1.4, LangChain 1.2, LlamaIndex Core 0.14