Skip to main content

Projects

What we're building and shipping.

Apr 15, 2026Engineers of AI

SynapseKit v1.0,Lightweight LLM Framework

A production-grade LLM framework built because LangChain frustrated us,and we decided to measure whether a simpler approach could actually work better. SynapseKit has 2 dependencies (vs LangChain's 67), a 30× faster cold start (12ms vs 360ms), and built-in cost guardrails that prevent a single agent run from blowing your API budget. Chains, agents, RAG pipelines, and tool use,all with zero magic and full debuggability. When something breaks at 3am, you can read the source in 20 minutes. MIT-licensed, fully documented, and battle-tested across 18 objective benchmarks.

2 dependencies30× faster cold startBuilt-in cost guardrails
Read more →
SynapseKithttpxpydanticvsLangChain,67 dependencies30× faster2 vs 67MIT
Apr 2026Engineers of AI

LLM Framework Showdown,30 Notebooks on Kaggle

A reproducible benchmark comparing SynapseKit, LangChain, and LlamaIndex across developer experience, RAG pipelines, agent capabilities, and production concerns. Every notebook includes methodology, raw results, statistical significance tests, and our interpretation. No hidden preprocessing, no curated datasets. If you disagree with a result, fork the notebook and prove us wrong,that's the point. We measure cold start, memory footprint, streaming latency at P99, and 15 more dimensions that matter in production.

30 notebooks3 frameworksFully reproducible
View project ↗
30notebooksReproducible on Kaggle
Coming SoonEngineers of AI

Agent Failure Mode Dataset

A structured dataset of production agent failures,not theoretical edge cases, but real traces from deployed systems that broke in ways nobody predicted. Each entry includes the agent architecture, the full execution trace, the exact point of failure, root cause analysis, and a tested mitigation strategy. Categories span infinite loops, tool call malformation, context window exhaustion, cost overruns, and cascading timeout failures. We're releasing this as an open dataset so teams can stress-test their agents against known failure patterns before deploying.

40+ failure patternsReal production tracesMitigation strategies
Coming soon
Failure ModeCategorySeverityInfinite loopLoopTool timeoutToolCost overrunCostContext overflowMemoryHallucinated toolToolWrong schemaSchemaStructured traces + mitigations