Click a tab to see the full multi-turn memory pipeline for each framework. Same task: 5 turns of conversation, memory injected into every RAG query.
from synapsekit import RAG rag = RAG(model="gpt-4o-mini", api_key=KEY, memory_window=5) await rag.add_documents(DOCS) r1 = await rag.ask("What is RAG?") r2 = await rag.ask("How does it improve accuracy?") r3 = await rag.ask("Which retrieval method is fastest?")
memory_window=5 is a single constructor argument. Every subsequent .ask() automatically prepends the last 5 turns to the retrieved context. Zero additional setup. Limitation: in-memory only — no persistence across sessions or restarts. Best for single-user, single-session chatbots.
from llama_index.core import Document, VectorStoreIndex, Settings from llama_index.core.memory import ChatMemoryBuffer from llama_index.llms.openai import OpenAI Settings.llm = OpenAI(model="gpt-4o-mini") index = VectorStoreIndex.from_documents([Document(text=d) for d in DOCS]) memory = ChatMemoryBuffer.from_defaults(token_limit=1500) engine = index.as_chat_engine(memory=memory, chat_mode="context") r1 = engine.chat("What is RAG?") r2 = engine.chat("How does it improve accuracy?") r3 = engine.chat("Which retrieval method is fastest?")
ChatMemoryBuffer.from_defaults(token_limit=1500) creates a token-budget buffer. The engine drops old messages when it exceeds the limit — more predictable prompt sizes than turn-count windows. Persistence: can serialize to SimpleChatStore (JSON). No Redis/Postgres built-in.
from langchain_community.retrievers import BM25Retriever from langchain_openai import ChatOpenAI from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_core.chat_history import InMemoryChatMessageHistory store = {} def get_session_history(session_id: str) -> InMemoryChatMessageHistory: if session_id not in store: store[session_id] = InMemoryChatMessageHistory() return store[session_id] retriever = BM25Retriever.from_texts(DOCS, k=3) prompt = ChatPromptTemplate.from_messages([ ("system", "Context: {ctx}"), MessagesPlaceholder("history"), ("human", "{question}"), ]) chain = ({"ctx": retriever, "question": RunnablePassthrough()} | prompt | ChatOpenAI()) chain_with_history = RunnableWithMessageHistory( chain, get_session_history, input_messages_key="question", history_messages_key="history" ) cfg = {"configurable": {"session_id": "s1"}} r1 = chain_with_history.invoke({"question": "What is RAG?"}, config=cfg)
RunnableWithMessageHistory wraps any LCEL chain with pluggable session storage. The payoff: swap InMemoryChatMessageHistory for RedisChatMessageHistory and you get production-grade multi-user persistence — one line change. Worth the 17-line setup cost if you're building for real users.