Skip to main content

Module 06: Agent Memory

What Separates an Agent From a Chatbot

A chatbot resets. Every conversation starts from zero. It does not know your name, your preferences, what you worked on last week, or what mistakes it made last time.

An agent remembers. It builds a model of you, of the project, of its own capabilities - and that model grows more accurate over time. Memory is the mechanism that turns a stateless language model into a persistent, improving system.

This module covers the full memory stack for production agents.


The Four Memory Types

Cognitive science distinguishes four types of human memory. AI agents need all four, implemented differently:

Each type has different latency, cost, and use cases. Production agents compose all four simultaneously.


Module Map

LessonTitleWhat You Learn
01Four Types of Agent MemoryCognitive model → technical implementation; trade-offs; composition patterns
02In-Context Working MemoryContext window management; sliding window; summarization; token budgeting
03Episodic Memory with Vector StoreChromaDB; memory formation and retrieval; consolidation; forgetting
04Semantic Memory and Knowledge GraphsEntity/relation extraction; NetworkX graph; multi-hop queries; KG + RAG hybrid
05Procedural Memory and Learned SkillsSkill library; trajectory-to-skill extraction; skill retrieval and composition
06Memory Compression and SummarizationProgressive summarization; hierarchical compression; importance scoring; LLM rewriting
07Cross-Session PersistenceSQLite + ChromaDB hybrid; serialization; state versioning; multi-agent shared memory

What You Will Build

By the end of this module you will have implemented:

  1. A four-memory demonstration system - shows all memory types operating on the same agent
  2. A context manager - tracks tokens, prunes history, summarizes when near limit
  3. An episodic memory store - ChromaDB-backed experience storage with importance scoring and consolidation
  4. A knowledge graph memory - NetworkX-based entity-relation graph with text extraction pipeline
  5. A procedural skill library - stores successful action sequences, retrieves by task similarity
  6. A memory compressor - LLM-based summarization with importance-weighted retention
  7. A persistence layer - SQLite + ChromaDB hybrid that survives restarts and supports multi-agent access

These components fit together into a complete memory architecture that you can attach to any agent.


The Core Trade-off

Every memory decision involves a fundamental trade-off:

There is no free lunch. Good memory architecture is about making informed trade-offs - knowing which memories matter most, which can be compressed, and which must be retrieved on demand.


Prerequisites

You should be comfortable with:

  • Vector embeddings and similarity search (touched in Module 03)
  • The ReAct loop and tool calling (Module 01, 04)
  • Basic Python async patterns

The code examples use openai, chromadb, and networkx. All are pip-installable.


The Core Principle

An agent with good memory does not just answer questions - it understands context, accumulates expertise, and improves with every interaction. Memory is the foundation of agency.

Let's build that memory system.

© 2026 EngineersOfAI. All rights reserved.