Module 06: Agent Memory
What Separates an Agent From a Chatbot
A chatbot resets. Every conversation starts from zero. It does not know your name, your preferences, what you worked on last week, or what mistakes it made last time.
An agent remembers. It builds a model of you, of the project, of its own capabilities - and that model grows more accurate over time. Memory is the mechanism that turns a stateless language model into a persistent, improving system.
This module covers the full memory stack for production agents.
The Four Memory Types
Cognitive science distinguishes four types of human memory. AI agents need all four, implemented differently:
Each type has different latency, cost, and use cases. Production agents compose all four simultaneously.
Module Map
| Lesson | Title | What You Learn |
|---|---|---|
| 01 | Four Types of Agent Memory | Cognitive model → technical implementation; trade-offs; composition patterns |
| 02 | In-Context Working Memory | Context window management; sliding window; summarization; token budgeting |
| 03 | Episodic Memory with Vector Store | ChromaDB; memory formation and retrieval; consolidation; forgetting |
| 04 | Semantic Memory and Knowledge Graphs | Entity/relation extraction; NetworkX graph; multi-hop queries; KG + RAG hybrid |
| 05 | Procedural Memory and Learned Skills | Skill library; trajectory-to-skill extraction; skill retrieval and composition |
| 06 | Memory Compression and Summarization | Progressive summarization; hierarchical compression; importance scoring; LLM rewriting |
| 07 | Cross-Session Persistence | SQLite + ChromaDB hybrid; serialization; state versioning; multi-agent shared memory |
What You Will Build
By the end of this module you will have implemented:
- A four-memory demonstration system - shows all memory types operating on the same agent
- A context manager - tracks tokens, prunes history, summarizes when near limit
- An episodic memory store - ChromaDB-backed experience storage with importance scoring and consolidation
- A knowledge graph memory - NetworkX-based entity-relation graph with text extraction pipeline
- A procedural skill library - stores successful action sequences, retrieves by task similarity
- A memory compressor - LLM-based summarization with importance-weighted retention
- A persistence layer - SQLite + ChromaDB hybrid that survives restarts and supports multi-agent access
These components fit together into a complete memory architecture that you can attach to any agent.
The Core Trade-off
Every memory decision involves a fundamental trade-off:
There is no free lunch. Good memory architecture is about making informed trade-offs - knowing which memories matter most, which can be compressed, and which must be retrieved on demand.
Prerequisites
You should be comfortable with:
- Vector embeddings and similarity search (touched in Module 03)
- The ReAct loop and tool calling (Module 01, 04)
- Basic Python async patterns
The code examples use openai, chromadb, and networkx. All are pip-installable.
The Core Principle
An agent with good memory does not just answer questions - it understands context, accumulates expertise, and improves with every interaction. Memory is the foundation of agency.
Let's build that memory system.
