Skip to main content

One doc tagged with "memory-systems"

View all tags

Module 5: Memory Systems for AI

HBM, DRAM, cache hierarchies, KV cache management, PagedAttention, and quantization as memory compression - understanding memory is understanding why LLM inference costs what it costs.