Module 06: Case Studies

Theory without production context is incomplete. This module takes the architectural patterns from previous modules and applies them to six real-world system design problems. Each case study is the kind of question you will face in a system design interview at a top tech company - and the kind of system you will need to design and defend in a real engineering role.

What You Will Learn

Case Study Map

#	System	Scale	Key Challenges
01	Recommendation	Billions of items, millions of users	Two-stage architecture, cold start, freshness
02	Search Ranking	Millions of queries/day	Semantic retrieval, LTR, A/B testing
03	Fraud Detection	under 100ms, 0.001% fraud rate	Class imbalance, delayed labels, concept drift
04	Content Moderation	500 hours of video/minute	Multi-modal, human + AI, adversarial
05	Ad Click Prediction	8.5B impressions/day	Online learning, calibration, exploration
06	LLM Products	Trillions of tokens/month	Cost, latency, hallucination, observability

Key Patterns Across All Case Studies

Two-stage architectures: fast candidate retrieval followed by slow, expensive ranking - appears in recommendation, search, fraud, and ads
Real-time plus batch hybrid: precomputed embeddings updated offline, real-time features computed online - appears in every case study
Human-in-the-loop: ML scales the decision volume; humans handle edge cases, appeals, and label generation - prominent in moderation and fraud
Feedback loops: user behavior drives model training drives user behavior - must be managed in recommendation, ads, and search
Extreme class imbalance: 0.001% click rates, 0.01% fraud rates, 0.001% policy violations - sampling and loss weighting strategies critical

How to Use This Module

Each case study is structured as a complete system design walkthrough, from requirements through full architecture. Read each case study as if you are in an interview: start by identifying the functional and non-functional requirements, then work through the architecture layer by layer. The Interview Q&A at the end of each lesson is calibrated to the questions actually asked at Meta, Google, Stripe, Airbnb, and similar companies.

For each system, you should be able to:

State the problem in one sentence and the key constraints (latency, scale, accuracy)
Draw the two-level or three-level architecture from memory
Explain the key modeling choice and why alternatives would fail
Describe the serving path in terms of latency budget
Explain how the system handles the characteristic failure mode (cold start, concept drift, adversarial inputs)

What You Will Learn​

Case Study Map​

Key Patterns Across All Case Studies​

How to Use This Module​

What You Will Learn

Case Study Map

Key Patterns Across All Case Studies

How to Use This Module