Module 10 - ML System Design
The gap between an ML model that works in a notebook and one that works in production is enormous. This module closes that gap.
Most ML courses teach you how to train models. Very few teach you how to design ML systems - how to decompose a business problem into a tractable ML objective, collect and label data at scale, engineer features that don't leak, choose evaluation metrics that actually track business value, deploy reliably, and build feedback loops that make the system smarter over time.
This module is structured exactly like a senior ML system design interview at Google, Meta, or Amazon - because that's the highest-fidelity test of whether you can actually build ML systems that matter.
The ML System Design Lifecycle
Module Lessons
| # | Lesson | Core Concept |
|---|---|---|
| 01 | Framing ML Problems | Business goal → proxy metric → ML objective → label construction |
| 02 | Data Collection Strategy | Data flywheel, labeling strategies, weak supervision, distribution shift |
| 03 | Feature Engineering at Scale | Feature stores, training-serving skew, embeddings, point-in-time joins |
| 04 | Model Selection Strategy | Choosing between model families, bias-variance, compute vs accuracy tradeoffs |
| 05 | Offline vs Online Evaluation | Precision/recall vs business metrics, A/B testing, interleaving experiments |
| 06 | Deployment Patterns | Batch vs real-time serving, shadow deployment, canary releases, rollback |
| 07 | Feedback Loops and Data Flywheel | Logging, delayed labels, retraining triggers, virtuous vs vicious cycles |
| 08 | Responsible AI and Ethics | Fairness metrics, model cards, bias auditing, regulatory constraints |
The ML System Design Interview
What interviewers at Google, Meta, and Amazon are actually evaluating - and it is not your knowledge of transformer architectures.
What they look for:
-
Problem framing before solution - Can you resist jumping to "let's use a neural network" and instead ask what we're actually optimizing? The single biggest differentiator between junior and senior ML engineers.
-
Scale awareness - Do you reason about 1M users differently from 1B users? Do you know when batch inference is better than real-time? When a heuristic beats a model?
-
Data intuition - Do you think about where labels come from? Implicit feedback bias? Training-serving skew? These are the things that kill models in production, not model architecture.
-
Evaluation discipline - Do you know that offline AUC improvement doesn't always mean online revenue improvement? Can you design an A/B test that isolates the model's contribution?
-
End-to-end thinking - Can you trace a single user action from raw event log all the way through feature pipeline, model inference, and business impact?
:::tip What "good" looks like in 45 minutes A strong candidate spends 5 minutes on problem framing, 8 minutes on data strategy, 8 minutes on features, 8 minutes on the model, 8 minutes on evaluation and serving, and 8 minutes on monitoring. A weak candidate spends 30 minutes on the model and 5 minutes on everything else. :::
Prerequisites
- Module 01 - ML Foundations - bias-variance tradeoff, overfitting, generalization
- Module 04 - Neural Networks - backprop, embeddings, attention
- Module 08 - Recommender Systems - collaborative filtering, matrix factorization (helps with examples in this module)
