ML System Design - The Most Differentiated Round

Reading time: ~15 min | Interview relevance: Critical | Roles: MLE, AI Eng, MLOps (Senior+)

The Real Interview Moment

You're 5 minutes into a system design round. The interviewer said: "Design a recommendation system for an e-commerce marketplace." You started drawing boxes - "data pipeline," "model," "serving layer." But 10 minutes in, the interviewer interrupts: "You've been describing components. I want to hear about trade-offs. Why this model over that one? What happens when a new user has no history? How do you measure success?"

This is the system design round. It's not about drawing the "correct" architecture diagram - it's about demonstrating that you can reason through ambiguity, make justified trade-offs, and think about ML systems as living, evolving products. This section gives you a framework, a rubric, and 13 complete design problems to practice with.

What You Will Master

A repeatable framework for any ML system design question
The exact rubric interviewers use to score your answer
13 complete design problems covering the full range of ML systems
How AI/LLM system design differs from traditional ML system design
Time management strategies for the 45-minute round

Section Roadmap

Page	What It Covers	Read If
Design Framework	The 6-step RPFMSE framework in detail	Everyone - this is your foundation
Evaluation Rubric	How interviewers score each component	Everyone - know what gets you "Strong Hire"
Recommendation System	Collaborative filtering, content-based, hybrid, cold start	MLE, AI Eng
Search Ranking	Query understanding, retrieval, ranking, personalization	MLE, AI Eng
Fraud Detection	Real-time scoring, class imbalance, adversarial evolution	MLE
News Feed Ranking	Multi-objective optimization, real-time features, diversity	MLE
Ad Click Prediction	Feature stores, real-time bidding, calibration at scale	MLE
Content Moderation	Multi-modal classification, human-in-the-loop, appeals	MLE, AI Eng
Autonomous Driving	Perception, prediction, planning, safety	MLE (specialized)
AI Chatbot	RAG, guardrails, conversation management, evaluation	AI Engineer
Visual Search	Embedding models, ANN indexing, cross-modal search	MLE
Anomaly Detection	Unsupervised methods, streaming, alerting	MLE, MLOps
Machine Translation	Encoder-decoder, quality estimation, low-resource	MLE
Speech Recognition	Acoustic models, language models, streaming ASR	MLE (specialized)
A/B Testing Platform	Experiment platform, statistical rigor, automation	MLOps, DS

Priority Order for Practice

Priority Study Order - Framework first, then Rubric, then design problems by relevance

Quick Reference: The Framework in 60 Seconds

60-Second Answer

"For any ML system design question, I follow a 6-step framework: (1) Requirements - clarify functional and non-functional constraints. (2) Problem formulation - translate the business goal into an ML objective with metrics. (3) Features and data - identify data sources, engineer features, handle labels. (4) Model - start with a baseline, iterate toward complexity with justification. (5) Serving - real-time vs. batch, latency optimization, failure handling. (6) Evaluation - offline metrics, online A/B testing, monitoring for drift, and a plan for iteration. The key is to cover all six steps in 45 minutes, spending roughly 5-8 minutes on each, rather than going deep on modeling and ignoring serving and evaluation."

Spaced Repetition Checkpoints

Day 0: Read the Framework and Rubric pages. Memorize the 6 steps.
Day 3: Design a Recommendation System in 45 minutes. Compare against the model answer.
Day 7: Design Fraud Detection. Focus on real-time serving and class imbalance.
Day 14: Do a mock system design with a friend. Get feedback on structure and trade-off discussion.
Day 21: Design 2 more problems from the list. By now, the framework should feel natural.

What's Next

Start with The Design Framework - it's the foundation for every problem in this section.

The Real Interview Moment​

What You Will Master​

Section Roadmap​

Priority Order for Practice​

Quick Reference: The Framework in 60 Seconds​

Spaced Repetition Checkpoints​

What's Next​