Skip to main content

ML System Design - The Most Differentiated Round

Reading time: ~15 min | Interview relevance: Critical | Roles: MLE, AI Eng, MLOps (Senior+)

The Real Interview Moment

You're 5 minutes into a system design round. The interviewer said: "Design a recommendation system for an e-commerce marketplace." You started drawing boxes - "data pipeline," "model," "serving layer." But 10 minutes in, the interviewer interrupts: "You've been describing components. I want to hear about trade-offs. Why this model over that one? What happens when a new user has no history? How do you measure success?"

This is the system design round. It's not about drawing the "correct" architecture diagram - it's about demonstrating that you can reason through ambiguity, make justified trade-offs, and think about ML systems as living, evolving products. This section gives you a framework, a rubric, and 13 complete design problems to practice with.

What You Will Master

  • A repeatable framework for any ML system design question
  • The exact rubric interviewers use to score your answer
  • 13 complete design problems covering the full range of ML systems
  • How AI/LLM system design differs from traditional ML system design
  • Time management strategies for the 45-minute round

Section Roadmap

PageWhat It CoversRead If
Design FrameworkThe 6-step RPFMSE framework in detailEveryone - this is your foundation
Evaluation RubricHow interviewers score each componentEveryone - know what gets you "Strong Hire"
Recommendation SystemCollaborative filtering, content-based, hybrid, cold startMLE, AI Eng
Search RankingQuery understanding, retrieval, ranking, personalizationMLE, AI Eng
Fraud DetectionReal-time scoring, class imbalance, adversarial evolutionMLE
News Feed RankingMulti-objective optimization, real-time features, diversityMLE
Ad Click PredictionFeature stores, real-time bidding, calibration at scaleMLE
Content ModerationMulti-modal classification, human-in-the-loop, appealsMLE, AI Eng
Autonomous DrivingPerception, prediction, planning, safetyMLE (specialized)
AI ChatbotRAG, guardrails, conversation management, evaluationAI Engineer
Visual SearchEmbedding models, ANN indexing, cross-modal searchMLE
Anomaly DetectionUnsupervised methods, streaming, alertingMLE, MLOps
Machine TranslationEncoder-decoder, quality estimation, low-resourceMLE
Speech RecognitionAcoustic models, language models, streaming ASRMLE (specialized)
A/B Testing PlatformExperiment platform, statistical rigor, automationMLOps, DS

Priority Order for Practice

Priority Study Order - Framework first, then Rubric, then design problems by relevance

Quick Reference: The Framework in 60 Seconds

60-Second Answer

"For any ML system design question, I follow a 6-step framework: (1) Requirements - clarify functional and non-functional constraints. (2) Problem formulation - translate the business goal into an ML objective with metrics. (3) Features and data - identify data sources, engineer features, handle labels. (4) Model - start with a baseline, iterate toward complexity with justification. (5) Serving - real-time vs. batch, latency optimization, failure handling. (6) Evaluation - offline metrics, online A/B testing, monitoring for drift, and a plan for iteration. The key is to cover all six steps in 45 minutes, spending roughly 5-8 minutes on each, rather than going deep on modeling and ignoring serving and evaluation."

Spaced Repetition Checkpoints

  • Day 0: Read the Framework and Rubric pages. Memorize the 6 steps.
  • Day 3: Design a Recommendation System in 45 minutes. Compare against the model answer.
  • Day 7: Design Fraud Detection. Focus on real-time serving and class imbalance.
  • Day 14: Do a mock system design with a friend. Get feedback on structure and trade-off discussion.
  • Day 21: Design 2 more problems from the list. By now, the framework should feel natural.

What's Next

Start with The Design Framework - it's the foundation for every problem in this section.

© 2026 EngineersOfAI. All rights reserved.