Ambiguity and Prioritization - Making Progress When the Path Is Unclear
Reading time: ~35 min | Interview relevance: Critical | Roles: MLE, Applied Scientist, Research Scientist, AI Engineer, ML Tech Lead, ML Manager
The Real Interview Moment
You are interviewing for an applied scientist role at a company that is building its first ML-powered product. The director of engineering, who has been friendly and conversational for the first twenty minutes, suddenly shifts gears: "Imagine you join our team. You are the first ML hire. The CEO has a vague idea that 'we should use AI to improve user retention.' There is no ML infrastructure, no labeled data, and no one on the team has shipped an ML model before. What do you do in your first 90 days?"
You freeze - not because you do not know anything about ML, but because the question has no single right answer. You could talk about data collection, infrastructure setup, hiring, a quick win to demonstrate value, or a long-term strategy. Everything feels relevant and nothing feels like the "answer."
You start talking about building a feature store and the interviewer's eyes glaze over. You pivot to talking about hiring and she interrupts: "Assume you can't hire for the first six months." You try again with data labeling and she asks: "How do you know what to label if you haven't defined the problem yet?"
Here is what went wrong: you were answering a technical question when she was asking a prioritization question. She does not want to know everything you could do - she wants to know how you decide what to do first, and second, and third, and what to explicitly not do. She wants to see you navigate ambiguity in real time.
This chapter teaches you to thrive when the path is unclear - to break ambiguous problems into concrete steps, to prioritize ruthlessly, and to communicate your reasoning so that everyone from your manager to the CEO understands why you are doing what you are doing and not something else.
What You Will Master
- Why ambiguity questions dominate AI/ML interviews
- The structured approach to navigating ML-specific ambiguity
- How to prioritize experiments, features, and technical investments
- Research vs production tradeoffs - the eternal tension
- The "first 90 days" framework that works for any ML role
- Scoping ML projects when requirements are vague
- Saying no - how to deprioritize gracefully
- Communicating your prioritization reasoning to different audiences
Self-Assessment: Where Are You Now?
| Level | Description | Target |
|---|---|---|
| Overwhelmed | "When everything is ambiguous, I don't know where to start" | Read everything - the structured frameworks will give you a starting point for any situation |
| Reactive | "I can handle ambiguity once I get going, but I struggle to structure my thinking upfront" | Focus on Parts 2-3 - build your prioritization frameworks |
| Strategic | "I navigate ambiguity well in practice but struggle to explain my reasoning in interviews" | Focus on Parts 5-7 - practice articulating your approach |
Part 1 - Why Ambiguity Questions Dominate ML Interviews
ML Is Inherently Ambiguous
Unlike traditional software engineering, where requirements can be specified precisely ("the button should be blue, 200px wide, and trigger a POST request"), ML projects are ambiguous by nature:
What the Question Actually Evaluates
"Companies ask ambiguity questions because ML is inherently uncertain at every stage - problem definition, data availability, model feasibility, and business impact. They want to see that you can break a vague goal into concrete questions, prioritize those questions based on information value and reversibility, make progress with imperfect information, and communicate your reasoning clearly. The worst answer is listing everything you could possibly do. The best answer is explaining what you would do first and why that specific sequence reduces the most uncertainty the fastest."
What Interviewers Actually Write Down
| Interviewer Notes | Signal |
|---|---|
| "Immediately asked clarifying questions before proposing solutions" | Strong positive - shows structured thinking |
| "Explicitly stated what they would NOT do and why" | Strong positive - shows prioritization maturity |
| "Proposed time-boxed explorations with clear decision points" | Strong positive - shows comfort with uncertainty |
| "Adapted their plan when I changed the constraints" | Positive - shows flexibility |
| "Listed every possible thing they could do without prioritizing" | Negative - information dump, not strategy |
| "Jumped straight to building infrastructure without understanding the problem" | Negative - solution before problem |
| "Could not explain why they would do X before Y" | Negative - no prioritization reasoning |
| "Became visibly uncomfortable when I said 'you don't know that yet'" | Negative - cannot tolerate ambiguity |
Part 2 - The Ambiguity Navigation Framework (ANF)
When you get an ambiguous question, use this five-step framework to structure your thinking in real time.
Step 1: Clarify the Objective
Before doing anything, make sure you understand what success looks like.
BAD: "The CEO wants to use AI for retention."
BETTER: "Let me make sure I understand the objective. We want to reduce
churn. Is that churn as in users leaving the platform entirely,
or churn as in reduced engagement? And what is the current churn
rate? Is there a target?"
You are not being difficult. You are demonstrating that the first step in navigating ambiguity is converting a vague goal into a measurable objective. If the interviewer says "assume you don't know yet," that is fine - acknowledge the ambiguity and explain how you would find out.
Step 2: Map What You Know and What You Do Not Know
Divide the problem space into:
| Category | Example |
|---|---|
| Known knowns | "We have 2 years of user behavior data in a PostgreSQL database" |
| Known unknowns | "We do not know if churn is predictable from behavior data" |
| Unknown unknowns | "There may be external factors driving churn we have not considered" |
Your plan should focus on converting known unknowns into known knowns as fast as possible. The unknown unknowns will reveal themselves as you make progress.
Step 3: Identify the Highest-Uncertainty Questions
Not all unknowns are equal. Rank them by:
- Information value: If we answer this question, how much does it change our approach?
- Reversibility: If we make a wrong assumption here, how costly is it to fix later?
- Speed to answer: How quickly can we resolve this uncertainty?
Step 4: Define Time-Boxed Explorations
Never commit to a full solution when you are still in the ambiguity phase. Instead, define time-boxed explorations:
"I would spend the first two weeks on three focused explorations:
Week 1:
- Exploration A: Pull 6 months of user behavior data and run a simple
churn prediction baseline (logistic regression on basic features).
Success criteria: AUC > 0.65 means there is a signal worth pursuing.
- Exploration B: Interview the customer success team to understand their
qualitative understanding of why users leave. Map their insights to
available data fields.
Week 2:
- Exploration C: Based on results of A and B, prototype the simplest
possible intervention - even if it's just a targeted email to
at-risk users identified by the baseline model.
Decision point at end of Week 2: Is there enough signal to justify a
full ML investment? If AUC < 0.55 and qualitative insights don't map
to measurable signals, we need to rethink whether ML is the right tool."
Step 5: Communicate the Plan and Decision Points
The plan is not just for you - it is for your stakeholders. Frame it as:
"Here is what I plan to do, in what order, and why:
1. [Action] - because [reasoning] - by [date]
2. [Action] - because [reasoning] - by [date]
3. [Decision point] - at this point, we will know [X] and can decide
whether to [continue/pivot/stop]
What I am NOT doing yet, and why:
- [Deferred action] - because [we don't have enough information yet /
this is irreversible and we should wait / the ROI is unclear]
What I need from you:
- [Specific ask] - [reason]
"
Do not present a linear plan that assumes everything will work. "First I will collect data, then I will train a model, then I will deploy it" ignores the reality that each step might fail or reveal that the next step is wrong. Interviewers want to see decision points - moments where you assess what you have learned and potentially change direction. A plan without decision points is not a plan; it is a fantasy.
Part 3 - Prioritizing ML Experiments
One of the most common ambiguity questions is about prioritization: "You have ten ideas for improving the model. How do you decide which to try first?"
The ML Experiment Prioritization Matrix
Evaluate each idea on four dimensions:
| Dimension | Question | Scale |
|---|---|---|
| Expected Impact | If this works, how much does the metric improve? | Low / Medium / High |
| Probability of Success | Based on prior work and intuition, how likely is this to work? | Low / Medium / High |
| Effort | How long does this experiment take to run? | Days / Weeks / Months |
| Learning Value | Even if it fails, do we learn something important? | Low / Medium / High |
Prioritization in Practice
SCENARIO: You are improving a search ranking model. Ten ideas on the board:
| # | Idea | Impact | P(Success) | Effort | Learning |
|---|------|--------|-----------|--------|----------|
| 1 | Add user click history features | High | High | 1 week | Medium |
| 2 | Switch from XGBoost to transformer | High | Medium | 4 weeks | Medium |
| 3 | Fix known data pipeline bug | Medium | High | 2 days | Low |
| 4 | Add image embeddings | Medium | Low | 3 weeks | High |
| 5 | Tune hyperparameters | Low | High | 3 days | Low |
| 6 | Collect human relevance judgments | High | High | 6 weeks | High |
| 7 | A/B test current model vs random | Low | High | 1 week | High |
| 8 | Implement real-time features | High | Medium | 8 weeks | Medium |
| 9 | Add query expansion | Medium | Medium | 2 weeks | Medium |
| 10 | Redesign evaluation metrics | Medium | High | 1 week | High |
PRIORITIZED ORDER:
1. #3 (Fix data bug) - high certainty, low effort, removes noise
from all future experiments
2. #10 (Redesign eval metrics) - if our metrics are wrong, every
experiment result is unreliable
3. #7 (A/B test vs random) - establishes the value of ML at all;
high learning value for 1 week of effort
4. #1 (Click history features) - high impact, high probability,
one week - best expected value
5. #6 (Human judgments) - START this now because it takes 6 weeks,
but run in parallel with #1 and #4
6. #9 (Query expansion) - medium everything, good next step after
quick wins are captured
7. #2 (Transformer) - high potential but 4 weeks of effort; defer
until simpler approaches plateau
8. #4 (Image embeddings) - interesting but low probability; run
only if data shows visual queries are underserved
9. #5 (Hyperparameter tuning) - diminishing returns; do last
10. #8 (Real-time features) - high investment, defer until
architecture supports it
KEY PRINCIPLE: Fix the foundation first (data, metrics), then
capture quick wins, then invest in larger bets.
"I prioritize ML experiments using four dimensions: expected impact, probability of success, effort, and learning value. I always start by fixing the foundation - data quality issues and evaluation metrics - because every subsequent experiment depends on these being right. Then I pursue quick wins with high expected value (high impact times high probability divided by effort). I start long-running activities (data collection, labeling) in parallel. I defer large architectural changes until simpler approaches plateau. And I explicitly deprioritize low-impact experiments regardless of how easy they are - effort is never zero, and we have finite attention."
The Anti-Patterns in Prioritization
| Anti-Pattern | Why It Is Wrong | What to Do Instead |
|---|---|---|
| Shiny object syndrome | Jumping to transformers before trying logistic regression | Always start with the simplest baseline |
| Completionism | Trying to do everything in parallel | Explicitly say what you are NOT doing and why |
| Local optimization | Tuning hyperparameters when the features are wrong | Fix the highest-leverage component first |
| Effort bias | Prioritizing easy things regardless of impact | Easy + low impact is still a waste of time |
| Research bias | Pursuing interesting ideas over impactful ones | Save research for 20% time; production experiments should maximize business value |
| Ignoring the foundation | Training models on buggy data with wrong metrics | Always fix data and evaluation before modeling |
Part 4 - Research vs Production Tradeoffs
This question appears in nearly every applied scientist and senior MLE interview. The tension between research exploration and production delivery is fundamental to ML organizations.
The Spectrum
The Interview Question
"How do you balance research exploration with production delivery? Your team has a quarterly OKR to improve CTR by 5%, but you believe investing in a new model architecture could yield 20% improvement - though it would take 6 months and might not work."
The Strong Answer Framework
1. Acknowledge the tension honestly.
"This is the central tension of applied ML. If you only ship incremental improvements, you miss breakthroughs. If you only pursue moonshots, you never deliver value. The right balance depends on the organization's maturity, the business urgency, and the team's capacity."
2. Propose a portfolio approach.
I split the team's effort into three categories:
70% - Production improvements
Incremental wins with high confidence.
This is how we hit quarterly OKRs.
Examples: feature engineering, threshold tuning, data quality fixes.
20% - Applied research
Promising approaches with medium confidence.
Time-boxed to 4-6 weeks. Clear success criteria.
Example: "We will test the new architecture on a subset of traffic.
If it beats the baseline by >10% offline, we invest in production."
10% - Exploration
Long-shot ideas with high upside.
No OKR pressure. Learning is the deliverable.
Example: "Explore whether multimodal embeddings improve search for
visual queries." One person, one month, writeup at the end.
3. Define the decision criteria for promoting research to production.
An applied research project graduates to production when:
1. Offline metrics show >X% improvement over the current system
2. The latency and cost are within acceptable bounds (or have a
clear path to being acceptable)
3. The approach is reproducible (not a one-time result)
4. There is a viable path to production that does not require
rewriting the entire serving stack
If these criteria are not met after the time box, we write up
what we learned and move on. No sunk cost reasoning.
4. Address the specific scenario.
"For the 6-month architecture bet: I would not commit the whole team for 6 months. I would allocate one senior engineer for 6 weeks to validate the hypothesis on a subset of data. If the offline results are promising, I would propose a phased plan: prototype on 1% of traffic in month 3, evaluate in month 4, and make a full investment decision based on real-world results. Meanwhile, the rest of the team delivers the quarterly OKR through incremental improvements."
Never say "research is more important than production" or "we should always ship the safe option." Both extremes signal poor judgment. The interviewer wants to see that you can hold both priorities simultaneously and have a principled way to allocate effort between them.
The Research-to-Production Pipeline
Example STAR Story: Balancing Research and Delivery
Situation: "I was on an applied ML team responsible for a recommendation system driving $15M/quarter in revenue. We had quarterly OKRs for engagement improvement. At the same time, a new retrieval architecture \text{---} dense retrieval with learned embeddings \text{---} was showing promise in academic papers and could potentially double our recall."
Task: "I needed to figure out whether to invest in the new architecture or continue with incremental improvements to our existing TF-IDF + collaborative filtering system. The team had six engineers, and the quarterly OKR required at least 3% CTR improvement."
Action: "I proposed a portfolio approach. Four engineers focused on production improvements \text{---} feature engineering, candidate generation tuning, and fixing a known cold-start issue. These were high-confidence improvements that I estimated would yield the 3% target.
Simultaneously, I allocated myself and one other senior engineer to a 4-week time-boxed exploration of dense retrieval. We set explicit success criteria: if offline recall@100 improved by more than 20% on our eval set, we would invest in a production prototype. If not, we would write up the findings and return to production work.
By week 3, we had results: dense retrieval improved recall@100 by 34%, but inference latency was 12x higher than our production SLA. I did not abandon the project \text{---} instead, I proposed a hybrid approach: use dense retrieval for offline candidate generation (where latency did not matter) and keep the existing system for real-time re-ranking.
I presented this to my manager with a clear ask: dedicate 2 additional engineer-weeks to build the hybrid system, with the explicit understanding that if the A/B test did not show improvement, we would revert."
Result: "The production team hit the 3% OKR through incremental improvements. The hybrid retrieval system launched in a small A/B test and showed a 7% additional CTR improvement \text{---} more than doubling our quarterly gains. The approach became the foundation for our next-generation recommendation architecture. The key learning was that the portfolio approach let us pursue a high-upside bet without risking the quarterly commitment."
Part 5 \text{---} The First 90 Days Framework
This is one of the most common senior-level interview questions. It appears in almost every Staff+ MLE, Tech Lead, and ML Manager interview.
The Question
"You join our team as the [role]. What do you do in your first 90 days?"
The Framework
Days 1-30: Learn
Goal: Build context and credibility before proposing changes.
WEEK 1: Understand the landscape
- Meet every team member 1:1. Ask: "What is working well? What is
frustrating? What would you change if you could?"
- Read the existing codebase, documentation, and past design docs
- Understand the data: what do we have, where does it live, how
fresh is it, what are the known quality issues?
- Map the ML system architecture end-to-end (data -> training ->
serving -> monitoring)
WEEK 2: Understand the business
- Meet the product manager. Understand the product roadmap, key
metrics, and user needs
- Meet the data analyst. Understand the current performance metrics
and trends
- Understand the business model: how does ML impact revenue, cost,
or user experience?
- Identify the gap between what the business wants and what ML
currently delivers
WEEK 3: Understand the gaps
- Run the existing ML pipeline end-to-end. Note every friction
point, undocumented step, and failure mode
- Review the evaluation methodology: are we measuring the right
things? Are the metrics trustworthy?
- Identify the biggest bottleneck: is it data quality, model
quality, serving infrastructure, or something else?
WEEK 4: Synthesize and share
- Write a brief document: "What I've learned in 30 days"
- Current state of ML systems
- Top 3 opportunities (with rough effort/impact estimates)
- Top 3 risks
- Proposed plan for Days 31-60
- Share it with your manager and key stakeholders. Get feedback.
Do not skip the learning phase and jump straight to proposing changes. The most common mistake new hires make is suggesting solutions before understanding context. Even if you see obvious improvements, spend the first two weeks listening. You will discover that the "obvious" fix was already tried and failed for reasons you do not yet understand, or that there is a political reason why something has not been changed.
Days 31-60: Deliver
Goal: Ship a quick win that builds credibility and demonstrates value.
CRITERIA FOR THE FIRST WIN:
- Achievable in 2-3 weeks by you alone (no dependencies)
- Has visible, measurable impact
- Is low risk (not touching critical production systems)
- Addresses a pain point the team already recognizes
GOOD FIRST WINS:
- Fix a known data quality issue that everyone complains about
- Add monitoring/alerting to a production model that currently
runs without observability
- Improve the evaluation suite so the team can iterate faster
- Automate a manual step in the ML pipeline
- Improve documentation for a critical but undocumented system
- Build a simple baseline model for a problem the team has been
overthinking
BAD FIRST WINS:
- Rewrite the serving infrastructure in a new framework
- Propose a new model architecture that requires months of work
- Reorganize the codebase according to your preferences
- Introduce a new tool that requires everyone to change their
workflow
Example STAR Story: Choosing the Right First Win
Situation: "I joined a team of four ML engineers at a mid-stage startup as the senior ML engineer. The team had a recommendation model in production serving 2M users, but they had no monitoring \text{---} they only knew the model was performing poorly when users complained on social media or the product manager noticed engagement drops in their weekly report."
Task: "In my first 30 days, I identified three major opportunities: model monitoring (medium effort, high impact), evaluation framework improvement (medium effort, medium impact), and a model architecture upgrade (high effort, potentially high impact). I needed to choose my first deliverable."
Action: "I chose monitoring as my first win for three specific reasons. First, it was the highest-risk gap \text{---} without monitoring, we could not detect degradation, which meant any future model improvements could regress without us knowing. Second, it was something the entire team recognized as painful \text{---} every engineer I talked to in my first week mentioned it. Third, it was achievable by me alone in 2-3 weeks without requiring changes to anyone else's workflow.
I built a monitoring dashboard that tracked three things: prediction distribution drift (comparing today's predictions against the training distribution), feature drift (detecting when input features changed significantly), and business metric correlation (tracking the relationship between model confidence and downstream engagement). I set up Slack alerts for distribution shifts exceeding 2 standard deviations.
Within the first week of monitoring being live, it caught a feature drift caused by a data pipeline change that no one had noticed. The model's precision had degraded by 8% over two weeks. I fixed the pipeline issue and restored performance."
Result: "The monitoring system became the team's most-valued infrastructure. It caught three more issues in the following quarter before they affected users. More importantly, it gave me credibility \text{---} the team saw me as someone who fixed real problems rather than someone who arrived with opinions. This credibility made it much easier to propose larger changes in my second and third months."
Days 61-90: Multiply
Goal: Start multiplying your impact beyond your individual work.
ACTIVITIES:
- Propose and begin a medium-term project (1-2 quarters) based
on what you learned in the first 60 days
- Establish or improve one team process: code review, design
review, experiment tracking, or on-call rotation
- Start mentoring one team member on a skill you are strong in
- Build a relationship with a key partner team (data engineering,
platform, product)
- Present your roadmap proposal to the team and get buy-in
DELIVERABLE:
A 1-2 page document: "ML Roadmap \text{---} Next 2 Quarters"
- Where we are today (baseline metrics)
- Where we should be in 6 months (target metrics)
- The 3-5 initiatives that will get us there, prioritized
- What we need (resources, data, infrastructure)
- What we are NOT doing and why
Adapting the Framework by Role
| Role | Days 1-30 Emphasis | Days 31-60 Emphasis | Days 61-90 Emphasis |
|---|---|---|---|
| First ML hire | Understand the data landscape and business problem | Build a baseline model and validate that ML adds value | Propose the ML roadmap and hiring plan |
| Senior MLE joining an existing team | Understand existing systems and team dynamics | Ship a technical improvement and establish trust | Propose architectural improvements and mentor |
| ML Tech Lead | Map stakeholders, understand cross-team dynamics | Fix the highest-impact bottleneck | Establish technical processes and a team roadmap |
| ML Manager | 1:1s with every report, understand career goals | Remove the biggest blocker for the team | Set team OKRs and development plans |
| Applied Scientist | Understand the modeling landscape and evaluation gaps | Improve a key model or establish rigorous evaluation | Propose a research agenda aligned with business goals |
| Research Scientist | Survey the team's current approaches and industry SOTA | Identify the highest-leverage research direction | Deliver initial results and present a research roadmap |
Startups (first ML hire): The "learn" phase is shorter \text{---} you might need to deliver value in the first 2-3 weeks. Emphasize speed and scrappiness. Your first win might be proving that ML is even worth investing in.
Big tech (joining an established team): The "learn" phase is longer \text{---} there is more organizational complexity to understand. Your first win should demonstrate that you integrate well, not that you are smarter than the existing team.
Research labs (Google DeepMind, FAIR, Anthropic): The framework shifts toward understanding the research landscape, identifying open problems, and proposing a research agenda rather than shipping a production improvement.
Part 6 \text{---} Scoping ML Projects When Requirements Are Vague
The Problem
Product stakeholders often come to ML teams with requests like:
- "Can we use AI to improve this?"
- "Our competitor has a recommendation engine. We need one too."
- "Can you build a model that predicts churn?"
These requests are not bad \text{---} they are just underspecified. Your job is to convert them into well-scoped ML projects.
The Scoping Conversation
Use this question framework when a stakeholder brings you a vague ML request:
1. PROBLEM DEFINITION
"What specific business outcome are you trying to improve?"
"How would you measure success? What metric moves?"
"What is the current baseline without ML?"
2. USER IMPACT
"Who is the end user of this model's output?"
"How will they interact with it? (ranked list, yes/no, score, text)"
"What happens when the model is wrong? What is the cost of errors?"
3. DATA AVAILABILITY
"What data do we have today that might be relevant?"
"Do we have labeled examples of the outcome we are trying to predict?"
"How much historical data is available?"
4. CONSTRAINTS
"What are the latency requirements? (real-time vs batch)"
"Are there privacy or regulatory constraints on what data we can use?"
"What is the timeline? When does this need to be in production?"
5. ALTERNATIVES
"Have we considered non-ML solutions? (rules, heuristics, manual process)"
"What would a human do today to solve this problem?"
"Is ML the right tool, or is there a simpler approach?"
The One-Pager
After the scoping conversation, write a one-page project brief:
PROJECT: [Name]
OWNER: [You]
DATE: [Date]
STATUS: Scoping
PROBLEM STATEMENT
[2-3 sentences describing the business problem in non-ML terms]
SUCCESS METRIC
[The specific metric that must improve, with a target]
Current baseline: [X]
Target: [Y]
Measurement method: [A/B test / offline evaluation / etc.]
APPROACH
[1 paragraph describing the proposed ML approach at a high level]
DATA REQUIREMENTS
[What data is needed, what is available, what gaps exist]
KEY ASSUMPTIONS
[List the 3-5 biggest assumptions. If any of these are wrong,
the project may not work.]
MILESTONES
1. [Milestone 1] \text{---} [Date] \text{---} [Deliverable]
2. [Milestone 2] \text{---} [Date] \text{---} [Deliverable]
3. [Decision point] \text{---} [Date] \text{---} [Go/no-go criteria]
RISKS
1. [Risk] \text{---} [Mitigation]
2. [Risk] \text{---} [Mitigation]
NOT IN SCOPE
[What this project explicitly does NOT include]
The format varies by company, but the content is universal. At Amazon, this would be a "one-pager" or "PR/FAQ." At Google, it would be a design doc. At a startup, it might be a Notion page or a Slack message. The point is to create a shared artifact that aligns everyone on the what, why, and how before writing any code.
Example STAR Story: Scoping an Ambiguous ML Request
Situation: "Our head of customer success came to the ML team and said: 'We are losing enterprise customers. Can AI help?' That was the entire brief. No metrics, no data specification, no success criteria."
Task: "I needed to convert this vague request into a scoped project that would either demonstrate clear value or prove that ML was not the right tool \text{---} within 4 weeks."
Action: "I started with the scoping conversation. Through three meetings with the customer success team, I learned that 'losing enterprise customers' meant contract non-renewals, which had increased from 8% to 14% over 6 months. The team had theories but no data-driven understanding of why.
I pulled contract renewal data and found we had 18 months of history covering 340 enterprise customers \text{---} small for ML, but potentially enough for a simple model. I identified four candidate approaches, ranging from simple to complex:
- Descriptive analytics \text{---} just analyze which features correlate with churn (2 days)
- Rules-based early warning \text{---} threshold alerts on usage metrics (1 week)
- Logistic regression churn prediction \text{---} predict renewal probability (2 weeks)
- Full ML pipeline with automated interventions \text{---} dynamic scoring and triggered actions (2-3 months)
I wrote a one-pager proposing to start with approach 1 as a learning exercise and approach 2 as a quick win, with a decision point at week 3 on whether approach 3 was feasible given the data size."
Result: "The descriptive analysis revealed that the top churn predictor was a 30% decline in feature usage over any 4-week period \text{---} a signal the customer success team had suspected but never quantified. The rules-based early warning system based on this insight identified 12 at-risk accounts, and the customer success team saved 8 of them through proactive outreach. We never needed the full ML pipeline \text{---} the simple approach captured 80% of the value. The head of customer success later said this was the most impactful project the ML team had ever done for her organization."
Part 7 \text{---} Saying No and Deprioritizing Gracefully
One of the hardest skills in ML \text{---} and one of the most tested in senior interviews \text{---} is the ability to say no.
Why Saying No Is Critical in ML
ML teams are perpetually overloaded with requests. Every team in the organization thinks ML could improve their product. If you say yes to everything, you will:
- Spread the team too thin and deliver nothing well
- Accumulate technical debt that slows future work
- Burn out your team
- Destroy the team's credibility when projects fail due to under-resourcing
The Framework for Saying No
1. Validate the request. "This is a great idea. I can see how ML could add value here."
2. Explain the tradeoff. "If we take this on, we would need to deprioritize [X]. Here is the impact of that tradeoff."
3. Offer an alternative. "Instead of building a full ML solution, could we start with a rules-based approach that captures 80% of the value? If it works, we can invest in ML later."
4. Provide a timeline. "This is on our radar for Q3. If the business need is urgent, let us discuss reprioritizing."
The Deprioritization Communication Template
Hi [stakeholder],
Thank you for the proposal to [project]. I agree this could be
valuable \text{---} [specific acknowledgment of the value].
After evaluating our current commitments, here is where we stand:
CURRENT PRIORITIES (Q1):
1. [Project A] \text{---} [business impact] \text{---} ships [date]
2. [Project B] \text{---} [business impact] \text{---} ships [date]
3. [Project C] \text{---} [business impact] \text{---} ships [date]
YOUR REQUEST:
- Estimated effort: [X weeks/months]
- Estimated impact: [Y]
- Dependencies: [Z]
If we take this on now, we would need to delay [Project B or C].
The impact of that delay would be [specific consequence].
MY RECOMMENDATION:
Option 1: Add this to Q2 roadmap (no tradeoff, but 3-month delay)
Option 2: Start with a simpler approach now (rules-based, [X] weeks)
and invest in full ML if the simple approach validates demand
Option 3: Deprioritize [Project C] and start this immediately
(trade [C's impact] for [this project's impact])
I am happy to discuss which option works best for the business.
\text{---} [You]
Example STAR Story: Saying No to a VP
Situation: "The VP of Sales asked our ML team to build a lead scoring model. She had seen it done at her previous company and believed it could increase conversion rates by 30%. She framed it as 'urgent' because Q4 pipeline was looking weak."
Task: "Our team was already fully committed to two projects \text{---} a fraud detection model (P0, regulatory deadline) and a recommendation system upgrade (P1, tied to the company's top revenue OKR). Taking on lead scoring would require deprioritizing one of these."
Action: "I did not say no outright. Instead, I did three things.
First, I estimated the effort honestly. After a 2-hour deep dive into the sales data, I concluded that a useful lead scoring model would take 6-8 weeks \text{---} not the '2 weeks, how hard can it be?' the VP had assumed. The data was messy, the definition of a 'qualified lead' was inconsistent across sales reps, and we had no ground truth labels for lead quality.
Second, I presented the tradeoff explicitly. I met with the VP and my engineering director together and showed them: 'If we take on lead scoring, we delay the fraud model by 6 weeks, which puts us at risk of missing a regulatory deadline and a potential 2M/quarter in projected revenue impact. Neither is trivial.'
Third, I offered a creative alternative. The sales team already had a CRM with basic scoring rules. I proposed spending 3 days \text{---} not 6 weeks \text{---} to analyze their existing CRM data and identify the 3 features most predictive of conversion. We could improve their manual scoring rules immediately and build a proper ML model in Q1 when we had capacity.
The VP initially pushed back \text{---} 'Can't you just do all three?' I held firm: 'I would love to, and I understand the urgency. But I want to be honest with you about what we can deliver well versus what we would deliver poorly. A poorly built lead scoring model is worse than no model \text{---} it will erode your team's trust in ML predictions.'"
Result: "The VP accepted the alternative. The 3-day analysis identified that leads with specific product page visit patterns converted at 4x the rate of other leads. The sales team updated their manual scoring to weight these behaviors, and conversion improved by 11% \text{---} not the 30% target, but achieved in 3 days instead of 8 weeks. We built the full ML model in Q1, achieving 24% improvement. The VP later told me she appreciated the honesty and the creative alternative."
"Saying no in ML is about presenting tradeoffs, not just refusing. When a stakeholder brings a request that does not fit the current priorities, I validate the idea, show what we would have to deprioritize to accommodate it, offer a simpler alternative that might capture most of the value, and provide a timeline for when we could take it on fully. This way, the stakeholder makes an informed decision about the tradeoff rather than feeling dismissed. The key is being transparent about capacity constraints and letting the business decide which priority wins."
Part 8 \text{---} ML-Specific Ambiguity Scenarios with Model Answers
Scenario 1: The Unclear Success Metric
"The PM says 'make the recommendations better.' How do you proceed?"
Strong Answer:
"'Better' is too vague to optimize for. I would start by understanding what 'better' means to different stakeholders:
- To the PM: Higher engagement (CTR, session duration)?
- To the business: Higher revenue (conversion, basket size)?
- To users: More relevant (satisfaction surveys, reduced bounces)?
- To the content team: Better coverage (long-tail exposure, diversity)?
These can conflict. Higher CTR might mean more clickbait. Higher revenue might mean more expensive items. I would propose a metric hierarchy:
- Primary metric: The one we optimize for (e.g., user satisfaction measured by post-interaction surveys)
- Guardrail metrics: Metrics that must not degrade (e.g., diversity of recommendations, revenue per session)
- Diagnostic metrics: Metrics we track to understand behavior but do not optimize for (e.g., CTR, coverage)
I would get alignment from the PM, the business team, and the content team on this hierarchy before running any experiments."
Scenario 2: The Impossible Timeline
"Leadership wants the ML feature launched in 4 weeks. Your estimate is 12 weeks for a proper solution. What do you do?"
Strong Answer:
"I would not say 'it is impossible' and I would not agree to an unrealistic timeline. Instead, I would present a phased plan:
Phase 1 (4 weeks) \text{---} MVP: Ship the simplest version that adds value. This might be a rules-based system, a simple model (logistic regression on handcrafted features), or even a curated/manual approach. It will not be the final solution, but it will demonstrate value and buy time.
Phase 2 (weeks 5-8) \text{---} Iterate: Improve the model with better features and a more sophisticated architecture. Deploy A/B testing infrastructure so we can measure improvements rigorously.
Phase 3 (weeks 9-12) \text{---} Production quality: Add monitoring, alerting, automated retraining, edge case handling, and documentation. This is the difference between a demo and a production system.
The key conversation with leadership is: 'We can have something in production in 4 weeks that adds value. But it will not be production-grade until week 12. If we cut corners on phase 3, we will pay for it in incidents, manual work, and maintenance burden for the next year.'"
Scenario 3: The Data That Does Not Exist
"You need labeled data for your model, but the labeling budget is zero and no labels exist. What do you do?"
Strong Answer:
"I would explore these approaches in order of effort:
-
Implicit labels from user behavior: Can clicks, conversions, dwell time, or other behavioral signals serve as noisy labels? This is free but may be biased.
-
Heuristic labels from existing systems: Can rules, thresholds, or existing business logic generate labels? These will not be perfect, but they can bootstrap a model.
-
Weak supervision: Use labeling function frameworks to combine multiple noisy label sources into higher-quality labels programmatically.
-
Active learning: Build a simple model with whatever labels you can get, then use uncertainty sampling to identify the most informative examples for human labeling \text{---} maximizing the value of a small labeling budget.
-
Transfer learning: Use a model pre-trained on a related task and fine-tune with minimal labeled data.
-
Zero-shot / few-shot with LLMs: For text classification tasks, use a large language model to generate initial labels, then validate a sample to estimate quality.
If none of these work, the honest answer is: 'We cannot build a reliable ML model without labels. Here is what it would cost to create a minimal labeled dataset, and here is the expected return on that investment.' Sometimes the right answer is to make the case for the labeling budget rather than trying to work around its absence."
Scenario 4: Too Many Stakeholders, Conflicting Priorities
"Three different teams want you to build three different models, all with 'high priority.' You can only do one this quarter. How do you choose?"
Strong Answer:
"I would evaluate each project on five dimensions and make the decision transparent:
| Dimension | Project A | Project B | Project C |
|---|---|---|---|
| Business impact (if successful) | $X revenue | Y% retention | Z% cost reduction |
| Feasibility (data, time, risk) | High | Medium | Low |
| Strategic alignment | Core product | Growth initiative | Operational efficiency |
| Dependencies on us (can they do without?) | No alternative | Rules-based alternative exists | Manual process works, slow but functional |
| Learning value (does this build capability?) | Low (similar to past work) | High (new domain) | Medium |
Then I would present this to my manager and the three stakeholders together - not separately. Separate conversations create a prisoner's dilemma where each stakeholder lobbies individually. A joint conversation forces a collective prioritization.
My recommendation would be based on the analysis, but I would let the business leaders make the final call on which impact matters most this quarter. My job is to make the tradeoffs visible, not to decide business priorities."
Scenario 5: The Model That Works Offline but Fails Online
"Your model shows a 15% improvement in offline evaluation, but the A/B test shows no improvement - maybe even a slight decline. What do you do?"
Strong Answer:
"This is one of the most common and frustrating problems in applied ML. I would investigate systematically:
1. Evaluation methodology check: Is the offline evaluation genuinely representative of production? Common issues: the test set does not match production distribution, the offline metric does not capture the online metric, or there is data leakage in the offline eval.
2. Feature pipeline check: Are the features computed identically in offline evaluation and online serving? Common issues: training-serving skew, different data sources, different aggregation windows, stale feature caches.
3. Traffic analysis: Is the A/B test properly randomized? Are there confounders? Is the sample size sufficient for statistical significance at the expected effect size?
4. User behavior analysis: Even if the model serves better results, users might not behave differently. Is the improvement in a dimension users actually notice? Are we measuring the right proxy for user satisfaction?
5. Latency and user experience: Did the model increase latency? Even small latency increases can offset quality improvements due to user abandonment.
I would systematically rule out each of these causes. If the offline eval is genuinely representative and the online system is serving the model correctly, the most likely explanation is that the offline metric does not capture what actually matters to users - which means we need to revisit our evaluation framework."
Scenario 6: Starting from Zero
"You are the first ML hire at a company with 50 engineers. There is no ML infrastructure, no data pipelines for ML, and the engineering team is skeptical about AI. What do you do?"
Strong Answer:
"Being the first ML hire is as much a change management challenge as a technical challenge. I would follow a specific playbook:
Weeks 1-2 - Understand and listen. I would not propose any ML solutions yet. I would meet with product managers, engineers, and the leadership team to understand the business problems, the data landscape, and the sources of skepticism. The skepticism is usually well-founded - they have seen AI hype and want to know if it is real for their specific problems.
Weeks 3-4 - Identify the lowest-hanging fruit. I would look for a problem where: the data already exists, a simple model (even logistic regression) could add clear value, the impact is measurable, and the engineering effort is minimal. This might be as simple as adding a prediction to an existing workflow - not building a new product.
Weeks 5-8 - Deliver the first win. Build the simplest possible ML solution end-to-end. Use existing infrastructure where possible - do not build an ML platform for a single model. The goal is to show value, not to build infrastructure. If I can show a measurable improvement with a Jupyter notebook and a cron job, that is better than spending 2 months building a perfect pipeline with no model.
Months 3-4 - Build credibility and infrastructure incrementally. Once the first win is visible, I would propose a small infrastructure investment - experiment tracking, model monitoring, a simple deployment pipeline. But only enough infrastructure for the next 2-3 models, not a grand platform vision.
Months 5-6 - Make the hiring case. By now, I should have 1-2 production models, clear business impact, and a backlog of opportunities. This is when I propose hiring the second ML engineer and present a 12-month ML roadmap.
The key insight is that the first ML hire's job is not to build ML systems - it is to prove that ML is worth investing in. You do that with quick, visible wins, not with grand infrastructure projects."
Part 9 - Scope Reduction Strategies
When the Project Is Too Big
A critical prioritization skill is scope reduction - delivering 80% of the value with 20% of the effort.
The Scope Reduction Conversation
When a stakeholder resists scope reduction:
STAKEHOLDER: "I need the full solution - 95% accuracy across all users."
YOU: "I understand. Let me show you two paths:
Path A (Full solution):
- 95% accuracy, all users
- Timeline: 4 months
- Risk: Medium (we may not hit 95%)
- Value delivered in month 1: $0
Path B (Phased approach):
- Phase 1: 80% accuracy, top 20% of users (by revenue)
Timeline: 3 weeks
Captures 60% of the total value
- Phase 2: Improve to 90%, expand to top 50% of users
Timeline: 6 weeks after Phase 1
Captures 85% of the total value
- Phase 3: Full solution, all users, 95% accuracy
Timeline: 4 months total (same as Path A)
Captures 100% of value
Path B delivers the same end result on the same timeline, but
starts delivering value in 3 weeks instead of 4 months. The
only cost is that it requires two small deployments instead of
one large one.
Which path would you prefer?"
Practice Exercises
Exercise 1: The First 90 Days (45 minutes)
Choose a company you are interviewing with. Write a detailed first-90-days plan for the specific role:
- Days 1-30: What would you learn and how?
- Days 31-60: What would you deliver?
- Days 61-90: What would you propose?
Be specific to the company - reference their products, their tech stack (if public), and the role description.
Exercise 2: Experiment Prioritization (30 minutes)
You are improving an ML-powered product of your choice (search, recommendations, fraud detection, etc.). List ten possible improvements. Score each on Impact, Probability, Effort, and Learning Value. Prioritize them and write a one-paragraph justification for your top 3 and your bottom 3.
Exercise 3: The Scoping Conversation (20 minutes)
A product manager says: "Our customer support team is overwhelmed. Can we use AI to help?" Write out:
- The ten questions you would ask before writing any code
- The one-page project brief based on reasonable assumptions
- The "NOT IN SCOPE" section
Exercise 4: The Deprioritization Email (20 minutes)
Your team is committed to two projects this quarter. A VP asks you to take on a third. Write the deprioritization email using the template from Part 7. Be specific about the tradeoffs.
Exercise 5: Real-Time Ambiguity Navigation (15 minutes)
Have a friend give you a vague ML prompt (e.g., "Use AI to reduce fraud"). Set a 5-minute timer and talk through the ANF framework out loud:
- Clarify the objective
- Map known vs unknown
- Identify highest-uncertainty questions
- Propose time-boxed explorations
- Communicate the plan
Record yourself and evaluate: did you structure your thinking, or did you ramble?
Exercise 6: Scope Reduction Practice (20 minutes)
Take an ML project you have worked on (or a hypothetical one). Write three versions:
- The full solution (what you would build with unlimited time)
- The 80/20 version (80% of the value in 20% of the time)
- The 1-week MVP (the absolute minimum that proves value)
For each version, note: what you kept, what you cut, and why.
Interview Cheat Sheet
| Concept | Key Point |
|---|---|
| Why ambiguity questions matter | ML is uncertain at every stage - problem, data, model, business. Navigating uncertainty is the job |
| The ANF (5 steps) | Clarify objective, map knowns/unknowns, identify high-uncertainty questions, time-box explorations, communicate plan with decision points |
| Experiment prioritization | Fix the foundation (data, metrics) first, then quick wins, then larger bets. Use Impact x Probability / Effort + Learning Value |
| Research vs production | 70/20/10 portfolio: production improvements, applied research, exploration. Clear graduation criteria |
| First 90 days | Learn (days 1-30), Deliver a quick win (31-60), Multiply your impact (61-90) |
| Scoping vague requests | Ask the 5 question categories, write a one-pager, define NOT IN SCOPE |
| Saying no | Validate, show the tradeoff, offer an alternative, provide a timeline |
| Scope reduction | Reduce population, simplify model, reduce automation, accept lower precision |
| The killer phrase | "Here is what I would explicitly NOT do, and why" |
| Decision points | Every plan needs explicit moments where you assess learning and potentially change direction |
The Phrases That Signal Strong Prioritization
- "The first thing I would do is make sure we are measuring the right thing."
- "Before building any model, I would validate that ML is the right tool for this problem."
- "Here is what I would explicitly NOT do in the first phase, and why."
- "I would time-box this exploration to two weeks. At the end, we decide whether to invest further."
- "The biggest risk is not that we build the wrong model - it is that we solve the wrong problem."
- "Let me present the tradeoffs so we can make this decision together."
The Phrases That Signal Weak Prioritization
- "I would try everything and see what works."
- "The first step is to build a feature store." (jumping to solutions)
- "I would need at least six months before delivering anything." (no quick wins)
- "It depends." (without then explaining what it depends on)
- "I would just ask my manager what to prioritize." (no ownership)
Spaced Repetition Checkpoints
After Reading (Day 0)
- Can you name the five steps of the Ambiguity Navigation Framework?
- Can you describe the four dimensions of the experiment prioritization matrix?
- Can you outline the first-90-days framework (learn/deliver/multiply)?
After 3 Days
- Deliver your first-90-days answer for a specific company in under 5 minutes. Does it sound structured or scattered?
- Take a real ML project you worked on. Apply the experiment prioritization matrix to ten possible improvements. Does the priority order match what actually happened?
- Practice the scoping conversation questions with a non-ML friend playing the PM role.
After 1 Week
- Have someone give you a vague ML prompt you have not seen before. Navigate the ambiguity in real time using the ANF. Time yourself - can you reach a structured plan in under 5 minutes?
- Write a deprioritization email for a real scenario from your career. Does it present tradeoffs clearly?
- Explain the research vs production balance to a non-technical person. Can they follow your reasoning?
After 2 Weeks
- Run through all six scenario types in the cheat sheet. Can you handle each one smoothly?
- Have a peer interview you with ambiguity questions and push back on your answers. Can you adapt in real time?
- Reflect: in your career, when have you handled ambiguity well? When have you handled it poorly? What was the difference?
What Comes Next
You now have the frameworks to navigate ambiguity, prioritize ruthlessly, and communicate your reasoning in any AI interview. In the final chapter of this section, Common Behavioral Questions, you will find 30+ of the most frequently asked behavioral questions with model answers - organized by theme, with clear explanations of what the interviewer is evaluating for each. Use it as both a study guide and a quick reference before your interview.
