Google-Style Problems
Reading time: ~45 min | Interview relevance: Critical (Google/Alphabet) | Roles: Software Engineer (ML), Research Engineer, Data Scientist, ML Infrastructure Engineer at Google
Google interviews are different. Not harder, not easier -- different. The emphasis is on clean code, multiple approaches, scalability thinking, and what Google internally calls "Googleyness" (a combination of intellectual humility, collaborative problem-solving, and comfort with ambiguity).
Google interviewers are trained to evaluate you on four signals: coding ability, algorithmic thinking, system design, and leadership/Googleyness. They are explicitly told to look for candidates who think about scale from the start, who discuss tradeoffs rather than jumping to a single solution, and who write code that other engineers would want to maintain.
This list of 30 problems is calibrated to what Google actually asks in ML-related roles, based on publicly reported interview experiences from 2023-2025. Each problem includes the specific Google evaluation lens you should optimize for.
Google Interview Structure (ML Roles)
| Round | Duration | Evaluation Focus | Google-Specific Notes |
|---|---|---|---|
| Coding 1 | 45 min | DSA + clean code | Expect 1 medium + 1 hard, or 2 mediums with follow-ups |
| Coding 2 | 45 min | ML-flavored coding | Implementation of ML algorithms or data processing |
| ML Design | 45 min | End-to-end ML system | Focus on Google-scale problems (billions of users) |
| System Design | 45 min | Infrastructure design | Distributed systems, data pipelines |
| Googleyness & Leadership | 45 min | Behavioral + values | Collaboration, handling ambiguity, navigating disagreement |
:::tip What Makes Google Interviews Unique
- Multiple approaches expected. Google interviewers want you to discuss 2-3 approaches before coding. Even if you know the optimal solution, start with brute force.
- Scale is assumed. Every problem should be reasoned about at Google scale (billions of data points, millions of QPS).
- Code quality matters. Google cares about readable, maintainable code. Use descriptive variable names, handle edge cases, and add comments for non-obvious logic.
- Follow-ups are the real test. The initial problem is the warm-up. The 2-3 follow-up questions reveal your depth. :::
Section 1: Coding Problems (12 Problems)
Google coding interviews emphasize algorithmic thinking, clean implementation, and the ability to handle follow-up questions that increase complexity.
Core DSA (Google Style)
| # | Problem | Difficulty | Time | Key Pattern | Google Lens | Follow-Up to Expect |
|---|---|---|---|---|---|---|
| 1 | Merge K Sorted Lists | Hard | 30 min | Min-heap merge | Scale: what if K = 10,000? | Distributed merge; merge from K different machines |
| 2 | Word Search II | Hard | 35 min | Trie + backtracking | Code quality: clean trie implementation | How would you handle a dictionary of 1M words? |
| 3 | Course Schedule II (All Orderings) | Medium | 25 min | Topological sort + DFS | Multiple approaches: Kahn's vs. DFS-based | Detect which specific courses form cycles |
| 4 | Sliding Window Maximum | Hard | 30 min | Monotonic deque | Optimization: O(1) max query | Extend to sliding window median |
| 5 | Serialize and Deserialize a Binary Tree | Hard | 30 min | BFS/DFS encoding | Design: format choice (JSON vs. custom) | Handle N-ary tree; handle very deep trees |
| 6 | Longest Substring with At Most K Distinct Characters | Medium | 20 min | Sliding window + hash map | Clean code: readable window logic | What if characters are Unicode? Memory implications? |
ML-Flavored Coding (Google Style)
| # | Problem | Difficulty | Time | Key Pattern | Google Lens | Follow-Up to Expect |
|---|---|---|---|---|---|---|
| 7 | Implement Reservoir Sampling | Medium | 20 min | Probabilistic sampling | Proof: can you prove uniform probability? | Weighted reservoir sampling; distributed sampling |
| 8 | Implement a Streaming Top-K with Bounded Memory | Medium | 25 min | Min-heap with size K | Scale: stream of 1B elements | Approximate top-K (Count-Min Sketch); distributed top-K |
| 9 | Implement K-Means Clustering from Scratch | Medium | 25 min | Iterative refinement | Initialization: random vs. K-Means++ | How to choose K? Silhouette score, elbow method |
| 10 | Implement AUC-ROC Computation Without Libraries | Medium | 20 min | Sort + trapezoidal rule | Edge cases: all positive, all negative | Partial AUC; AUC-PR comparison; when to prefer which |
| 11 | Implement a Consistent Hash Ring | Medium | 25 min | Hash function + sorted array | Scale: rebalancing when nodes join/leave | Virtual nodes for load balancing; weighted nodes |
| 12 | Implement MapReduce Word Count | Medium | 20 min | Map, shuffle, reduce | Scale: what if input is 1TB? | Combiner optimization; handling data skew |
:::warning Google Coding Red Flags
- Starting to code without discussing approach (Google wants approach discussion first)
- Only presenting one solution (they want to see you consider alternatives)
- Ignoring the follow-up question (the follow-up is where the signal lives)
- Writing clever but unreadable code (Google values readability over brevity)
- Not testing your code (walk through an example after coding) :::
Section 2: ML Design Problems (10 Problems)
Google ML Design interviews focus on end-to-end systems at Google scale. The interviewer expects you to think about billions of users, petabytes of data, and systems that must be reliable, fair, and efficient.
Core ML Systems
| # | Problem | Difficulty | Time | Key Concept | Google Scale Factor | What Google Evaluates |
|---|---|---|---|---|---|---|
| 13 | Design YouTube Video Recommendation | Hard | 45 min | Multi-stage ranking + user modeling | 2B+ users, 800M videos | Engagement optimization vs. responsible AI tradeoffs |
| 14 | Design Google Search Ranking | Hard | 45 min | Query understanding + retrieval + L2R | Billions of documents, 8.5B searches/day | Multi-stage pipeline, freshness, diversity, quality |
| 15 | Design Google Ads Click Prediction | Hard | 45 min | Real-time prediction + calibration | Trillions of predictions/day | Revenue optimization, advertiser quality, user experience |
| 16 | Design Gmail Spam Detection | Medium | 40 min | Text classification + adversarial robustness | 1.8B users, 15B messages/day | Adversarial robustness, feedback loops, privacy |
| 17 | Design Google Maps ETA Prediction | Medium | 40 min | Spatial-temporal modeling | Billions of routes/day | Real-time traffic, graph neural networks, uncertainty |
ML Infrastructure (Google Scale)
| # | Problem | Difficulty | Time | Key Concept | Google Scale Factor | What Google Evaluates |
|---|---|---|---|---|---|---|
| 18 | Design a Distributed Model Training System | Hard | 45 min | Data/model parallelism | Models with 100B+ parameters | AllReduce vs. parameter server, checkpointing, fault tolerance |
| 19 | Design a Feature Store for Online Serving | Hard | 45 min | Feature consistency + low latency | Millions of features, sub-10ms serving | Point-in-time correctness, caching, freshness guarantees |
| 20 | Design an ML Experiment Platform | Medium | 40 min | Experiment tracking + reproducibility | Thousands of experiments/day | Multi-tenant, resource scheduling, result comparison |
| 21 | Design a Model Monitoring and Alerting System | Medium | 35 min | Data drift + prediction drift | Hundreds of models in production | Alert fatigue reduction, root cause attribution |
| 22 | Design an Embedding Retrieval System (ANN at Scale) | Hard | 45 min | Approximate nearest neighbor | Billions of embeddings | HNSW vs. ScaNN, index building, serving latency |
:::note Google ML Design Framework Google interviewers expect this structure:
- Problem Framing (3 min): What are we optimizing? What are the constraints?
- Metrics (3 min): Offline metrics (NDCG, AUC) + Online metrics (CTR, engagement) + Guardrail metrics (diversity, fairness)
- Data (5 min): What data is available? What labels do we have? Data freshness?
- Feature Engineering (5 min): User features, item features, context features, cross-features
- Model Architecture (8 min): Why this model? Multi-stage if needed.
- Training (5 min): How to train at scale? Data pipelines, distributed training.
- Serving (5 min): Latency, throughput, caching, fallback.
- Evaluation (5 min): Offline eval -> Online A/B test -> Launch decision
- Iteration (3 min): What would you try next? What would you monitor? :::
Section 3: System Design & Infrastructure (5 Problems)
For ML Infrastructure and Platform roles, Google asks pure infrastructure design questions alongside ML design.
| # | Problem | Difficulty | Time | Key Concept | Google Scale Factor | What Google Evaluates |
|---|---|---|---|---|---|---|
| 23 | Design a Distributed Key-Value Store | Hard | 45 min | Consistency, partitioning, replication | Billions of keys, millions of QPS | CAP tradeoffs, consistent hashing, conflict resolution |
| 24 | Design a Real-Time Stream Processing System | Hard | 45 min | Exactly-once semantics, windowing | Millions of events/sec | Watermarks, late data, state management, checkpointing |
| 25 | Design a Job Scheduler for ML Training Workloads | Medium | 40 min | Priority queues, resource allocation | Thousands of GPU jobs | Preemption, gang scheduling, fair share, quota management |
| 26 | Design a Data Pipeline for Continuous Model Retraining | Medium | 35 min | Orchestration, validation, deployment | Daily retraining of hundreds of models | Data freshness, validation gates, canary deployment |
| 27 | Design a Multi-Tenant Data Lake | Medium | 35 min | Isolation, access control, cost allocation | Petabytes of data, hundreds of teams | Metadata management, lineage, cost attribution |
Section 4: Googleyness & Behavioral (3 Discussion Topics)
Google explicitly evaluates Googleyness. These are not standard behavioral questions -- they test how you think about engineering culture.
| # | Topic | Time | What Google Looks For |
|---|---|---|---|
| 28 | Describe a time you disagreed with a technical decision and what you did | 15 min | Respectful disagreement, data-driven advocacy, willingness to commit after decision |
| 29 | How would you handle a situation where your model improves metrics but has fairness concerns? | 15 min | Ethical reasoning, proactive identification of harm, balancing business and user impact |
| 30 | Tell me about a time you simplified a complex system | 15 min | Preference for simplicity, ability to identify unnecessary complexity, impact measurement |
:::tip Googleyness Signals Things that signal Googleyness positively:
- "I wasn't sure, so I ran an experiment to find out"
- "I disagreed, but I committed to the team's decision and helped make it succeed"
- "I noticed this could disproportionately affect certain users, so I..."
- "The simpler approach turned out to be better because..."
Things that signal Googleyness negatively:
- "I knew I was right, so I pushed until they agreed"
- "I just used the approach from my previous company"
- "Fairness wasn't really in scope for this project" :::
Google-Specific Preparation Strategies
1. The Multiple Approaches Framework
For every coding problem, prepare three approaches:
| Level | What to Present | Example (Merge K Sorted Lists) |
|---|---|---|
| Brute Force | Concatenate and sort | Merge all into one list, sort: O(N log N) |
| Better | Merge pairs iteratively | Merge 2 at a time: O(N * K) |
| Optimal | Min-heap | Push all heads, pop min: O(N log K) |
Always present all three, explain tradeoffs, then code the optimal.
2. Scale Reasoning Template
For every design problem, explicitly discuss scale:
"Let me think about the scale here.
- Users: [X] daily active users
- Data: [Y] events per day, [Z] total stored
- Latency: p50 = [A]ms, p99 = [B]ms
- Throughput: [C] QPS at peak
Given this scale, [approach X] would [reason], so I'd choose [approach Y] instead."
3. Code Quality Checklist (Google Style)
Before submitting your code, verify:
[ ] Descriptive variable names (not single letters, except loop vars)
[ ] Edge cases handled (empty input, null, overflow)
[ ] Helper functions extracted (not one monolithic function)
[ ] Complexity stated (time and space)
[ ] Walked through at least one example
[ ] Discussed testing approach
4-Week Google Prep Plan
| Week | Focus | Problems | Daily Load |
|---|---|---|---|
| Week 1 | Coding + multiple approaches | #1-12 | 2 problems/day |
| Week 2 | ML design | #13-17 | 1 design/day |
| Week 3 | Infra design + ML infra | #18-27 | 1-2 designs/day |
| Week 4 | Behavioral + mocks | #28-30 + full mocks | 1 topic + 1 mock/day |
Week 1: Coding (Multiple Approaches Focus)
Day 1: #1 (Merge K Sorted Lists - present 3 approaches)
Day 2: #2 (Word Search II - trie vs. brute force comparison)
Day 3: #3, #4 (Topological sort variants, sliding window max)
Day 4: #5, #6 (Serialize tree, sliding window)
Day 5: #7, #8 (Reservoir sampling, streaming top-K)
Day 6: #9, #10 (K-Means, AUC-ROC)
Day 7: #11, #12 (Consistent hashing, MapReduce)
Week 2: ML Design (Google Scale)
Day 1: #13 (YouTube recommendations - the flagship problem)
Day 2: #14 (Google Search ranking)
Day 3: #15 (Google Ads click prediction)
Day 4: #16 (Gmail spam detection)
Day 5: #17 (Google Maps ETA)
Day 6-7: Re-do #13 and #14 from scratch under time pressure
Week 3: Infrastructure Design
Day 1: #18 (Distributed training - AllReduce vs. parameter server)
Day 2: #19 (Feature store - consistency problems)
Day 3: #20, #21 (Experiment platform, model monitoring)
Day 4: #22 (Embedding retrieval at scale - ScaNN)
Day 5: #23 (Distributed KV store - consistency models)
Day 6: #24, #25 (Stream processing, job scheduler)
Day 7: #26, #27 (Continuous retraining, data lake)
Week 4: Behavioral + Full Mocks
Day 1: Prepare 5 STAR stories aligned to Googleyness
Day 2: #28 (Disagreement scenario - practice out loud)
Day 3: #29 (Fairness scenario - practice out loud)
Day 4: #30 (Simplification story - practice out loud)
Day 5: Full mock: 2 coding + 1 ML design + 1 behavioral
Day 6: Full mock: 1 coding + 1 system design + 1 ML design
Day 7: Review weak areas; practice explaining under time pressure
Problem Deep Dives
Problem 13: Design YouTube Video Recommendation
Why Google asks this: YouTube is the largest ML system at Google. Recommending videos to 2 billion users from 800 million videos requires multi-stage ranking, real-time personalization, and careful balancing of engagement with responsible AI.
Architecture (Google's Published Approach):
Candidate Generation -> Ranking -> Reranking -> Serving
1. Candidate Generation (~100-1000 candidates)
- Collaborative filtering (user-item matrix factorization)
- Content-based (video embeddings via deep neural network)
- Context-based (watch history, search history, demographics)
- Multiple candidate generators run in parallel
2. Ranking (score each candidate)
- Features: user history, video features, context (time, device)
- Model: Deep neural network with watch time prediction
- Objective: Expected watch time (not just CTR)
- Training: Weighted logistic regression with watch time as weight
3. Reranking (business rules + diversity)
- Diversity injection (avoid showing 5 similar videos)
- Freshness boost for new content
- Creator fairness (distribution across creators)
- Responsible AI filters (misinformation, harmful content)
4. Serving
- Candidate generation: batch + real-time hybrid
- Ranking: real-time inference, <50ms latency budget
- Caching: user-level recommendation cache with TTL
Key Discussion Points Google Wants to Hear:
- Why watch time > CTR (clickbait optimization problem)
- Position bias correction in training data
- Cold-start handling for new users and new videos
- Responsible AI: filter bubbles, radicalization prevention
- How to evaluate: offline (recall@K, NDCG) vs. online (engagement metrics)
Problem 22: Design an Embedding Retrieval System
Why Google asks this: Many Google systems rely on embedding-based retrieval (search, ads, YouTube, Photos). Building a system that can search billions of embeddings in sub-10ms is a core infrastructure challenge.
Architecture:
1. Embedding Generation
- Model: Two-tower architecture (query encoder + item encoder)
- Training: Contrastive learning with in-batch negatives
- Dimension: 128-512 depending on quality/speed tradeoff
2. Index Building (Offline)
- Algorithm: ScaNN (Google's ANN algorithm)
- Asymmetric hashing for coarse quantization
- Anisotropic vector quantization for fine ranking
- Partitioning: IVF (inverted file) for coarse search
- Build time: hours for billion-scale index
3. Online Serving
- Query encoding: real-time inference (~5ms)
- ANN search: ~2-5ms for top-100 from billion-scale
- Post-retrieval scoring: exact dot product on top-K
- Total latency budget: <15ms
4. Index Updates
- Full rebuild: daily/weekly for major model updates
- Incremental: streaming new items into existing index
- Staleness monitoring: alert if index is too old
Tradeoffs to Discuss:
- Recall vs. latency (more probes = higher recall, more latency)
- Quantization precision vs. index size
- Flat index (exact, slow) vs. approximate (fast, lossy)
- CPU vs. GPU serving for ANN
Google-Specific Patterns
| Pattern | Where It Appears | Google's Emphasis |
|---|---|---|
| Multi-stage pipeline | #13, #14, #15 | Candidate generation -> ranking -> reranking |
| Distributed computation | #1, #12, #18 | MapReduce thinking; what if data doesn't fit on one machine? |
| Approximate algorithms | #8, #22 | Exact is too slow at Google scale; approximate with guarantees |
| Consistent hashing | #11, #23 | Distributed data placement and load balancing |
| Streaming computation | #7, #8, #24 | Bounded memory, single-pass algorithms |
| Fairness-aware design | #13, #16, #29 | Responsible AI is a first-class design requirement |
| Experimentation | #20, #26 | Every change must be measurable and reversible |
Google Level Expectations
| Level | Coding | ML Design | System Design | Googleyness |
|---|---|---|---|---|
| L3 (SWE II) | Solve 2 mediums cleanly | Not expected | Not expected | Basic collaboration |
| L4 (SWE III) | 1 medium + 1 hard | Basic ML system | Not expected | Navigating ambiguity |
| L5 (Senior) | 1 hard with follow-ups | Full ML system with depth | Moderate complexity | Technical leadership |
| L6 (Staff) | 1 hard optimally + clean code | Google-scale ML system | Complex distributed system | Cross-team impact |
| L7 (Senior Staff) | Discuss tradeoffs at expert level | Novel system architecture | Novel infrastructure | Org-wide influence |
:::danger Common Reasons Google Rejects Candidates
- Only one approach. Google explicitly trains interviewers to look for multiple approaches.
- Ignoring scale. Designing a system that works for 1000 users when Google operates at 1B+.
- Messy code. Google's codebase is shared across thousands of engineers. Readability is non-negotiable.
- Not asking clarifying questions. Google problems are intentionally ambiguous. Asking questions shows maturity.
- Lack of humility. Googleyness explicitly includes "intellectual humility."
- Ignoring fairness/responsibility. At Google, responsible AI is a launch requirement, not an afterthought. :::
Difficulty Distribution
| Difficulty | Problems | Count |
|---|---|---|
| Medium | #3, #6, #7, #8, #9, #10, #11, #12, #16, #17, #20, #21, #25, #26, #27 | 15 |
| Hard | #1, #2, #4, #5, #13, #14, #15, #18, #19, #22, #23, #24 | 12 |
| Behavioral | #28, #29, #30 | 3 |
Problem Deep Dives (Continued)
Problem 4: Sliding Window Maximum
Why Google asks this: This problem tests knowledge of the monotonic deque pattern, which is non-obvious and rarely encountered outside of deliberate practice. The follow-up (sliding window median) is even harder and tests whether you can extend your solution.
Approach:
from collections import deque
def max_sliding_window(nums, k):
dq = deque() # stores indices; values are decreasing
result = []
for i, num in enumerate(nums):
# Remove indices outside the window
while dq and dq[0] < i - k + 1:
dq.popleft()
# Remove smaller elements (they can never be the max)
while dq and nums[dq[-1]] < num:
dq.pop()
dq.append(i)
# Window is full, record the max
if i >= k - 1:
result.append(nums[dq[0]])
return result
The Monotonic Deque Invariant:
- The deque stores indices in decreasing order of their values
- The front of the deque is always the maximum of the current window
- When a new element enters, all smaller elements are removed (they are useless)
- When the window slides, expired indices are removed from the front
Complexity: O(n) time, O(k) space -- each element is pushed and popped at most once.
Follow-Up: Sliding Window Median
- Replace the deque with two heaps (max-heap for lower half, min-heap for upper half)
- Lazy deletion for elements leaving the window
- Time: O(n log k), Space: O(k)
Google-Specific Discussion Points:
- How would you distribute this across multiple machines? (Partition the array, each machine handles a chunk with k-1 overlap)
- What if the array does not fit in memory? (Streaming version with bounded memory)
Problem 9: Implement K-Means Clustering from Scratch
Why Google asks this: K-Means is simple enough to implement in 25 minutes, but the follow-up questions (initialization, convergence, choosing K) reveal deep understanding.
Implementation:
import numpy as np
def kmeans(X, k, max_iters=100, tol=1e-4):
n, d = X.shape
# K-Means++ initialization
centroids = [X[np.random.randint(n)]]
for _ in range(1, k):
distances = np.min([
np.sum((X - c) ** 2, axis=1) for c in centroids
], axis=0)
probs = distances / distances.sum()
idx = np.random.choice(n, p=probs)
centroids.append(X[idx])
centroids = np.array(centroids)
for iteration in range(max_iters):
# Assignment step
distances = np.array([
np.sum((X - c) ** 2, axis=1) for c in centroids
]) # (k, n)
labels = np.argmin(distances, axis=0) # (n,)
# Update step
new_centroids = np.array([
X[labels == i].mean(axis=0) if np.any(labels == i)
else centroids[i] # keep old centroid if cluster is empty
for i in range(k)
])
# Check convergence
shift = np.sum((new_centroids - centroids) ** 2)
centroids = new_centroids
if shift < tol:
break
return labels, centroids
Follow-Up Questions Google Asks:
| Question | Expected Answer |
|---|---|
| Why K-Means++ instead of random init? | Random init can lead to poor local optima; K-Means++ spreads initial centroids, leading to better convergence |
| How do you choose K? | Elbow method (plot inertia vs. K), silhouette score, domain knowledge |
| What if clusters are non-spherical? | K-Means assumes spherical clusters; use DBSCAN or Gaussian Mixture Models |
| What if data has outliers? | Use K-Medoids (more robust) or remove outliers first |
| How do you scale to 1B data points? | Mini-batch K-Means: sample a batch, update centroids incrementally |
| How do you handle high-dimensional data? | Dimensionality reduction first (PCA/UMAP), then cluster |
Problem 11: Implement a Consistent Hash Ring
Why Google asks this: Consistent hashing is the backbone of distributed data placement at Google scale. It determines which server stores which data, with minimal reshuffling when servers join or leave.
Implementation:
import hashlib
import bisect
class ConsistentHashRing:
def __init__(self, nodes=None, virtual_nodes=150):
self.virtual_nodes = virtual_nodes
self.ring = [] # sorted list of hash positions
self.node_map = {} # hash_position -> physical node
if nodes:
for node in nodes:
self.add_node(node)
def _hash(self, key):
return int(hashlib.md5(key.encode()).hexdigest(), 16)
def add_node(self, node):
for i in range(self.virtual_nodes):
virtual_key = f"{node}:{i}"
hash_val = self._hash(virtual_key)
bisect.insort(self.ring, hash_val)
self.node_map[hash_val] = node
def remove_node(self, node):
for i in range(self.virtual_nodes):
virtual_key = f"{node}:{i}"
hash_val = self._hash(virtual_key)
self.ring.remove(hash_val)
del self.node_map[hash_val]
def get_node(self, key):
if not self.ring:
return None
hash_val = self._hash(key)
idx = bisect.bisect_right(self.ring, hash_val)
if idx == len(self.ring):
idx = 0 # wrap around
return self.node_map[self.ring[idx]]
Key Design Decisions:
| Decision | Choice | Reasoning |
|---|---|---|
| Hash function | MD5 | Good distribution; not crypto-secure, but not needed here |
| Virtual nodes | 150 per physical node | Balances load evenly; fewer = more variance |
| Data structure | Sorted array + bisect | O(log n) lookup; balanced BST or skip list also work |
| Wrap-around | Modular ring | Key maps to the next node clockwise on the ring |
Google Follow-Ups:
- How many keys need to be remapped when a node is added? (Only K/N keys on average, where K = total keys, N = number of nodes)
- How do you handle weighted nodes? (More virtual nodes for higher-capacity servers)
- How do you replicate data? (Map to the next R nodes on the ring, skipping same physical node)
Google ML Role Taxonomy
Understanding which Google role you are interviewing for helps prioritize preparation:
| Role | Coding Weight | ML Design Weight | System Design Weight | Key Differentiator |
|---|---|---|---|---|
| SWE (ML) | 40% | 30% | 30% | Strong coding + ML integration |
| Research Scientist | 20% | 50% | 10% | Paper knowledge + novel methods |
| Research Engineer | 30% | 30% | 30% | Implements research ideas at scale |
| ML Infrastructure | 30% | 10% | 50% | Distributed systems + ML serving |
| Data Scientist | 20% | 30% | 10% | Statistical rigor + business insight |
| Applied Scientist | 25% | 40% | 25% | Domain expertise + ML application |
Google Technical Screen vs. Onsite
The phone screen (technical screen) is different from onsite at Google:
| Dimension | Technical Screen | Onsite |
|---|---|---|
| Duration | 45 min (1 round) | 4-5 rounds (full day) |
| Problems | 1 medium or 1 medium + easy | Mix of medium and hard |
| Evaluation | Pass/fail threshold | Calibrated scoring across dimensions |
| Follow-ups | 0-1 follow-up | 2-3 follow-ups per problem |
| System design | Sometimes (for senior) | Always (for L5+) |
| Code quality bar | Functional code | Clean, maintainable code |
:::tip Phone Screen Strategy For the Google phone screen:
- Solve the problem correctly and completely (this is the bar)
- Use descriptive names and handle edge cases (differentiate yourself)
- Talk through your approach before coding (shows engineering maturity)
- If you finish early, suggest improvements and discuss complexity
- Do NOT overthink -- a correct medium solution beats a partial hard solution :::
Google Recruiter FAQ
Common questions candidates ask, with honest answers:
| Question | Answer |
|---|---|
| Can I use Python? | Yes, Python is fully supported. Most ML candidates use Python. |
| Do I need to know Google-specific tech (Spanner, Bigtable)? | No, but knowing the concepts behind them helps in system design. |
| How long do I have to prepare? | Google often gives 2-4 weeks between phone screen and onsite. |
| Can I redo a bad round? | No, each round is evaluated independently. One bad round does not necessarily mean rejection. |
| What happens if I am borderline? | Google uses a hiring committee. Borderline cases may receive additional interviews (SVC = supplemental virtual call). |
| Should I study Google's published papers? | For Research Scientist roles, yes. For SWE/MLE, it helps but is not required. |
Next Steps
After completing Google-Style preparation:
- Meta-Style Problems if also interviewing at Meta (different emphasis)
- Hard Tier for additional hard-tier practice
- MLE Problems for comprehensive MLE preparation
- Research Engineer Problems if targeting Google Research or DeepMind
