Easy Tier Problems
Reading time: ~40 min | Interview relevance: High | Roles: All AI/ML Roles (especially entry-level, career switchers, and warm-up for experienced engineers)
You open your laptop to start interview prep and immediately jump to a Hard-tier dynamic programming problem. Twenty minutes later, you are staring at a blank screen, confidence shattered. This is how most people fail at interview preparation before they even begin.
Easy-tier problems exist for a reason. They build the muscle memory for patterns that appear repeatedly in harder problems. They confirm that your fundamentals are solid. And they give you the psychological momentum to tackle medium and hard problems with clarity instead of panic.
This list of 35 problems spans coding, ML implementation, data processing, and system design discussion. Every problem can be completed in 15-25 minutes. If any problem takes you longer than 30 minutes, it signals a gap worth investigating before moving up.
How to Use This List
| Goal | Approach |
|---|---|
| Total beginner | Work through all 35 problems sequentially over 2 weeks |
| Warm-up before harder prep | Complete in 3-5 days, spending no more than 20 min per problem |
| Confidence building | Cherry-pick problems from your weakest category |
| Interview day warm-up | Solve 2-3 problems the morning of your interview |
:::tip The 20-Minute Rule For easy problems, set a strict 20-minute timer. If you cannot solve it in 20 minutes, read the solution, understand the pattern, then re-solve from scratch the next day. Speed matters at this level. :::
Category 1: Core Data Structures & Algorithms (10 Problems)
These are the bread-and-butter DSA problems. You should be able to solve each one without hesitation.
| # | Problem | Time | Key Pattern | Category | What It Tests | Company Tags |
|---|---|---|---|---|---|---|
| 1 | Two Sum | 10 min | Hash map lookup | Arrays | Hash map for O(n) lookups; the most famous interview problem | FAANG, All |
| 2 | Valid Parentheses | 10 min | Stack matching | Stacks | LIFO structure for balanced matching; expression parsing | All |
| 3 | Merge Two Sorted Lists | 10 min | Two-pointer merge | Linked Lists | Merge operation foundational to merge sort and stream merging | All |
| 4 | Maximum Depth of Binary Tree | 10 min | DFS recursion | Trees | Basic tree traversal and recursive thinking | All |
| 5 | Best Time to Buy and Sell Stock | 10 min | Running minimum | Arrays | Track running min/max; single-pass optimization | FAANG, All |
| 6 | Invert Binary Tree | 10 min | Tree recursion | Trees | Recursive tree transformation; understanding pointer swaps | All |
| 7 | Contains Duplicate | 5 min | Hash set | Arrays | Set operations for uniqueness; O(n) vs O(n log n) tradeoff | All |
| 8 | Linked List Cycle Detection | 10 min | Floyd's slow/fast pointer | Linked Lists | Two-pointer technique; cycle detection appears in graph problems | All |
| 9 | Binary Search | 10 min | Divide and conquer | Search | The most fundamental O(log n) algorithm; boundary conditions matter | All |
| 10 | Implement a Min Stack | 15 min | Auxiliary stack | Stacks | Maintaining O(1) access to additional properties | All |
:::note Why These 10 Problems Matter Every pattern in this section reappears in medium and hard problems:
- Hash map -> LRU Cache, Group Anagrams, Two Sum variants
- Two pointers -> 3Sum, Container With Most Water, merge operations
- DFS -> Path Sum, Serialize Tree, connected components
- Stack -> Valid Parentheses -> Largest Rectangle in Histogram
- Binary Search -> Search in Rotated Array, median finding
Master the easy version first. The hard version is just the easy version with more constraints. :::
Category 2: ML Implementation Basics (8 Problems)
These problems test whether you can implement fundamental ML operations without relying on library calls.
| # | Problem | Time | Key Pattern | Category | What It Tests | Company Tags |
|---|---|---|---|---|---|---|
| 11 | Implement K-Nearest Neighbors | 20 min | Distance computation + sorting | Classification | NumPy fluency, distance metrics, top-K selection | Google, Meta, Startups |
| 12 | Implement Linear Regression (Closed-Form) | 15 min | Matrix operations | Regression | Normal equation, NumPy matrix math, inverse computation | All |
| 13 | Compute Precision, Recall, and F1 Score | 10 min | Confusion matrix math | Evaluation | Metric computation from confusion matrix; when to use which metric | All |
| 14 | Implement Sigmoid and Softmax Functions | 10 min | Numerical stability | Activation Functions | Overflow prevention with max subtraction; probability normalization | All |
| 15 | Implement Min-Max and Z-Score Normalization | 10 min | Feature scaling | Preprocessing | Data normalization; handling edge cases (zero variance) | All |
| 16 | Implement One-Hot Encoding from Scratch | 10 min | Categorical encoding | Preprocessing | Mapping categories to binary vectors; handling unknown categories | All |
| 17 | Implement Train-Test Split with Shuffling | 10 min | Random sampling | Evaluation | Random seed management, stratification awareness | All |
| 18 | Compute Euclidean, Manhattan, and Cosine Distance | 15 min | Vector operations | Similarity | Distance metric selection; when cosine beats Euclidean | All |
:::warning Common Easy ML Mistakes
- Sigmoid overflow: Always implement as
1 / (1 + exp(-x))with clipping, or use themax(0, x)trick for numerical stability - Softmax overflow: Subtract the max value before exponentiating:
exp(x - max(x)) - Division by zero: Z-score normalization with zero variance; F1 with zero precision or recall
- Forgetting to shuffle: Train-test split on time-ordered data leaks future information :::
Category 3: Data Processing & SQL (10 Problems)
Practical data manipulation problems that every AI/ML role requires.
Python Data Processing
| # | Problem | Time | Key Pattern | Category | What It Tests | Company Tags |
|---|---|---|---|---|---|---|
| 19 | Parse and Aggregate Large Log Files | 15 min | Hash map aggregation | Data Processing | File I/O, string parsing, aggregation | All |
| 20 | Compute Intersection of Two Large Sorted Arrays | 10 min | Two-pointer merge | Arrays | Efficient set operations; memory-conscious processing | Google, Meta |
| 21 | Read CSV and Compute Column Statistics | 10 min | Streaming computation | Data Processing | Pandas fluency; handling missing values, type coercion | All |
| 22 | Implement a Word Frequency Counter | 10 min | Hash map counting | Text Processing | Tokenization, normalization, counting | All |
| 23 | Build a Simple Data Validation Function | 15 min | Schema checking | Data Quality | Type checks, range validation, null detection | All |
SQL Fundamentals
| # | Problem | Time | Key Pattern | Category | What It Tests | Company Tags |
|---|---|---|---|---|---|---|
| 24 | Find Employees Earning More Than Their Manager | 10 min | Self-join | SQL | Basic join logic with table self-referencing | All |
| 25 | Find Duplicate Emails in a Table | 5 min | GROUP BY + HAVING | SQL | Aggregation fundamentals; identifying duplicates | All |
| 26 | Second Highest Salary | 10 min | Subquery or window function | SQL | Ranking with edge cases (ties, NULLs) | FAANG, All |
| 27 | Combine Two Tables (Left Join) | 5 min | LEFT JOIN | SQL | Understanding NULL-preserving joins | All |
| 28 | Customers Who Never Ordered | 10 min | LEFT JOIN + IS NULL / NOT EXISTS | SQL | Anti-join pattern; filtering for absence | All |
:::tip SQL Practice Strategy These SQL problems are intentionally simple. They test whether you can translate English requirements into correct SQL without syntax errors. If you can solve all five in under 30 minutes total, your SQL fundamentals are solid. If not, spend a day on SQL basics before attempting the Data Engineer or Data Scientist problem lists. :::
Category 4: System Design Discussion (4 Problems)
These are not full system design problems. They are discussion-level questions where you explain concepts clearly and reason about tradeoffs. Each should take 10-15 minutes of verbal explanation.
| # | Problem | Time | Key Pattern | Category | What It Tests | Company Tags |
|---|---|---|---|---|---|---|
| 29 | Explain the Bias-Variance Tradeoff | 10 min | Conceptual | ML Theory | Can you explain underfitting vs. overfitting with examples? | All |
| 30 | Compare REST vs. gRPC for Model Serving | 10 min | API design | System Design | Tradeoff analysis: latency, compatibility, streaming support | All |
| 31 | Explain Why Batch Normalization Helps Training | 10 min | Conceptual | Deep Learning | Internal covariate shift, gradient flow, regularization effect | All |
| 32 | Describe How You Would Monitor a Model in Production | 15 min | MLOps | System Design | Data drift, prediction drift, latency, error rates | All |
Category 5: Probability & Statistics Basics (3 Problems)
Many ML interviews include at least one probability question. These are the easy versions.
| # | Problem | Time | Key Pattern | Category | What It Tests | Company Tags |
|---|---|---|---|---|---|---|
| 33 | Compute the Expected Value of a Dice Game | 10 min | Expected value | Probability | Basic expectation calculation; linearity of expectation | All |
| 34 | Explain P-Value to a Non-Technical Person | 10 min | Statistical inference | Statistics | Communication skills; simplifying complex concepts | All |
| 35 | Bayes' Theorem: Disease Testing Problem | 15 min | Conditional probability | Probability | Base rate neglect, prior/posterior reasoning | FAANG, All |
:::danger Do Not Skip Easy Problems Three reasons engineers skip easy problems and regret it:
-
Speed matters. In a 45-minute interview, solving an easy warm-up in 5 minutes leaves 40 minutes for the hard follow-up. Solving it in 15 minutes leaves only 30.
-
Edge cases hide in easy problems. Two Sum with duplicate values. Binary search on an empty array. Train-test split when n=1. Easy problems teach defensive coding.
-
Interviewers calibrate on easy problems. If you struggle with an easy problem, the interviewer downgrades the rest of the interview. If you crush it quickly with clean code, they give you harder follow-ups (which is what you want). :::
1-Week Easy Tier Study Plan
For candidates who need a quick warm-up before harder preparation:
| Day | Focus | Problems | Time |
|---|---|---|---|
| Day 1 | DSA fundamentals | #1-5 (hash map, stack, merge, DFS, running min) | 60 min |
| Day 2 | More DSA + basics | #6-10 (tree recursion, set, cycle, binary search, min stack) | 60 min |
| Day 3 | ML implementation | #11-14 (KNN, linear regression, metrics, activations) | 60 min |
| Day 4 | ML + data processing | #15-19 (normalization, encoding, split, distances, log parsing) | 60 min |
| Day 5 | Data processing + SQL | #20-25 (arrays, CSV, word count, validation, SQL basics) | 60 min |
| Day 6 | SQL + concepts | #26-32 (SQL, system design discussions) | 60 min |
| Day 7 | Probability + review | #33-35 + re-solve any problems that took >20 min | 60 min |
2-Week Easy Tier Study Plan (Thorough)
For career switchers or candidates returning after a long break:
| Day | Focus | Problems | Notes |
|---|---|---|---|
| Day 1 | Arrays & Hashing | #1, #5, #7 | Focus on hash map pattern |
| Day 2 | Stacks & Lists | #2, #3, #8, #10 | LIFO and pointer-based structures |
| Day 3 | Trees & Search | #4, #6, #9 | Recursion and binary search |
| Day 4 | Review DSA | Re-solve #1-10 | Target <10 min per problem |
| Day 5 | ML Basics 1 | #11, #12 | KNN, linear regression |
| Day 6 | ML Basics 2 | #13, #14, #15 | Metrics, activations, normalization |
| Day 7 | ML Basics 3 | #16, #17, #18 | Encoding, splitting, distances |
| Day 8 | Data Processing | #19, #20, #21, #22 | Python file and data handling |
| Day 9 | Data Quality | #23 + review ML problems | Validation and edge cases |
| Day 10 | SQL Day 1 | #24, #25, #26 | Joins, GROUP BY, ranking |
| Day 11 | SQL Day 2 | #27, #28 | LEFT JOIN, anti-join |
| Day 12 | Concepts | #29, #30, #31, #32 | Discussion practice (speak out loud) |
| Day 13 | Probability | #33, #34, #35 | Expected value, Bayes |
| Day 14 | Full review | Re-solve all problems that felt shaky | Speed run the entire list |
Problem Deep Dives
Problem 1: Two Sum
Why this problem matters: Two Sum is the most commonly asked interview problem across all companies. It tests whether you reach for the O(n) hash map solution or the O(n^2) brute force. More importantly, it tests how you handle edge cases.
Brute Force:
def two_sum(nums, target):
for i in range(len(nums)):
for j in range(i + 1, len(nums)):
if nums[i] + nums[j] == target:
return [i, j]
return []
Optimal (Hash Map):
def two_sum(nums, target):
seen = {}
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
return []
Edge Cases to Handle:
- Array with duplicate values:
[3, 3], target = 6 - Negative numbers:
[-1, 2, -3, 4], target = 1 - Single element: should return empty
- No solution exists
Follow-Up Questions Interviewers Ask:
- What if the array is sorted? (Two-pointer approach, O(1) space)
- What if you need all pairs? (Modify to collect all results)
- What if there are multiple valid answers? (Return any one, or modify contract)
Problem 11: Implement K-Nearest Neighbors
Why this problem matters: KNN is the simplest ML algorithm, but implementing it correctly tests NumPy fluency, understanding of distance metrics, and handling of edge cases.
Implementation:
import numpy as np
from collections import Counter
def knn_predict(X_train, y_train, x_query, k=5):
# Compute distances (Euclidean)
distances = np.sqrt(np.sum((X_train - x_query) ** 2, axis=1))
# Get k nearest indices
k_nearest_idx = np.argsort(distances)[:k]
# Get labels of k nearest
k_nearest_labels = y_train[k_nearest_idx]
# Majority vote
most_common = Counter(k_nearest_labels).most_common(1)
return most_common[0][0]
Key Points Interviewers Check:
- Vectorized distance computation (not a for loop)
- Handling ties in voting (what if 2 classes have equal votes?)
- Choice of k (odd k avoids ties for binary classification)
- Distance metric choice (Euclidean vs. Manhattan vs. Cosine)
- Time complexity: O(n * d) for distance computation + O(n log n) for sorting
Problem 35: Bayes' Theorem Disease Testing
Why this problem matters: This problem exposes base rate neglect, one of the most common reasoning errors. It tests whether you can apply Bayes' theorem correctly under pressure.
Problem Statement: A disease affects 1 in 1000 people. A test has 99% sensitivity (true positive rate) and 99% specificity (true negative rate). If a person tests positive, what is the probability they actually have the disease?
Solution:
P(Disease) = 0.001 (prevalence)
P(Positive | Disease) = 0.99 (sensitivity)
P(Positive | No Disease) = 0.01 (false positive rate)
P(Positive) = P(Pos|D) * P(D) + P(Pos|~D) * P(~D)
= 0.99 * 0.001 + 0.01 * 0.999
= 0.00099 + 0.00999
= 0.01098
P(Disease | Positive) = P(Pos|D) * P(D) / P(Positive)
= 0.00099 / 0.01098
= 0.0901 (about 9%)
Key Insight: Despite the test being 99% accurate, a positive result only means a 9% chance of having the disease. The low base rate dominates. This is why interviewers love this problem -- most candidates intuitively guess 99%.
Patterns You Must Internalize
| Pattern | Easy Problem | Medium/Hard Extension |
|---|---|---|
| Hash map for O(1) lookup | Two Sum (#1) | LRU Cache, Group Anagrams |
| Stack for matching | Valid Parentheses (#2) | Largest Rectangle, Calculator |
| Two-pointer merge | Merge Sorted Lists (#3) | Merge K Sorted Lists, Intersection |
| DFS recursion | Max Depth (#4) | Path Sum, Serialize Tree |
| Running min/max | Buy/Sell Stock (#5) | Trapping Rain Water, Sliding Window Max |
| Floyd's cycle detection | Cycle Detection (#8) | Find Duplicate Number, Linked List Intersection |
| Binary search | Binary Search (#9) | Search Rotated Array, Median of Two Arrays |
| Vectorized computation | KNN (#11) | Batch operations, feature engineering |
| Conditional probability | Bayes' Theorem (#35) | Naive Bayes classifier, A/B test analysis |
Confidence Checkpoints
Use these checkpoints to assess readiness for harder tiers:
| Checkpoint | Target | Ready for |
|---|---|---|
| All 10 DSA problems in <15 min each | 2.5 hours total | Medium Tier DSA |
| All 8 ML problems with correct edge cases | 2 hours total | Medium Tier ML |
| All 5 SQL problems first try, no syntax errors | 45 min total | Data Engineer/Scientist SQL rounds |
| Can explain all 4 system design concepts clearly | 45 min total | Medium Tier System Design |
| All 3 probability problems correct | 30 min total | Medium Tier probability |
| Full list completed | <8 hours total | Medium Tier (all categories) |
Additional Problem Deep Dives
Problem 5: Best Time to Buy and Sell Stock
Why this problem matters: This is the simplest example of the "running minimum" pattern. It appears trivial, but the pattern extends to harder problems like Trapping Rain Water and Best Time to Buy and Sell Stock III (with multiple transactions).
Brute Force (O(n^2)):
def max_profit_brute(prices):
max_profit = 0
for i in range(len(prices)):
for j in range(i + 1, len(prices)):
max_profit = max(max_profit, prices[j] - prices[i])
return max_profit
Optimal (O(n)):
def max_profit(prices):
min_price = float('inf')
max_profit = 0
for price in prices:
min_price = min(min_price, price)
max_profit = max(max_profit, price - min_price)
return max_profit
The Insight: At each position, the best profit we can make by selling today is today's price - minimum price seen so far. We track the running minimum and the running best profit in a single pass.
Edge Cases:
- Prices always decreasing: profit = 0 (never buy)
- Single price: profit = 0
- All same price: profit = 0
- Two prices: simple comparison
Follow-Up Extensions:
| Variant | Key Change | Difficulty |
|---|---|---|
| Buy and Sell Stock II (unlimited transactions) | Greedy: sum all upward moves | Easy |
| Buy and Sell Stock III (at most 2 transactions) | Track 2 buy/sell states | Hard |
| Buy and Sell Stock IV (at most K transactions) | DP with K states | Hard |
| Buy and Sell Stock with Cooldown | DP with 3 states (hold, sold, rest) | Medium |
Problem 9: Binary Search
Why this problem matters: Binary search is the most fundamental O(log n) algorithm, but getting the boundary conditions right is notoriously tricky. More interview bugs come from binary search off-by-one errors than from any other algorithm.
Standard Implementation:
def binary_search(nums, target):
lo, hi = 0, len(nums) - 1
while lo <= hi:
mid = lo + (hi - lo) // 2 # Avoid integer overflow
if nums[mid] == target:
return mid
elif nums[mid] < target:
lo = mid + 1
else:
hi = mid - 1
return -1
Common Variants:
| Variant | Change | When to Use |
|---|---|---|
| Find leftmost occurrence | When nums[mid] == target, set hi = mid - 1 | Sorted array with duplicates |
| Find rightmost occurrence | When nums[mid] == target, set lo = mid + 1 | Sorted array with duplicates |
| Find insertion point | Return lo when loop ends | bisect_left equivalent |
| Search rotated array | Compare mid with lo to determine sorted half | LeetCode 33 |
| Find peak element | Compare mid with mid+1 | LeetCode 162 |
The Three Binary Search Templates:
# Template 1: Standard (find exact match)
while lo <= hi:
mid = lo + (hi - lo) // 2
if condition: return mid
elif ...: lo = mid + 1
else: hi = mid - 1
# Template 2: Find boundary (leftmost True)
while lo < hi:
mid = lo + (hi - lo) // 2
if condition(mid): hi = mid
else: lo = mid + 1
return lo
# Template 3: Find boundary (rightmost True)
while lo < hi:
mid = lo + (hi - lo + 1) // 2 # Round up to avoid infinite loop
if condition(mid): lo = mid
else: hi = mid - 1
return lo
Key Points:
lo + (hi - lo) // 2prevents integer overflow (matters in languages like Java/C++)- Template 2 and 3 converge when
lo == hi, soreturn logives the boundary - Template 3 rounds up (
+ 1before// 2) to prevent infinite loop whenlo = mid
Problem 14: Implement Sigmoid and Softmax Functions
Why this problem matters: These are the two most common activation/output functions in ML. Getting them numerically stable is critical and something many candidates get wrong.
Sigmoid (Naive vs. Stable):
import numpy as np
# Naive (BROKEN for large negative x)
def sigmoid_naive(x):
return 1 / (1 + np.exp(-x)) # exp(1000) = overflow
# Stable
def sigmoid(x):
# For x >= 0: 1 / (1 + exp(-x))
# For x < 0: exp(x) / (1 + exp(x)) (avoids exp of large positive)
return np.where(
x >= 0,
1 / (1 + np.exp(-x)),
np.exp(x) / (1 + np.exp(x))
)
Softmax (Naive vs. Stable):
# Naive (BROKEN for large values)
def softmax_naive(x):
return np.exp(x) / np.sum(np.exp(x)) # exp(1000) = overflow
# Stable (subtract max before exp)
def softmax(x):
x_shifted = x - np.max(x) # Shift so max is 0
exp_x = np.exp(x_shifted)
return exp_x / np.sum(exp_x)
# For 2D input (batch of vectors)
def softmax_batch(x):
x_shifted = x - np.max(x, axis=1, keepdims=True)
exp_x = np.exp(x_shifted)
return exp_x / np.sum(exp_x, axis=1, keepdims=True)
Why the Max Subtraction Trick Works:
softmax(x) = exp(x_i) / sum(exp(x_j))
= exp(x_i - c) / sum(exp(x_j - c)) for any constant c
Choosing c = max(x) ensures no exponent exceeds 0, preventing overflow. The mathematical result is identical.
Key Points Interviewers Check:
- Numerical stability (the max subtraction trick)
- Correct axis handling for batched inputs
- Knowledge of temperature scaling:
softmax(x / T)for controlling sharpness - When to use sigmoid vs. softmax (binary vs. multi-class; can use sigmoid for multi-label)
Problem 18: Compute Euclidean, Manhattan, and Cosine Distance
Why this problem matters: Distance metrics are the foundation of KNN, clustering, retrieval systems, and embedding similarity. Choosing the wrong metric can make your model useless.
Implementations:
import numpy as np
def euclidean_distance(a, b):
"""L2 distance: sqrt(sum((a_i - b_i)^2))"""
return np.sqrt(np.sum((a - b) ** 2))
def manhattan_distance(a, b):
"""L1 distance: sum(|a_i - b_i|)"""
return np.sum(np.abs(a - b))
def cosine_similarity(a, b):
"""Cosine: dot(a, b) / (||a|| * ||b||)"""
dot = np.dot(a, b)
norm_a = np.linalg.norm(a)
norm_b = np.linalg.norm(b)
if norm_a == 0 or norm_b == 0:
return 0.0 # Handle zero vector
return dot / (norm_a * norm_b)
def cosine_distance(a, b):
"""1 - cosine_similarity"""
return 1 - cosine_similarity(a, b)
When to Use Which:
| Metric | Best For | Key Property | Example Use Case |
|---|---|---|---|
| Euclidean | Dense, normalized features | Sensitive to magnitude | Image features, physical coordinates |
| Manhattan | Sparse features, grid-based | More robust to outliers | City block distance, sparse text features |
| Cosine | Text embeddings, directions | Invariant to magnitude | Document similarity, word embeddings |
Common Pitfall: Using Euclidean distance on unnormalized features. If feature A ranges [0, 1] and feature B ranges [0, 10000], Euclidean distance is dominated by feature B. Always normalize first, or use cosine similarity.
Self-Assessment Worksheet
Rate yourself on each category before starting. This helps prioritize your study time.
| Category | Problem Count | Confidence (1-5) | Priority | Action |
|---|---|---|---|---|
| DSA (hash map, stack, two pointers) | 10 | ___ | ___ | ___ |
| ML Implementation (KNN, regression, metrics) | 8 | ___ | ___ | ___ |
| Data Processing (log parsing, CSV, validation) | 5 | ___ | ___ | ___ |
| SQL (joins, GROUP BY, ranking) | 5 | ___ | ___ | ___ |
| System Design Discussion | 4 | ___ | ___ | ___ |
| Probability & Statistics | 3 | ___ | ___ | ___ |
Priority Scoring:
- Confidence 1-2 AND high interview relevance for your role = High Priority
- Confidence 3 = Medium Priority (review, do not deep-dive)
- Confidence 4-5 = Low Priority (quick review on day of)
Speed Drill Templates
Once you have solved each problem at least once, use these speed drills to build interview-day readiness.
15-Minute DSA Sprint
Pick 3 random problems from #1-10. Solve each in exactly 5 minutes. If you cannot finish in 5 minutes, move to the next one. After the 15 minutes, review what slowed you down.
20-Minute ML Sprint
Pick 2 random problems from #11-18. Solve each in exactly 10 minutes. Focus on getting the core algorithm correct. Edge cases and optimization can wait.
10-Minute SQL Sprint
Pick 2 random problems from #24-28. Solve each in exactly 5 minutes. If you have syntax errors, that is the gap to focus on.
Full Easy-Tier Speed Run
Try to complete all 35 problems in under 5 hours. This is your final readiness check before moving to Medium Tier. If any problem takes more than 15 minutes, mark it for review.
| Target | Time | Status |
|---|---|---|
| First attempt | 8-10 hours | Learning |
| Second attempt | 5-7 hours | Progressing |
| Third attempt | <5 hours | Ready for Medium Tier |
Common Misconceptions
| Misconception | Reality |
|---|---|
| "Easy problems are beneath me" | Easy problems are the building blocks of every hard problem |
| "I should spend most time on hard problems" | 60-70% of interview questions are medium; medium requires easy mastery |
| "Speed doesn't matter for easy problems" | Speed on easy problems determines how much time you have for the hard follow-up |
| "I can skip easy SQL since I use SQL daily" | Interview SQL has specific patterns (window functions, self-joins) that daily queries may not exercise |
| "I don't need to implement ML basics" | Implementation questions test understanding beyond API calls; they appear at every level |
| "Probability questions are rare" | Bayes' theorem and expected value appear in ~30% of DS/MLE interviews |
Next Steps
After completing the Easy Tier:
- Medium Tier is the natural next step -- it covers the bulk of interview problems
- Core 50 for a focused cross-category curriculum
- Role-specific lists for targeted preparation:
- MLE Problems for Machine Learning Engineer roles
- AI Engineer Problems for AI/GenAI Engineer roles
- Data Scientist Problems for Data Scientist roles
- Data Engineer Problems for Data Engineer roles
