Dynamic Programming
Memoization vs tabulation, common DP patterns (knapsack, LCS, edit distance), and DP in ML contexts (Viterbi, CTC, beam search) - with practice problems and full Python solutions.
Reading time: ~40 min | Interview relevance: High | Roles: MLE, Research Engineer, AI Engineer
The Real Interview Moment
You are in the final round at a top AI research lab. The interviewer, a research scientist who published three NeurIPS papers last year, leans back in her chair and says: "Let us do something different from the usual LeetCode fare. I want you to implement the Viterbi algorithm for a simple Hidden Markov Model. Given transition probabilities, emission probabilities, and a sequence of observations, find the most likely sequence of hidden states."
You feel a jolt of recognition. This is dynamic programming - the same principle behind edit distance and longest common subsequence, but applied to a core ML algorithm. The state is (position in sequence, current hidden state). The recurrence is: the best way to be in state s at position t is the best way to be in some state s' at position t-1, multiplied by the transition probability from s' to s and the emission probability of the observation at t given state s.
You write it cleanly. The interviewer nods: "Good. Now, what is the relationship between Viterbi and beam search in sequence-to-sequence models?" You explain that beam search is essentially a width-limited version of the same principle - instead of tracking the best path to every state, you track only the top-K paths.
Dynamic programming is not just a coding interview pattern. It is the algorithmic backbone of sequence modeling, reinforcement learning, and optimization. This chapter teaches you both the interview patterns and the ML connections.
The Core Idea
Dynamic programming solves problems by breaking them into overlapping subproblems and storing the results to avoid recomputation.
Two Properties Required for DP
- Optimal substructure - The optimal solution to the problem can be constructed from optimal solutions to its subproblems
- Overlapping subproblems - The same subproblems are solved multiple times
Memoization vs Tabulation
# ============ TOP-DOWN (MEMOIZATION) ============
# Start from the big problem, recurse, cache results
def fib_memo(n, memo={}):
"""Fibonacci with memoization.
Time: O(n), Space: O(n)
"""
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fib_memo(n - 1, memo) + fib_memo(n - 2, memo)
return memo[n]
# Better approach using functools
from functools import lru_cache
@lru_cache(maxsize=None)
def fib_lru(n):
"""Fibonacci with lru_cache - cleaner memoization."""
if n <= 1:
return n
return fib_lru(n - 1) + fib_lru(n - 2)
# ============ BOTTOM-UP (TABULATION) ============
# Start from smallest subproblems, build up to the answer
def fib_tab(n):
"""Fibonacci with tabulation.
Time: O(n), Space: O(n)
"""
if n <= 1:
return n
dp = [0] * (n + 1)
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i - 1] + dp[i - 2]
return dp[n]
# Space-optimized - only need last two values
def fib_optimal(n):
"""Fibonacci with O(1) space.
Time: O(n), Space: O(1)
"""
if n <= 1:
return n
prev2, prev1 = 0, 1
for _ in range(2, n + 1):
curr = prev1 + prev2
prev2 = prev1
prev1 = curr
return prev1
Memoization vs Tabulation Comparison
| Aspect | Memoization (Top-Down) | Tabulation (Bottom-Up) |
|---|---|---|
| Direction | Start from target, recurse down | Start from base cases, build up |
| Implementation | Recursive + cache | Iterative + array |
| What gets computed | Only needed subproblems | All subproblems |
| Stack overflow risk | Yes (deep recursion) | No |
| Easier to write | Often yes (more intuitive) | Sometimes - depends on the problem |
| Space optimization | Harder | Easier (can often reduce to O(1)) |
When stuck on a DP problem, use this framework: (1) Define the state - what variables uniquely identify a subproblem? (2) Write the recurrence - how does the answer for state (i) relate to smaller states? (3) Identify base cases. (4) Determine the computation order (which states depend on which). (5) Code it up - start with memoization if the recurrence is clear, switch to tabulation for optimization.
Pattern 1: 1D Dynamic Programming
Climbing Stairs / Fibonacci Family
def climb_stairs(n):
"""Number of ways to climb n stairs (1 or 2 steps at a time).
Recurrence: dp[i] = dp[i-1] + dp[i-2]
This IS Fibonacci.
Time: O(n), Space: O(1)
"""
if n <= 2:
return n
prev2, prev1 = 1, 2
for _ in range(3, n + 1):
curr = prev1 + prev2
prev2 = prev1
prev1 = curr
return prev1
def min_cost_climbing(cost):
"""Minimum cost to reach the top. Can start from step 0 or 1.
ML context: Minimum-cost path through a pipeline where
each stage has a processing cost.
Recurrence: dp[i] = cost[i] + min(dp[i-1], dp[i-2])
Time: O(n), Space: O(1)
"""
n = len(cost)
if n <= 1:
return 0
prev2, prev1 = cost[0], cost[1]
for i in range(2, n):
curr = cost[i] + min(prev1, prev2)
prev2 = prev1
prev1 = curr
return min(prev1, prev2)
def house_robber(nums):
"""Maximum amount you can rob without robbing adjacent houses.
Recurrence: dp[i] = max(dp[i-1], dp[i-2] + nums[i])
Either skip this house (dp[i-1]) or rob it (dp[i-2] + nums[i]).
Time: O(n), Space: O(1)
"""
if not nums:
return 0
if len(nums) == 1:
return nums[0]
prev2, prev1 = nums[0], max(nums[0], nums[1])
for i in range(2, len(nums)):
curr = max(prev1, prev2 + nums[i])
prev2 = prev1
prev1 = curr
return prev1
Longest Increasing Subsequence (LIS)
def lis(nums):
"""Length of the longest increasing subsequence.
ML context: Finding the longest monotonically improving
sequence of model performance metrics.
Approach 1: DP - O(n^2)
dp[i] = length of LIS ending at index i
Recurrence: dp[i] = max(dp[j] + 1) for all j < i where nums[j] < nums[i]
Time: O(n^2), Space: O(n)
"""
if not nums:
return 0
n = len(nums)
dp = [1] * n # Each element is a subsequence of length 1
for i in range(1, n):
for j in range(i):
if nums[j] < nums[i]:
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)
def lis_optimal(nums):
"""LIS using patience sorting - O(n log n).
Maintain a list 'tails' where tails[i] is the smallest tail
of all increasing subsequences of length i+1.
Time: O(n log n), Space: O(n)
"""
import bisect
tails = []
for num in nums:
pos = bisect.bisect_left(tails, num)
if pos == len(tails):
tails.append(num)
else:
tails[pos] = num
return len(tails)
A very common DP mistake: confusing "subsequence" with "subarray." A subsequence can skip elements (e.g., [1, 3, 5] from [1, 2, 3, 4, 5]). A subarray must be contiguous (e.g., [2, 3, 4]). Longest Increasing Subsequence is DP. Maximum Subarray sum is Kadane's algorithm (greedy, not DP).
Pattern 2: 2D Dynamic Programming
Edit Distance (Levenshtein Distance)
def edit_distance(word1, word2):
"""Minimum number of operations (insert, delete, replace)
to convert word1 into word2.
ML context: String similarity for NLP - spell checking,
fuzzy matching, comparing model outputs to ground truth.
Also: sequence alignment in bioinformatics.
State: dp[i][j] = edit distance between word1[:i] and word2[:j]
Recurrence:
If word1[i-1] == word2[j-1]:
dp[i][j] = dp[i-1][j-1] # Characters match, no operation
Else:
dp[i][j] = 1 + min(
dp[i-1][j], # Delete from word1
dp[i][j-1], # Insert into word1
dp[i-1][j-1] # Replace
)
Time: O(m * n), Space: O(m * n), can be optimized to O(min(m, n))
"""
m, n = len(word1), len(word2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
# Base cases: converting empty string to/from word
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
dp[i][j] = 1 + min(
dp[i - 1][j], # Delete
dp[i][j - 1], # Insert
dp[i - 1][j - 1] # Replace
)
return dp[m][n]
# Space-optimized to O(n)
def edit_distance_optimized(word1, word2):
"""Same algorithm, O(min(m,n)) space."""
if len(word1) < len(word2):
word1, word2 = word2, word1
m, n = len(word1), len(word2)
prev = list(range(n + 1))
curr = [0] * (n + 1)
for i in range(1, m + 1):
curr[0] = i
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
curr[j] = prev[j - 1]
else:
curr[j] = 1 + min(prev[j], curr[j - 1], prev[j - 1])
prev, curr = curr, prev
return prev[n]
Longest Common Subsequence (LCS)
def lcs(text1, text2):
"""Length of the longest common subsequence.
ML context: Measuring similarity between two sequences
(e.g., comparing predicted vs actual token sequences,
BLEU score computation involves LCS-like operations).
State: dp[i][j] = LCS length of text1[:i] and text2[:j]
Recurrence:
If text1[i-1] == text2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
Else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
Time: O(m * n), Space: O(m * n)
"""
m, n = len(text1), len(text2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
return dp[m][n]
def lcs_with_backtrack(text1, text2):
"""Return the actual LCS string, not just its length."""
m, n = len(text1), len(text2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
# Backtrack to find the actual subsequence
result = []
i, j = m, n
while i > 0 and j > 0:
if text1[i - 1] == text2[j - 1]:
result.append(text1[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
i -= 1
else:
j -= 1
return ''.join(reversed(result))
Pattern 3: Knapsack Problems
0/1 Knapsack
def knapsack_01(weights, values, capacity):
"""0/1 Knapsack - each item can be taken at most once.
ML context: Feature selection with a budget constraint -
each feature has a "cost" (computation time) and "value"
(predictive power). Select features to maximize value
within a compute budget.
State: dp[i][w] = max value using first i items with capacity w
Recurrence:
dp[i][w] = max(
dp[i-1][w], # Skip item i
dp[i-1][w - weights[i]] + values[i] # Take item i (if it fits)
)
Time: O(n * capacity), Space: O(n * capacity)
"""
n = len(weights)
dp = [[0] * (capacity + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for w in range(capacity + 1):
dp[i][w] = dp[i - 1][w] # Skip item i
if weights[i - 1] <= w:
dp[i][w] = max(
dp[i][w],
dp[i - 1][w - weights[i - 1]] + values[i - 1]
)
return dp[n][capacity]
# Space-optimized to O(capacity)
def knapsack_01_optimized(weights, values, capacity):
"""O(capacity) space - iterate weights in reverse."""
dp = [0] * (capacity + 1)
for i in range(len(weights)):
for w in range(capacity, weights[i] - 1, -1): # Reverse!
dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
return dp[capacity]
Unbounded Knapsack (Coin Change)
def coin_change(coins, amount):
"""Minimum number of coins to make the amount.
Unbounded knapsack - each coin can be used unlimited times.
State: dp[a] = minimum coins to make amount a
Recurrence: dp[a] = min(dp[a - coin] + 1) for each coin
Time: O(amount * len(coins)), Space: O(amount)
"""
dp = [float('inf')] * (amount + 1)
dp[0] = 0
for a in range(1, amount + 1):
for coin in coins:
if coin <= a and dp[a - coin] != float('inf'):
dp[a] = min(dp[a], dp[a - coin] + 1)
return dp[amount] if dp[amount] != float('inf') else -1
def coin_change_count(coins, amount):
"""Count the number of ways to make the amount.
State: dp[a] = number of ways to make amount a
Recurrence: dp[a] += dp[a - coin] for each coin
(iterate coins in outer loop to avoid counting permutations)
Time: O(amount * len(coins)), Space: O(amount)
"""
dp = [0] * (amount + 1)
dp[0] = 1
for coin in coins: # Outer loop over coins - counts combinations
for a in range(coin, amount + 1):
dp[a] += dp[a - coin]
return dp[amount]
At Google and Meta, you will likely see knapsack-style problems phrased in ML terms: "Given N features with prediction gains and computation costs, select the subset that maximizes prediction gain within a latency budget." Recognizing this as 0/1 knapsack is the key insight.
Pattern 4: String DP
Palindrome Problems
def longest_palindrome_substring(s):
"""Find the longest palindromic substring.
Approach: Expand around center (not pure DP but often categorized here).
For each position, try both odd-length and even-length palindromes.
Time: O(n^2), Space: O(1)
"""
if not s:
return ""
start, max_len = 0, 1
def expand(left, right):
nonlocal start, max_len
while left >= 0 and right < len(s) and s[left] == s[right]:
if right - left + 1 > max_len:
start = left
max_len = right - left + 1
left -= 1
right += 1
for i in range(len(s)):
expand(i, i) # Odd-length palindromes
expand(i, i + 1) # Even-length palindromes
return s[start:start + max_len]
def count_palindromic_substrings(s):
"""Count the number of palindromic substrings.
Time: O(n^2), Space: O(1)
"""
count = 0
def expand(left, right):
nonlocal count
while left >= 0 and right < len(s) and s[left] == s[right]:
count += 1
left -= 1
right += 1
for i in range(len(s)):
expand(i, i)
expand(i, i + 1)
return count
Word Break
def word_break(s, word_dict):
"""Can string s be segmented into words from the dictionary?
ML context: Tokenization - can a string be segmented
into valid tokens from a vocabulary?
State: dp[i] = True if s[:i] can be segmented
Recurrence: dp[i] = any(dp[j] and s[j:i] in word_dict) for j in [0, i)
Time: O(n^2 * m) where m = max word length, Space: O(n)
"""
n = len(s)
word_set = set(word_dict)
dp = [False] * (n + 1)
dp[0] = True
for i in range(1, n + 1):
for j in range(i):
if dp[j] and s[j:i] in word_set:
dp[i] = True
break
return dp[n]
Pattern 5: DP in ML Algorithms
Viterbi Algorithm
def viterbi(observations, states, start_prob, trans_prob, emit_prob):
"""Viterbi algorithm - find the most likely sequence of hidden states.
This is THE example of DP in ML. Used in:
- Hidden Markov Models (HMMs) for POS tagging, speech recognition
- CRF decoding in NER (named entity recognition)
State: dp[t][s] = probability of the most likely path
ending in state s at time t
Recurrence:
dp[t][s] = max over all previous states s' of:
dp[t-1][s'] * trans_prob[s'][s] * emit_prob[s][observations[t]]
Time: O(T * S^2) where T = sequence length, S = number of states
Space: O(T * S)
"""
T = len(observations)
S = len(states)
# dp[t][s] = (probability, previous_state)
dp = [[None] * S for _ in range(T)]
# Initialize: first observation
for s in range(S):
dp[0][s] = (
start_prob[states[s]] * emit_prob[states[s]][observations[0]],
None
)
# Fill: subsequent observations
for t in range(1, T):
for s in range(S):
best_prob = -1
best_prev = -1
for s_prev in range(S):
prob = (dp[t - 1][s_prev][0] *
trans_prob[states[s_prev]][states[s]] *
emit_prob[states[s]][observations[t]])
if prob > best_prob:
best_prob = prob
best_prev = s_prev
dp[t][s] = (best_prob, best_prev)
# Backtrack: find the most likely path
# Find best final state
best_final = max(range(S), key=lambda s: dp[T - 1][s][0])
path = [states[best_final]]
current = best_final
for t in range(T - 1, 0, -1):
current = dp[t][current][1]
path.append(states[current])
return list(reversed(path)), dp[T - 1][best_final][0]
# Example usage:
# states = ['sunny', 'rainy']
# observations = ['walk', 'shop', 'clean']
# start_prob = {'sunny': 0.6, 'rainy': 0.4}
# trans_prob = {'sunny': {'sunny': 0.7, 'rainy': 0.3},
# 'rainy': {'sunny': 0.4, 'rainy': 0.6}}
# emit_prob = {'sunny': {'walk': 0.6, 'shop': 0.3, 'clean': 0.1},
# 'rainy': {'walk': 0.1, 'shop': 0.4, 'clean': 0.5}}
Beam Search as Width-Limited DP
def beam_search(initial_state, expand_fn, score_fn, beam_width, max_steps):
"""Beam search - a width-limited form of DP.
Used in sequence-to-sequence models (machine translation,
text generation) as an alternative to greedy decoding.
Instead of tracking ALL states (like Viterbi), track only
the top-K (beam_width) most promising partial sequences.
Time: O(max_steps * beam_width * branching_factor)
Space: O(beam_width * max_steps)
"""
import heapq
# Each beam entry: (negative_score, sequence)
# Negative because heapq is a min-heap
beams = [(-score_fn(initial_state), [initial_state])]
for step in range(max_steps):
candidates = []
for neg_score, sequence in beams:
current_state = sequence[-1]
# Expand current state to get successors
for next_state in expand_fn(current_state):
new_sequence = sequence + [next_state]
new_score = score_fn(new_sequence)
candidates.append((-new_score, new_sequence))
# Keep only top beam_width candidates
beams = heapq.nsmallest(beam_width, candidates)
# Check for termination (all beams ended)
if not beams:
break
# Return best sequence
best_neg_score, best_sequence = min(beams)
return best_sequence, -best_neg_score
Sequence Alignment (Needleman-Wunsch)
def needleman_wunsch(seq1, seq2, match_score=1, mismatch_penalty=-1, gap_penalty=-2):
"""Global sequence alignment using Needleman-Wunsch algorithm.
ML context: Protein/DNA sequence alignment in bioinformatics ML.
This is a generalization of edit distance with customizable scores.
Time: O(m * n), Space: O(m * n)
"""
m, n = len(seq1), len(seq2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
# Initialize gap penalties
for i in range(m + 1):
dp[i][0] = i * gap_penalty
for j in range(n + 1):
dp[0][j] = j * gap_penalty
# Fill table
for i in range(1, m + 1):
for j in range(1, n + 1):
score = match_score if seq1[i - 1] == seq2[j - 1] else mismatch_penalty
dp[i][j] = max(
dp[i - 1][j - 1] + score, # Match/mismatch
dp[i - 1][j] + gap_penalty, # Gap in seq2
dp[i][j - 1] + gap_penalty # Gap in seq1
)
# Backtrack to find alignment
alignment1, alignment2 = [], []
i, j = m, n
while i > 0 or j > 0:
if i > 0 and j > 0:
score = match_score if seq1[i - 1] == seq2[j - 1] else mismatch_penalty
if dp[i][j] == dp[i - 1][j - 1] + score:
alignment1.append(seq1[i - 1])
alignment2.append(seq2[j - 1])
i -= 1
j -= 1
continue
if i > 0 and dp[i][j] == dp[i - 1][j] + gap_penalty:
alignment1.append(seq1[i - 1])
alignment2.append('-')
i -= 1
else:
alignment1.append('-')
alignment2.append(seq2[j - 1])
j -= 1
return (''.join(reversed(alignment1)),
''.join(reversed(alignment2)),
dp[m][n])
If you are interviewing for a research engineer or bioinformatics ML role and cannot explain the connection between edit distance and sequence alignment, it is a major red flag. They are the same algorithm with different scoring schemes. Edit distance minimizes operations; Needleman-Wunsch maximizes alignment score.
The DP Problem-Solving Framework
Quick Decision Table
| Problem Pattern | State Definition | Recurrence Pattern | Time |
|---|---|---|---|
| Linear sequence | dp[i] = answer for first i elements | dp[i] depends on dp[i-1], dp[i-2], etc. | O(n) or O(n^2) |
| Two sequences | dp[i][j] = answer for seq1[:i], seq2[:j] | dp[i][j] depends on dp[i-1][j], dp[i][j-1], dp[i-1][j-1] | O(m*n) |
| Knapsack | dp[i][w] = answer using first i items with capacity w | dp[i][w] = max(skip, take) | O(n*W) |
| Interval | dp[i][j] = answer for subarray arr[i:j+1] | dp[i][j] depends on dp[i+1][j], dp[i][j-1], dp[i+1][j-1] | O(n^2) |
| Grid | dp[i][j] = answer at position (i, j) | dp[i][j] depends on dp[i-1][j], dp[i][j-1] | O(m*n) |
Practice Problems
Problem 1: Maximum Profit from Model Training
You are training a model and can choose to retrain or keep the current checkpoint at each epoch. Retraining costs c[i] compute and gives v[i] accuracy improvement. You have a total compute budget B. Maximize total accuracy improvement.
# Input: costs = [3, 4, 2, 5, 1], values = [2, 3, 1, 5, 2], budget = 7
# Output: 7 (select epochs with costs [3, 4] giving values [2, 5])
# Wait - that's costs [3,4] = 7, values [2,3] = 5
# Better: costs [2, 5] = 7, values [1, 5] = 6
# Even better: costs [3, 2, 1] = 6, values [2, 1, 2] = 5
# Best: costs [4, 2, 1] = 7, values [3, 1, 2] = 6
# Actually: costs [3, 2, 1] = 6 <= 7, values [2, 1, 2] = 5
# costs [4, 1] = 5, values [3, 2] = 5
# costs [5, 1] = 6, values [5, 2] = 7 <-- this is 7!
# Output: 7
Hint 1
Hint 2
Hint 3
Solution
def max_training_profit(costs, values, budget):
"""0/1 Knapsack - maximize accuracy improvement within compute budget.
Time: O(n * budget), Space: O(budget)
"""
dp = [0] * (budget + 1)
for i in range(len(costs)):
for b in range(budget, costs[i] - 1, -1):
dp[b] = max(dp[b], dp[b - costs[i]] + values[i])
return dp[budget]
Problem 2: Token Sequence Similarity (Edit Distance)
Given two token sequences (e.g., model output and ground truth), compute the minimum number of token insertions, deletions, and substitutions to convert one into the other.
# Input: seq1 = ["the", "cat", "sat", "on", "mat"]
# seq2 = ["the", "dog", "sat", "on", "the", "mat"]
# Output: 2 (substitute "cat"->"dog", insert "the")
Hint 1
Hint 2
Hint 3
Solution
def token_edit_distance(seq1, seq2):
"""Edit distance on token sequences.
Time: O(m * n), Space: O(n)
"""
m, n = len(seq1), len(seq2)
prev = list(range(n + 1))
curr = [0] * (n + 1)
for i in range(1, m + 1):
curr[0] = i
for j in range(1, n + 1):
if seq1[i - 1] == seq2[j - 1]:
curr[j] = prev[j - 1]
else:
curr[j] = 1 + min(prev[j], curr[j - 1], prev[j - 1])
prev, curr = curr, prev
return prev[n]
Problem 3: Decode Ways
A message containing letters A-Z is encoded as numbers (A=1, B=2, ..., Z=26). Given a string of digits, count the number of ways to decode it.
# Input: "226"
# Output: 3 ("BZ", "VF", "BBF")
# ML context: Analogous to CTC decoding where multiple
# alignments can produce the same output.
Hint 1
Hint 2
Hint 3
Solution
def num_decodings(s):
"""Count decoding ways for a digit string.
Time: O(n), Space: O(1)
"""
if not s or s[0] == '0':
return 0
n = len(s)
prev2 = 1 # dp[i-2]: empty string has 1 way
prev1 = 1 # dp[i-1]: single valid digit has 1 way
for i in range(1, n):
curr = 0
# Single digit decode
if s[i] != '0':
curr += prev1
# Two digit decode
two_digit = int(s[i - 1:i + 1])
if 10 <= two_digit <= 26:
curr += prev2
prev2 = prev1
prev1 = curr
return prev1
Problem 4: Unique Paths in a Grid
A robot starts at (0,0) and wants to reach (m-1, n-1). It can only move right or down. Count the number of unique paths.
# Input: m = 3, n = 7
# Output: 28
Hint 1
Hint 2
Hint 3
Solution
def unique_paths(m, n):
"""Count unique paths in a grid.
Time: O(m * n), Space: O(n)
"""
dp = [1] * n
for i in range(1, m):
for j in range(1, n):
dp[j] += dp[j - 1]
return dp[n - 1]
Problem 5: Longest Common Subsequence of Token Sequences
Given two tokenized text sequences, find the length of their longest common subsequence and return the actual subsequence.
# Input: tokens1 = ["the", "quick", "brown", "fox", "jumps"]
# tokens2 = ["the", "slow", "brown", "fox", "runs"]
# Output: 3, ["the", "brown", "fox"]
Hint 1
Hint 2
Hint 3
Solution
def lcs_tokens(tokens1, tokens2):
"""LCS for token sequences with backtracking.
Time: O(m * n), Space: O(m * n)
"""
m, n = len(tokens1), len(tokens2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if tokens1[i - 1] == tokens2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
# Backtrack
result = []
i, j = m, n
while i > 0 and j > 0:
if tokens1[i - 1] == tokens2[j - 1]:
result.append(tokens1[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
i -= 1
else:
j -= 1
result.reverse()
return dp[m][n], result
Problem 6: Partition Equal Subset Sum
Given an array of positive integers, determine if the array can be partitioned into two subsets with equal sum.
# Input: [1, 5, 11, 5]
# Output: True (subsets [1, 5, 5] and [11])
# ML context: Balanced data splitting - can you partition a dataset
# into two groups with equal total weight?
Hint 1
Hint 2
Hint 3
Solution
def can_partition(nums):
"""Check if array can be partitioned into two equal-sum subsets.
Time: O(n * sum/2), Space: O(sum/2)
"""
total = sum(nums)
if total % 2 != 0:
return False
target = total // 2
dp = [False] * (target + 1)
dp[0] = True
for num in nums:
for j in range(target, num - 1, -1):
dp[j] = dp[j] or dp[j - num]
return dp[target]
Interview Cheat Sheet
| Pattern | When to Use | Key Recurrence |
|---|---|---|
| 1D DP (Fibonacci) | Linear sequence, each step depends on prev 1-2 | dp[i] = f(dp[i-1], dp[i-2]) |
| LIS | Longest increasing subsequence | dp[i] = max(dp[j]+1) for j < i, arr[j] < arr[i] |
| Edit Distance | String/sequence transformation | dp[i][j] = min(delete, insert, replace) |
| LCS | Longest common subsequence | dp[i][j] = dp[i-1][j-1]+1 if match, else max(skip) |
| 0/1 Knapsack | Select items with weight/value | dp[i][w] = max(skip, take) |
| Coin Change | Minimum coins for target | dp[a] = min(dp[a-coin]+1) for each coin |
| Grid Paths | Count/optimize paths in 2D grid | dp[i][j] = dp[i-1][j] + dp[i][j-1] |
| Viterbi | Most likely hidden state sequence | dp[t][s] = max(dp[t-1][s'] * transition * emission) |
| Word Break | Can string be segmented? | dp[i] = any(dp[j] and s[j:i] in dict) |
Spaced Repetition Checkpoints
| Day | Review Activity | Time |
|---|---|---|
| Day 0 | Read all patterns. Implement edit distance and coin change from scratch. | 40 min |
| Day 3 | Implement LCS with backtracking and 0/1 knapsack without looking at solutions. | 25 min |
| Day 7 | Solve 3 new DP problems from LeetCode (one 1D, one 2D, one knapsack variant). | 45 min |
| Day 14 | Implement the Viterbi algorithm from scratch with a small example. Verify manually. | 30 min |
| Day 21 | Without reference, classify 5 DP problems by pattern (1D, 2D, knapsack, string). Solve one from each category. | 60 min |
Next Steps
Dynamic programming handles optimization over sequences. Next, learn Sorting and Searching to master binary search variations, top-K selection, and sorting algorithms critical for data-heavy ML pipelines.
