Skip to main content

Dynamic Programming

Memoization vs tabulation, common DP patterns (knapsack, LCS, edit distance), and DP in ML contexts (Viterbi, CTC, beam search) - with practice problems and full Python solutions.

Reading time: ~40 min | Interview relevance: High | Roles: MLE, Research Engineer, AI Engineer

The Real Interview Moment

You are in the final round at a top AI research lab. The interviewer, a research scientist who published three NeurIPS papers last year, leans back in her chair and says: "Let us do something different from the usual LeetCode fare. I want you to implement the Viterbi algorithm for a simple Hidden Markov Model. Given transition probabilities, emission probabilities, and a sequence of observations, find the most likely sequence of hidden states."

You feel a jolt of recognition. This is dynamic programming - the same principle behind edit distance and longest common subsequence, but applied to a core ML algorithm. The state is (position in sequence, current hidden state). The recurrence is: the best way to be in state s at position t is the best way to be in some state s' at position t-1, multiplied by the transition probability from s' to s and the emission probability of the observation at t given state s.

You write it cleanly. The interviewer nods: "Good. Now, what is the relationship between Viterbi and beam search in sequence-to-sequence models?" You explain that beam search is essentially a width-limited version of the same principle - instead of tracking the best path to every state, you track only the top-K paths.

Dynamic programming is not just a coding interview pattern. It is the algorithmic backbone of sequence modeling, reinforcement learning, and optimization. This chapter teaches you both the interview patterns and the ML connections.

The Core Idea

Dynamic programming solves problems by breaking them into overlapping subproblems and storing the results to avoid recomputation.

When to Use Dynamic Programming

Two Properties Required for DP

  1. Optimal substructure - The optimal solution to the problem can be constructed from optimal solutions to its subproblems
  2. Overlapping subproblems - The same subproblems are solved multiple times

Memoization vs Tabulation

# ============ TOP-DOWN (MEMOIZATION) ============
# Start from the big problem, recurse, cache results

def fib_memo(n, memo={}):
"""Fibonacci with memoization.
Time: O(n), Space: O(n)
"""
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fib_memo(n - 1, memo) + fib_memo(n - 2, memo)
return memo[n]

# Better approach using functools
from functools import lru_cache

@lru_cache(maxsize=None)
def fib_lru(n):
"""Fibonacci with lru_cache - cleaner memoization."""
if n <= 1:
return n
return fib_lru(n - 1) + fib_lru(n - 2)

# ============ BOTTOM-UP (TABULATION) ============
# Start from smallest subproblems, build up to the answer

def fib_tab(n):
"""Fibonacci with tabulation.
Time: O(n), Space: O(n)
"""
if n <= 1:
return n
dp = [0] * (n + 1)
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i - 1] + dp[i - 2]
return dp[n]

# Space-optimized - only need last two values
def fib_optimal(n):
"""Fibonacci with O(1) space.
Time: O(n), Space: O(1)
"""
if n <= 1:
return n
prev2, prev1 = 0, 1
for _ in range(2, n + 1):
curr = prev1 + prev2
prev2 = prev1
prev1 = curr
return prev1

Memoization vs Tabulation Comparison

AspectMemoization (Top-Down)Tabulation (Bottom-Up)
DirectionStart from target, recurse downStart from base cases, build up
ImplementationRecursive + cacheIterative + array
What gets computedOnly needed subproblemsAll subproblems
Stack overflow riskYes (deep recursion)No
Easier to writeOften yes (more intuitive)Sometimes - depends on the problem
Space optimizationHarderEasier (can often reduce to O(1))
60-Second Answer

When stuck on a DP problem, use this framework: (1) Define the state - what variables uniquely identify a subproblem? (2) Write the recurrence - how does the answer for state (i) relate to smaller states? (3) Identify base cases. (4) Determine the computation order (which states depend on which). (5) Code it up - start with memoization if the recurrence is clear, switch to tabulation for optimization.

Pattern 1: 1D Dynamic Programming

Climbing Stairs / Fibonacci Family

def climb_stairs(n):
"""Number of ways to climb n stairs (1 or 2 steps at a time).

Recurrence: dp[i] = dp[i-1] + dp[i-2]
This IS Fibonacci.

Time: O(n), Space: O(1)
"""
if n <= 2:
return n
prev2, prev1 = 1, 2
for _ in range(3, n + 1):
curr = prev1 + prev2
prev2 = prev1
prev1 = curr
return prev1

def min_cost_climbing(cost):
"""Minimum cost to reach the top. Can start from step 0 or 1.

ML context: Minimum-cost path through a pipeline where
each stage has a processing cost.

Recurrence: dp[i] = cost[i] + min(dp[i-1], dp[i-2])

Time: O(n), Space: O(1)
"""
n = len(cost)
if n <= 1:
return 0

prev2, prev1 = cost[0], cost[1]
for i in range(2, n):
curr = cost[i] + min(prev1, prev2)
prev2 = prev1
prev1 = curr
return min(prev1, prev2)

def house_robber(nums):
"""Maximum amount you can rob without robbing adjacent houses.

Recurrence: dp[i] = max(dp[i-1], dp[i-2] + nums[i])
Either skip this house (dp[i-1]) or rob it (dp[i-2] + nums[i]).

Time: O(n), Space: O(1)
"""
if not nums:
return 0
if len(nums) == 1:
return nums[0]

prev2, prev1 = nums[0], max(nums[0], nums[1])
for i in range(2, len(nums)):
curr = max(prev1, prev2 + nums[i])
prev2 = prev1
prev1 = curr
return prev1

Longest Increasing Subsequence (LIS)

def lis(nums):
"""Length of the longest increasing subsequence.

ML context: Finding the longest monotonically improving
sequence of model performance metrics.

Approach 1: DP - O(n^2)
dp[i] = length of LIS ending at index i
Recurrence: dp[i] = max(dp[j] + 1) for all j < i where nums[j] < nums[i]

Time: O(n^2), Space: O(n)
"""
if not nums:
return 0
n = len(nums)
dp = [1] * n # Each element is a subsequence of length 1

for i in range(1, n):
for j in range(i):
if nums[j] < nums[i]:
dp[i] = max(dp[i], dp[j] + 1)

return max(dp)

def lis_optimal(nums):
"""LIS using patience sorting - O(n log n).

Maintain a list 'tails' where tails[i] is the smallest tail
of all increasing subsequences of length i+1.

Time: O(n log n), Space: O(n)
"""
import bisect
tails = []
for num in nums:
pos = bisect.bisect_left(tails, num)
if pos == len(tails):
tails.append(num)
else:
tails[pos] = num
return len(tails)
Common Trap

A very common DP mistake: confusing "subsequence" with "subarray." A subsequence can skip elements (e.g., [1, 3, 5] from [1, 2, 3, 4, 5]). A subarray must be contiguous (e.g., [2, 3, 4]). Longest Increasing Subsequence is DP. Maximum Subarray sum is Kadane's algorithm (greedy, not DP).

Pattern 2: 2D Dynamic Programming

Edit Distance (Levenshtein Distance)

def edit_distance(word1, word2):
"""Minimum number of operations (insert, delete, replace)
to convert word1 into word2.

ML context: String similarity for NLP - spell checking,
fuzzy matching, comparing model outputs to ground truth.
Also: sequence alignment in bioinformatics.

State: dp[i][j] = edit distance between word1[:i] and word2[:j]

Recurrence:
If word1[i-1] == word2[j-1]:
dp[i][j] = dp[i-1][j-1] # Characters match, no operation
Else:
dp[i][j] = 1 + min(
dp[i-1][j], # Delete from word1
dp[i][j-1], # Insert into word1
dp[i-1][j-1] # Replace
)

Time: O(m * n), Space: O(m * n), can be optimized to O(min(m, n))
"""
m, n = len(word1), len(word2)
dp = [[0] * (n + 1) for _ in range(m + 1)]

# Base cases: converting empty string to/from word
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j

for i in range(1, m + 1):
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
dp[i][j] = 1 + min(
dp[i - 1][j], # Delete
dp[i][j - 1], # Insert
dp[i - 1][j - 1] # Replace
)

return dp[m][n]

# Space-optimized to O(n)
def edit_distance_optimized(word1, word2):
"""Same algorithm, O(min(m,n)) space."""
if len(word1) < len(word2):
word1, word2 = word2, word1
m, n = len(word1), len(word2)

prev = list(range(n + 1))
curr = [0] * (n + 1)

for i in range(1, m + 1):
curr[0] = i
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
curr[j] = prev[j - 1]
else:
curr[j] = 1 + min(prev[j], curr[j - 1], prev[j - 1])
prev, curr = curr, prev

return prev[n]

Longest Common Subsequence (LCS)

def lcs(text1, text2):
"""Length of the longest common subsequence.

ML context: Measuring similarity between two sequences
(e.g., comparing predicted vs actual token sequences,
BLEU score computation involves LCS-like operations).

State: dp[i][j] = LCS length of text1[:i] and text2[:j]

Recurrence:
If text1[i-1] == text2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
Else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])

Time: O(m * n), Space: O(m * n)
"""
m, n = len(text1), len(text2)
dp = [[0] * (n + 1) for _ in range(m + 1)]

for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

return dp[m][n]

def lcs_with_backtrack(text1, text2):
"""Return the actual LCS string, not just its length."""
m, n = len(text1), len(text2)
dp = [[0] * (n + 1) for _ in range(m + 1)]

for i in range(1, m + 1):
for j in range(1, n + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

# Backtrack to find the actual subsequence
result = []
i, j = m, n
while i > 0 and j > 0:
if text1[i - 1] == text2[j - 1]:
result.append(text1[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
i -= 1
else:
j -= 1

return ''.join(reversed(result))

Pattern 3: Knapsack Problems

0/1 Knapsack

def knapsack_01(weights, values, capacity):
"""0/1 Knapsack - each item can be taken at most once.

ML context: Feature selection with a budget constraint -
each feature has a "cost" (computation time) and "value"
(predictive power). Select features to maximize value
within a compute budget.

State: dp[i][w] = max value using first i items with capacity w

Recurrence:
dp[i][w] = max(
dp[i-1][w], # Skip item i
dp[i-1][w - weights[i]] + values[i] # Take item i (if it fits)
)

Time: O(n * capacity), Space: O(n * capacity)
"""
n = len(weights)
dp = [[0] * (capacity + 1) for _ in range(n + 1)]

for i in range(1, n + 1):
for w in range(capacity + 1):
dp[i][w] = dp[i - 1][w] # Skip item i
if weights[i - 1] <= w:
dp[i][w] = max(
dp[i][w],
dp[i - 1][w - weights[i - 1]] + values[i - 1]
)

return dp[n][capacity]

# Space-optimized to O(capacity)
def knapsack_01_optimized(weights, values, capacity):
"""O(capacity) space - iterate weights in reverse."""
dp = [0] * (capacity + 1)
for i in range(len(weights)):
for w in range(capacity, weights[i] - 1, -1): # Reverse!
dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
return dp[capacity]

Unbounded Knapsack (Coin Change)

def coin_change(coins, amount):
"""Minimum number of coins to make the amount.
Unbounded knapsack - each coin can be used unlimited times.

State: dp[a] = minimum coins to make amount a

Recurrence: dp[a] = min(dp[a - coin] + 1) for each coin

Time: O(amount * len(coins)), Space: O(amount)
"""
dp = [float('inf')] * (amount + 1)
dp[0] = 0

for a in range(1, amount + 1):
for coin in coins:
if coin <= a and dp[a - coin] != float('inf'):
dp[a] = min(dp[a], dp[a - coin] + 1)

return dp[amount] if dp[amount] != float('inf') else -1

def coin_change_count(coins, amount):
"""Count the number of ways to make the amount.

State: dp[a] = number of ways to make amount a

Recurrence: dp[a] += dp[a - coin] for each coin
(iterate coins in outer loop to avoid counting permutations)

Time: O(amount * len(coins)), Space: O(amount)
"""
dp = [0] * (amount + 1)
dp[0] = 1

for coin in coins: # Outer loop over coins - counts combinations
for a in range(coin, amount + 1):
dp[a] += dp[a - coin]

return dp[amount]
Company Variation

At Google and Meta, you will likely see knapsack-style problems phrased in ML terms: "Given N features with prediction gains and computation costs, select the subset that maximizes prediction gain within a latency budget." Recognizing this as 0/1 knapsack is the key insight.

Pattern 4: String DP

Palindrome Problems

def longest_palindrome_substring(s):
"""Find the longest palindromic substring.

Approach: Expand around center (not pure DP but often categorized here).
For each position, try both odd-length and even-length palindromes.

Time: O(n^2), Space: O(1)
"""
if not s:
return ""

start, max_len = 0, 1

def expand(left, right):
nonlocal start, max_len
while left >= 0 and right < len(s) and s[left] == s[right]:
if right - left + 1 > max_len:
start = left
max_len = right - left + 1
left -= 1
right += 1

for i in range(len(s)):
expand(i, i) # Odd-length palindromes
expand(i, i + 1) # Even-length palindromes

return s[start:start + max_len]

def count_palindromic_substrings(s):
"""Count the number of palindromic substrings.
Time: O(n^2), Space: O(1)
"""
count = 0

def expand(left, right):
nonlocal count
while left >= 0 and right < len(s) and s[left] == s[right]:
count += 1
left -= 1
right += 1

for i in range(len(s)):
expand(i, i)
expand(i, i + 1)

return count

Word Break

def word_break(s, word_dict):
"""Can string s be segmented into words from the dictionary?

ML context: Tokenization - can a string be segmented
into valid tokens from a vocabulary?

State: dp[i] = True if s[:i] can be segmented

Recurrence: dp[i] = any(dp[j] and s[j:i] in word_dict) for j in [0, i)

Time: O(n^2 * m) where m = max word length, Space: O(n)
"""
n = len(s)
word_set = set(word_dict)
dp = [False] * (n + 1)
dp[0] = True

for i in range(1, n + 1):
for j in range(i):
if dp[j] and s[j:i] in word_set:
dp[i] = True
break

return dp[n]

Pattern 5: DP in ML Algorithms

Viterbi Algorithm

def viterbi(observations, states, start_prob, trans_prob, emit_prob):
"""Viterbi algorithm - find the most likely sequence of hidden states.

This is THE example of DP in ML. Used in:
- Hidden Markov Models (HMMs) for POS tagging, speech recognition
- CRF decoding in NER (named entity recognition)

State: dp[t][s] = probability of the most likely path
ending in state s at time t

Recurrence:
dp[t][s] = max over all previous states s' of:
dp[t-1][s'] * trans_prob[s'][s] * emit_prob[s][observations[t]]

Time: O(T * S^2) where T = sequence length, S = number of states
Space: O(T * S)
"""
T = len(observations)
S = len(states)

# dp[t][s] = (probability, previous_state)
dp = [[None] * S for _ in range(T)]

# Initialize: first observation
for s in range(S):
dp[0][s] = (
start_prob[states[s]] * emit_prob[states[s]][observations[0]],
None
)

# Fill: subsequent observations
for t in range(1, T):
for s in range(S):
best_prob = -1
best_prev = -1
for s_prev in range(S):
prob = (dp[t - 1][s_prev][0] *
trans_prob[states[s_prev]][states[s]] *
emit_prob[states[s]][observations[t]])
if prob > best_prob:
best_prob = prob
best_prev = s_prev
dp[t][s] = (best_prob, best_prev)

# Backtrack: find the most likely path
# Find best final state
best_final = max(range(S), key=lambda s: dp[T - 1][s][0])
path = [states[best_final]]

current = best_final
for t in range(T - 1, 0, -1):
current = dp[t][current][1]
path.append(states[current])

return list(reversed(path)), dp[T - 1][best_final][0]

# Example usage:
# states = ['sunny', 'rainy']
# observations = ['walk', 'shop', 'clean']
# start_prob = {'sunny': 0.6, 'rainy': 0.4}
# trans_prob = {'sunny': {'sunny': 0.7, 'rainy': 0.3},
# 'rainy': {'sunny': 0.4, 'rainy': 0.6}}
# emit_prob = {'sunny': {'walk': 0.6, 'shop': 0.3, 'clean': 0.1},
# 'rainy': {'walk': 0.1, 'shop': 0.4, 'clean': 0.5}}

Beam Search as Width-Limited DP

def beam_search(initial_state, expand_fn, score_fn, beam_width, max_steps):
"""Beam search - a width-limited form of DP.

Used in sequence-to-sequence models (machine translation,
text generation) as an alternative to greedy decoding.

Instead of tracking ALL states (like Viterbi), track only
the top-K (beam_width) most promising partial sequences.

Time: O(max_steps * beam_width * branching_factor)
Space: O(beam_width * max_steps)
"""
import heapq

# Each beam entry: (negative_score, sequence)
# Negative because heapq is a min-heap
beams = [(-score_fn(initial_state), [initial_state])]

for step in range(max_steps):
candidates = []
for neg_score, sequence in beams:
current_state = sequence[-1]
# Expand current state to get successors
for next_state in expand_fn(current_state):
new_sequence = sequence + [next_state]
new_score = score_fn(new_sequence)
candidates.append((-new_score, new_sequence))

# Keep only top beam_width candidates
beams = heapq.nsmallest(beam_width, candidates)

# Check for termination (all beams ended)
if not beams:
break

# Return best sequence
best_neg_score, best_sequence = min(beams)
return best_sequence, -best_neg_score

Sequence Alignment (Needleman-Wunsch)

def needleman_wunsch(seq1, seq2, match_score=1, mismatch_penalty=-1, gap_penalty=-2):
"""Global sequence alignment using Needleman-Wunsch algorithm.

ML context: Protein/DNA sequence alignment in bioinformatics ML.
This is a generalization of edit distance with customizable scores.

Time: O(m * n), Space: O(m * n)
"""
m, n = len(seq1), len(seq2)
dp = [[0] * (n + 1) for _ in range(m + 1)]

# Initialize gap penalties
for i in range(m + 1):
dp[i][0] = i * gap_penalty
for j in range(n + 1):
dp[0][j] = j * gap_penalty

# Fill table
for i in range(1, m + 1):
for j in range(1, n + 1):
score = match_score if seq1[i - 1] == seq2[j - 1] else mismatch_penalty
dp[i][j] = max(
dp[i - 1][j - 1] + score, # Match/mismatch
dp[i - 1][j] + gap_penalty, # Gap in seq2
dp[i][j - 1] + gap_penalty # Gap in seq1
)

# Backtrack to find alignment
alignment1, alignment2 = [], []
i, j = m, n
while i > 0 or j > 0:
if i > 0 and j > 0:
score = match_score if seq1[i - 1] == seq2[j - 1] else mismatch_penalty
if dp[i][j] == dp[i - 1][j - 1] + score:
alignment1.append(seq1[i - 1])
alignment2.append(seq2[j - 1])
i -= 1
j -= 1
continue
if i > 0 and dp[i][j] == dp[i - 1][j] + gap_penalty:
alignment1.append(seq1[i - 1])
alignment2.append('-')
i -= 1
else:
alignment1.append('-')
alignment2.append(seq2[j - 1])
j -= 1

return (''.join(reversed(alignment1)),
''.join(reversed(alignment2)),
dp[m][n])
Instant Rejection

If you are interviewing for a research engineer or bioinformatics ML role and cannot explain the connection between edit distance and sequence alignment, it is a major red flag. They are the same algorithm with different scoring schemes. Edit distance minimizes operations; Needleman-Wunsch maximizes alignment score.

The DP Problem-Solving Framework

The DP Problem-Solving Framework

Quick Decision Table

Problem PatternState DefinitionRecurrence PatternTime
Linear sequencedp[i] = answer for first i elementsdp[i] depends on dp[i-1], dp[i-2], etc.O(n) or O(n^2)
Two sequencesdp[i][j] = answer for seq1[:i], seq2[:j]dp[i][j] depends on dp[i-1][j], dp[i][j-1], dp[i-1][j-1]O(m*n)
Knapsackdp[i][w] = answer using first i items with capacity wdp[i][w] = max(skip, take)O(n*W)
Intervaldp[i][j] = answer for subarray arr[i:j+1]dp[i][j] depends on dp[i+1][j], dp[i][j-1], dp[i+1][j-1]O(n^2)
Griddp[i][j] = answer at position (i, j)dp[i][j] depends on dp[i-1][j], dp[i][j-1]O(m*n)

Practice Problems

Problem 1: Maximum Profit from Model Training

You are training a model and can choose to retrain or keep the current checkpoint at each epoch. Retraining costs c[i] compute and gives v[i] accuracy improvement. You have a total compute budget B. Maximize total accuracy improvement.

# Input: costs = [3, 4, 2, 5, 1], values = [2, 3, 1, 5, 2], budget = 7
# Output: 7 (select epochs with costs [3, 4] giving values [2, 5])
# Wait - that's costs [3,4] = 7, values [2,3] = 5
# Better: costs [2, 5] = 7, values [1, 5] = 6
# Even better: costs [3, 2, 1] = 6, values [2, 1, 2] = 5
# Best: costs [4, 2, 1] = 7, values [3, 1, 2] = 6
# Actually: costs [3, 2, 1] = 6 <= 7, values [2, 1, 2] = 5
# costs [4, 1] = 5, values [3, 2] = 5
# costs [5, 1] = 6, values [5, 2] = 7 <-- this is 7!
# Output: 7
Hint 1
This is a 0/1 knapsack problem. Each epoch is an "item" with a weight (cost) and value (accuracy improvement).
Hint 2
State: dp[i][b] = max accuracy improvement using first i epochs with budget b.
Hint 3
Recurrence: dp[i][b] = max(dp[i-1][b], dp[i-1][b-costs[i]] + values[i]).
Solution
def max_training_profit(costs, values, budget):
"""0/1 Knapsack - maximize accuracy improvement within compute budget.
Time: O(n * budget), Space: O(budget)
"""
dp = [0] * (budget + 1)
for i in range(len(costs)):
for b in range(budget, costs[i] - 1, -1):
dp[b] = max(dp[b], dp[b - costs[i]] + values[i])
return dp[budget]

Problem 2: Token Sequence Similarity (Edit Distance)

Given two token sequences (e.g., model output and ground truth), compute the minimum number of token insertions, deletions, and substitutions to convert one into the other.

# Input: seq1 = ["the", "cat", "sat", "on", "mat"]
# seq2 = ["the", "dog", "sat", "on", "the", "mat"]
# Output: 2 (substitute "cat"->"dog", insert "the")
Hint 1
This is edit distance but on tokens instead of characters.
Hint 2
The algorithm is identical. State: dp[i][j] = edit distance between seq1[:i] and seq2[:j].
Hint 3
Compare tokens for equality instead of characters. Everything else stays the same.
Solution
def token_edit_distance(seq1, seq2):
"""Edit distance on token sequences.
Time: O(m * n), Space: O(n)
"""
m, n = len(seq1), len(seq2)
prev = list(range(n + 1))
curr = [0] * (n + 1)

for i in range(1, m + 1):
curr[0] = i
for j in range(1, n + 1):
if seq1[i - 1] == seq2[j - 1]:
curr[j] = prev[j - 1]
else:
curr[j] = 1 + min(prev[j], curr[j - 1], prev[j - 1])
prev, curr = curr, prev

return prev[n]

Problem 3: Decode Ways

A message containing letters A-Z is encoded as numbers (A=1, B=2, ..., Z=26). Given a string of digits, count the number of ways to decode it.

# Input: "226"
# Output: 3 ("BZ", "VF", "BBF")

# ML context: Analogous to CTC decoding where multiple
# alignments can produce the same output.
Hint 1
At each position, you can decode one digit (1-9) or two digits (10-26).
Hint 2
dp[i] = number of ways to decode s[:i]. dp[i] = dp[i-1] (if single digit valid) + dp[i-2] (if two digits valid).
Hint 3
Be careful with '0' - it cannot be decoded alone, only as part of "10" or "20".
Solution
def num_decodings(s):
"""Count decoding ways for a digit string.
Time: O(n), Space: O(1)
"""
if not s or s[0] == '0':
return 0

n = len(s)
prev2 = 1 # dp[i-2]: empty string has 1 way
prev1 = 1 # dp[i-1]: single valid digit has 1 way

for i in range(1, n):
curr = 0
# Single digit decode
if s[i] != '0':
curr += prev1
# Two digit decode
two_digit = int(s[i - 1:i + 1])
if 10 <= two_digit <= 26:
curr += prev2
prev2 = prev1
prev1 = curr

return prev1

Problem 4: Unique Paths in a Grid

A robot starts at (0,0) and wants to reach (m-1, n-1). It can only move right or down. Count the number of unique paths.

# Input: m = 3, n = 7
# Output: 28
Hint 1
dp[i][j] = number of unique paths to reach (i, j). The robot can come from (i-1, j) or (i, j-1).
Hint 2
Recurrence: dp[i][j] = dp[i-1][j] + dp[i][j-1]. Base case: first row and first column are all 1s.
Hint 3
Space optimization: you only need the previous row, so use a 1D array.
Solution
def unique_paths(m, n):
"""Count unique paths in a grid.
Time: O(m * n), Space: O(n)
"""
dp = [1] * n
for i in range(1, m):
for j in range(1, n):
dp[j] += dp[j - 1]
return dp[n - 1]

Problem 5: Longest Common Subsequence of Token Sequences

Given two tokenized text sequences, find the length of their longest common subsequence and return the actual subsequence.

# Input: tokens1 = ["the", "quick", "brown", "fox", "jumps"]
# tokens2 = ["the", "slow", "brown", "fox", "runs"]
# Output: 3, ["the", "brown", "fox"]
Hint 1
Standard LCS algorithm, applied to lists of tokens instead of strings.
Hint 2
Build the DP table, then backtrack to find the actual subsequence.
Hint 3
When text1[i-1] == text2[j-1], add to LCS and move diagonally. Otherwise, move in the direction of the larger DP value.
Solution
def lcs_tokens(tokens1, tokens2):
"""LCS for token sequences with backtracking.
Time: O(m * n), Space: O(m * n)
"""
m, n = len(tokens1), len(tokens2)
dp = [[0] * (n + 1) for _ in range(m + 1)]

for i in range(1, m + 1):
for j in range(1, n + 1):
if tokens1[i - 1] == tokens2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

# Backtrack
result = []
i, j = m, n
while i > 0 and j > 0:
if tokens1[i - 1] == tokens2[j - 1]:
result.append(tokens1[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
i -= 1
else:
j -= 1

result.reverse()
return dp[m][n], result

Problem 6: Partition Equal Subset Sum

Given an array of positive integers, determine if the array can be partitioned into two subsets with equal sum.

# Input: [1, 5, 11, 5]
# Output: True (subsets [1, 5, 5] and [11])

# ML context: Balanced data splitting - can you partition a dataset
# into two groups with equal total weight?
Hint 1
If the total sum is odd, return False immediately. Otherwise, the problem reduces to: can you find a subset that sums to total/2?
Hint 2
This is a 0/1 knapsack problem where the "capacity" is total/2.
Hint 3
dp[j] = True if a subset sum of j is achievable. Iterate items, update dp in reverse order.
Solution
def can_partition(nums):
"""Check if array can be partitioned into two equal-sum subsets.
Time: O(n * sum/2), Space: O(sum/2)
"""
total = sum(nums)
if total % 2 != 0:
return False
target = total // 2

dp = [False] * (target + 1)
dp[0] = True

for num in nums:
for j in range(target, num - 1, -1):
dp[j] = dp[j] or dp[j - num]

return dp[target]

Interview Cheat Sheet

PatternWhen to UseKey Recurrence
1D DP (Fibonacci)Linear sequence, each step depends on prev 1-2dp[i] = f(dp[i-1], dp[i-2])
LISLongest increasing subsequencedp[i] = max(dp[j]+1) for j < i, arr[j] < arr[i]
Edit DistanceString/sequence transformationdp[i][j] = min(delete, insert, replace)
LCSLongest common subsequencedp[i][j] = dp[i-1][j-1]+1 if match, else max(skip)
0/1 KnapsackSelect items with weight/valuedp[i][w] = max(skip, take)
Coin ChangeMinimum coins for targetdp[a] = min(dp[a-coin]+1) for each coin
Grid PathsCount/optimize paths in 2D griddp[i][j] = dp[i-1][j] + dp[i][j-1]
ViterbiMost likely hidden state sequencedp[t][s] = max(dp[t-1][s'] * transition * emission)
Word BreakCan string be segmented?dp[i] = any(dp[j] and s[j:i] in dict)

Spaced Repetition Checkpoints

DayReview ActivityTime
Day 0Read all patterns. Implement edit distance and coin change from scratch.40 min
Day 3Implement LCS with backtracking and 0/1 knapsack without looking at solutions.25 min
Day 7Solve 3 new DP problems from LeetCode (one 1D, one 2D, one knapsack variant).45 min
Day 14Implement the Viterbi algorithm from scratch with a small example. Verify manually.30 min
Day 21Without reference, classify 5 DP problems by pattern (1D, 2D, knapsack, string). Solve one from each category.60 min

Next Steps

Dynamic programming handles optimization over sequences. Next, learn Sorting and Searching to master binary search variations, top-K selection, and sorting algorithms critical for data-heavy ML pipelines.

© 2026 EngineersOfAI. All rights reserved.