Skip to main content

Arrays and Strings

Two pointers, sliding window, prefix sums, string manipulation, and matrix operations - with ML-relevant practice problems and full Python solutions.

Reading time: ~40 min | Interview relevance: Very High | Roles: All AI/ML roles

The Real Interview Moment

You are in the second round at a Series B AI startup. The interviewer, a staff engineer who built their recommendation system, pulls up a shared editor and says: "Here is the scenario. We have a stream of model confidence scores from our production inference pipeline. We need to compute the moving average over a window of size K, and also flag any window where the max score exceeds a threshold. Can you write a function to do both in a single pass?"

You recognize this immediately. It is a sliding window problem - one of the most common patterns in coding interviews, and one that maps directly to real ML monitoring tasks. You know that a naive approach recalculates the sum and max for each window from scratch (O(nK)), but there are O(n) approaches using running sums and a deque.

You start writing. The interviewer nods as you explain your approach. Within fifteen minutes, you have a clean solution with proper edge case handling. The follow-up - "What if the window size changes dynamically?" - does not faze you because you understand the underlying pattern, not just the specific problem.

This chapter teaches you the patterns that make array and string problems feel automatic.

Pattern 1: Two Pointers

The two-pointer technique uses two indices that move through an array, typically one from each end or both from the start at different speeds.

When to Use Two Pointers

  • The array is sorted (or can be sorted without violating the problem constraints)
  • You need to find pairs or subarrays with some property
  • You need to partition or rearrange elements in place

Two Pointer Variants

Core Implementation

# Pattern 1a: Opposite-direction two pointers
# Use when: sorted array, find pair with target sum
def two_sum_sorted(arr, target):
"""Find two numbers in sorted array that sum to target.
Time: O(n), Space: O(1)
"""
left, right = 0, len(arr) - 1
while left < right:
current_sum = arr[left] + arr[right]
if current_sum == target:
return [left, right]
elif current_sum < target:
left += 1 # Need a bigger sum, move left pointer right
else:
right -= 1 # Need a smaller sum, move right pointer left
return [-1, -1]

# Pattern 1b: Same-direction two pointers (fast/slow)
# Use when: removing elements in place, detecting cycles
def remove_duplicates(arr):
"""Remove duplicates from sorted array in place.
Returns new length. Time: O(n), Space: O(1)
"""
if not arr:
return 0
slow = 0 # Points to last unique element
for fast in range(1, len(arr)):
if arr[fast] != arr[slow]:
slow += 1
arr[slow] = arr[fast]
return slow + 1

# Pattern 1c: Two-array merge
# Use when: merging two sorted sequences
def merge_sorted(arr1, arr2):
"""Merge two sorted arrays into one sorted array.
Time: O(n + m), Space: O(n + m)
"""
result = []
i, j = 0, 0
while i < len(arr1) and j < len(arr2):
if arr1[i] <= arr2[j]:
result.append(arr1[i])
i += 1
else:
result.append(arr2[j])
j += 1
result.extend(arr1[i:])
result.extend(arr2[j:])
return result
60-Second Answer

Two-pointer problems follow a simple decision tree: Is the array sorted? If yes, try opposite-direction pointers. Do you need to modify the array in place? Use fast/slow pointers. Are there two arrays to combine? Use one pointer per array.

ML-Relevant Two-Pointer Problem

def merge_prediction_streams(stream1, stream2):
"""Merge two sorted streams of (timestamp, prediction) tuples.

ML context: Combining predictions from two model replicas
that process data in parallel, merging by timestamp.

Time: O(n + m), Space: O(n + m)
"""
merged = []
i, j = 0, 0
while i < len(stream1) and j < len(stream2):
if stream1[i][0] <= stream2[j][0]:
merged.append(stream1[i])
i += 1
else:
merged.append(stream2[j])
j += 1
merged.extend(stream1[i:])
merged.extend(stream2[j:])
return merged

# Example usage:
# stream1 = [(1, 0.92), (3, 0.87), (5, 0.95)]
# stream2 = [(2, 0.89), (4, 0.91), (6, 0.88)]
# Result: [(1, 0.92), (2, 0.89), (3, 0.87), (4, 0.91), (5, 0.95), (6, 0.88)]

Pattern 2: Sliding Window

Sliding window is arguably the most important array pattern for ML interviews because it directly maps to time-series processing, rolling statistics, and streaming data analysis.

Fixed-Size Window

def moving_average(values, k):
"""Compute moving average with window size k.

ML context: Smoothing training loss or metric curves.

Time: O(n), Space: O(n) for output, O(1) working space
"""
if len(values) < k:
return []

result = []
window_sum = sum(values[:k]) # Initialize first window
result.append(window_sum / k)

for i in range(k, len(values)):
window_sum += values[i] # Add new element
window_sum -= values[i - k] # Remove old element
result.append(window_sum / k)

return result

# Example: moving_average([1, 3, 2, 6, 4, 8], 3)
# Windows: [1,3,2] [3,2,6] [2,6,4] [6,4,8]
# Averages: [2.0, 3.67, 4.0, 6.0]

Variable-Size Window

def smallest_window_above_threshold(scores, threshold):
"""Find the smallest contiguous window where sum >= threshold.

ML context: Find the minimum number of consecutive features
whose combined importance exceeds a threshold.

Time: O(n), Space: O(1)
"""
n = len(scores)
min_length = float('inf')
window_sum = 0
left = 0

for right in range(n):
window_sum += scores[right]

# Shrink window from left while constraint is met
while window_sum >= threshold:
min_length = min(min_length, right - left + 1)
window_sum -= scores[left]
left += 1

return min_length if min_length != float('inf') else 0

Sliding Window Maximum (Deque Approach)

from collections import deque

def sliding_window_max(arr, k):
"""Return the maximum value in each window of size k.

ML context: Finding peak prediction confidence in rolling windows
for anomaly detection in model monitoring.

Time: O(n), Space: O(k)
"""
if not arr or k == 0:
return []

dq = deque() # Stores indices in decreasing order of arr values
result = []

for i in range(len(arr)):
# Remove elements outside the window from front
while dq and dq[0] < i - k + 1:
dq.popleft()

# Remove elements smaller than current from back
# (they can never be the max while current element is in the window)
while dq and arr[dq[-1]] <= arr[i]:
dq.pop()

dq.append(i)

# Window is fully formed starting at index k-1
if i >= k - 1:
result.append(arr[dq[0]])

return result

# Example: sliding_window_max([1, 3, -1, -3, 5, 3, 6, 7], 3)
# Windows: [1,3,-1] [3,-1,-3] [-1,-3,5] [-3,5,3] [5,3,6] [3,6,7]
# Max: [3, 3, 5, 5, 6, 7]
Instant Rejection

If you solve sliding window maximum with a nested loop (O(nk)), the interviewer will ask you to optimize. If you cannot produce the O(n) deque solution, it is a strong negative signal for ML roles. This problem tests whether you understand amortized analysis - each element is pushed and popped at most once, so the total work is O(n).

Sliding Window with Hash Map

def longest_substring_no_repeat(s):
"""Find the length of the longest substring without repeating characters.

ML context: Finding the longest unique token sequence in text data.

Time: O(n), Space: O(min(n, alphabet_size))
"""
char_index = {} # char -> most recent index
max_length = 0
left = 0

for right, char in enumerate(s):
if char in char_index and char_index[char] >= left:
left = char_index[char] + 1 # Jump past the duplicate
char_index[char] = right
max_length = max(max_length, right - left + 1)

return max_length

# Example: longest_substring_no_repeat("abcabcbb") -> 3 ("abc")

Pattern 3: Prefix Sums

Prefix sums precompute cumulative sums so that any range sum can be answered in O(1).

def build_prefix_sum(arr):
"""Build prefix sum array for O(1) range sum queries.

prefix[i] = sum of arr[0..i-1]
Range sum arr[i..j] = prefix[j+1] - prefix[i]

Time: O(n) to build, O(1) per query
Space: O(n)
"""
n = len(arr)
prefix = [0] * (n + 1)
for i in range(n):
prefix[i + 1] = prefix[i] + arr[i]
return prefix

def range_sum(prefix, i, j):
"""Sum of arr[i..j] inclusive."""
return prefix[j + 1] - prefix[i]

Subarray Sum Equals K

def subarray_sum_equals_k(arr, k):
"""Count subarrays whose sum equals k.

ML context: Finding contiguous time windows where cumulative
prediction error equals a specific value.

Key insight: If prefix[j] - prefix[i] == k, then subarray (i, j] sums to k.
So we need to count how many times (current_prefix - k) has appeared before.

Time: O(n), Space: O(n)
"""
count = 0
prefix_sum = 0
prefix_counts = {0: 1} # prefix_sum -> frequency

for num in arr:
prefix_sum += num
# How many times have we seen a prefix sum equal to (prefix_sum - k)?
count += prefix_counts.get(prefix_sum - k, 0)
prefix_counts[prefix_sum] = prefix_counts.get(prefix_sum, 0) + 1

return count

# Example: subarray_sum_equals_k([1, 1, 1], 2) -> 2
# Subarrays: [1,1] (indices 0-1), [1,1] (indices 1-2)

2D Prefix Sum (Matrix Region Sum)

def build_2d_prefix(matrix):
"""Build 2D prefix sum for O(1) rectangular region sum queries.

ML context: Fast computation of feature sums in image patches,
integral images for object detection (Viola-Jones).

Time: O(rows * cols) to build, O(1) per query
"""
if not matrix or not matrix[0]:
return [[]]
rows, cols = len(matrix), len(matrix[0])
prefix = [[0] * (cols + 1) for _ in range(rows + 1)]

for r in range(rows):
for c in range(cols):
prefix[r + 1][c + 1] = (
matrix[r][c]
+ prefix[r][c + 1]
+ prefix[r + 1][c]
- prefix[r][c]
)
return prefix

def region_sum(prefix, r1, c1, r2, c2):
"""Sum of rectangle from (r1,c1) to (r2,c2) inclusive."""
return (
prefix[r2 + 1][c2 + 1]
- prefix[r1][c2 + 1]
- prefix[r2 + 1][c1]
+ prefix[r1][c1]
)
Company Variation

2D prefix sums (integral images) are particularly common at companies working on computer vision - Apple, Tesla, and Meta Reality Labs. For NLP-focused roles, 1D prefix sums are sufficient.

Pattern 4: String Manipulation

String problems in ML interviews often relate to tokenization, text preprocessing, or pattern matching.

Common String Operations

# String reversal - in-place on a list of characters
def reverse_string(chars):
"""Reverse a list of characters in place.
Time: O(n), Space: O(1)
"""
left, right = 0, len(chars) - 1
while left < right:
chars[left], chars[right] = chars[right], chars[left]
left += 1
right -= 1

# Check if two strings are anagrams
def is_anagram(s1, s2):
"""Check if s1 and s2 are anagrams.

ML context: Detecting near-duplicate text entries in datasets.

Time: O(n), Space: O(1) - bounded by alphabet size
"""
if len(s1) != len(s2):
return False
from collections import Counter
return Counter(s1) == Counter(s2)

# Tokenization - simple word-level tokenizer
def tokenize(text):
"""Basic tokenizer that handles punctuation.

ML context: Pre-processing text for NLP models.

Time: O(n), Space: O(n)
"""
tokens = []
current_token = []
for char in text.lower():
if char.isalnum():
current_token.append(char)
else:
if current_token:
tokens.append(''.join(current_token))
current_token = []
if current_token:
tokens.append(''.join(current_token))
return tokens

String Matching - KMP Algorithm

def kmp_search(text, pattern):
"""Find all occurrences of pattern in text using KMP algorithm.

ML context: Searching for specific patterns in log files,
finding subsequences in genomic data.

Time: O(n + m), Space: O(m)
"""
# Build failure function (longest proper prefix that is also a suffix)
def build_failure(pattern):
m = len(pattern)
failure = [0] * m
j = 0
for i in range(1, m):
while j > 0 and pattern[i] != pattern[j]:
j = failure[j - 1]
if pattern[i] == pattern[j]:
j += 1
failure[i] = j
return failure

n, m = len(text), len(pattern)
if m == 0:
return []

failure = build_failure(pattern)
matches = []
j = 0 # Position in pattern

for i in range(n):
while j > 0 and text[i] != pattern[j]:
j = failure[j - 1]
if text[i] == pattern[j]:
j += 1
if j == m:
matches.append(i - m + 1)
j = failure[j - 1]

return matches
Common Trap

String concatenation in a loop (result += char) creates a new string object each time because Python strings are immutable. This makes a loop of n concatenations O(n^2). Always use ''.join(list_of_chars) instead, which is O(n).

Pattern 5: Matrix Operations

Matrix problems test 2D array traversal, which is fundamental for image processing and feature engineering.

def rotate_matrix_90(matrix):
"""Rotate an NxN matrix 90 degrees clockwise in place.

ML context: Image augmentation - rotating images during training.

Approach: Transpose, then reverse each row.
Time: O(n^2), Space: O(1) - in place
"""
n = len(matrix)
# Step 1: Transpose (swap matrix[i][j] with matrix[j][i])
for i in range(n):
for j in range(i + 1, n):
matrix[i][j], matrix[j][i] = matrix[j][i], matrix[i][j]
# Step 2: Reverse each row
for i in range(n):
matrix[i].reverse()

def spiral_order(matrix):
"""Return elements of matrix in spiral order.

ML context: Unrolling feature maps in CNNs, serializing 2D data.

Time: O(m * n), Space: O(1) extra (excluding output)
"""
if not matrix:
return []
result = []
top, bottom = 0, len(matrix) - 1
left, right = 0, len(matrix[0]) - 1

while top <= bottom and left <= right:
# Traverse right
for col in range(left, right + 1):
result.append(matrix[top][col])
top += 1
# Traverse down
for row in range(top, bottom + 1):
result.append(matrix[row][right])
right -= 1
# Traverse left
if top <= bottom:
for col in range(right, left - 1, -1):
result.append(matrix[bottom][col])
bottom -= 1
# Traverse up
if left <= right:
for row in range(bottom, top - 1, -1):
result.append(matrix[row][left])
left += 1

return result

def set_matrix_zeroes(matrix):
"""If an element is 0, set its entire row and column to 0.

ML context: Masking invalid entries in a feature matrix.
If a feature is missing (0), invalidate all related entries.

Time: O(m * n), Space: O(1) - use first row/col as markers
"""
m, n = len(matrix), len(matrix[0])
first_row_zero = any(matrix[0][j] == 0 for j in range(n))
first_col_zero = any(matrix[i][0] == 0 for i in range(m))

# Use first row and first column as markers
for i in range(1, m):
for j in range(1, n):
if matrix[i][j] == 0:
matrix[i][0] = 0
matrix[0][j] = 0

# Zero out marked rows and columns
for i in range(1, m):
for j in range(1, n):
if matrix[i][0] == 0 or matrix[0][j] == 0:
matrix[i][j] = 0

# Handle first row and column
if first_row_zero:
for j in range(n):
matrix[0][j] = 0
if first_col_zero:
for i in range(m):
matrix[i][0] = 0

Pattern Comparison Table

PatternKey Signal in ProblemTimeSpaceFrequency in ML Interviews
Two PointersSorted array, find pairs, in-place modifyO(n)O(1)Very High
Sliding Window (fixed)"Window of size K", rolling statisticO(n)O(1)-O(k)Very High
Sliding Window (variable)"Smallest/largest subarray with property X"O(n)O(1)-O(k)High
Prefix SumRange sum queries, subarray sumsO(n) build, O(1) queryO(n)High
String MatchingFind pattern in text, tokenizationO(n + m)O(m)Medium
Matrix Traversal2D grid, image operations, feature matrixO(m*n)O(1)-O(m*n)Medium-High

Practice Problems

Problem 1: Rolling Model Accuracy (Sliding Window)

Given an array of model accuracy scores from successive evaluations and a window size K, return the maximum accuracy drop within any window of size K. The "drop" in a window is defined as max(window) - min(window).

# Input: scores = [0.92, 0.89, 0.95, 0.88, 0.91, 0.87, 0.93], k = 3
# Output: 0.07
# Explanation: Window [0.95, 0.88, 0.91] has drop 0.95 - 0.88 = 0.07
# Window [0.88, 0.91, 0.87] has drop 0.91 - 0.87 = 0.04
# Maximum drop across all windows is 0.08 from [0.95, 0.88, 0.91]
# Actually [0.91, 0.87, 0.93] -> 0.93 - 0.87 = 0.06
# And [0.88, 0.91, 0.87] -> 0.91 - 0.87 = 0.04
# Recheck: [0.92,0.89,0.95]->0.06, [0.89,0.95,0.88]->0.07,
# [0.95,0.88,0.91]->0.07, [0.88,0.91,0.87]->0.04, [0.91,0.87,0.93]->0.06
# Maximum drop = 0.07
Hint 1
The brute force approach checks every window and finds min/max in O(k) each - total O(nk). Can you maintain both the max and min of the window efficiently?
Hint 2
Use two monotonic deques: one for the running maximum and one for the running minimum.
Hint 3
Each element is pushed/popped from each deque at most once, so the total time is O(n).
Solution
from collections import deque

def max_accuracy_drop(scores, k):
"""Find the maximum (max - min) within any window of size k.

Uses two monotonic deques: one decreasing (for max), one increasing (for min).

Time: O(n), Space: O(k)
"""
if len(scores) < k or k == 0:
return 0.0

max_dq = deque() # Decreasing - front is max of window
min_dq = deque() # Increasing - front is min of window
max_drop = 0.0

for i in range(len(scores)):
# Maintain max deque (decreasing)
while max_dq and scores[max_dq[-1]] <= scores[i]:
max_dq.pop()
max_dq.append(i)

# Maintain min deque (increasing)
while min_dq and scores[min_dq[-1]] >= scores[i]:
min_dq.pop()
min_dq.append(i)

# Remove elements outside window
if max_dq[0] < i - k + 1:
max_dq.popleft()
if min_dq[0] < i - k + 1:
min_dq.popleft()

# Window is complete starting at index k-1
if i >= k - 1:
drop = scores[max_dq[0]] - scores[min_dq[0]]
max_drop = max(max_drop, drop)

return round(max_drop, 4)

Problem 2: Feature Importance Threshold (Prefix Sum)

Given an array of feature importance scores, find the minimum number of consecutive features (starting from the most important) that account for at least T% of the total importance.

# Input: importances = [0.05, 0.15, 0.25, 0.10, 0.20, 0.12, 0.13], threshold = 0.60
# Output: 3
# Explanation: Sort descending: [0.25, 0.20, 0.15, 0.13, 0.12, 0.10, 0.05]
# Cumulative: [0.25, 0.45, 0.60, ...]
# After 3 features, cumulative = 0.60 >= 0.60 * total(1.0)
Hint 1
Sort the importance scores in descending order first. Then compute a running sum.
Hint 2
Compute the total sum, then iterate through the sorted array, accumulating until you reach `threshold * total_sum`.
Hint 3
This is essentially a prefix sum on the sorted array with an early termination condition.
Solution
def min_features_for_threshold(importances, threshold):
"""Find minimum consecutive features covering threshold of total importance.

Time: O(n log n) for sorting, Space: O(n) for sorted copy
"""
if not importances:
return 0

total = sum(importances)
target = threshold * total
sorted_imp = sorted(importances, reverse=True)

cumulative = 0.0
for i, imp in enumerate(sorted_imp):
cumulative += imp
if cumulative >= target - 1e-9: # Float comparison tolerance
return i + 1

return len(importances)

Problem 3: Longest Stable Training Window (Variable Sliding Window)

Given a sequence of training loss values, find the length of the longest contiguous window where the loss never increases by more than delta between consecutive steps.

# Input: losses = [2.5, 2.3, 2.4, 2.1, 1.9, 2.8, 2.7, 2.6, 2.5, 2.4], delta = 0.2
# Output: 5
# Explanation: Window [2.1, 1.9, 2.8, ...] breaks because 2.8 - 1.9 = 0.9 > 0.2
# But the consecutive increases: 2.3->2.4 is +0.1 (ok), 2.4->2.1 is decrease (ok),
# Longest window where each step increase <= delta:
# [2.8, 2.7, 2.6, 2.5, 2.4] -> each step is a decrease, length 5
Hint 1
This is not a traditional sliding window problem. You cannot shrink from the left to fix a violation - you need to restart the window.
Hint 2
Use a simple scan: maintain a `start` pointer. If `losses[i] - losses[i-1] > delta`, reset `start = i`.
Hint 3
Track `max_length = max(max_length, i - start + 1)` at each step.
Solution
def longest_stable_window(losses, delta):
"""Find the longest contiguous window where consecutive
loss increases never exceed delta.

Time: O(n), Space: O(1)
"""
if len(losses) <= 1:
return len(losses)

max_length = 1
start = 0

for i in range(1, len(losses)):
# Check if consecutive increase exceeds delta
if losses[i] - losses[i - 1] > delta:
start = i # Reset window
max_length = max(max_length, i - start + 1)

return max_length

Problem 4: Merge Sorted Prediction Intervals (Two Pointers + Intervals)

Given a list of non-overlapping prediction intervals sorted by start time, and a new interval representing a model's active prediction window, merge the new interval and return the resulting non-overlapping intervals.

# Input: intervals = [[1,3], [6,9], [12,15]], new_interval = [5,10]
# Output: [[1,3], [5,10], [12,15]]
# Explanation: [5,10] overlaps with [6,9], merging gives [5,10]
Hint 1
Since intervals are sorted, you can split them into three groups: those entirely before the new interval, those that overlap, and those entirely after.
Hint 2
Two intervals [a,b] and [c,d] overlap if and only if a <= d and c <= b.
Hint 3
For overlapping intervals, merge by taking min of starts and max of ends.
Solution
def insert_interval(intervals, new_interval):
"""Insert and merge a new interval into sorted non-overlapping intervals.

Time: O(n), Space: O(n) for output
"""
result = []
i = 0
n = len(intervals)

# Add all intervals that end before new_interval starts
while i < n and intervals[i][1] < new_interval[0]:
result.append(intervals[i])
i += 1

# Merge all overlapping intervals
while i < n and intervals[i][0] <= new_interval[1]:
new_interval = [
min(new_interval[0], intervals[i][0]),
max(new_interval[1], intervals[i][1])
]
i += 1
result.append(new_interval)

# Add remaining intervals
while i < n:
result.append(intervals[i])
i += 1

return result

Problem 5: Product of Array Except Self (Array Technique)

Given an array of prediction confidence scores, return an array where each element is the product of all scores except the one at that index. Do not use division.

# Input: [0.8, 0.9, 0.7, 0.95]
# Output: [0.5985, 0.532, 0.684, 0.504]
# Explanation: output[0] = 0.9 * 0.7 * 0.95 = 0.5985
Hint 1
You cannot use division (what if a score is 0?). Think about prefix and suffix products.
Hint 2
For each index i, the answer is (product of all elements to the left) * (product of all elements to the right).
Hint 3
Compute left products in one pass (left to right), then multiply by right products in another pass (right to left). Use the output array for left products to achieve O(1) extra space.
Solution
def product_except_self(nums):
"""Product of array except self without division.

Time: O(n), Space: O(1) extra (output array not counted)
"""
n = len(nums)
result = [1] * n

# Left pass: result[i] = product of all elements to the left of i
left_product = 1
for i in range(n):
result[i] = left_product
left_product *= nums[i]

# Right pass: multiply by product of all elements to the right of i
right_product = 1
for i in range(n - 1, -1, -1):
result[i] *= right_product
right_product *= nums[i]

return result

Problem 6: Longest Consecutive Feature Indices (Array + Set)

Given an unsorted array of feature indices used in a model, find the length of the longest consecutive sequence.

# Input: [100, 4, 200, 1, 3, 2]
# Output: 4
# Explanation: Longest consecutive sequence is [1, 2, 3, 4]
Hint 1
Sorting gives O(n log n). Can you do O(n)?
Hint 2
Use a set for O(1) lookups. For each number, check if it could be the *start* of a sequence (i.e., num - 1 is not in the set).
Hint 3
Only start counting from sequence beginnings. This ensures each element is visited at most twice.
Solution
def longest_consecutive(nums):
"""Find the longest consecutive sequence in an unsorted array.

Time: O(n), Space: O(n)
"""
if not nums:
return 0

num_set = set(nums)
max_length = 0

for num in num_set:
# Only start counting from the beginning of a sequence
if num - 1 not in num_set:
current = num
length = 1
while current + 1 in num_set:
current += 1
length += 1
max_length = max(max_length, length)

return max_length

Problem 7: Encode/Decode Model Configuration Strings

Design encode and decode functions for a list of model configuration strings.

# Input: ["learning_rate:0.001", "batch_size:32", "optimizer:adam"]
# Encode: "20#learning_rate:0.001 12#batch_size:32 14#optimizer:adam"
# Decode back to original list
Hint 1
The challenge is that strings can contain any character (including delimiters). You need an unambiguous encoding.
Hint 2
Use length-prefix encoding: store the length of each string followed by a delimiter, then the string itself.
Hint 3
Format: `"{length}#{string}"` for each string. When decoding, read the number before `#`, then read exactly that many characters.
Solution
def encode(strs):
"""Encode a list of strings into a single string.
Time: O(total characters), Space: O(total characters)
"""
encoded = []
for s in strs:
encoded.append(f"{len(s)}#{s}")
return ''.join(encoded)

def decode(s):
"""Decode a string back into a list of strings.
Time: O(n), Space: O(n)
"""
result = []
i = 0
while i < len(s):
# Find the '#' delimiter
j = i
while s[j] != '#':
j += 1
length = int(s[i:j])
result.append(s[j + 1: j + 1 + length])
i = j + 1 + length
return result

Problem 8: Maximum Subarray (Kadane's Algorithm)

Find the contiguous subarray with the largest sum. This is one of the most classic interview problems and directly maps to finding the most impactful consecutive features.

# Input: [-2, 1, -3, 4, -1, 2, 1, -5, 4]
# Output: 6 (subarray [4, -1, 2, 1])
Hint 1
At each position, you have two choices: start a new subarray here, or extend the previous subarray.
Hint 2
If the running sum becomes negative, it is better to start fresh.
Hint 3
Track `current_sum = max(nums[i], current_sum + nums[i])` and `max_sum = max(max_sum, current_sum)`.
Solution
def max_subarray(nums):
"""Find the contiguous subarray with the largest sum (Kadane's algorithm).

ML context: Finding the consecutive epochs with the greatest
cumulative improvement in model performance.

Time: O(n), Space: O(1)
"""
if not nums:
return 0

max_sum = nums[0]
current_sum = nums[0]

for i in range(1, len(nums)):
# Either extend previous subarray or start new one
current_sum = max(nums[i], current_sum + nums[i])
max_sum = max(max_sum, current_sum)

return max_sum

# To also return the subarray itself:
def max_subarray_with_indices(nums):
"""Return (max_sum, start_index, end_index)."""
max_sum = nums[0]
current_sum = nums[0]
start = end = 0
temp_start = 0

for i in range(1, len(nums)):
if nums[i] > current_sum + nums[i]:
current_sum = nums[i]
temp_start = i
else:
current_sum += nums[i]

if current_sum > max_sum:
max_sum = current_sum
start = temp_start
end = i

return max_sum, start, end

Problem 9: Trapping Rain Water (Two Pointers Advanced)

Given an array representing elevation heights, compute how much water can be trapped between the bars.

# Input: [0, 1, 0, 2, 1, 0, 1, 3, 2, 1, 2, 1]
# Output: 6
Hint 1
The water above each position depends on the minimum of (max height to its left, max height to its right) minus its own height.
Hint 2
You could precompute left_max and right_max arrays (O(n) space), but two pointers can do it in O(1) space.
Hint 3
Use two pointers from both ends. The side with the smaller max determines the water level. Move the pointer on the smaller side inward.
Solution
def trap_rain_water(height):
"""Compute total trapped water using two pointers.

Time: O(n), Space: O(1)
"""
if not height:
return 0

left, right = 0, len(height) - 1
left_max, right_max = 0, 0
total_water = 0

while left < right:
if height[left] < height[right]:
if height[left] >= left_max:
left_max = height[left]
else:
total_water += left_max - height[left]
left += 1
else:
if height[right] >= right_max:
right_max = height[right]
else:
total_water += right_max - height[right]
right -= 1

return total_water

Problem 10: Batch Normalization Statistics (Array + Streaming)

Implement an online algorithm that computes the mean and variance of a stream of values using Welford's method (single pass, numerically stable).

# This is directly relevant to batch normalization in neural networks
# Input: stream of values arriving one at a time
# Output: running mean and variance after each value
Hint 1
The naive approach (store all values, compute mean and variance) uses O(n) space. Welford's method uses O(1) space.
Hint 2
Welford's update: when a new value x arrives, compute `delta = x - mean`, update `mean += delta / n`, then `delta2 = x - mean` (using new mean), and `M2 += delta * delta2`.
Hint 3
Variance = M2 / n (population) or M2 / (n - 1) (sample).
Solution
class OnlineStats:
"""Welford's online algorithm for mean and variance.

ML context: Computing batch normalization statistics in a
streaming fashion, or tracking running statistics during training.

Time: O(1) per update, Space: O(1)
"""
def __init__(self):
self.n = 0
self.mean = 0.0
self.m2 = 0.0 # Sum of squared differences from mean

def update(self, x):
"""Add a new value to the running statistics."""
self.n += 1
delta = x - self.mean
self.mean += delta / self.n
delta2 = x - self.mean # Using updated mean
self.m2 += delta * delta2

def get_mean(self):
return self.mean

def get_variance(self):
"""Population variance."""
if self.n < 2:
return 0.0
return self.m2 / self.n

def get_sample_variance(self):
"""Sample variance (Bessel's correction)."""
if self.n < 2:
return 0.0
return self.m2 / (self.n - 1)

def get_std(self):
import math
return math.sqrt(self.get_variance())

# Usage:
# stats = OnlineStats()
# for loss in training_losses:
# stats.update(loss)
# print(f"Running mean: {stats.get_mean():.4f}, "
# f"Running std: {stats.get_std():.4f}")

Interview Cheat Sheet

PatternWhen to UseKey Implementation Detail
Two Pointers (opposite)Sorted array, pair findingleft, right = 0, n-1; move based on comparison
Two Pointers (same direction)In-place modification, dedupslow marks valid elements, fast scans
Fixed Sliding WindowRolling stats, window of size KAdd right element, subtract left element
Variable Sliding WindowMin/max subarray with propertyExpand right, shrink left while valid
Sliding Window + DequeWindow max/minMonotonic deque, front = answer
Prefix SumRange queries, subarray sumsprefix[i] = sum(arr[0..i-1])
Kadane's AlgorithmMaximum subarray sumcurr = max(nums[i], curr + nums[i])
Product Except SelfAvoid divisionLeft pass then right pass
Matrix In-PlaceRotate, zero rows/colsUse first row/col as markers
Welford's MethodOnline mean/variancedelta = x - mean; mean += delta/n

Spaced Repetition Checkpoints

DayReview ActivityTime
Day 0Read all patterns. Solve Problems 1, 5, and 8 without looking at solutions.40 min
Day 3Re-solve sliding window maximum (Problem 1) and Kadane's (Problem 8) from scratch.20 min
Day 7Solve 3 new sliding window or two-pointer problems from LeetCode.30 min
Day 14Without looking, write the template for: fixed sliding window, variable sliding window, prefix sum, and two-pointer merge.20 min
Day 21Time yourself: solve a Medium array/string problem in under 20 minutes. Repeat with a different problem.40 min

Next Steps

Arrays and strings are the foundation. Next, learn Hash Maps and Sets - the data structure that turns O(n^2) solutions into O(n) solutions and appears in virtually every coding interview.

© 2026 EngineersOfAI. All rights reserved.