Trees and Graphs
BFS, DFS, binary trees, BSTs, tries, graph representations, shortest paths, and topological sort - with ML-relevant practice problems and full Python solutions.
Reading time: ~40 min | Interview relevance: High | Roles: All AI/ML roles
The Real Interview Moment
You are in a virtual on-site at a major AI company. The interviewer shares a screen with a diagram: "Here is the dependency graph for our ML training pipeline. Node A is data ingestion, B is data validation, C is feature engineering (depends on A and B), D is model training (depends on C), E is evaluation (depends on D), and F is model registration (depends on D and E). We need to determine a valid execution order."
You immediately recognize this: topological sort on a DAG. You have seen this exact pattern in production ML systems - Airflow DAGs, Kubeflow pipelines, MLflow projects. You explain the approach: build an adjacency list, compute in-degrees, use a queue to process nodes with zero in-degree. Within ten minutes, you have clean, working code.
The follow-up comes: "What if there is a cycle? How do you detect it?" You explain that if you process fewer nodes than exist in the graph, there must be a cycle. The interviewer probes further: "Can you identify which nodes are in the cycle?" Now you are in territory where DFS with node coloring (white/gray/black) shines.
Trees and graphs are not abstract textbook concepts - they are the backbone of ML systems. This chapter prepares you for both the algorithmic questions and the practical applications.
Binary Trees
Core Traversals
Every tree problem builds on four traversals. Know them cold.
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
# ============ RECURSIVE TRAVERSALS ============
def inorder(root):
"""Left -> Root -> Right. Gives sorted order for BST.
Time: O(n), Space: O(h) call stack where h = height
"""
if not root:
return []
return inorder(root.left) + [root.val] + inorder(root.right)
def preorder(root):
"""Root -> Left -> Right. Useful for tree serialization.
Time: O(n), Space: O(h)
"""
if not root:
return []
return [root.val] + preorder(root.left) + preorder(root.right)
def postorder(root):
"""Left -> Right -> Root. Useful for deletion, expression evaluation.
Time: O(n), Space: O(h)
"""
if not root:
return []
return postorder(root.left) + postorder(root.right) + [root.val]
# ============ ITERATIVE TRAVERSALS ============
def inorder_iterative(root):
"""Iterative inorder using an explicit stack."""
result = []
stack = []
current = root
while current or stack:
while current:
stack.append(current)
current = current.left
current = stack.pop()
result.append(current.val)
current = current.right
return result
def preorder_iterative(root):
"""Iterative preorder using a stack."""
if not root:
return []
result = []
stack = [root]
while stack:
node = stack.pop()
result.append(node.val)
# Push right first so left is processed first (LIFO)
if node.right:
stack.append(node.right)
if node.left:
stack.append(node.left)
return result
# ============ LEVEL ORDER (BFS) ============
from collections import deque
def level_order(root):
"""BFS traversal - returns list of lists, one per level.
Time: O(n), Space: O(w) where w = max width of tree
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level = []
for _ in range(len(queue)):
node = queue.popleft()
level.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(level)
return result
Traversal Comparison
| Traversal | Order | Use Cases | ML Application |
|---|---|---|---|
| Inorder | L -> Root -> R | Get sorted order from BST | Sorted feature extraction from decision tree |
| Preorder | Root -> L -> R | Serialize tree, copy tree | Serialize model tree for storage |
| Postorder | L -> R -> Root | Delete tree, evaluate expressions | Compute subtree metrics bottom-up |
| Level Order (BFS) | Level by level | Find min depth, level averages | Layer-by-layer analysis of decision tree |
If an interviewer asks "When would you use BFS vs DFS on a tree?", the answer is: Use BFS when you need level-by-level processing or finding the shortest path (min depth, level averages, right-side view). Use DFS when you need to process leaves before parents (path sums, tree diameter, subtree computations) or when the tree is very wide (BFS queue would be too large).
Common Binary Tree Problems
def max_depth(root):
"""Maximum depth of a binary tree.
Time: O(n), Space: O(h)
"""
if not root:
return 0
return 1 + max(max_depth(root.left), max_depth(root.right))
def is_balanced(root):
"""Check if tree is height-balanced (every subtree's height differs by at most 1).
Time: O(n), Space: O(h)
"""
def check(node):
if not node:
return 0
left_h = check(node.left)
right_h = check(node.right)
if left_h == -1 or right_h == -1 or abs(left_h - right_h) > 1:
return -1
return 1 + max(left_h, right_h)
return check(root) != -1
def diameter(root):
"""Diameter of binary tree (longest path between any two nodes).
Time: O(n), Space: O(h)
"""
max_diameter = [0]
def depth(node):
if not node:
return 0
left = depth(node.left)
right = depth(node.right)
max_diameter[0] = max(max_diameter[0], left + right)
return 1 + max(left, right)
depth(root)
return max_diameter[0]
def lowest_common_ancestor(root, p, q):
"""Find the lowest common ancestor of two nodes.
Time: O(n), Space: O(h)
"""
if not root or root == p or root == q:
return root
left = lowest_common_ancestor(root.left, p, q)
right = lowest_common_ancestor(root.right, p, q)
if left and right:
return root
return left or right
def has_path_sum(root, target):
"""Check if any root-to-leaf path sums to target.
ML context: Check if any branch of a decision tree
produces a cumulative score above a threshold.
Time: O(n), Space: O(h)
"""
if not root:
return False
if not root.left and not root.right:
return root.val == target
return (has_path_sum(root.left, target - root.val) or
has_path_sum(root.right, target - root.val))
Binary Search Trees (BSTs)
BSTs maintain the property: left.val < node.val < right.val. This enables O(log n) search, insert, and delete in balanced trees.
def search_bst(root, target):
"""Search for a value in a BST.
Time: O(h), Space: O(h) recursive / O(1) iterative
"""
if not root or root.val == target:
return root
if target < root.val:
return search_bst(root.left, target)
return search_bst(root.right, target)
def insert_bst(root, val):
"""Insert a value into a BST.
Time: O(h), Space: O(h)
"""
if not root:
return TreeNode(val)
if val < root.val:
root.left = insert_bst(root.left, val)
elif val > root.val:
root.right = insert_bst(root.right, val)
return root
def validate_bst(root, low=float('-inf'), high=float('inf')):
"""Validate that a tree is a valid BST.
Time: O(n), Space: O(h)
"""
if not root:
return True
if root.val <= low or root.val >= high:
return False
return (validate_bst(root.left, low, root.val) and
validate_bst(root.right, root.val, high))
def kth_smallest_bst(root, k):
"""Find the kth smallest element in a BST.
Approach: Inorder traversal gives sorted order.
Time: O(h + k), Space: O(h)
"""
stack = []
current = root
count = 0
while current or stack:
while current:
stack.append(current)
current = current.left
current = stack.pop()
count += 1
if count == k:
return current.val
current = current.right
return -1
When validating a BST, do not just check that left.val < node.val < right.val. You must check that all values in the left subtree are less than the node, and all values in the right subtree are greater. The correct approach passes valid ranges (low, high) down the recursion.
Tries (Prefix Trees)
Tries are essential for NLP-related problems: autocomplete, spell checking, prefix matching.
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
self.count = 0 # Number of words with this prefix
class Trie:
"""Prefix tree for efficient string operations.
ML context: Building prefix-based search for token vocabularies,
autocomplete for text input, prefix matching in NLP pipelines.
Insert: O(m), Search: O(m), Prefix search: O(m)
where m = length of the word
Space: O(total characters across all words)
"""
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.count += 1
node.is_end = True
def search(self, word):
"""Return True if word exists in trie."""
node = self._find_node(word)
return node is not None and node.is_end
def starts_with(self, prefix):
"""Return True if any word starts with prefix."""
return self._find_node(prefix) is not None
def count_with_prefix(self, prefix):
"""Count words with given prefix."""
node = self._find_node(prefix)
return node.count if node else 0
def _find_node(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return None
node = node.children[char]
return node
def autocomplete(self, prefix, limit=5):
"""Return up to `limit` words starting with prefix.
ML context: Autocomplete suggestions based on prefix.
"""
node = self._find_node(prefix)
if not node:
return []
results = []
self._dfs_collect(node, prefix, results, limit)
return results
def _dfs_collect(self, node, current, results, limit):
if len(results) >= limit:
return
if node.is_end:
results.append(current)
for char in sorted(node.children):
self._dfs_collect(node.children[char], current + char, results, limit)
Graph Representations
# Adjacency list - most common for interviews
def build_adjacency_list(edges, directed=True):
"""Build adjacency list from edge list.
Time: O(E), Space: O(V + E)
"""
from collections import defaultdict
graph = defaultdict(list)
for u, v in edges:
graph[u].append(v)
if not directed:
graph[v].append(u)
return graph
# Adjacency matrix - for dense graphs
def build_adjacency_matrix(n, edges, directed=True):
"""Build adjacency matrix from edge list.
Time: O(V^2 + E), Space: O(V^2)
"""
matrix = [[0] * n for _ in range(n)]
for u, v in edges:
matrix[u][v] = 1
if not directed:
matrix[v][u] = 1
return matrix
Graph Traversals: BFS and DFS
Breadth-First Search (BFS)
from collections import deque
def bfs(graph, start):
"""BFS traversal of a graph.
Time: O(V + E), Space: O(V)
"""
visited = set([start])
queue = deque([start])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return order
def shortest_path_bfs(graph, start, end):
"""Find shortest path in unweighted graph using BFS.
ML context: Finding the shortest dependency chain
between two stages in an ML pipeline.
Time: O(V + E), Space: O(V)
"""
if start == end:
return [start]
visited = set([start])
queue = deque([(start, [start])])
while queue:
node, path = queue.popleft()
for neighbor in graph.get(node, []):
if neighbor == end:
return path + [neighbor]
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, path + [neighbor]))
return [] # No path exists
def bfs_level_by_level(graph, start):
"""BFS that processes level by level (useful for distance calculations).
Time: O(V + E), Space: O(V)
"""
visited = set([start])
queue = deque([start])
levels = []
while queue:
level = []
for _ in range(len(queue)):
node = queue.popleft()
level.append(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
levels.append(level)
return levels
Depth-First Search (DFS)
def dfs_recursive(graph, start, visited=None):
"""DFS traversal (recursive).
Time: O(V + E), Space: O(V) for visited set + O(V) call stack
"""
if visited is None:
visited = set()
visited.add(start)
result = [start]
for neighbor in graph.get(start, []):
if neighbor not in visited:
result.extend(dfs_recursive(graph, neighbor, visited))
return result
def dfs_iterative(graph, start):
"""DFS traversal (iterative with explicit stack).
Time: O(V + E), Space: O(V)
"""
visited = set()
stack = [start]
result = []
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
result.append(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
stack.append(neighbor)
return result
def has_cycle_directed(graph, num_nodes):
"""Detect cycle in a directed graph using DFS with 3-color marking.
WHITE (0) = unvisited
GRAY (1) = in current DFS path (being processed)
BLACK (2) = fully processed
If we encounter a GRAY node, we have found a back edge (cycle).
ML context: Detecting circular dependencies in ML pipeline DAGs.
Time: O(V + E), Space: O(V)
"""
WHITE, GRAY, BLACK = 0, 1, 2
color = {i: WHITE for i in range(num_nodes)}
def dfs(node):
color[node] = GRAY
for neighbor in graph.get(node, []):
if color[neighbor] == GRAY:
return True # Back edge found - cycle exists
if color[neighbor] == WHITE and dfs(neighbor):
return True
color[node] = BLACK
return False
for node in range(num_nodes):
if color[node] == WHITE:
if dfs(node):
return True
return False
BFS vs DFS Comparison
| Aspect | BFS | DFS |
|---|---|---|
| Data structure | Queue | Stack (or recursion) |
| Exploration pattern | Level by level | As deep as possible |
| Shortest path (unweighted) | Yes | No |
| Space complexity | O(width) | O(depth) |
| Best for | Shortest path, level-order, wide graphs | Cycle detection, path finding, deep graphs |
| ML use cases | Pipeline stage distance, layer analysis | Dependency detection, tree serialization |
Topological Sort
Topological sort orders nodes in a DAG such that for every edge (u, v), u comes before v. This is critical for ML pipeline execution ordering.
from collections import deque, defaultdict
def topological_sort_kahn(graph, num_nodes):
"""Topological sort using Kahn's algorithm (BFS-based).
ML context: Determining execution order of ML pipeline stages.
data_ingestion -> preprocessing -> feature_engineering -> training -> evaluation
Time: O(V + E), Space: O(V)
"""
# Compute in-degrees
in_degree = [0] * num_nodes
for node in graph:
for neighbor in graph[node]:
in_degree[neighbor] += 1
# Start with nodes that have no dependencies
queue = deque([i for i in range(num_nodes) if in_degree[i] == 0])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph.get(node, []):
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# If we processed fewer nodes than exist, there is a cycle
if len(order) != num_nodes:
return [] # Cycle detected - no valid topological order
return order
def topological_sort_dfs(graph, num_nodes):
"""Topological sort using DFS (adds nodes in reverse postorder).
Time: O(V + E), Space: O(V)
"""
visited = set()
order = []
def dfs(node):
visited.add(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
dfs(neighbor)
order.append(node)
for node in range(num_nodes):
if node not in visited:
dfs(node)
return order[::-1] # Reverse postorder
# Example: ML Pipeline
def pipeline_execution_order():
"""Determine valid execution order for an ML pipeline."""
stages = {
0: 'data_ingestion',
1: 'data_validation',
2: 'feature_engineering',
3: 'model_training',
4: 'evaluation',
5: 'model_registration'
}
# Dependencies (directed edges)
graph = {
0: [2], # data_ingestion -> feature_engineering
1: [2], # data_validation -> feature_engineering
2: [3], # feature_engineering -> model_training
3: [4, 5], # model_training -> evaluation, model_registration
4: [5], # evaluation -> model_registration
5: []
}
order = topological_sort_kahn(graph, 6)
return [stages[i] for i in order]
# Possible output: ['data_ingestion', 'data_validation',
# 'feature_engineering', 'model_training',
# 'evaluation', 'model_registration']
If you cannot implement topological sort in an ML interview, it is a significant gap. ML engineers work with DAG-based systems (Airflow, Kubeflow, Dagster) daily. Interviewers expect you to know Kahn's algorithm (BFS-based) or DFS-based topological sort from memory.
Shortest Path Algorithms
Dijkstra's Algorithm
import heapq
def dijkstra(graph, start):
"""Find shortest path from start to all other nodes in a weighted graph.
graph[node] = [(neighbor, weight), ...]
ML context: Finding the lowest-cost path through a model
ensemble or pipeline with weighted processing costs.
Time: O((V + E) log V), Space: O(V)
"""
distances = {start: 0}
heap = [(0, start)] # (distance, node)
visited = set()
while heap:
dist, node = heapq.heappop(heap)
if node in visited:
continue
visited.add(node)
for neighbor, weight in graph.get(node, []):
new_dist = dist + weight
if neighbor not in distances or new_dist < distances[neighbor]:
distances[neighbor] = new_dist
heapq.heappush(heap, (new_dist, neighbor))
return distances
# Example: Finding cheapest compute path
# graph = {
# 'cpu': [('gpu_small', 5), ('tpu', 20)],
# 'gpu_small': [('gpu_large', 3), ('cpu', 5)],
# 'gpu_large': [('tpu', 2)],
# 'tpu': []
# }
# dijkstra(graph, 'cpu') -> {'cpu': 0, 'gpu_small': 5, 'gpu_large': 8, 'tpu': 10}
Union-Find (Disjoint Set)
class UnionFind:
"""Union-Find with path compression and union by rank.
ML context: Clustering - grouping similar items,
connected components in similarity graphs.
Find: O(alpha(n)) ~ O(1) amortized
Union: O(alpha(n)) ~ O(1) amortized
"""
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
self.count = n # Number of connected components
def find(self, x):
"""Find root with path compression."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
"""Union by rank. Returns True if x and y were in different components."""
rx, ry = self.find(x), self.find(y)
if rx == ry:
return False
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1
self.count -= 1
return True
def connected(self, x, y):
return self.find(x) == self.find(y)
def count_connected_components(n, edges):
"""Count connected components using Union-Find.
ML context: Finding clusters in a similarity graph
where edges connect similar data points.
Time: O(E * alpha(V)) ~ O(E), Space: O(V)
"""
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return uf.count
Number of Islands (Classic Graph Problem)
def num_islands(grid):
"""Count the number of islands in a 2D grid.
ML context: Connected component analysis in image segmentation.
Each '1' is land, '0' is water. Count distinct land masses.
Time: O(rows * cols), Space: O(rows * cols) for visited
"""
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
visited = set()
count = 0
def dfs(r, c):
if (r < 0 or r >= rows or c < 0 or c >= cols or
grid[r][c] == '0' or (r, c) in visited):
return
visited.add((r, c))
dfs(r + 1, c)
dfs(r - 1, c)
dfs(r, c + 1)
dfs(r, c - 1)
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1' and (r, c) not in visited:
dfs(r, c)
count += 1
return count
Practice Problems
Problem 1: Validate ML Pipeline DAG
Given an ML pipeline as a list of (source, destination) dependencies, determine if the pipeline is valid (no circular dependencies) and return a valid execution order.
# Input: num_stages = 5
# dependencies = [(0,1), (0,2), (1,3), (2,3), (3,4)]
# Output: [0, 1, 2, 3, 4] or [0, 2, 1, 3, 4] (any valid order)
# Input: num_stages = 3
# dependencies = [(0,1), (1,2), (2,0)]
# Output: [] (cycle detected - invalid pipeline)
Hint 1
Hint 2
Hint 3
Solution
from collections import defaultdict, deque
def validate_pipeline(num_stages, dependencies):
"""Return valid execution order or empty list if cycle exists.
Time: O(V + E), Space: O(V + E)
"""
graph = defaultdict(list)
in_degree = [0] * num_stages
for src, dst in dependencies:
graph[src].append(dst)
in_degree[dst] += 1
queue = deque([i for i in range(num_stages) if in_degree[i] == 0])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return order if len(order) == num_stages else []
Problem 2: Binary Tree Right Side View
Given a binary tree, return the values visible from the right side (the last node at each level).
# Input: 1
# / \
# 2 3
# \ \
# 5 4
# Output: [1, 3, 4]
Hint 1
Hint 2
Hint 3
Solution
from collections import deque
def right_side_view(root):
"""Return right-side view of binary tree.
Time: O(n), Space: O(w) for BFS / O(h) for DFS
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue)
for i in range(level_size):
node = queue.popleft()
if i == level_size - 1:
result.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
return result
Problem 3: Clone a Graph
Given a reference to a node in a connected undirected graph, return a deep copy of the graph.
# ML context: Cloning a computation graph for gradient checkpointing
# or creating a copy of a model architecture graph.
Hint 1
Hint 2
Hint 3
Solution
class GraphNode:
def __init__(self, val=0, neighbors=None):
self.val = val
self.neighbors = neighbors if neighbors else []
def clone_graph(node):
"""Deep copy a graph using DFS + hash map.
Time: O(V + E), Space: O(V)
"""
if not node:
return None
cloned = {} # original -> clone mapping
def dfs(original):
if original in cloned:
return cloned[original]
copy = GraphNode(original.val)
cloned[original] = copy
for neighbor in original.neighbors:
copy.neighbors.append(dfs(neighbor))
return copy
return dfs(node)
Problem 4: Word Ladder (BFS Shortest Transformation)
Given a begin word, an end word, and a dictionary, find the shortest transformation sequence (each step changes one letter).
# Input: beginWord = "hit", endWord = "cog"
# wordList = ["hot", "dot", "dog", "lot", "log", "cog"]
# Output: 5 (hit -> hot -> dot -> dog -> cog)
# ML context: Analogous to finding the shortest path in
# a word embedding neighborhood graph.
Hint 1
Hint 2
Hint 3
Solution
from collections import defaultdict, deque
def ladder_length(begin_word, end_word, word_list):
"""Shortest word transformation sequence using BFS.
Time: O(n * m^2) where n = words, m = word length
Space: O(n * m)
"""
if end_word not in word_list:
return 0
word_set = set(word_list)
m = len(begin_word)
# Build pattern map: "h*t" -> ["hit", "hot"]
pattern_map = defaultdict(list)
for word in word_set:
for i in range(m):
pattern = word[:i] + '*' + word[i + 1:]
pattern_map[pattern].append(word)
visited = {begin_word}
queue = deque([(begin_word, 1)])
while queue:
word, depth = queue.popleft()
for i in range(m):
pattern = word[:i] + '*' + word[i + 1:]
for neighbor in pattern_map[pattern]:
if neighbor == end_word:
return depth + 1
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, depth + 1))
return 0
Problem 5: Course Schedule (Cycle Detection)
Given the number of courses and prerequisite pairs, determine if it is possible to finish all courses (no circular dependencies).
# Input: numCourses = 4, prerequisites = [[1,0], [2,1], [3,2]]
# Output: True (take 0 -> 1 -> 2 -> 3)
# Input: numCourses = 2, prerequisites = [[0,1], [1,0]]
# Output: False (circular dependency)
Hint 1
Hint 2
Hint 3
Solution
from collections import defaultdict, deque
def can_finish(num_courses, prerequisites):
"""Determine if all courses can be completed.
Time: O(V + E), Space: O(V + E)
"""
graph = defaultdict(list)
in_degree = [0] * num_courses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
queue = deque([i for i in range(num_courses) if in_degree[i] == 0])
count = 0
while queue:
node = queue.popleft()
count += 1
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return count == num_courses
Problem 6: Serialize and Deserialize Binary Tree
Design functions to serialize a binary tree to a string and deserialize it back.
# ML context: Saving and loading decision tree models,
# serializing computation graphs for distributed training.
Hint 1
Hint 2
Hint 3
Solution
class Codec:
"""Serialize and deserialize a binary tree.
Time: O(n) for both operations, Space: O(n)
"""
def serialize(self, root):
"""Preorder traversal with null markers."""
tokens = []
def preorder(node):
if not node:
tokens.append('null')
return
tokens.append(str(node.val))
preorder(node.left)
preorder(node.right)
preorder(root)
return ','.join(tokens)
def deserialize(self, data):
"""Reconstruct tree from serialized string."""
tokens = iter(data.split(','))
def build():
val = next(tokens)
if val == 'null':
return None
node = TreeNode(int(val))
node.left = build()
node.right = build()
return node
return build()
Algorithm Selection Guide
Interview Cheat Sheet
| Topic | Key Takeaway |
|---|---|
| Tree traversals | Know all 4 (inorder, preorder, postorder, level-order) both recursive and iterative |
| BST property | left < root < right for entire subtrees, not just immediate children |
| BFS | Queue-based, level-by-level, finds shortest path in unweighted graphs |
| DFS | Stack/recursion, goes deep first, used for cycle detection and path problems |
| Topological sort | Kahn's (BFS + in-degree) or DFS postorder reverse. Only works on DAGs |
| Cycle detection | DFS with white/gray/black coloring for directed; Union-Find for undirected |
| Dijkstra | Heap-based, for weighted shortest path. O((V+E) log V) |
| Union-Find | Near O(1) per operation with path compression + union by rank |
| Trie | O(m) insert/search for strings of length m. Used for prefix matching |
| Number of Islands | DFS/BFS flood fill on 2D grid. Classic connected components |
Spaced Repetition Checkpoints
| Day | Review Activity | Time |
|---|---|---|
| Day 0 | Implement all four tree traversals from memory. Solve Number of Islands. | 40 min |
| Day 3 | Implement topological sort (Kahn's) from memory. Test with ML pipeline example. | 20 min |
| Day 7 | Solve 2 new tree problems and 1 graph problem from LeetCode. | 30 min |
| Day 14 | Implement Union-Find from scratch. Solve connected components problem. | 20 min |
| Day 21 | Without reference, write BFS shortest path, DFS cycle detection, and topological sort. Time yourself. | 40 min |
Next Steps
Trees and graphs handle hierarchical and network-structured data. Next, learn Dynamic Programming to tackle optimization problems that appear in sequence alignment, Viterbi decoding, and beam search.
