Skip to main content

Trees and Graphs

BFS, DFS, binary trees, BSTs, tries, graph representations, shortest paths, and topological sort - with ML-relevant practice problems and full Python solutions.

Reading time: ~40 min | Interview relevance: High | Roles: All AI/ML roles

The Real Interview Moment

You are in a virtual on-site at a major AI company. The interviewer shares a screen with a diagram: "Here is the dependency graph for our ML training pipeline. Node A is data ingestion, B is data validation, C is feature engineering (depends on A and B), D is model training (depends on C), E is evaluation (depends on D), and F is model registration (depends on D and E). We need to determine a valid execution order."

You immediately recognize this: topological sort on a DAG. You have seen this exact pattern in production ML systems - Airflow DAGs, Kubeflow pipelines, MLflow projects. You explain the approach: build an adjacency list, compute in-degrees, use a queue to process nodes with zero in-degree. Within ten minutes, you have clean, working code.

The follow-up comes: "What if there is a cycle? How do you detect it?" You explain that if you process fewer nodes than exist in the graph, there must be a cycle. The interviewer probes further: "Can you identify which nodes are in the cycle?" Now you are in territory where DFS with node coloring (white/gray/black) shines.

Trees and graphs are not abstract textbook concepts - they are the backbone of ML systems. This chapter prepares you for both the algorithmic questions and the practical applications.

Binary Trees

Core Traversals

Every tree problem builds on four traversals. Know them cold.

class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right

# ============ RECURSIVE TRAVERSALS ============

def inorder(root):
"""Left -> Root -> Right. Gives sorted order for BST.
Time: O(n), Space: O(h) call stack where h = height
"""
if not root:
return []
return inorder(root.left) + [root.val] + inorder(root.right)

def preorder(root):
"""Root -> Left -> Right. Useful for tree serialization.
Time: O(n), Space: O(h)
"""
if not root:
return []
return [root.val] + preorder(root.left) + preorder(root.right)

def postorder(root):
"""Left -> Right -> Root. Useful for deletion, expression evaluation.
Time: O(n), Space: O(h)
"""
if not root:
return []
return postorder(root.left) + postorder(root.right) + [root.val]

# ============ ITERATIVE TRAVERSALS ============

def inorder_iterative(root):
"""Iterative inorder using an explicit stack."""
result = []
stack = []
current = root
while current or stack:
while current:
stack.append(current)
current = current.left
current = stack.pop()
result.append(current.val)
current = current.right
return result

def preorder_iterative(root):
"""Iterative preorder using a stack."""
if not root:
return []
result = []
stack = [root]
while stack:
node = stack.pop()
result.append(node.val)
# Push right first so left is processed first (LIFO)
if node.right:
stack.append(node.right)
if node.left:
stack.append(node.left)
return result

# ============ LEVEL ORDER (BFS) ============

from collections import deque

def level_order(root):
"""BFS traversal - returns list of lists, one per level.
Time: O(n), Space: O(w) where w = max width of tree
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level = []
for _ in range(len(queue)):
node = queue.popleft()
level.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(level)
return result

Traversal Comparison

TraversalOrderUse CasesML Application
InorderL -> Root -> RGet sorted order from BSTSorted feature extraction from decision tree
PreorderRoot -> L -> RSerialize tree, copy treeSerialize model tree for storage
PostorderL -> R -> RootDelete tree, evaluate expressionsCompute subtree metrics bottom-up
Level Order (BFS)Level by levelFind min depth, level averagesLayer-by-layer analysis of decision tree
60-Second Answer

If an interviewer asks "When would you use BFS vs DFS on a tree?", the answer is: Use BFS when you need level-by-level processing or finding the shortest path (min depth, level averages, right-side view). Use DFS when you need to process leaves before parents (path sums, tree diameter, subtree computations) or when the tree is very wide (BFS queue would be too large).

Common Binary Tree Problems

def max_depth(root):
"""Maximum depth of a binary tree.
Time: O(n), Space: O(h)
"""
if not root:
return 0
return 1 + max(max_depth(root.left), max_depth(root.right))

def is_balanced(root):
"""Check if tree is height-balanced (every subtree's height differs by at most 1).
Time: O(n), Space: O(h)
"""
def check(node):
if not node:
return 0
left_h = check(node.left)
right_h = check(node.right)
if left_h == -1 or right_h == -1 or abs(left_h - right_h) > 1:
return -1
return 1 + max(left_h, right_h)
return check(root) != -1

def diameter(root):
"""Diameter of binary tree (longest path between any two nodes).
Time: O(n), Space: O(h)
"""
max_diameter = [0]
def depth(node):
if not node:
return 0
left = depth(node.left)
right = depth(node.right)
max_diameter[0] = max(max_diameter[0], left + right)
return 1 + max(left, right)
depth(root)
return max_diameter[0]

def lowest_common_ancestor(root, p, q):
"""Find the lowest common ancestor of two nodes.
Time: O(n), Space: O(h)
"""
if not root or root == p or root == q:
return root
left = lowest_common_ancestor(root.left, p, q)
right = lowest_common_ancestor(root.right, p, q)
if left and right:
return root
return left or right

def has_path_sum(root, target):
"""Check if any root-to-leaf path sums to target.
ML context: Check if any branch of a decision tree
produces a cumulative score above a threshold.
Time: O(n), Space: O(h)
"""
if not root:
return False
if not root.left and not root.right:
return root.val == target
return (has_path_sum(root.left, target - root.val) or
has_path_sum(root.right, target - root.val))

Binary Search Trees (BSTs)

BSTs maintain the property: left.val < node.val < right.val. This enables O(log n) search, insert, and delete in balanced trees.

def search_bst(root, target):
"""Search for a value in a BST.
Time: O(h), Space: O(h) recursive / O(1) iterative
"""
if not root or root.val == target:
return root
if target < root.val:
return search_bst(root.left, target)
return search_bst(root.right, target)

def insert_bst(root, val):
"""Insert a value into a BST.
Time: O(h), Space: O(h)
"""
if not root:
return TreeNode(val)
if val < root.val:
root.left = insert_bst(root.left, val)
elif val > root.val:
root.right = insert_bst(root.right, val)
return root

def validate_bst(root, low=float('-inf'), high=float('inf')):
"""Validate that a tree is a valid BST.
Time: O(n), Space: O(h)
"""
if not root:
return True
if root.val <= low or root.val >= high:
return False
return (validate_bst(root.left, low, root.val) and
validate_bst(root.right, root.val, high))

def kth_smallest_bst(root, k):
"""Find the kth smallest element in a BST.
Approach: Inorder traversal gives sorted order.
Time: O(h + k), Space: O(h)
"""
stack = []
current = root
count = 0
while current or stack:
while current:
stack.append(current)
current = current.left
current = stack.pop()
count += 1
if count == k:
return current.val
current = current.right
return -1
Common Trap

When validating a BST, do not just check that left.val < node.val < right.val. You must check that all values in the left subtree are less than the node, and all values in the right subtree are greater. The correct approach passes valid ranges (low, high) down the recursion.

Tries (Prefix Trees)

Tries are essential for NLP-related problems: autocomplete, spell checking, prefix matching.

class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
self.count = 0 # Number of words with this prefix

class Trie:
"""Prefix tree for efficient string operations.

ML context: Building prefix-based search for token vocabularies,
autocomplete for text input, prefix matching in NLP pipelines.

Insert: O(m), Search: O(m), Prefix search: O(m)
where m = length of the word
Space: O(total characters across all words)
"""
def __init__(self):
self.root = TrieNode()

def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.count += 1
node.is_end = True

def search(self, word):
"""Return True if word exists in trie."""
node = self._find_node(word)
return node is not None and node.is_end

def starts_with(self, prefix):
"""Return True if any word starts with prefix."""
return self._find_node(prefix) is not None

def count_with_prefix(self, prefix):
"""Count words with given prefix."""
node = self._find_node(prefix)
return node.count if node else 0

def _find_node(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return None
node = node.children[char]
return node

def autocomplete(self, prefix, limit=5):
"""Return up to `limit` words starting with prefix.

ML context: Autocomplete suggestions based on prefix.
"""
node = self._find_node(prefix)
if not node:
return []
results = []
self._dfs_collect(node, prefix, results, limit)
return results

def _dfs_collect(self, node, current, results, limit):
if len(results) >= limit:
return
if node.is_end:
results.append(current)
for char in sorted(node.children):
self._dfs_collect(node.children[char], current + char, results, limit)

Graph Representations

Graph Representations - Choosing the Right One

# Adjacency list - most common for interviews
def build_adjacency_list(edges, directed=True):
"""Build adjacency list from edge list.
Time: O(E), Space: O(V + E)
"""
from collections import defaultdict
graph = defaultdict(list)
for u, v in edges:
graph[u].append(v)
if not directed:
graph[v].append(u)
return graph

# Adjacency matrix - for dense graphs
def build_adjacency_matrix(n, edges, directed=True):
"""Build adjacency matrix from edge list.
Time: O(V^2 + E), Space: O(V^2)
"""
matrix = [[0] * n for _ in range(n)]
for u, v in edges:
matrix[u][v] = 1
if not directed:
matrix[v][u] = 1
return matrix

Graph Traversals: BFS and DFS

Breadth-First Search (BFS)

from collections import deque

def bfs(graph, start):
"""BFS traversal of a graph.
Time: O(V + E), Space: O(V)
"""
visited = set([start])
queue = deque([start])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return order

def shortest_path_bfs(graph, start, end):
"""Find shortest path in unweighted graph using BFS.

ML context: Finding the shortest dependency chain
between two stages in an ML pipeline.

Time: O(V + E), Space: O(V)
"""
if start == end:
return [start]
visited = set([start])
queue = deque([(start, [start])])
while queue:
node, path = queue.popleft()
for neighbor in graph.get(node, []):
if neighbor == end:
return path + [neighbor]
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, path + [neighbor]))
return [] # No path exists

def bfs_level_by_level(graph, start):
"""BFS that processes level by level (useful for distance calculations).
Time: O(V + E), Space: O(V)
"""
visited = set([start])
queue = deque([start])
levels = []
while queue:
level = []
for _ in range(len(queue)):
node = queue.popleft()
level.append(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
levels.append(level)
return levels

Depth-First Search (DFS)

def dfs_recursive(graph, start, visited=None):
"""DFS traversal (recursive).
Time: O(V + E), Space: O(V) for visited set + O(V) call stack
"""
if visited is None:
visited = set()
visited.add(start)
result = [start]
for neighbor in graph.get(start, []):
if neighbor not in visited:
result.extend(dfs_recursive(graph, neighbor, visited))
return result

def dfs_iterative(graph, start):
"""DFS traversal (iterative with explicit stack).
Time: O(V + E), Space: O(V)
"""
visited = set()
stack = [start]
result = []
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
result.append(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
stack.append(neighbor)
return result

def has_cycle_directed(graph, num_nodes):
"""Detect cycle in a directed graph using DFS with 3-color marking.

WHITE (0) = unvisited
GRAY (1) = in current DFS path (being processed)
BLACK (2) = fully processed

If we encounter a GRAY node, we have found a back edge (cycle).

ML context: Detecting circular dependencies in ML pipeline DAGs.

Time: O(V + E), Space: O(V)
"""
WHITE, GRAY, BLACK = 0, 1, 2
color = {i: WHITE for i in range(num_nodes)}

def dfs(node):
color[node] = GRAY
for neighbor in graph.get(node, []):
if color[neighbor] == GRAY:
return True # Back edge found - cycle exists
if color[neighbor] == WHITE and dfs(neighbor):
return True
color[node] = BLACK
return False

for node in range(num_nodes):
if color[node] == WHITE:
if dfs(node):
return True
return False

BFS vs DFS Comparison

AspectBFSDFS
Data structureQueueStack (or recursion)
Exploration patternLevel by levelAs deep as possible
Shortest path (unweighted)YesNo
Space complexityO(width)O(depth)
Best forShortest path, level-order, wide graphsCycle detection, path finding, deep graphs
ML use casesPipeline stage distance, layer analysisDependency detection, tree serialization

Topological Sort

Topological sort orders nodes in a DAG such that for every edge (u, v), u comes before v. This is critical for ML pipeline execution ordering.

from collections import deque, defaultdict

def topological_sort_kahn(graph, num_nodes):
"""Topological sort using Kahn's algorithm (BFS-based).

ML context: Determining execution order of ML pipeline stages.
data_ingestion -> preprocessing -> feature_engineering -> training -> evaluation

Time: O(V + E), Space: O(V)
"""
# Compute in-degrees
in_degree = [0] * num_nodes
for node in graph:
for neighbor in graph[node]:
in_degree[neighbor] += 1

# Start with nodes that have no dependencies
queue = deque([i for i in range(num_nodes) if in_degree[i] == 0])
order = []

while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph.get(node, []):
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)

# If we processed fewer nodes than exist, there is a cycle
if len(order) != num_nodes:
return [] # Cycle detected - no valid topological order
return order

def topological_sort_dfs(graph, num_nodes):
"""Topological sort using DFS (adds nodes in reverse postorder).
Time: O(V + E), Space: O(V)
"""
visited = set()
order = []

def dfs(node):
visited.add(node)
for neighbor in graph.get(node, []):
if neighbor not in visited:
dfs(neighbor)
order.append(node)

for node in range(num_nodes):
if node not in visited:
dfs(node)

return order[::-1] # Reverse postorder

# Example: ML Pipeline
def pipeline_execution_order():
"""Determine valid execution order for an ML pipeline."""
stages = {
0: 'data_ingestion',
1: 'data_validation',
2: 'feature_engineering',
3: 'model_training',
4: 'evaluation',
5: 'model_registration'
}

# Dependencies (directed edges)
graph = {
0: [2], # data_ingestion -> feature_engineering
1: [2], # data_validation -> feature_engineering
2: [3], # feature_engineering -> model_training
3: [4, 5], # model_training -> evaluation, model_registration
4: [5], # evaluation -> model_registration
5: []
}

order = topological_sort_kahn(graph, 6)
return [stages[i] for i in order]
# Possible output: ['data_ingestion', 'data_validation',
# 'feature_engineering', 'model_training',
# 'evaluation', 'model_registration']
Instant Rejection

If you cannot implement topological sort in an ML interview, it is a significant gap. ML engineers work with DAG-based systems (Airflow, Kubeflow, Dagster) daily. Interviewers expect you to know Kahn's algorithm (BFS-based) or DFS-based topological sort from memory.

Shortest Path Algorithms

Dijkstra's Algorithm

import heapq

def dijkstra(graph, start):
"""Find shortest path from start to all other nodes in a weighted graph.

graph[node] = [(neighbor, weight), ...]

ML context: Finding the lowest-cost path through a model
ensemble or pipeline with weighted processing costs.

Time: O((V + E) log V), Space: O(V)
"""
distances = {start: 0}
heap = [(0, start)] # (distance, node)
visited = set()

while heap:
dist, node = heapq.heappop(heap)
if node in visited:
continue
visited.add(node)

for neighbor, weight in graph.get(node, []):
new_dist = dist + weight
if neighbor not in distances or new_dist < distances[neighbor]:
distances[neighbor] = new_dist
heapq.heappush(heap, (new_dist, neighbor))

return distances

# Example: Finding cheapest compute path
# graph = {
# 'cpu': [('gpu_small', 5), ('tpu', 20)],
# 'gpu_small': [('gpu_large', 3), ('cpu', 5)],
# 'gpu_large': [('tpu', 2)],
# 'tpu': []
# }
# dijkstra(graph, 'cpu') -> {'cpu': 0, 'gpu_small': 5, 'gpu_large': 8, 'tpu': 10}

Union-Find (Disjoint Set)

class UnionFind:
"""Union-Find with path compression and union by rank.

ML context: Clustering - grouping similar items,
connected components in similarity graphs.

Find: O(alpha(n)) ~ O(1) amortized
Union: O(alpha(n)) ~ O(1) amortized
"""
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
self.count = n # Number of connected components

def find(self, x):
"""Find root with path compression."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]

def union(self, x, y):
"""Union by rank. Returns True if x and y were in different components."""
rx, ry = self.find(x), self.find(y)
if rx == ry:
return False
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1
self.count -= 1
return True

def connected(self, x, y):
return self.find(x) == self.find(y)

def count_connected_components(n, edges):
"""Count connected components using Union-Find.

ML context: Finding clusters in a similarity graph
where edges connect similar data points.

Time: O(E * alpha(V)) ~ O(E), Space: O(V)
"""
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return uf.count

Number of Islands (Classic Graph Problem)

def num_islands(grid):
"""Count the number of islands in a 2D grid.

ML context: Connected component analysis in image segmentation.
Each '1' is land, '0' is water. Count distinct land masses.

Time: O(rows * cols), Space: O(rows * cols) for visited
"""
if not grid:
return 0

rows, cols = len(grid), len(grid[0])
visited = set()
count = 0

def dfs(r, c):
if (r < 0 or r >= rows or c < 0 or c >= cols or
grid[r][c] == '0' or (r, c) in visited):
return
visited.add((r, c))
dfs(r + 1, c)
dfs(r - 1, c)
dfs(r, c + 1)
dfs(r, c - 1)

for r in range(rows):
for c in range(cols):
if grid[r][c] == '1' and (r, c) not in visited:
dfs(r, c)
count += 1

return count

Practice Problems

Problem 1: Validate ML Pipeline DAG

Given an ML pipeline as a list of (source, destination) dependencies, determine if the pipeline is valid (no circular dependencies) and return a valid execution order.

# Input: num_stages = 5
# dependencies = [(0,1), (0,2), (1,3), (2,3), (3,4)]
# Output: [0, 1, 2, 3, 4] or [0, 2, 1, 3, 4] (any valid order)

# Input: num_stages = 3
# dependencies = [(0,1), (1,2), (2,0)]
# Output: [] (cycle detected - invalid pipeline)
Hint 1
This is a direct application of topological sort with cycle detection.
Hint 2
Use Kahn's algorithm. Build in-degree array. If the result has fewer nodes than num_stages, there is a cycle.
Hint 3
The graph is directed: dependency (a, b) means a must run before b, so add edge a -> b.
Solution
from collections import defaultdict, deque

def validate_pipeline(num_stages, dependencies):
"""Return valid execution order or empty list if cycle exists.
Time: O(V + E), Space: O(V + E)
"""
graph = defaultdict(list)
in_degree = [0] * num_stages

for src, dst in dependencies:
graph[src].append(dst)
in_degree[dst] += 1

queue = deque([i for i in range(num_stages) if in_degree[i] == 0])
order = []

while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)

return order if len(order) == num_stages else []

Problem 2: Binary Tree Right Side View

Given a binary tree, return the values visible from the right side (the last node at each level).

# Input: 1
# / \
# 2 3
# \ \
# 5 4
# Output: [1, 3, 4]
Hint 1
Use level-order traversal (BFS). The right-side view is the last element of each level.
Hint 2
Alternatively, use DFS visiting right child first, and track the first node seen at each depth.
Hint 3
For BFS: process each level, and append only the last element. For DFS: if `depth == len(result)`, it is the first node at this depth from the right.
Solution
from collections import deque

def right_side_view(root):
"""Return right-side view of binary tree.
Time: O(n), Space: O(w) for BFS / O(h) for DFS
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue)
for i in range(level_size):
node = queue.popleft()
if i == level_size - 1:
result.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
return result

Problem 3: Clone a Graph

Given a reference to a node in a connected undirected graph, return a deep copy of the graph.

# ML context: Cloning a computation graph for gradient checkpointing
# or creating a copy of a model architecture graph.
Hint 1
Use BFS or DFS. The key challenge is avoiding infinite loops with cycles - use a hash map to track which nodes have already been cloned.
Hint 2
Map: `original node -> cloned node`. Before processing neighbors, check if they have already been cloned.
Hint 3
Process: create clone, add to map, iterate through neighbors, clone each uncloned neighbor and recurse.
Solution
class GraphNode:
def __init__(self, val=0, neighbors=None):
self.val = val
self.neighbors = neighbors if neighbors else []

def clone_graph(node):
"""Deep copy a graph using DFS + hash map.
Time: O(V + E), Space: O(V)
"""
if not node:
return None

cloned = {} # original -> clone mapping

def dfs(original):
if original in cloned:
return cloned[original]
copy = GraphNode(original.val)
cloned[original] = copy
for neighbor in original.neighbors:
copy.neighbors.append(dfs(neighbor))
return copy

return dfs(node)

Problem 4: Word Ladder (BFS Shortest Transformation)

Given a begin word, an end word, and a dictionary, find the shortest transformation sequence (each step changes one letter).

# Input: beginWord = "hit", endWord = "cog"
# wordList = ["hot", "dot", "dog", "lot", "log", "cog"]
# Output: 5 (hit -> hot -> dot -> dog -> cog)

# ML context: Analogous to finding the shortest path in
# a word embedding neighborhood graph.
Hint 1
Model as a graph problem: each word is a node, and two words are connected if they differ by exactly one character. Use BFS for shortest path.
Hint 2
Instead of checking all pairs of words (O(n^2 * m)), create a pattern map: "h*t" -> ["hit", "hot"]. This builds the graph in O(n * m).
Hint 3
BFS from beginWord. At each step, generate all wildcard patterns for the current word and find matching words from the pattern map.
Solution
from collections import defaultdict, deque

def ladder_length(begin_word, end_word, word_list):
"""Shortest word transformation sequence using BFS.
Time: O(n * m^2) where n = words, m = word length
Space: O(n * m)
"""
if end_word not in word_list:
return 0

word_set = set(word_list)
m = len(begin_word)

# Build pattern map: "h*t" -> ["hit", "hot"]
pattern_map = defaultdict(list)
for word in word_set:
for i in range(m):
pattern = word[:i] + '*' + word[i + 1:]
pattern_map[pattern].append(word)

visited = {begin_word}
queue = deque([(begin_word, 1)])

while queue:
word, depth = queue.popleft()
for i in range(m):
pattern = word[:i] + '*' + word[i + 1:]
for neighbor in pattern_map[pattern]:
if neighbor == end_word:
return depth + 1
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, depth + 1))

return 0

Problem 5: Course Schedule (Cycle Detection)

Given the number of courses and prerequisite pairs, determine if it is possible to finish all courses (no circular dependencies).

# Input: numCourses = 4, prerequisites = [[1,0], [2,1], [3,2]]
# Output: True (take 0 -> 1 -> 2 -> 3)

# Input: numCourses = 2, prerequisites = [[0,1], [1,0]]
# Output: False (circular dependency)
Hint 1
This is cycle detection in a directed graph. A valid course schedule exists if and only if the prerequisite graph is a DAG.
Hint 2
Use topological sort (Kahn's algorithm). If you can process all nodes, there is no cycle.
Hint 3
Edge [a, b] means "to take course a, you must first take course b" - so the edge is b -> a.
Solution
from collections import defaultdict, deque

def can_finish(num_courses, prerequisites):
"""Determine if all courses can be completed.
Time: O(V + E), Space: O(V + E)
"""
graph = defaultdict(list)
in_degree = [0] * num_courses

for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1

queue = deque([i for i in range(num_courses) if in_degree[i] == 0])
count = 0

while queue:
node = queue.popleft()
count += 1
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)

return count == num_courses

Problem 6: Serialize and Deserialize Binary Tree

Design functions to serialize a binary tree to a string and deserialize it back.

# ML context: Saving and loading decision tree models,
# serializing computation graphs for distributed training.
Hint 1
Use preorder traversal with a special marker for null nodes.
Hint 2
Serialize: preorder, writing "null" for empty nodes. Deserialize: read values in order, using "null" to know when a subtree is empty.
Hint 3
Use an iterator/index to track position during deserialization.
Solution
class Codec:
"""Serialize and deserialize a binary tree.
Time: O(n) for both operations, Space: O(n)
"""
def serialize(self, root):
"""Preorder traversal with null markers."""
tokens = []
def preorder(node):
if not node:
tokens.append('null')
return
tokens.append(str(node.val))
preorder(node.left)
preorder(node.right)
preorder(root)
return ','.join(tokens)

def deserialize(self, data):
"""Reconstruct tree from serialized string."""
tokens = iter(data.split(','))

def build():
val = next(tokens)
if val == 'null':
return None
node = TreeNode(int(val))
node.left = build()
node.right = build()
return node

return build()

Algorithm Selection Guide

Graph and Tree Algorithm Selection Guide

Interview Cheat Sheet

TopicKey Takeaway
Tree traversalsKnow all 4 (inorder, preorder, postorder, level-order) both recursive and iterative
BST propertyleft < root < right for entire subtrees, not just immediate children
BFSQueue-based, level-by-level, finds shortest path in unweighted graphs
DFSStack/recursion, goes deep first, used for cycle detection and path problems
Topological sortKahn's (BFS + in-degree) or DFS postorder reverse. Only works on DAGs
Cycle detectionDFS with white/gray/black coloring for directed; Union-Find for undirected
DijkstraHeap-based, for weighted shortest path. O((V+E) log V)
Union-FindNear O(1) per operation with path compression + union by rank
TrieO(m) insert/search for strings of length m. Used for prefix matching
Number of IslandsDFS/BFS flood fill on 2D grid. Classic connected components

Spaced Repetition Checkpoints

DayReview ActivityTime
Day 0Implement all four tree traversals from memory. Solve Number of Islands.40 min
Day 3Implement topological sort (Kahn's) from memory. Test with ML pipeline example.20 min
Day 7Solve 2 new tree problems and 1 graph problem from LeetCode.30 min
Day 14Implement Union-Find from scratch. Solve connected components problem.20 min
Day 21Without reference, write BFS shortest path, DFS cycle detection, and topological sort. Time yourself.40 min

Next Steps

Trees and graphs handle hierarchical and network-structured data. Next, learn Dynamic Programming to tackle optimization problems that appear in sequence alignment, Viterbi decoding, and beam search.

© 2026 EngineersOfAI. All rights reserved.