Research Engineer Problem List

Reading time: ~40 min | Interview relevance: Critical | Roles: Research Engineer, Research Scientist, Applied Research Scientist, ML Research Engineer

A Research Engineer at a top AI lab opens your interview with: "Here is a paper we published last month. Read the abstract and Section 3. You have 45 minutes to implement the core algorithm." No LeetCode. No system design templates. Just you, a whiteboard (or laptop), and the mathematical heart of a new method. If that sounds exciting rather than terrifying, this role is for you.

Research Engineer interviews are the most technically demanding in AI/ML. They test deep mathematical understanding, algorithm implementation from papers, and the ability to critically evaluate research. This list of 45 problems prepares you for all four dimensions: paper implementation, mathematical reasoning, algorithm coding, and research taste.

Research Engineer Interview Structure

Round	Duration	What They Test	Weight
Algorithm Implementation	60-90 min	Code an algorithm from a paper or description	30-35%
Math & Theory	45-60 min	Probability, linear algebra, optimization, information theory	20-25%
Paper Discussion	45-60 min	Critically analyze a paper, propose improvements	20-25%
Coding	45-60 min	Strong CS fundamentals, DSA	15-20%
Research Taste	30-45 min	What problems matter? Where is the field going?	5-10%

:::tip The Research Engineer Bar Research Engineers are expected to bridge the gap between mathematical ideas and working code. You must be comfortable reading equations, understanding the intuition behind them, and implementing them efficiently. This is a rare combination. :::

Section 1: Paper Implementation (12 Problems)

These problems simulate the most distinctive part of Research Engineer interviews: implementing algorithms from paper descriptions.

Core Algorithm Implementations

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
1	Implement Multi-Head Self-Attention	Hard	40 min	Scaled dot-product attention, head splitting, concatenation	The foundation of Transformers; must be second nature	DeepMind, Google, OpenAI, Anthropic
2	Implement Byte-Pair Encoding (BPE) Tokenizer	Medium	30 min	Iterative merge of frequent pairs	Core NLP preprocessing; appears in every LLM paper	OpenAI, Google, Meta
3	Implement Beam Search Decoding	Medium	30 min	Breadth-limited search, log probability accumulation	Standard decoding strategy for sequence models	Google, Meta, AI Labs
4	Implement a Simple GAN Training Loop	Hard	40 min	Alternating optimization, generator/discriminator interplay	Tests understanding of adversarial training dynamics	DeepMind, OpenAI, Meta
5	Implement Contrastive Learning (SimCLR-style)	Hard	40 min	Data augmentation, projection head, NT-Xent loss	Self-supervised learning is a major research direction	Google, Meta, DeepMind
6	Implement REINFORCE Policy Gradient	Hard	35 min	Log probability trick, baseline subtraction, variance reduction	Foundation of RLHF and RL research	DeepMind, OpenAI, Anthropic

Advanced Implementations

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
7	Implement Flash Attention (Simplified)	Hard	45 min	Tiled computation, memory-efficient attention	Critical optimization for large-scale Transformers	AI Labs
8	Implement LoRA (Low-Rank Adaptation)	Medium	30 min	Low-rank decomposition of weight updates	Standard parameter-efficient fine-tuning	AI Labs, Big Tech
9	Implement Rotary Position Embeddings (RoPE)	Hard	35 min	Rotation matrices, relative position encoding	Used in most modern LLMs	AI Labs
10	Implement a VAE (Variational Autoencoder)	Hard	40 min	Reparameterization trick, ELBO, KL divergence	Core generative model; tests probabilistic ML depth	DeepMind, OpenAI, Meta
11	Implement Group Query Attention (GQA)	Medium	25 min	Key-value head sharing, memory reduction	Efficiency technique in modern Transformers	Google, Meta
12	Implement DDPM Noise Schedule and Forward Process	Hard	35 min	Gaussian noise addition, noise schedule, variance schedule	Diffusion models are a major research area	Google, OpenAI, Stability AI

:::warning Paper Implementation Tips

Read the math first, then the code. Do not jump to implementation.
Identify the core computation (usually 3-5 lines of math).
Implement a naive version first, then optimize.
Always verify dimensions with small examples.
Comment your code with equation references. :::

Section 2: Mathematical Reasoning (12 Problems)

These problems test your ability to reason mathematically about ML concepts -- a core requirement for research roles.

Linear Algebra & Optimization

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
13	Derive the Gradient of Softmax Cross-Entropy Loss	Medium	25 min	Chain rule, Jacobian of softmax	Every neural network uses this; derivation tests understanding	All AI Labs
14	Prove That SVD Gives the Best Low-Rank Approximation	Hard	30 min	Eckart-Young theorem, Frobenius norm minimization	Foundation of dimensionality reduction and compression	DeepMind, Google
15	Derive the Update Rules for Adam Optimizer	Medium	25 min	Exponential moving averages, bias correction	Most common optimizer; understanding internals matters	All
16	Explain and Derive the Reparameterization Trick	Medium	25 min	Pathwise gradient estimation, sampling from distributions	Critical for VAEs and stochastic computation graphs	DeepMind, OpenAI, Meta
17	Derive the Gradient of Attention with Respect to Queries	Hard	30 min	Matrix calculus, softmax Jacobian, chain rule	Deep understanding of Transformer training	AI Labs

Probability & Information Theory

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
18	Derive the ELBO (Evidence Lower Bound) for VAEs	Hard	30 min	Jensen's inequality, KL divergence decomposition	Foundation of variational inference	DeepMind, OpenAI, Meta
19	Prove That KL Divergence Is Non-Negative	Medium	20 min	Jensen's inequality, convexity of -log	Basic information theory; tests mathematical rigor	All AI Labs
20	Calculate the Entropy of a Mixture of Gaussians	Medium	25 min	Mixture model entropy bounds, Monte Carlo estimation	Mixture models appear throughout ML	DeepMind, Google
21	Derive the Bias-Variance Decomposition for MSE	Medium	20 min	Expectation algebra, law of total expectation	Foundational ML theory	All
22	Explain Why Dropout Works as Approximate Bayesian Inference	Hard	30 min	Monte Carlo dropout, model uncertainty	Connects practical technique to theory	DeepMind, Google

Analysis & Convergence

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
23	Prove Convergence of SGD Under Convexity Assumptions	Hard	35 min	Learning rate schedule, expected loss decrease	Optimization theory for deep learning	DeepMind, Google
24	Analyze the Computational Complexity of Transformer Self-Attention	Medium	20 min	O(n^2*d) time and space, alternatives	Understanding scaling is critical for LLM research	All AI Labs

Section 3: Research-Flavored Coding (10 Problems)

These problems test strong CS fundamentals with a research twist -- the kind of algorithmic thinking needed to make research ideas work in practice.

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
25	Implement Efficient Top-K Selection Without Full Sort	Medium	20 min	Quickselect, partial sort	Used in beam search, top-K sampling, retrieval	All
26	Implement a Bloom Filter for Deduplication	Medium	25 min	Probabilistic data structure, false positive analysis	Used in training data deduplication, web crawling	Google, Meta
27	Implement A Search for Shortest Path*	Medium	30 min	Heuristic search, priority queue	Planning in agents, graph-based reasoning	DeepMind, Google
28	Implement Sparse Matrix Multiplication	Hard	35 min	CSR/CSC format, efficient iteration	Sparse operations are critical for large-scale models	Google, DeepMind
29	Implement the Hungarian Algorithm for Optimal Assignment	Hard	40 min	Bipartite matching, augmenting paths	Used in DETR (object detection), evaluation metrics	DeepMind, Meta
30	Implement Dijkstra's Algorithm with Decrease-Key	Medium	25 min	Priority queue with updates	Graph reasoning, shortest path problems	All
31	Implement a KD-Tree for Nearest Neighbor Search	Hard	35 min	Space partitioning, recursive construction	Efficient search in embedding spaces	Google, DeepMind
32	Implement Online Learning (Perceptron with Mistake Bound)	Medium	25 min	Online update, mistake-driven learning	Foundation of online/streaming ML	DeepMind, Google
33	Implement Parallel Prefix Sum (Scan)	Medium	25 min	Work-efficient parallel algorithm	Foundation of GPU programming, parallel reductions	AI Labs
34	Implement Consistent Hashing for Distributed Data	Medium	25 min	Hash ring, virtual nodes	Distributed training data partitioning	Google, Meta

Section 4: Paper Discussion & Research Taste (11 Problems)

These problems test your ability to read, critique, and extend research -- the hallmark of a strong research engineer.

Paper Analysis

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
35	Critique the Experimental Setup of "Attention Is All You Need"	Medium	25 min	Ablation design, baseline selection, evaluation metrics	The most important modern ML paper	All AI Labs
36	Compare BERT vs. GPT Pre-Training Approaches: Strengths and Weaknesses	Medium	20 min	Masked LM vs. autoregressive, bidirectional vs. unidirectional	Foundation of modern NLP	All
37	Explain Why Scaling Laws Matter for LLM Development	Medium	25 min	Chinchilla scaling, compute-optimal training	Drives billion-dollar resource allocation decisions	OpenAI, Anthropic, DeepMind
38	Analyze the RLHF Pipeline: What Can Go Wrong?	Hard	30 min	Reward hacking, distribution shift, reward model limitations	Core technique for aligning LLMs	Anthropic, OpenAI, DeepMind
39	Explain Chain-of-Thought Prompting: Why Does It Work?	Medium	20 min	Implicit computation, reasoning traces	Emergent ability with practical implications	All

Research Direction Questions

#	Problem	Difficulty	Time	Key Concept	Why It Matters	Company Tags
40	What Are the Most Important Open Problems in AI Safety?	Medium	25 min	Alignment, interpretability, robustness	Safety research is a top priority at AI labs	Anthropic, OpenAI, DeepMind
41	Propose an Approach to Make Transformers Handle Long Contexts Efficiently	Hard	30 min	Efficient attention, memory mechanisms, retrieval augmentation	Active area of research with direct practical impact	All AI Labs
42	How Would You Evaluate Whether an LLM Truly "Understands" Language?	Hard	30 min	Behavioral tests, probing, mechanistic interpretability	Philosophical but practically relevant	Anthropic, DeepMind
43	Design an Experiment to Test Whether Larger Models Are More Calibrated	Medium	25 min	Experimental design, calibration metrics, controlled comparisons	Tests ability to design rigorous experiments	All AI Labs
44	What Research Would You Pursue Given Unlimited Compute for 6 Months?	Medium	20 min	Research vision, feasibility assessment, impact estimation	Tests research taste and ambition	All AI Labs
45	Critique a Recent Paper of Your Choice and Propose an Extension	Hard	30 min	Critical reading, identifying limitations, creative extension	The ultimate research engineer test	All AI Labs

:::note Research Taste Questions There are no "right" answers to research taste questions. Interviewers are looking for:

Awareness of the current research landscape
Critical thinking about what matters and what doesn't
Originality in proposing new directions
Feasibility assessment -- wild ideas are fine if you acknowledge the challenges
Depth on at least one area you care about deeply :::

6-Week Research Engineer Study Plan

Research Engineer preparation takes longer due to the mathematical depth required.

Week	Focus	Problems	Daily Load
Week 1	Core implementations	#1-6	1 implementation/day
Week 2	Advanced implementations	#7-12	1 implementation/day
Week 3	Mathematics	#13-24	2 proofs/derivations per day
Week 4	Research coding	#25-34	2 problems/day
Week 5	Paper discussion	#35-45	2 problems/day + read papers
Week 6	Integration + mock	Mixed	1 deep problem + 1 mock/day

Daily Practice Format for Research Engineers

Research Engineer Daily Practice Format - Morning, Afternoon, Evening Breakdown

:::tip Building Research Taste Research taste is built over months, not days. Start reading papers now, even if you are months from interviews:

Subscribe to arXiv daily digests for your subfield
Follow AI researchers on Twitter/X for commentary
Attend online reading groups
Write brief summaries of papers you read :::

Essential Math Reference

Linear Algebra Core

Concept	Where It Appears	Must Know
Matrix multiplication	Attention, linear layers	Dimensions, complexity
Eigendecomposition	PCA, spectral methods	Eigenvalues, eigenvectors
SVD	Compression, LoRA	Truncated SVD, rank
Matrix calculus	Backpropagation	Jacobian, chain rule
Positive definiteness	Kernel methods, covariance	Cholesky, eigenvalue test

Probability & Statistics Core

Concept	Where It Appears	Must Know
Bayes' theorem	Bayesian inference, posteriors	Prior, likelihood, posterior
KL divergence	VAEs, RLHF, distillation	Properties, computation
Entropy	Information theory, cross-entropy loss	Bits, nats, relationship to loss
Gaussian distribution	Everything	PDF, MLE, conjugate prior
Law of large numbers	SGD convergence	Weak vs. strong

Optimization Core

Concept	Where It Appears	Must Know
Gradient descent	All training	Learning rate, convergence
Convexity	Loss landscape analysis	Convex functions, local vs. global
Lagrange multipliers	Constrained optimization, SVM	KKT conditions
Stochastic optimization	SGD, Adam	Variance, bias, convergence
Second-order methods	L-BFGS, natural gradient	Hessian, Fisher information

Problem Deep Dive: Implement Multi-Head Self-Attention

This is the single most important implementation problem for research roles. Here is how to approach it:

The Math

Attention(Q, K, V) = softmax(Q @ K^T / sqrt(d_k)) @ V

Multi-Head:
  For each head i:
    Q_i = X @ W_Q_i    (project to head dimension)
    K_i = X @ W_K_i
    V_i = X @ W_V_i
    head_i = Attention(Q_i, K_i, V_i)

  Output = Concat(head_1, ..., head_h) @ W_O

Implementation Skeleton (NumPy)

def multi_head_attention(X, W_Q, W_K, W_V, W_O, n_heads):
    batch, seq_len, d_model = X.shape
    d_k = d_model // n_heads

    # Project to Q, K, V
    Q = X @ W_Q  # (batch, seq_len, d_model)
    K = X @ W_K
    V = X @ W_V

    # Reshape for multi-head: (batch, n_heads, seq_len, d_k)
    Q = Q.reshape(batch, seq_len, n_heads, d_k).transpose(0, 2, 1, 3)
    K = K.reshape(batch, seq_len, n_heads, d_k).transpose(0, 2, 1, 3)
    V = V.reshape(batch, seq_len, n_heads, d_k).transpose(0, 2, 1, 3)

    # Scaled dot-product attention
    scores = Q @ K.transpose(0, 1, 3, 2) / np.sqrt(d_k)  # (batch, heads, seq, seq)
    weights = softmax(scores, axis=-1)
    context = weights @ V  # (batch, heads, seq, d_k)

    # Concatenate heads
    context = context.transpose(0, 2, 1, 3).reshape(batch, seq_len, d_model)

    # Output projection
    output = context @ W_O
    return output

What Interviewers Check

Correct scaling by sqrt(d_k) -- prevents softmax saturation
Correct reshape and transpose for multi-head splitting
Correct concatenation order after attention
Numerically stable softmax (subtract max before exp)
Discussion of masking for decoder (causal mask)
Complexity analysis: O(n^2 * d) time and space

Difficulty Distribution

Difficulty	Problems	Count
Easy	(none)	0
Medium	#2, #3, #8, #11, #13, #15, #16, #19, #20, #21, #24, #25, #26, #27, #30, #32, #33, #34, #35, #36, #37, #39, #40, #43, #44	25
Hard	#1, #4, #5, #6, #7, #9, #10, #12, #14, #17, #18, #22, #23, #28, #29, #31, #38, #41, #42, #45	20

:::danger Research Engineer Problems Are Hard Notice: there are zero Easy problems. Research Engineer interviews are the hardest in AI/ML. If you are not comfortable with Medium-difficulty problems, build a stronger foundation with the Core 50 and Medium Tier first. :::

Progress Tracker

#	Problem	Status
1	Multi-Head Self-Attention	[ ]
2	BPE Tokenizer	[ ]
3	Beam Search	[ ]
4	GAN Training Loop	[ ]
5	SimCLR Contrastive Learning	[ ]
6	REINFORCE Policy Gradient	[ ]
7	Flash Attention (Simplified)	[ ]
8	LoRA Implementation	[ ]
9	Rotary Position Embeddings	[ ]
10	VAE with Reparameterization	[ ]
11	Group Query Attention	[ ]
12	DDPM Noise Schedule	[ ]
13	Softmax CE Gradient	[ ]
14	SVD Low-Rank Proof	[ ]
15	Adam Optimizer Derivation	[ ]
16	Reparameterization Trick	[ ]
17	Attention Gradient	[ ]
18	ELBO Derivation	[ ]
19	KL Non-Negativity Proof	[ ]
20	Mixture of Gaussians Entropy	[ ]
21	Bias-Variance Decomposition	[ ]
22	Dropout as Bayesian Inference	[ ]
23	SGD Convergence Proof	[ ]
24	Transformer Complexity Analysis	[ ]
25	Efficient Top-K Selection	[ ]
26	Bloom Filter	[ ]
27	A* Search	[ ]
28	Sparse Matrix Multiplication	[ ]
29	Hungarian Algorithm	[ ]
30	Dijkstra with Decrease-Key	[ ]
31	KD-Tree	[ ]
32	Online Perceptron	[ ]
33	Parallel Prefix Sum	[ ]
34	Consistent Hashing	[ ]
35	Critique "Attention Is All You Need"	[ ]
36	BERT vs GPT Comparison	[ ]
37	Scaling Laws	[ ]
38	RLHF Pipeline Analysis	[ ]
39	Chain-of-Thought Analysis	[ ]
40	AI Safety Open Problems	[ ]
41	Long-Context Transformers	[ ]
42	LLM Understanding Evaluation	[ ]
43	Calibration Experiment Design	[ ]
44	Research Vision Question	[ ]
45	Paper Critique & Extension	[ ]

Next Steps

After completing the Research Engineer problem list:

Hard Tier for more challenging algorithmic problems
Google-Style Problems since DeepMind/Google Brain are top research destinations
Section 9: Paper Discussion for deeper paper analysis practice
Section 15: Role-Specific Prep for the full Research Engineer preparation path

Research Engineer Interview Structure​

Section 1: Paper Implementation (12 Problems)​

Core Algorithm Implementations​

Advanced Implementations​

Section 2: Mathematical Reasoning (12 Problems)​

Linear Algebra & Optimization​

Probability & Information Theory​

Analysis & Convergence​

Section 3: Research-Flavored Coding (10 Problems)​

Section 4: Paper Discussion & Research Taste (11 Problems)​

Paper Analysis​

Research Direction Questions​

6-Week Research Engineer Study Plan​

Daily Practice Format for Research Engineers​

Essential Math Reference​

Linear Algebra Core​

Probability & Statistics Core​

Optimization Core​

Problem Deep Dive: Implement Multi-Head Self-Attention​

The Math​

Implementation Skeleton (NumPy)​

What Interviewers Check​

Difficulty Distribution​

Progress Tracker​

Next Steps​