LLM Interviews - The Complete 2026 Preparation Roadmap

Reading time: ~30 min | Interview relevance: Critical | Roles: MLE, AI Eng, LLM Eng, Research Eng, Applied Scientist

The Real Interview Moment

You are sitting in the first round of a Series B AI startup interview. The hiring manager leans forward and says: "We get 500 applications a week from people who say they have LLM experience. Most of them have called an API and written some prompts. Walk me through how you would build our fine-tuned model from scratch - from data collection to production deployment with guardrails."

This is the reality of LLM interviews in 2026. Every software engineer's resume now lists "LLM experience." The bar has shifted dramatically. Interviewers no longer ask "What is a Transformer?" - they ask "Why does LLaMA 3 use GQA instead of MHA, and what is the memory savings at 128K context length?" They do not want API callers. They want engineers who understand the full stack from pretraining data curation to inference optimization.

This section is your complete preparation guide. It covers 11 interconnected topics, each with the depth expected at top AI labs and LLM-focused startups.

What You Will Master

Map the complete LLM interview landscape across 11 core topics
Assess your current level and identify the highest-ROI study areas
Choose the right study path for your target role and timeline
Understand what separates a "strong hire" from "everyone else" in 2026
Track your preparation progress with spaced repetition checkpoints

Self-Assessment: Where Are You Now?

Rate yourself honestly on each topic. This is your starting point - you will reassess after studying each chapter.

#	Topic	Your Score
1	Transformer Internals for LLMs	___
2	LLM Pretraining	___
3	Fine-Tuning (LoRA, QLoRA, Adapters)	___
4	RLHF and Alignment	___
5	RAG Systems	___
6	Prompt Engineering	___
7	LLM Evaluation	___
8	Inference Optimization	___
9	Agent Architectures	___
10	Safety and Guardrails	___
11	LLM Interview Questions (Capstone)	___

Scoring guide:

40+ total: You are well-prepared. Focus on weak spots and practice under time pressure.
25-39: Solid foundation. Work through the chapters in dependency order.
Under 25: Start from Chapter 1 and work sequentially. Allow 4-6 weeks.

Why LLM Interviews Are Different in 2026

The "Everyone Claims LLM Experience" Problem

In 2023, listing "LLM experience" on your resume was a differentiator. By 2026, it is table stakes - and the signal-to-noise ratio has collapsed. Here is what changed:

LLM Experience Evolution

Instant Rejection

Saying "I built an LLM application" when you mean "I called the OpenAI API with a system prompt" will end your interview. Interviewers in 2026 probe immediately: "What model did you use? Why? What was your evaluation framework? How did you handle hallucinations?"

What Top Companies Actually Test

The interview landscape has stratified into distinct tiers:

Tier	Companies	What They Test	Depth Expected
Tier 1 - Frontier Labs	Anthropic, OpenAI, Google DeepMind, Meta FAIR	Pretraining, architecture research, alignment theory	Can derive from first principles, propose novel approaches
Tier 2 - LLM Infrastructure	Databricks, Anyscale, Modal, Together AI, Fireworks	Training infrastructure, inference optimization, serving	Can build training pipelines, optimize serving stacks
Tier 3 - AI-Native Products	Cursor, Replit, Notion AI, Harvey, Glean	RAG, agents, evaluation, fine-tuning for domain	Can build end-to-end LLM features, measure quality
Tier 4 - Enterprise AI	Big tech AI teams, consulting, finance	RAG, prompt engineering, safety, cost optimization	Can deploy reliably at scale with guardrails

Company Variation

Anthropic and OpenAI will ask you to derive attention complexity from scratch. A Series A startup building an AI code editor will ask you to design a RAG pipeline that handles 50K-file codebases. Both are "LLM interviews" but they test completely different depths.

The Five Competencies Interviewers Probe

Every LLM interview question maps to one or more of these competencies:

LLM Engineering Competencies

Part 1 - The 11-Topic Roadmap

Topic Dependency Diagram

The topics in this section are not independent. Study them in dependency order to build understanding layer by layer:

Topic Dependency Map

Legend: Red = foundational (start here) | Yellow = core training topics | Blue = application layer | Green = advanced integration

Topic-by-Topic Summary

Chapter 1: Transformer Internals for LLMs

Why it matters: Every other topic builds on this. You cannot discuss pretraining, fine-tuning, or inference optimization without understanding the architecture.

Key concepts: Decoder-only architecture, causal masking, RoPE positional encoding, Grouped Query Attention (GQA), SwiGLU FFN, RMSNorm, KV cache mechanics, parameter counting, FLOP estimation.

Interview frequency: Asked in 95% of LLM interviews. Frontier labs expect derivation-level depth.

Chapter 2: LLM Pretraining

Why it matters: Understanding pretraining separates engineers who can build foundation models from those who only consume them.

Key concepts: Data collection and filtering pipelines, tokenization (BPE, SentencePiece, tiktoken), training objectives (causal LM, prefix LM, fill-in-the-middle), scaling laws (Chinchilla, inference-aware), compute budgets, 3D parallelism, checkpointing, fault tolerance.

Interview frequency: Asked at Tier 1 and Tier 2 companies. Tier 3 asks lighter versions focused on data quality.

Chapter 3: Fine-Tuning LLMs

Why it matters: The most practical interview topic. Every company that uses LLMs has fine-tuning decisions to make.

Key concepts: Full fine-tuning, LoRA/QLoRA math, prefix tuning, adapter layers, instruction tuning, data formatting, quality vs quantity tradeoffs, when to fine-tune vs prompt engineer vs RAG, catastrophic forgetting, cost analysis.

Interview frequency: Asked in 85% of LLM interviews. Expect to compare approaches with concrete cost numbers.

Chapter 4: RLHF and Alignment

Why it matters: This is what makes raw language models into useful assistants. Alignment is the hottest research area in AI.

Key concepts: Reward model training, PPO for LLMs, DPO and its variants, Constitutional AI, RLAIF, preference data collection, reward hacking, alignment tax.

Interview frequency: Critical at frontier labs. Tier 3-4 companies ask conceptual questions.

Chapter 5: RAG Systems

Why it matters: RAG is the most deployed LLM pattern in production. If you are interviewing at any company building LLM products, expect RAG questions.

Key concepts: Chunking strategies, embedding models, vector databases, hybrid search, reranking, query transformation, multi-hop RAG, evaluation (faithfulness, relevance, recall).

Interview frequency: Asked in 90% of applied AI interviews. System design rounds often center on RAG.

Chapter 6: Prompt Engineering

Why it matters: The gap between amateur and expert prompting is enormous. Companies need engineers who can systematically optimize prompts.

Key concepts: Chain-of-thought, few-shot design, system prompt architecture, structured outputs, prompt injection defense, A/B testing prompts, prompt versioning.

Interview frequency: Asked everywhere, but depth varies. Frontier labs test understanding of why techniques work.

Chapter 7: LLM Evaluation

Why it matters: "How do you know it works?" is the question that separates production engineers from demo builders.

Key concepts: Perplexity and its limitations, benchmark suites (MMLU, HumanEval, MT-Bench), human evaluation design, LLM-as-judge, contamination detection, domain-specific eval, A/B testing in production.

Interview frequency: Increasingly common. Every serious company asks about evaluation strategy.

Chapter 8: Inference Optimization

Why it matters: Serving LLMs at scale is expensive. Companies need engineers who can reduce latency and cost by 10x.

Key concepts: KV cache optimization, continuous batching, speculative decoding, quantization (GPTQ, AWQ, GGUF), PagedAttention/vLLM, tensor parallelism for serving, prefill vs decode optimization.

Interview frequency: Critical at Tier 2 infrastructure companies. Asked at all tiers for senior roles.

Chapter 9: Agent Architectures

Why it matters: Agents are the frontier of LLM applications. Companies are racing to build reliable autonomous systems.

Key concepts: ReAct pattern, tool use, planning and decomposition, memory systems, multi-agent coordination, error recovery, evaluation of agent systems.

Interview frequency: Growing rapidly. Most common at AI-native product companies.

Chapter 10: Safety and Guardrails

Why it matters: No production LLM ships without safety. Regulatory pressure is increasing globally.

Key concepts: Prompt injection and jailbreaks, output filtering, content classification, constitutional approaches, red teaming, safety benchmarks, regulatory compliance (EU AI Act).

Interview frequency: Asked at every company deploying LLMs to users. Frontier labs go deep on alignment theory.

Chapter 11: LLM Interview Questions (Capstone)

Why it matters: Integrative questions that span multiple topics, simulating real interview pressure.

Key concepts: Cross-topic system design, rapid-fire concept questions, debugging scenarios, paper discussion, whiteboard architecture.

Interview frequency: This IS the interview.

Part 2 - Study Paths by Role and Timeline

Path Selection Guide

Study Path Selection

Detailed Study Paths

The Deep Path (Research Engineer / Scientist) - 6 weeks

Target companies: Anthropic, OpenAI, Google DeepMind, Meta FAIR

Week	Topics	Focus
1	Ch 1: Transformer Internals	Derive attention, implement from scratch, parameter counting
2	Ch 2: Pretraining	Scaling laws derivation, data pipeline design, training infrastructure
3	Ch 4: RLHF and Alignment	Reward modeling math, DPO derivation, alignment research landscape
4	Ch 3: Fine-Tuning + Ch 7: Evaluation	LoRA theory, benchmark design, contamination
5	Ch 10: Safety + Ch 8: Inference	Safety-alignment connection, efficient inference theory
6	Ch 11: Capstone + Mock Interviews	Timed practice, paper discussions

Interviewer's Perspective

At frontier labs, we expect candidates to go beyond reciting facts. We want to hear you reason about tradeoffs, propose experiments, and identify limitations in existing approaches. Practice explaining your reasoning out loud.

The Full Stack Path (MLE / LLM Engineer) - 5 weeks

Target companies: Databricks, Scale AI, Cohere, AI startups with training pipelines

Week	Topics	Focus
1	Ch 1: Transformer Internals	Architecture comparison, KV cache math, memory estimation
2	Ch 2: Pretraining + Ch 3: Fine-Tuning	End-to-end training, LoRA implementation, data pipelines
3	Ch 4: RLHF + Ch 8: Inference	Post-training pipeline, serving optimization
4	Ch 5: RAG + Ch 7: Evaluation	Production RAG, evaluation frameworks
5	Ch 11: Capstone + Mock Interviews	System design, cross-topic questions

The Applied Path (AI / Applied Engineer) - 4 weeks

Target companies: Cursor, Notion AI, Harvey, Glean, enterprise AI teams

Week	Topics	Focus
1	Ch 1: Transformer Internals (lighter) + Ch 3: Fine-Tuning	Practical architecture knowledge, when/how to fine-tune
2	Ch 5: RAG + Ch 6: Prompt Engineering	Production RAG design, systematic prompting
3	Ch 9: Agents + Ch 7: Evaluation	Agent architectures, measuring quality
4	Ch 10: Safety + Ch 11: Capstone	Guardrails, end-to-end system design

The Infra Path (ML Platform / Infra Engineer) - 4 weeks

Target companies: Together AI, Fireworks, Modal, Anyscale, cloud AI teams

Week	Topics	Focus
1	Ch 1: Transformer Internals + Ch 8: Inference	Memory math, KV cache, quantization, serving frameworks
2	Ch 2: Pretraining	3D parallelism, FSDP, checkpointing, fault tolerance
3	Ch 5: RAG + Ch 9: Agents	Infrastructure for retrieval and agent systems
4	Ch 7: Evaluation + Ch 11: Capstone	Eval infrastructure, system design

Part 3 - How to Use Each Chapter

Chapter Structure

Every chapter in this section follows a consistent structure designed for interview preparation:

Section	Purpose	How to Use
The Real Interview Moment	Sets the stakes with a realistic scenario	Read once to understand what you are preparing for
What You Will Master	Learning objectives checklist	Use as a progress tracker
Self-Assessment	Honest skill evaluation	Take before and after studying
Core Content (Parts 1-3+)	Deep technical material with diagrams	Study actively - draw diagrams, derive equations
Practice Problems	Graduated difficulty with hints	Attempt before looking at hints; time yourself
Interview Cheat Sheet	Quick-reference table	Review before interviews
Spaced Repetition Checkpoints	Retention schedule	Follow the Day 0/3/7/14/21 schedule strictly

Study Techniques That Work

60-Second Answer

For every concept, practice giving a 60-second explanation. Time yourself. Interviewers judge clarity and conciseness as much as correctness. If you cannot explain KV cache in 60 seconds, you will ramble for 5 minutes and lose the interviewer.

Active recall beats passive reading. After reading a section:

Close the page
Write down everything you remember on a blank sheet
Reopen and check what you missed
Focus your review on the gaps

Teach it to someone. Explain each concept to a friend, a rubber duck, or a voice recorder. If you stumble, you do not know it well enough.

Solve problems under time pressure. Real interviews give you 5-10 minutes per question. Practice with a timer.

Part 4 - The 2026 Interview Landscape

What Changed from 2024

Aspect	2024	2026
Baseline expectation	"Have you used an LLM?"	"Have you trained or fine-tuned a model?"
Architecture depth	"Explain attention"	"Compare GQA vs MQA memory savings at 128K context"
RAG questions	"What is RAG?"	"Design a RAG system with hybrid search, reranking, and evaluation"
Evaluation	Rarely asked	Standard question: "How would you evaluate this?"
Agents	Cutting-edge topic	Expected knowledge for senior roles
Safety	Nice to know	Required - regulatory pressure (EU AI Act)
Cost awareness	Optional	Required - "What does this cost to train/serve?"
Open-source knowledge	Bonus	Expected - LLaMA, Mistral, Qwen ecosystem

Common Interview Formats for LLM Roles

Interview Formats

Common Trap

Many candidates over-prepare for coding and under-prepare for system design. LLM system design rounds are where most candidates fail because they cannot reason about tradeoffs between RAG, fine-tuning, prompt engineering, and agents for a given use case.

The Questions That Separate Candidates

These cross-cutting questions appear in almost every LLM interview. If you can answer all of them confidently, you are well-prepared:

"Walk me through the full LLM stack from pretraining to production." Tests breadth. Can you connect all 11 topics?
"When would you fine-tune vs use RAG vs prompt engineer?" Tests judgment. The answer is always "it depends" - but you need to say on WHAT.
"How would you evaluate whether your LLM feature is working?" Tests evaluation maturity. Most candidates have no answer beyond "vibes."
"What are the failure modes of this system?" Tests safety and reliability thinking. Can you enumerate what goes wrong?
"What would this cost to train/serve at our scale?" Tests cost awareness. Interviewers want back-of-envelope numbers, not "it depends."

Part 5 - Building Your LLM Portfolio

What Makes a Strong LLM Portfolio in 2026

Calling APIs is not a portfolio. Here is what actually impresses:

Project Type	Impact Level	Example
Fine-tuned a model on custom data	High	Fine-tuned LLaMA 3 8B on legal documents, measured 23% improvement on domain QA
Built a production RAG system	High	RAG pipeline with hybrid search, reranking, and automated eval suite
Reproduced a paper	Very High	Implemented DPO from scratch, reproduced key results on TL;DR summarization
Built evaluation infrastructure	High	Automated eval framework comparing 5 models across 3 domain-specific benchmarks
Open-source contribution	Very High	Contributed to vLLM, LangChain, or similar projects
Called an API with a prompt	None	This is not a portfolio project

Interviewer's Perspective

When I review LLM portfolios, I look for three things: (1) Did they measure something? (2) Did they make a tradeoff decision and explain why? (3) Did they encounter a real problem and solve it? A fine-tuning project that reports "the model got better" is worthless. One that reports "LoRA rank 16 with $\alpha = 32$ on attention layers gave 12% improvement on our held-out set, while rank 64 caused overfitting after 2 epochs" - that is a hire signal.

Practice Problems

Problem 1: Study Plan Design

You have 3 weeks before an interview at an AI-native startup building a coding assistant (similar to Cursor). They told you the interview includes: LLM system design, coding (Python), and a technical deep-dive. Design your study plan.

Hint 1 - Direction

Think about what a coding assistant company cares about most. Which of the 11 topics are most relevant? Which can you skip or cover lightly?

Hint 2 - Insight

A coding assistant company cares deeply about: RAG (searching codebases), inference speed (real-time suggestions), evaluation (code correctness), and prompt engineering (structured outputs). They care less about pretraining from scratch or RLHF theory.

Hint 3 - Full Solution + Rubric

Optimal 3-week plan:

Week 1: Transformer Internals (2 days, focus on KV cache and inference) + Fine-Tuning (2 days, focus on LoRA and when to fine-tune) + RAG Systems (1 day, start the chapter)

Week 2: RAG Systems (3 days, deep focus - this is their core product) + Prompt Engineering (1 day, structured outputs and code prompting) + Inference Optimization (1 day, speculative decoding and batching)

Week 3: Agent Architectures (1 day - coding assistants are agents) + Evaluation (1 day - code eval is specific) + Capstone Questions (2 days) + Mock Interviews (1 day)

Scoring Rubric:

Criterion	Strong Hire	Lean Hire	No Hire
Prioritized RAG and inference	Correctly identified as top priorities	Mentioned but did not prioritize	Focused on pretraining or RLHF
Included evaluation	Specific to code quality metrics	Generic "test it"	Not mentioned
Realistic time allocation	Matches 3-week constraint	Slightly overloaded	Tried to cover everything equally
Included practice/mocks	Dedicated time for timed practice	Mentioned briefly	All reading, no practice

Problem 2: Role Classification

For each scenario, identify the most likely interview focus areas (top 3 chapters):

(a) Anthropic - Research Engineer
(b) Databricks - ML Engineer on Model Serving
(c) Harvey (legal AI) - Applied AI Engineer
(d) A bank - Senior ML Engineer for internal tools

Hint 1 - Direction

Think about what each company builds and what problems they solve at their core. Map those problems to our 11 chapters.

Hint 2 - Insight

Anthropic builds frontier models and studies alignment. Databricks serves models at scale. Harvey applies LLMs to legal workflows. A bank needs reliable, safe, cost-effective internal tools.

Hint 3 - Full Solution + Rubric

(a) Anthropic - Research Engineer:

Ch 1: Transformer Internals (derivation-level)
Ch 4: RLHF and Alignment (core mission)
Ch 2: Pretraining (scaling laws, training dynamics)

(b) Databricks - ML Engineer on Model Serving:

Ch 8: Inference Optimization (core job)
Ch 1: Transformer Internals (memory math)
Ch 2: Pretraining (training infrastructure, parallelism)

(c) Harvey - Applied AI Engineer:

Ch 5: RAG Systems (legal document retrieval)
Ch 3: Fine-Tuning (domain adaptation)
Ch 7: Evaluation (legal accuracy measurement)

(d) Bank - Senior ML Engineer:

Ch 5: RAG Systems (internal document search)
Ch 10: Safety and Guardrails (regulatory compliance)
Ch 6: Prompt Engineering (reliable outputs)

Scoring Rubric:

Criterion	Strong Hire	Lean Hire	No Hire
Matched company to correct chapters	4/4 correct or close	2-3/4 correct	Generic answers for all
Justified choices	Explained reasoning tied to company mission	Gave answers without reasoning	Could not connect topics to roles
Recognized company-specific needs	Mentioned specific products/challenges	Generic role mapping	No company awareness

Problem 3: Evaluate a Candidate

You are the interviewer. A candidate for an LLM Engineer role gives this answer to "Explain how LoRA works":

"LoRA is a technique where you freeze the base model and add small trainable matrices. It reduces the number of parameters you need to train. You can use it with QLoRA which also quantizes the model. It is more efficient than full fine-tuning."

Rate this answer. What is missing? What would make it a Strong Hire answer?

Hint 1 - Direction

The answer is factually correct but shallow. What specific technical details would an interviewer expect?

Hint 2 - Insight

A strong answer would include: the low-rank decomposition math, rank and alpha parameters, which modules to target, memory savings calculation, and when NOT to use LoRA.

Hint 3 - Full Solution + Rubric

Assessment: Lean No-Hire. The answer is correct but could come from reading a blog post summary. It demonstrates recognition, not understanding.

What is missing:

Math: LoRA decomposes weight update $\Delta W$ into $BA$ where $B \in \mathbb{R}^{d \times r}$ and $A \in \mathbb{R}^{r \times d}$ , with rank $r \ll d$
Parameters: No mention of rank $r$ , scaling factor $\alpha$ , or the relationship $\frac{\alpha}{r}$
Target modules: Which layers get LoRA adapters (typically Q, K, V projections; sometimes all linear layers)
Memory math: For a 7B model, full fine-tuning needs ~56 GB (fp32 optimizer states), LoRA rank 16 trains ~20M params (~80 MB)
Tradeoffs: When LoRA is insufficient (significant domain shift), when to increase rank
Merging: LoRA weights can be merged back into base model at inference time with zero overhead

Strong Hire answer would cover all 6 points in about 2 minutes, with specific numbers.

Criterion	Strong Hire	Lean Hire	No Hire
Includes math	Writes decomposition, explains rank	Mentions "low-rank" without math	No math at all
Concrete numbers	Memory savings, parameter counts	Vague "more efficient"	No numbers
Tradeoffs	When to use, when not to	Only benefits	"Always use LoRA"
Implementation details	Target modules, alpha/rank tuning	Generic description	Sounds like API docs

Interview Cheat Sheet

Topic	Core Question	60-Second Answer Must Include
Transformer Internals	"How does attention work in modern LLMs?"	Scaled dot-product, causal masking, GQA, RoPE, KV cache
Pretraining	"How are LLMs trained?"	Data pipeline, causal LM objective, scaling laws, 3D parallelism
Fine-Tuning	"When and how do you fine-tune?"	LoRA math, rank/alpha, full vs parameter-efficient, cost comparison
RLHF	"How do you align an LLM?"	SFT then RM then PPO (or DPO), preference data, reward hacking risks
RAG	"How do you add knowledge to an LLM?"	Chunk, embed, retrieve, rerank, generate, evaluate faithfulness
Prompt Engineering	"How do you optimize prompts?"	CoT, few-shot, structured output, systematic testing, version control
Evaluation	"How do you know your LLM works?"	Task-specific metrics, human eval, LLM-as-judge, benchmark contamination
Inference	"How do you serve LLMs efficiently?"	KV cache, continuous batching, quantization, speculative decoding
Agents	"How do you build LLM agents?"	ReAct loop, tool use, planning, memory, error recovery, evaluation
Safety	"How do you make LLMs safe?"	Input/output filtering, prompt injection defense, red teaming, monitoring

Spaced Repetition Checkpoints

Use this schedule to retain what you learn. Each checkpoint should take 15-20 minutes.

Day 0 (After reading this overview)

Draw the topic dependency diagram from memory
Write down the 5 competencies interviewers probe
Identify your study path and target timeline
Complete the self-assessment table honestly

Day 3

Without looking, list all 11 topics in order
For each topic, write one sentence about what it covers
Recite the 5 cross-cutting questions that separate candidates
Review your study plan - are you on track?

Day 7

Explain to someone (or a recorder) why LLM interviews are different in 2026
For your target company tier, list the top 5 topics to prioritize
Quiz yourself: for each of the 10 cheat sheet topics, give a 60-second answer
Adjust your study plan based on which topics felt weakest

Day 14

Redo the self-assessment. Compare scores to Day 0
Do a mock interview: have someone ask you 5 random cheat sheet questions
Time yourself: can you explain each topic in under 60 seconds?
Identify your top 3 weak areas and schedule extra review

Day 21

Final self-assessment. All scores should be 4+
Full mock interview simulation (30 min, mixed topics)
Review the practice problems - can you solve them without hints?
Prepare your "LLM story" - the 2-minute narrative of your LLM experience

What Comes Next

Start with Chapter 1: Transformer Internals for LLMs. This is the foundation everything else builds on. Even if you have studied Transformers before, the LLM-specific details (GQA, RoPE, SwiGLU, KV cache math) are what interviewers test in 2026.

If you scored 4+ on Transformer Internals in your self-assessment, you can move quickly through Chapter 1 and spend more time on your weaker areas. But do not skip it - the practice problems will reveal gaps you did not know you had.

The Real Interview Moment​

What You Will Master​

Self-Assessment: Where Are You Now?​

Why LLM Interviews Are Different in 2026​

The "Everyone Claims LLM Experience" Problem​

What Top Companies Actually Test​

The Five Competencies Interviewers Probe​

Part 1 - The 11-Topic Roadmap​

Topic Dependency Diagram​

Topic-by-Topic Summary​

Chapter 1: Transformer Internals for LLMs​

Chapter 2: LLM Pretraining​

Chapter 3: Fine-Tuning LLMs​

Chapter 4: RLHF and Alignment​

Chapter 5: RAG Systems​

Chapter 6: Prompt Engineering​

Chapter 7: LLM Evaluation​

Chapter 8: Inference Optimization​

Chapter 9: Agent Architectures​

Chapter 10: Safety and Guardrails​

Chapter 11: LLM Interview Questions (Capstone)​

Part 2 - Study Paths by Role and Timeline​

Path Selection Guide​

Detailed Study Paths​

The Deep Path (Research Engineer / Scientist) - 6 weeks​

The Full Stack Path (MLE / LLM Engineer) - 5 weeks​

The Applied Path (AI / Applied Engineer) - 4 weeks​

The Infra Path (ML Platform / Infra Engineer) - 4 weeks​

Part 3 - How to Use Each Chapter​

Chapter Structure​

Study Techniques That Work​

Part 4 - The 2026 Interview Landscape​

What Changed from 2024​

Common Interview Formats for LLM Roles​

The Questions That Separate Candidates​

Part 5 - Building Your LLM Portfolio​

What Makes a Strong LLM Portfolio in 2026​

Practice Problems​

Problem 1: Study Plan Design​

Problem 2: Role Classification​

Problem 3: Evaluate a Candidate​

Interview Cheat Sheet​

Spaced Repetition Checkpoints​

Day 0 (After reading this overview)​

Day 3​

Day 7​

Day 14​

Day 21​

What Comes Next​

The Real Interview Moment

What You Will Master

Self-Assessment: Where Are You Now?

Why LLM Interviews Are Different in 2026

The "Everyone Claims LLM Experience" Problem

What Top Companies Actually Test

The Five Competencies Interviewers Probe

Part 1 - The 11-Topic Roadmap

Topic Dependency Diagram

Topic-by-Topic Summary

Chapter 1: Transformer Internals for LLMs

Chapter 2: LLM Pretraining

Chapter 3: Fine-Tuning LLMs

Chapter 4: RLHF and Alignment

Chapter 5: RAG Systems

Chapter 6: Prompt Engineering

Chapter 7: LLM Evaluation

Chapter 8: Inference Optimization

Chapter 9: Agent Architectures

Chapter 10: Safety and Guardrails

Chapter 11: LLM Interview Questions (Capstone)

Part 2 - Study Paths by Role and Timeline

Path Selection Guide

Detailed Study Paths

The Deep Path (Research Engineer / Scientist) - 6 weeks

The Full Stack Path (MLE / LLM Engineer) - 5 weeks

The Applied Path (AI / Applied Engineer) - 4 weeks

The Infra Path (ML Platform / Infra Engineer) - 4 weeks

Part 3 - How to Use Each Chapter

Chapter Structure

Study Techniques That Work

Part 4 - The 2026 Interview Landscape

What Changed from 2024

Common Interview Formats for LLM Roles

The Questions That Separate Candidates

Part 5 - Building Your LLM Portfolio

What Makes a Strong LLM Portfolio in 2026

Practice Problems

Problem 1: Study Plan Design

Problem 2: Role Classification

Problem 3: Evaluate a Candidate

Interview Cheat Sheet

Spaced Repetition Checkpoints

Day 0 (After reading this overview)

Day 3

Day 7

Day 14

Day 21

What Comes Next