Skip to main content

Module 3: Compilers and Runtimes for ML

torch.compile(model) can make your model 2-3x faster with one line of code. But when it does not work - when it falls back to eager mode, when it produces incorrect results, when the compilation takes longer than your training run - you have no idea why. That is the cost of treating the compiler as a black box.

This module opens the box. Not to make you write compilers, but to give you enough understanding to use torch.compile correctly, debug compilation failures, understand what XLA and TensorRT are actually doing, and make informed decisions about the compilation-performance tradeoff.

The Compilation Stack

Lessons in This Module

#LessonKey Concept
1How Compilers WorkParsing, IR, optimization passes, code generation
2JIT Compilation in PythonCPython bytecode, numba, how JIT differs from AOT
3MLIR - Multi-Level IRDialect system, lowering passes, why MLIR exists
4XLA and JAX CompilationXLA IR, fusion in XLA, how JAX uses XLA
5torch.compile InternalsDynamo, Inductor, compilation modes, guard failures
6TensorRT and Inference OptimizationGraph optimization, layer fusion, precision calibration
7ONNX and Cross-Framework PortabilityONNX format, opsets, ONNX Runtime optimization
8Ahead-of-Time vs JIT for MLTradeoffs, when each approach is appropriate

Key Concepts You Will Master

  • Intermediate representations (IR) - how compilers represent computation before generating hardware code
  • Operator fusion - the compiler technique that eliminates intermediate memory reads/writes
  • Graph-mode vs eager-mode - the fundamental difference in how PyTorch and JAX execute operations
  • torch.compile compilation modes - default, reduce-overhead, max-autotune - when to use each
  • TensorRT calibration - using a calibration dataset to choose quantization scales for FP16/INT8 inference
  • Guard failures - why torch.compile recompiles and how to avoid it

Prerequisites

  • GPU Architecture helpful
  • PyTorch proficiency
  • Basic understanding of how compilers work (no prior experience required)
© 2026 EngineersOfAI. All rights reserved.