Classifier-Free Guidance - Steering Diffusion with Text
Complete derivation of CFG from classifier guidance through the Ho-Salimans implicit classifier insight - the guidance scale trade-off, negative prompting mechanics, dynamic thresholding, CFG++ variants, and production sampling implementations.
DDIM and Accelerated Diffusion Sampling
How DDIM reduces 1000-step DDPM sampling to 10-50 steps via a non-Markovian process, the eta parameter, DDIM inversion for image editing, and DPM-Solver as the current production standard.
DDPMs - The Mathematical Foundation of Diffusion Models
The complete mathematical derivation of Denoising Diffusion Probabilistic Models - forward process, reverse process, ELBO objective, noise schedule comparison, U-Net architecture, and why predicting noise works better than predicting clean images.
Diffusion Models Beyond Images - Audio, Video, 3D, Molecules, Text
How the diffusion framework generalizes across modalities - from waveform audio synthesis to protein structure prediction, video generation, 3D scene creation, time series, and text - with the architectural changes each domain requires.
Evaluating Generative Models - FID, IS, Precision/Recall, Human Evaluation
A complete guide to evaluating generative models - from the mathematics of FID and Inception Score to Precision/Recall manifolds, CLIP-based metrics, DINO similarity, human preference studies, metric gaming, and building production evaluation pipelines.
Fine-Tuning Diffusion Models - DreamBooth, LoRA, Textual Inversion, ControlNet
How to teach Stable Diffusion new concepts with as few as 5-20 images - covering Textual Inversion, DreamBooth, LoRA, ControlNet, and IP-Adapter with full training code, hyperparameter guidance, and evaluation strategies.
Generative Models Overview - VAEs, GANs, Flow Models, and Diffusion
A unified view of generative modeling approaches - how VAEs, GANs, normalizing flows, energy-based models, and diffusion models each define a different way to learn a distribution, with trade-offs in quality, diversity, training stability, and likelihood.
Latent Diffusion Models - The Architecture Behind Stable Diffusion
How Rombach et al. moved diffusion from pixel space to a compressed latent space via KL-VAE with perceptual and adversarial losses, cross-attention conditioning, and the complete Stable Diffusion pipeline - enabling high-resolution generation on consumer GPUs.
Module 15 - Diffusion Models
Master diffusion models from first principles - DDPM, score matching, DDIM acceleration, latent diffusion, classifier-free guidance, fine-tuning, and evaluation across image, audio, and molecular domains.
Score-Based Generative Models - Diffusion Through the Lens of Score Matching
How Song and Ermon's score matching framework unifies DDPM and enables stochastic differential equations for continuous-time diffusion - the mathematical theory behind modern diffusion models, from score functions and Langevin dynamics through denoising score matching and the SDE unification.