10 docs tagged with "diffusion-models"

Classifier-Free Guidance - Steering Diffusion with Text

Complete derivation of CFG from classifier guidance through the Ho-Salimans implicit classifier insight - the guidance scale trade-off, negative prompting mechanics, dynamic thresholding, CFG++ variants, and production sampling implementations.

DDIM and Accelerated Diffusion Sampling

How DDIM reduces 1000-step DDPM sampling to 10-50 steps via a non-Markovian process, the eta parameter, DDIM inversion for image editing, and DPM-Solver as the current production standard.

DDPMs - The Mathematical Foundation of Diffusion Models

The complete mathematical derivation of Denoising Diffusion Probabilistic Models - forward process, reverse process, ELBO objective, noise schedule comparison, U-Net architecture, and why predicting noise works better than predicting clean images.

Diffusion Models Beyond Images - Audio, Video, 3D, Molecules, Text

How the diffusion framework generalizes across modalities - from waveform audio synthesis to protein structure prediction, video generation, 3D scene creation, time series, and text - with the architectural changes each domain requires.

Evaluating Generative Models - FID, IS, Precision/Recall, Human Evaluation

A complete guide to evaluating generative models - from the mathematics of FID and Inception Score to Precision/Recall manifolds, CLIP-based metrics, DINO similarity, human preference studies, metric gaming, and building production evaluation pipelines.

Fine-Tuning Diffusion Models - DreamBooth, LoRA, Textual Inversion, ControlNet

How to teach Stable Diffusion new concepts with as few as 5-20 images - covering Textual Inversion, DreamBooth, LoRA, ControlNet, and IP-Adapter with full training code, hyperparameter guidance, and evaluation strategies.

Generative Models Overview - VAEs, GANs, Flow Models, and Diffusion

A unified view of generative modeling approaches - how VAEs, GANs, normalizing flows, energy-based models, and diffusion models each define a different way to learn a distribution, with trade-offs in quality, diversity, training stability, and likelihood.

Latent Diffusion Models - The Architecture Behind Stable Diffusion

How Rombach et al. moved diffusion from pixel space to a compressed latent space via KL-VAE with perceptual and adversarial losses, cross-attention conditioning, and the complete Stable Diffusion pipeline - enabling high-resolution generation on consumer GPUs.

Module 15 - Diffusion Models

Master diffusion models from first principles - DDPM, score matching, DDIM acceleration, latent diffusion, classifier-free guidance, fine-tuning, and evaluation across image, audio, and molecular domains.

Score-Based Generative Models - Diffusion Through the Lens of Score Matching

How Song and Ermon's score matching framework unifies DDPM and enables stochastic differential equations for continuous-time diffusion - the mathematical theory behind modern diffusion models, from score functions and Langevin dynamics through denoising score matching and the SDE unification.