Module 5: Fine-Tuning Pipelines
LoRA lets you fine-tune a 70B model on a single GPU. Axolotl gives you a production-grade pipeline for doing it reliably, reproducibly, and at scale. This module is about the engineering around fine-tuning - not the math, but the workflow: dataset preparation, configuration, multi-GPU training, DPO, and managing what you built after you built it.
Most fine-tuning projects fail not because of bad architecture choices but because of bad data. This module gives equal weight to data engineering and training configuration.
The Fine-Tuning Pipeline
Lessons in This Module
| # | Lesson | Key Concept |
|---|---|---|
| 1 | Fine-Tuning Pipeline Overview | End-to-end workflow, tools, decision points |
| 2 | Axolotl Configuration Deep Dive | YAML config, dataset formats, key hyperparameters |
| 3 | Data Formatting for Fine-Tuning | ShareGPT, Alpaca, raw completion formats |
| 4 | Hyperparameter Selection | Learning rate, batch size, warmup, LoRA rank |
| 5 | Multi-GPU Fine-Tuning | DeepSpeed ZeRO stages, FSDP with LoRA |
| 6 | Continued Pretraining vs Instruction Tuning | Different data, different objectives, different results |
| 7 | DPO and Preference Fine-Tuning | DPO math, preference datasets, vs RLHF |
| 8 | Storing and Versioning Adapters | HuggingFace Hub, adapter naming, model cards |
Key Concepts You Will Master
- Dataset formatting standards - ShareGPT, Alpaca, and chat template formats and when to use each
- Axolotl YAML configuration - the key configuration parameters that determine training quality
- DeepSpeed ZeRO - stages 1/2/3 and when each is appropriate for LoRA fine-tuning
- DPO (Direct Preference Optimization) - training on preference pairs without a reward model
- Adapter versioning - managing adapters as model families release new base models
Prerequisites
- LoRA and QLoRA Fine-Tuning
- At least one GPU with 16GB+ VRAM for most lessons
- Python and basic familiarity with training loops
© 2026 EngineersOfAI. All rights reserved.
