Skip to main content

Module 5: Fine-Tuning Pipelines

LoRA lets you fine-tune a 70B model on a single GPU. Axolotl gives you a production-grade pipeline for doing it reliably, reproducibly, and at scale. This module is about the engineering around fine-tuning - not the math, but the workflow: dataset preparation, configuration, multi-GPU training, DPO, and managing what you built after you built it.

Most fine-tuning projects fail not because of bad architecture choices but because of bad data. This module gives equal weight to data engineering and training configuration.

The Fine-Tuning Pipeline

Lessons in This Module

#LessonKey Concept
1Fine-Tuning Pipeline OverviewEnd-to-end workflow, tools, decision points
2Axolotl Configuration Deep DiveYAML config, dataset formats, key hyperparameters
3Data Formatting for Fine-TuningShareGPT, Alpaca, raw completion formats
4Hyperparameter SelectionLearning rate, batch size, warmup, LoRA rank
5Multi-GPU Fine-TuningDeepSpeed ZeRO stages, FSDP with LoRA
6Continued Pretraining vs Instruction TuningDifferent data, different objectives, different results
7DPO and Preference Fine-TuningDPO math, preference datasets, vs RLHF
8Storing and Versioning AdaptersHuggingFace Hub, adapter naming, model cards

Key Concepts You Will Master

  • Dataset formatting standards - ShareGPT, Alpaca, and chat template formats and when to use each
  • Axolotl YAML configuration - the key configuration parameters that determine training quality
  • DeepSpeed ZeRO - stages 1/2/3 and when each is appropriate for LoRA fine-tuning
  • DPO (Direct Preference Optimization) - training on preference pairs without a reward model
  • Adapter versioning - managing adapters as model families release new base models

Prerequisites

© 2026 EngineersOfAI. All rights reserved.