8 docs tagged with "lora"

Advanced PEFT Methods

Beyond LoRA - Prefix Tuning, Prompt Tuning, IA3, AdaLoRA, VeRA, and LoftQ. When to reach for each method, how they compare on parameter count and quality, and practical implementation with the PEFT library.

Axolotl and TRL Training Frameworks

Using Axolotl and HuggingFace TRL for LoRA and QLoRA fine-tuning - configuration files, SFTTrainer, DPO training, and distributed multi-GPU fine-tuning setups.

Evaluating Fine-Tuned Models

Evaluation strategies for fine-tuned LLMs - held-out test sets, LLM-as-judge evaluation, perplexity measurement, task-specific benchmarks, and avoiding evaluation pitfalls.

LoRA Mathematics and Implementation

Learn how LoRA (Low-Rank Adaptation) decomposes weight updates into low-rank matrices, why this works mathematically, and how to implement it from scratch in PyTorch and with HuggingFace PEFT.

Merging and Model Soup Techniques

Combining multiple fine-tuned models without retraining - LoRA adapter merging, SLERP, TIES-merging, DARE, and MergeKit for production model merging that unlocks capabilities no single training run achieves.

QLoRA: 4-Bit Fine-Tuning

Learn how QLoRA combines 4-bit NF4 quantization, double quantization, and paged optimizers to fine-tune 65B parameter models on a single GPU - covering the math, implementation, and production engineering.

Selecting Target Modules and Rank

Which layers to apply LoRA to and what rank to use - two of the most impactful fine-tuning decisions. Covers attention vs FFN targeting, rank selection from r=4 to r=64, RSLoRA, DoRA, LoRA+, and ablation strategies.

Training Data Preparation for Fine-Tuning

Building high-quality data pipelines for LoRA fine-tuning - chat templates, instruction masking, deduplication, quality filtering, synthetic data generation, and dataset formats that actually produce good models.