Skip to main content

8 docs tagged with "training"

View all tags

Continual Learning and Domain Adaptation

Learn how to adapt open-source language models to specialized domains through continual pre-training, manage catastrophic forgetting with EWC and data mixing, and evaluate domain knowledge gain versus general capability loss.

Fine-Tuning Cost and ROI Analysis

Making the business case for LLM fine-tuning - calculating GPU compute costs, estimating break-even against API pricing, and deciding when fine-tuning beats prompt engineering on ROI.

Fine-Tuning Hyperparameter Search

Systematic hyperparameter optimization for LLM fine-tuning - learning rate, batch size, epochs, LoRA rank, warmup schedules, and efficient search strategies with Optuna and WandB sweeps.

Full Fine-Tuning vs PEFT

Decision framework for choosing between full fine-tuning and parameter-efficient methods like LoRA and QLoRA - covering compute requirements, quality ceilings, catastrophic forgetting, and when each approach wins.

Instruction Tuning at Scale

How to instruction-tune open-source models at production scale - covering the FLAN insight, dataset construction principles, scaling laws for instruction data, multi-node training setup, and a complete pipeline for fine-tuning Llama 3 8B on a 2-node A100 cluster.

Monitoring and Debugging Fine-Tuning

How to monitor LLM fine-tuning runs and debug failures - tracking loss curves, gradient norms, GPU utilization, MFU, and diagnosing NaN loss, overfitting, and OOM errors in LoRA and full fine-tuning.

RLHF and DPO for Open Models

Learn how to align open-source language models with human preferences using RLHF and the simpler, more stable Direct Preference Optimization (DPO) approach with TRL.

Synthetic Data and Self-Improvement

Generating high-quality synthetic training data with LLMs using Evol-Instruct, Self-Instruct, Constitutional AI, rejection sampling, and self-play techniques to build data flywheels without expensive human annotation.