Skip to main content

8 docs tagged with "model-merging"

View all tags

DARE - Delta Weight Sparsification

How DARE randomly drops delta weights and rescales the remainder to dramatically reduce interference when merging multiple fine-tuned models.

Linear Interpolation and Model Soup

How weight averaging of fine-tuned models produces better, more robust models than any individual fine-tune - and the task arithmetic framework for composing capabilities.

MergeKit - The Practical Toolkit

How to use arcee-ai/mergekit to merge language models with YAML configuration, CPU-compatible layer-by-layer processing, and automated HuggingFace Hub upload.

SLERP - Spherical Linear Interpolation

How spherical linear interpolation provides smoother, geometrically correct blending between two model weight configurations than simple linear averaging.

Why Model Merging Exists

The catastrophic forgetting problem, why naive ensembles are too expensive, and the surprising geometric insight that makes model merging possible.