Skip to main content

Assign and Add: A Mechanistic Study of Compositional Arithmetic

:::info Stub — Full Engineering Breakdown Coming This paper was auto-fetched from arXiv on 2026-06-01. A full breakdown with production viability rating, implementation notes, and honest limitations is being written. Subscribe to AI Letters → :::

AuthorsBrady Exoo et al.
Year2026
FieldMachine Learning
arXiv2605.31497
PDFDownload
Categoriescs.LG, stat.ML

Abstract

Large language models are able to compose skills in order to perform complex tasks, many of which might not have been seen during training. The details of how exactly this composition occurs remain elusive. In this paper, we study a mechanism for compositional generalization in transformers by considering a simple controlled setting involving variable assignment and modular addition. By partitioning our training data into disjoint sets, we observe that small transformers are able to generalize to previously unseen combinations of variables and numbers. Our mechanistic analysis shows that the same ``modular addition'' MLP module is used whether the inputs are given directly or indirectly through a separate variable assignment mechanism. We also analyze the training dynamics from an empirical lens, which reveals three phases of learning: first, modular addition is learned, then the structure required for variable assignment, and finally a refinement phase where the model generalizes to some hard sequences not seen in training. Finally, we provide a theoretical framework to explain how compositionality emerges from training dynamics. These results suggest that compositional generalization can be a natural consequence of the compositionality of internal mechanisms in~transformers.


Engineering Breakdown

The Problem

Large language models are able to compose skills in order to perform complex tasks, many of which might not have been seen during training.

The Approach

In this paper, we study a mechanism for compositional generalization in transformers by considering a simple controlled setting involving variable assignment and modular addition.

Key Results

These results suggest that compositional generalization can be a natural consequence of the compositionality of internal mechanisms in~transformers.

Research Areas

This paper contributes to the following areas of AI/ML engineering:

  • Model training
  • Generalization
  • Optimization
  • Supervised learning
  • Deep learning
  • Mechanistic

:::tip Subscribe Get weekly breakdowns of papers like this in AI Letters - the newsletter for engineers building production AI systems. :::


Back to Research Lab → · Subscribe to AI Letters →

© 2026 EngineersOfAI. All rights reserved.