Dot Products and Projections - The Math Behind Attention
A deep engineering dive into dot products, orthogonality, vector projection, Gram-Schmidt orthogonalization, and least squares - the mathematical heart of the transformer attention mechanism.
A deep engineering dive into dot products, orthogonality, vector projection, Gram-Schmidt orthogonalization, and least squares - the mathematical heart of the transformer attention mechanism.