Information Geometry
Statistical manifolds, Fisher information matrix, natural gradient descent, and why second-order optimization methods like K-FAC and Shampoo are geometrically principled.
Statistical manifolds, Fisher information matrix, natural gradient descent, and why second-order optimization methods like K-FAC and Shampoo are geometrically principled.