Bytecode and the Compiler
How Python source becomes bytecode, the dis module, .pyc files, peephole optimisation, and writing a bytecode-level function.
How Python source becomes bytecode, the dis module, .pyc files, peephole optimisation, and writing a bytecode-level function.
A deep engineering dive into convex functions, convex sets, loss landscape geometry, saddle points, local vs global minima, and why deep learning works despite non-convexity.
A deep engineering dive into gradient descent derivation, learning rate theory, convergence conditions, batch vs mini-batch vs SGD, momentum, and learning rate schedules with complete Python implementations.
Statistical manifolds, Fisher information matrix, natural gradient descent, and why second-order optimization methods like K-FAC and Shampoo are geometrically principled.
A deep engineering dive into constrained optimization, the Lagrangian function, KKT conditions, and their ML applications in SVMs, L1/L2 regularization, and trust region methods.
Profiling, Cython, Numba, memory optimisation, async performance, and Python at scale - turning Python code from slow to production-fast.
A deep engineering dive into the math behind SGD with momentum, AdaGrad, RMSProp, Adam, AdamW, learning rate schedules, gradient clipping, and when to use each optimizer for ML training.
Object memory overhead, __slots__, generators, memory-mapped files, and GC tuning - reducing Python's memory footprint in production.
A deep engineering dive into Taylor expansions, why gradient descent uses first-order approximations, how Newton's method uses curvature, quasi-Newton methods, and their practical implications for ML optimization.