01Calculus and Optimization for Machine Learning - Module OverviewA complete module map showing how derivatives, gradients, backpropagation, gradient descent, and optimization algorithms connect to training every major ML model.02Derivatives and Gradients - The Compass of TrainingA deep engineering dive into single-variable derivatives, partial derivatives, gradient vectors, and Jacobians - the mathematical foundation behind every gradient-based ML training algorithm.03Chain Rule and Backpropagation - How Neural Networks LearnA deep engineering dive into the chain rule, computational graphs, forward and backward passes, and how PyTorch autograd implements backpropagation to train networks of any depth.04Gradient Descent Mechanics - The Engine of Every Training LoopA deep engineering dive into gradient descent derivation, learning rate theory, convergence conditions, batch vs mini-batch vs SGD, momentum, and learning rate schedules with complete Python implementations.05Convex Functions and Optimization - Why Some Problems Are Easy and Others Are NotA deep engineering dive into convex functions, convex sets, loss landscape geometry, saddle points, local vs global minima, and why deep learning works despite non-convexity.06Lagrange Multipliers - Constrained Optimization and the Math Behind SVMsA deep engineering dive into constrained optimization, the Lagrangian function, KKT conditions, and their ML applications in SVMs, L1/L2 regularization, and trust region methods.07Taylor Series and Approximations - The Mathematics Behind Gradient Descent and Newton's MethodA deep engineering dive into Taylor expansions, why gradient descent uses first-order approximations, how Newton's method uses curvature, quasi-Newton methods, and their practical implications for ML optimization.08Automatic Differentiation - How PyTorch Really Computes GradientsA deep engineering dive into forward mode and reverse mode automatic differentiation, computational graphs, PyTorch autograd internals, custom gradient functions, and when to use torch.no_grad().09Optimization Algorithms Deep Dive - SGD, Adam, AdamW, and BeyondA deep engineering dive into the math behind SGD with momentum, AdaGrad, RMSProp, Adam, AdamW, learning rate schedules, gradient clipping, and when to use each optimizer for ML training.