Bias-Variance Tradeoff
Mathematical decomposition of generalization error into bias, variance, and noise - with formal derivations, practical examples, and the modern double-descent perspective in deep learning.
Mathematical decomposition of generalization error into bias, variance, and noise - with formal derivations, practical examples, and the modern double-descent perspective in deep learning.
Why classical theory fails for deep learning - double descent, benign overfitting, implicit regularisation of SGD, neural tangent kernel, and modern PAC-Bayes bounds.
The online learning model, regret bounds, Perceptron algorithm, Follow-The-Leader and Follow-The-Regularised-Leader, Hedge algorithm, and connections to streaming ML and online ad auctions.
Probably Approximately Correct framework - sample complexity, consistent learners, finite hypothesis classes, and the formal foundation of why data size matters in ML.
Rademacher complexity as a data-dependent measure of hypothesis class richness - definition, connection to VC dimension, generalization bounds, and why it gives tighter guarantees for ML.
Regularisation as Occam's razor - Tikhonov regularisation, structural risk minimisation, the connection between dropout and Bayesian inference, and early stopping as regularisation.
The mathematical theory of generalization - why ML models work, when they fail, and how to bound their error. Module map and learning objectives for PAC learning, VC dimension, and modern generalization theory.
Vapnik-Chervonenkis dimension - shattering, VC dimension of common classifiers, the Fundamental Theorem of Statistical Learning, and why model capacity determines generalization.