Skip to main content

13 docs tagged with "ml-foundations"

View all tags

Bias-Variance Tradeoff

The formal decomposition of prediction error into bias, variance, and noise - with production diagnostics, learning curves, double descent, and ensemble strategies.

Cross-Validation

A comprehensive guide to cross-validation - k-Fold, stratified, repeated, LOOCV, group CV, time-series CV, nested CV, and common pitfalls including data leakage.

Data Representation and Feature Spaces

How raw data is encoded as vectors in feature spaces - tabular, text, image, time-series, and graph data - including the curse of dimensionality and practical feature engineering with sklearn.

Evaluation Metrics for Classification

Precision, recall, F1, AUC-ROC, AUC-PR, log loss, MCC - the complete guide to classification evaluation with business context, code, and when each metric matters.

Evaluation Metrics for Regression

A comprehensive guide to regression evaluation - MAE, MSE, RMSE, R², MAPE, Huber loss, residual diagnostics, business-aligned metrics, and production monitoring patterns.

Generalization, Overfitting, and Underfitting

Why models fail to generalize - the formal definition of generalization gap, diagnosing overfitting and underfitting, regularization strategies, and distribution shift in production.

Module 01: ML Foundations - Overview

Complete overview of the ML Foundations module - 12 lessons covering the core concepts every ML engineer must know before building production systems.

Probabilistic View of Machine Learning

Framing machine learning through probability - MLE, MAP estimation, prior-posterior reasoning, cross-entropy as negative log-likelihood, calibration, Bayesian deep learning, and uncertainty quantification.

Statistical Learning Theory

The mathematical foundations of machine learning - PAC learning, VC dimension, Rademacher complexity, sample complexity, generalisation bounds, and the theory behind why regularisation works.

The ML Workflow - End to End

The complete ML engineering workflow from problem framing through data, features, model training, evaluation, deployment, and monitoring - and where projects actually fail.

Train / Validation / Test Split Strategy

A deep dive into data splitting - why the split matters, how to partition data correctly, data leakage patterns, temporal splits, group splits, and production-grade evaluation design.

What is Machine Learning?

Three precise ways to think about ML - optimization, compression, and function approximation - with production context, taxonomy, and when ML is the wrong tool.