Decision Trees Internals
Deep dive into decision tree internals - recursive binary splitting, CART, Gini and entropy impurity, pruning, and a full from-scratch NumPy implementation for classification and regression.
Deep dive into decision tree internals - recursive binary splitting, CART, Gini and entropy impurity, pruning, and a full from-scratch NumPy implementation for classification and regression.
Master all three feature importance types, TreeSHAP for exact Shapley values, SHAP interaction values, feature selection with SHAP, data leakage detection, fairness analysis, and production importance drift monitoring.
Understand gradient boosting from first principles - additive models, functional gradient descent, pseudo-residuals for any loss function, shrinkage, stochastic boosting, and bias-variance tradeoffs versus Random Forest.
A deep dive into how decision trees choose splits - Shannon entropy, information gain, Gini impurity, gain ratio, regression variance reduction, and the multi-valued feature bias every practitioner must understand.
Master LightGBM's GOSS and EFB algorithms, CatBoost's ordered target statistics, and learn when to choose each framework for large-scale tabular machine learning.
Master decision trees and ensemble methods from first principles - the model family that dominates tabular ML competitions and powers production fraud, pricing, and ranking systems worldwide.
How to prevent decision tree overfitting through pre-pruning parameters, cost-complexity post-pruning, weakest-link pruning, MDL principle, and production-grade tuning strategies.
Master Random Forests from first principles - bagging variance reduction math, feature randomization, OOB error estimation, Extra-Trees, bias-variance decomposition, MDI vs permutation importance, and production deployment patterns.
Master stacking and blending ensemble techniques - out-of-fold meta-learning, data leakage prevention, model diversity, snapshot ensembling, temporal ensembling, Kaggle competition patterns, and production deployment tradeoffs.
Master XGBoost internals - the 7 innovations over vanilla gradient boosting, optimal leaf weights, gain calculation, hyperparameter tuning, and production deployment with ONNX and GPU training.