Skip to main content

10 docs tagged with "tree-models"

View all tags

Decision Trees Internals

Deep dive into decision tree internals - recursive binary splitting, CART, Gini and entropy impurity, pruning, and a full from-scratch NumPy implementation for classification and regression.

Feature Importance and SHAP

Master all three feature importance types, TreeSHAP for exact Shapley values, SHAP interaction values, feature selection with SHAP, data leakage detection, fairness analysis, and production importance drift monitoring.

Gradient Boosting From Scratch

Understand gradient boosting from first principles - additive models, functional gradient descent, pseudo-residuals for any loss function, shrinkage, stochastic boosting, and bias-variance tradeoffs versus Random Forest.

Information Gain, Gini Impurity, and Entropy

A deep dive into how decision trees choose splits - Shannon entropy, information gain, Gini impurity, gain ratio, regression variance reduction, and the multi-valued feature bias every practitioner must understand.

LightGBM and CatBoost

Master LightGBM's GOSS and EFB algorithms, CatBoost's ordered target statistics, and learn when to choose each framework for large-scale tabular machine learning.

Module 03 - Tree Models and Ensembles

Master decision trees and ensemble methods from first principles - the model family that dominates tabular ML competitions and powers production fraud, pricing, and ranking systems worldwide.

Pruning and Depth Control

How to prevent decision tree overfitting through pre-pruning parameters, cost-complexity post-pruning, weakest-link pruning, MDL principle, and production-grade tuning strategies.

Random Forests

Master Random Forests from first principles - bagging variance reduction math, feature randomization, OOB error estimation, Extra-Trees, bias-variance decomposition, MDI vs permutation importance, and production deployment patterns.

Stacking and Blending

Master stacking and blending ensemble techniques - out-of-fold meta-learning, data leakage prevention, model diversity, snapshot ensembling, temporal ensembling, Kaggle competition patterns, and production deployment tradeoffs.

XGBoost Deep Dive

Master XGBoost internals - the 7 innovations over vanilla gradient boosting, optimal leaf weights, gain calculation, hyperparameter tuning, and production deployment with ONNX and GPU training.