10 docs tagged with "explainability"

Attention as Explanation - What Transformers Are (and Aren't) Looking At

When attention weights help explain transformer decisions, when they mislead, and the debate between attention-as-explanation and attention-is-not-explanation.

Counterfactual Explanations - What Would Have to Change for a Different Decision?

Counterfactual explanations answer 'what would need to change?' - the most actionable form of ML explanation, and the basis for GDPR compliance in automated decision-making.

Evaluating the Quality of ML Explanations - Faithfulness, Robustness, and Human Studies

How to measure whether an ML explanation is actually good - faithfulness metrics, the ROAR benchmark, sanity checks, human evaluation studies, and a complete quantitative evaluation pipeline.

Explainability in Production ML Systems - Monitoring, Latency, and Compliance

How to operationalize ML explainability at scale - latency budgets, caching strategies, drift monitoring, compliance audit trails, and production architecture patterns for regulated industries.

Feature Importance Methods - Beyond SHAP

Permutation importance, impurity-based importance, partial dependence plots, ALE, H-statistics, Sobol indices, and production monitoring - the complete toolkit for understanding which features drive your model's decisions, and when each method lies to you.

Interpretability vs Explainability - Clearing Up the Confusion

The difference between understanding how a model works (interpretability) and explaining a specific prediction (explainability) - and why that distinction shapes regulation, trust, and system design.

LIME - Local Interpretable Model-Agnostic Explanations

LIME explains any black-box classifier by fitting a local linear approximation around a specific prediction - the algorithm, variants, limitations, and when to use it vs SHAP.

Module 12 - Explainability and Interpretability

From Shapley values to saliency maps - the complete toolkit for understanding, auditing, and explaining ML models in production.

Saliency Maps for Vision - What Your CNN Is Actually Seeing

Gradient-based saliency, GradCAM, SmoothGrad, Guided Backpropagation, and Integrated Gradients for explaining computer vision models - with practical code and honest limitations.

SHAP Values - The Unified Theory of Feature Importance

Shapley values from cooperative game theory provide the only provably fair attribution of feature contributions to a model's prediction - and SHAP makes them computationally tractable.