Custom Data Monitoring
Building custom monitoring with Great Expectations and statistical tests.
Building custom monitoring with Great Expectations and statistical tests.
Apache Atlas, DataHub, Amundsen - cataloguing data for ML teams.
Runbooks, on-call rotations, and root cause analysis for data incidents.
Column-level lineage, impact analysis, and tools like OpenLineage and DataHub.
What freshness, distribution, volume, schema, and lineage tracking do for AI systems, when silent data drift and pipeline failures silently corrupt model inputs and degrade predictions, and how to instrument these five pillars in production AI data pipelines.
Monte Carlo, Bigeye, and Soda - managed data observability.