The Provenance Gap - AI Letters #33

EU AI Act Enforcement Deadlines - Countdown

High-risk AI systems (credit scoring, HR decisions, educational AI, critical infrastructure) must maintain full audit trails.

August 2026

Article 26 logging obligations for new high-risk AI systems

2 December 2027

Backstop enforcement for all deployed high-risk AI systems

Already in force

GDPR Article 17: right to erasure of personal training data, verifiable

The gap

No existing single tool answers audit questions end-to-end

2018

MLflow Released

Databricks open-sources MLflow. Artifact-level experiment tracking becomes standard. Records which dataset file and code commit produced which model. Does not record which rows were processed or how.

2020

DVC + DSLog

DVC adds Git-like versioning for datasets. DSLog tracks operator-level provenance for NumPy workloads. Both excellent at their stage - neither crosses the boundary into training or attribution.

2023

TRAK: Attribution at Scale

Park et al. (MIT/ICML 2023) introduce TRAK - training data attribution via sparse JL projection. LDS 0.0290 on CIFAR-2 in 691 seconds on GPU. Excellent attribution quality. Operates entirely on post-preprocessing tensors with no source file knowledge.

2024

EU AI Act Published

EU Regulation 2024/1689 published. Article 26 requires audit trails for high-risk AI. Article 13 requires transparency. GDPR Article 17 erasure obligations already in force. The compliance clock starts.

May 2026

Traceprop Released (Zenodo Preprint)

First unified system connecting source files through preprocessing through training to predictions. Three layers: ProvenanceTensor lineage, JL-projected GradientStore attribution, provenance-guided approximate unlearning. Sub-1% overhead at 1M elements. pip install traceprop.

August 2026

EU AI Act Article 26 - New Systems

Logging obligations enter force for new high-risk AI systems. Credit scoring, HR decisions, educational AI, critical infrastructure must maintain audit trails. Systems deployed after this date need audit trail infrastructure from day one.

2 Dec 2027

EU AI Act Backstop Enforcement

All deployed high-risk AI systems must be compliant. Systems deployed before August 2026 have until this date to retrofit audit trail infrastructure. The 18-month window is shorter than most ML infrastructure projects.