Airflow for ML Pipelines
Orchestrate ML training pipelines with Airflow - data quality gates, KubernetesPodOperator training, champion/challenger evaluation, and conditional deployment.
Orchestrate ML training pipelines with Airflow - data quality gates, KubernetesPodOperator training, champion/challenger evaluation, and conditional deployment.
Storing training datasets, experiment artifacts, and model outputs in a lakehouse.
NumPy, Pandas, SciPy, Matplotlib, scikit-learn, PyTorch, and JAX - the complete Python stack for AI/ML engineering.
Pandas for machine learning - efficient data loading, feature engineering, pipelines, memory optimisation, and common ML preprocessing patterns.
Scikit-learn Pipeline, ColumnTransformer, custom transformers, feature unions, and production-ready ML workflows.
SciPy for machine learning - optimisation, sparse matrices, statistical distributions, signal processing, and distance metrics.