Skip to main content

19 docs tagged with "feature-engineering"

View all tags

Embedding Stores

Storing and serving dense embeddings at scale for real-time recommendation and search.

Feature Consistency

Ensuring identical features between training (offline) and serving (online).

Feature Monitoring in Production

Monitoring features after deployment - PSI, KS tests, freshness monitoring, completeness tracking, and proving to a regulator that no feature drifted more than 10% PSI.

Feature Selection and Importance

Reducing 500 features to 50 without losing model performance - filter, wrapper, and embedded methods, SHAP-based selection, and leakage detection.

Feature Stores in Production

Architecture and operations of feature stores - offline and online layers, point-in-time joins, and avoiding the training-serving skew that costs you accuracy.

Feature Validation and Testing

Ensuring feature quality through schema validation, unit tests, integration tests, and monitoring - catching the NaN bug before it degrades your model for 3 weeks.

Numerical and Categorical Features

Systematic feature engineering for tabular data - transformations, encoding, imputation, and selection that lifted AUC from 0.71 to 0.84.

Overview

Overview of real-time feature engineering for low-latency ML systems.

Pandas for ML

Pandas for machine learning - efficient data loading, feature engineering, pipelines, memory optimisation, and common ML preprocessing patterns.

Production Patterns

Case studies in real-time feature engineering from Uber, Twitter, and LinkedIn.

Scikit-Learn Pipelines

Scikit-learn Pipeline, ColumnTransformer, custom transformers, feature unions, and production-ready ML workflows.

Text Features for ML

Turning text into ML features - from TF-IDF baselines to embedding-based representations that improved e-commerce search NDCG by 18%.

Time-Series Features

Feature engineering for temporal data - lag features, rolling statistics, Fourier seasonality, and preventing temporal leakage that destroys production forecasts.