Module 10 - Time Series Mathematics for ML Engineering
Nearly every production ML system operates on data with temporal structure - yet most ML courses treat time series as an afterthought.
Your recommendation system uses users' recent interactions (temporal order matters). Your financial model predicts stock prices (non-stationary time series that most models get wrong). Your IoT anomaly detector monitors sensor readings at 100 Hz (spectral analysis, not just statistics). Your language model is fundamentally a time series model - token prediction given history.
Understanding the mathematics of time series is not optional for a production ML engineer. It is the difference between models that correctly account for temporal dependencies and models that silently fail on data that violates their independence assumptions.
Why Time Series Math Matters in ML
Sequential data is everywhere
| Domain | Time Series | What Goes Wrong Without This Math |
|---|---|---|
| Finance | Stock prices, order book | Non-stationarity violates model assumptions |
| IoT/Sensors | Temperature, vibration, power | Aliasing, noise, seasonality missed |
| NLP/LLMs | Token sequences | Positional encoding design requires Fourier |
| Recommendation | User interaction history | Temporal autocorrelation causes data leakage |
| Robotics | Control signals | Kalman filter for sensor fusion |
| Audio/Speech | Waveforms | Fourier + wavelets for features |
The non-stationarity problem
A neural network trained on 2019–2022 stock data may work perfectly in backtesting but fail spectacularly in deployment. Why? The data distribution shifts over time (non-stationarity). Without testing for stationarity (Augmented Dickey-Fuller test), you cannot know whether your model learned genuine signal or just memorized a trend that no longer holds.
Temporal dependencies = data leakage risk
Standard train/test split shuffles data - fine for i.i.d. data, catastrophic for time series. Shuffling destroys temporal order and uses future information to predict the past. Time-based cross-validation, understanding autocorrelation, and knowing ARIMA are prerequisites for building evaluation pipelines that give honest performance estimates.
Module Map
Module 10: Time Series Mathematics
│
├── 01 - Stationarity and Ergodicity
│ Strict vs weak stationarity, ADF test,
│ differencing, unit roots, ML implications
│
├── 02 - Autocorrelation and PACF
│ ACF and PACF functions, lag plots,
│ interpreting correlograms to identify ARIMA orders
│
├── 03 - Fourier Analysis
│ DFT, FFT algorithm, power spectrum,
│ Fourier features in neural networks
│
├── 04 - ARIMA Models
│ AR, MA, ARIMA, SARIMA,
│ Box-Jenkins methodology, statsmodels
│
├── 05 - State Space Models
│ State space representation, Kalman filter,
│ connection to RNNs and LSTMs
│
├── 06 - Cointegration and Granger Causality
│ Cointegration, Johansen test,
│ Granger causality, causal feature selection
│
└── 07 - Wavelets and Multiscale Analysis
Wavelet transform, mother wavelets,
multiresolution analysis, WaveNet
Key Concepts at a Glance
| Concept | Why It Matters in ML |
|---|---|
| Stationarity | Required for ARIMA; violated by most financial data |
| ADF test | Diagnostic: is your series stationary? |
| Autocorrelation | Quantifies temporal dependence; guides model lag selection |
| PACF | Identifies the AR order in ARIMA |
| FFT | O(n log n) spectral decomposition - fast feature extraction |
| Power spectrum | Which frequencies carry the most signal? |
| ARIMA(p,d,q) | Classical forecasting baseline you must understand |
| Kalman filter | Optimal linear state estimation - backbone of sensor fusion |
| Cointegration | Long-run equilibrium between non-stationary series |
| Granger causality | Does series X help predict Y? |
| Wavelet transform | Multi-resolution analysis: time + frequency simultaneously |
Connections to Other Modules
Start with Lesson 01: Stationarity and Ergodicity →
