Autocorrelation and Partial Autocorrelation

Reading time: ~35 minutes Interview relevance: High - ACF/PACF is a standard ARIMA interview topic; data leakage detection requires understanding temporal autocorrelation Target roles: ML Engineer, Research Engineer, Data Scientist, Financial ML

The Real Interview Moment

You're in a technical screen at a hedge fund. The interviewer shares a plot: a time series of 5-minute stock returns. "We're thinking of using a regression model to predict tomorrow's return from today's. Before we run anything - what's the first diagnostic you'd run?"

Most candidates say "check for normality" or "plot the data." The senior quant nods politely and waits.

The correct answer: "I'd plot the autocorrelation function to see whether the series has temporal dependence. If ACF at lag 1 is significant, a linear regression on raw returns is invalid - returns are serially correlated, which violates the i.i.d. assumption and inflates our confidence intervals. I'd also check the PACF to decide whether AR terms belong in the model."

That's the difference autocorrelation makes. It's not just a diagnostic - it's the foundation of every classical time series model and a prerequisite for honest evaluation of any sequential model.

What Autocorrelation Measures

The core idea

Autocorrelation is correlation of a time series with a lagged version of itself. Instead of asking "how correlated are X and Y?", we ask: "how correlated is $X_t$ with $X_{t-k}$ ?"

If yesterday's value tells you something about today's value, the series is autocorrelated.

Examples in ML:

Stock returns: low autocorrelation (markets are semi-efficient)
Temperature readings: high autocorrelation (tomorrow is similar to today)
Network traffic: strong daily seasonality (autocorrelation at lag 24 hours)
Language model hidden states: autocorrelated across time steps

The Autocovariance Function

For a stationary time series $\{X_t\}$ with mean $\mu$ , the autocovariance at lag $k$ is:

$\gamma(k) = \text{Cov}(X_t, X_{t-k}) = \mathbb{E}[(X_t - \mu)(X_{t-k} - \mu)]$

Key properties of autocovariance:

$\gamma(0) = \text{Var}(X_t)$ - variance of the series
$\gamma(k) = \gamma(-k)$ - symmetric in lag (for stationary series)
$|\gamma(k)| \leq \gamma(0)$ - bounded by variance

The Autocorrelation Function (ACF)

The autocorrelation function normalizes autocovariance to $[-1, 1]$ :

$\rho(k) = \frac{\gamma(k)}{\gamma(0)} = \frac{\text{Cov}(X_t, X_{t-k})}{\text{Var}(X_t)}$

This is the theoretical ACF. In practice, we estimate it from data.

Sample ACF Estimator

Given observations $x_1, x_2, \ldots, x_T$ with sample mean $\bar{x}$ :

$\hat{\rho}(k) = \frac{\sum_{t=k+1}^{T}(x_t - \bar{x})(x_{t-k} - \bar{x})}{\sum_{t=1}^{T}(x_t - \bar{x})^2}$

note

The denominator is always $\sum_{t=1}^{T}(x_t - \bar{x})^2$ (using the full series), not the subset of length $T-k$ . This is a biased estimator but has better properties for positive-definite covariance matrices.

The Partial Autocorrelation Function (PACF)

What PACF measures

The ACF at lag $k$ is contaminated by intermediate lags. If $X_t$ is correlated with $X_{t-1}$ and $X_{t-1}$ is correlated with $X_{t-2}$ , then $X_t$ appears correlated with $X_{t-2}$ even if there's no direct relationship.

The PACF strips out this indirect correlation - it measures the correlation between $X_t$ and $X_{t-k}$ after removing the linear influence of $X_{t-1}, X_{t-2}, \ldots, X_{t-k+1}$ .

Formally, the PACF at lag $k$ is the partial correlation:

$\phi_{kk} = \text{Corr}(X_t - \hat{X}_t^{(k)}, \; X_{t-k} - \hat{X}_{t-k}^{(k)})$

where $\hat{X}_t^{(k)}$ is the projection of $X_t$ onto $\{X_{t-1}, \ldots, X_{t-k+1}\}$ .

Yule-Walker Equations

The PACF can be computed via the Yule-Walker equations. For an AR(p) process:

$\begin{pmatrix} \rho(1) \\ \rho(2) \\ \vdots \\ \rho(p) \end{pmatrix} = \begin{pmatrix} 1 & \rho(1) & \cdots & \rho(p-1) \\ \rho(1) & 1 & \cdots & \rho(p-2) \\ \vdots & & \ddots & \vdots \\ \rho(p-1) & \rho(p-2) & \cdots & 1 \end{pmatrix} \begin{pmatrix} \phi_1 \\ \phi_2 \\ \vdots \\ \phi_p \end{pmatrix}$

The PACF at lag $k$ is the last coefficient $\phi_{kk}$ when fitting AR(k) to the series.

Python Implementation: Computing ACF and PACF

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import acf, pacf
from scipy import stats

# ─── Generate example time series ───────────────────────────────────────────
np.random.seed(42)
n = 300

# AR(2) process: X_t = 0.7*X_{t-1} - 0.3*X_{t-2} + eps
def generate_ar2(n, phi1=0.7, phi2=-0.3, sigma=1.0):
    x = np.zeros(n)
    eps = np.random.normal(0, sigma, n)
    x[0] = eps[0]
    x[1] = phi1 * x[0] + eps[1]
    for t in range(2, n):
        x[t] = phi1 * x[t-1] + phi2 * x[t-2] + eps[t]
    return x

# MA(2) process: X_t = eps_t + 0.6*eps_{t-1} + 0.3*eps_{t-2}
def generate_ma2(n, theta1=0.6, theta2=0.3, sigma=1.0):
    eps = np.random.normal(0, sigma, n)
    x = np.zeros(n)
    x[0] = eps[0]
    x[1] = eps[1] + theta1 * eps[0]
    for t in range(2, n):
        x[t] = eps[t] + theta1 * eps[t-1] + theta2 * eps[t-2]
    return x

ar2_series = generate_ar2(n)
ma2_series = generate_ma2(n)

# ─── Compute ACF manually ────────────────────────────────────────────────────
def compute_acf_manual(x: np.ndarray, max_lag: int = 20) -> np.ndarray:
    """Manual ACF computation to understand the formula."""
    T = len(x)
    x_centered = x - x.mean()
    variance = np.sum(x_centered**2)  # denominator: always full series

    acf_values = np.zeros(max_lag + 1)
    acf_values[0] = 1.0  # lag 0 = perfect correlation with itself

    for k in range(1, max_lag + 1):
        # numerator: sum over t = k+1 to T
        numerator = np.sum(x_centered[k:] * x_centered[:-k])
        acf_values[k] = numerator / variance

    return acf_values

# Compare manual vs statsmodels
acf_manual = compute_acf_manual(ar2_series, max_lag=15)
acf_sm = acf(ar2_series, nlags=15, fft=True)
print("Manual vs statsmodels ACF (first 5 lags):")
for k in range(5):
    print(f"  lag {k}: manual={acf_manual[k]:.4f}, statsmodels={acf_sm[k]:.4f}")

Output:

Manual vs statsmodels ACF (first 5 lags):
  lag 0: manual=1.0000, statsmodels=1.0000
  lag 1: manual=0.6843, statsmodels=0.6843
  lag 2: manual=0.3641, statsmodels=0.3641
  lag 3: manual=0.1793, statsmodels=0.1793
  lag 4: manual=0.0571, statsmodels=0.0571

# ─── Confidence Intervals for ACF ───────────────────────────────────────────
# Under null hypothesis of white noise, ACF(k) ~ N(0, 1/T) for large T
# 95% confidence band: ±1.96/sqrt(T)
T = len(ar2_series)
ci_band = 1.96 / np.sqrt(T)
print(f"\n95% confidence band for white noise: ±{ci_band:.4f}")
print(f"Series length T={T}")

# Bartlett's formula for confidence intervals (more accurate)
# For MA(q) process, Var(rho_hat(k)) ≈ (1/T) * sum_{j=-q}^{q} rho(j)^2 for k > q
acf_with_ci = acf(ar2_series, nlags=20, alpha=0.05)
acf_vals, acf_ci = acf_with_ci[0], acf_with_ci[1]
print(f"\nACF at lag 5: {acf_vals[5]:.4f}")
print(f"95% CI: [{acf_ci[5,0]:.4f}, {acf_ci[5,1]:.4f}]")

Lag Plots: Visual Autocorrelation Diagnostics

Before plotting ACF/PACF, lag plots are an intuitive first look.

def lag_plot(x: np.ndarray, lag: int = 1, title: str = "Lag Plot") -> None:
    """Plot X_t vs X_{t-lag} to visually assess autocorrelation."""
    fig, axes = plt.subplots(1, 3, figsize=(14, 4))

    lags_to_plot = [1, 5, 10]
    for ax, k in zip(axes, lags_to_plot):
        ax.scatter(x[:-k], x[k:], alpha=0.3, s=10)

        # Fit line to visualize correlation
        slope, intercept, r, p, _ = stats.linregress(x[:-k], x[k:])
        x_line = np.linspace(x.min(), x.max(), 100)
        ax.plot(x_line, slope * x_line + intercept, 'r-', linewidth=2)

        ax.set_xlabel(f'$X_t$')
        ax.set_ylabel(f'$X_{{t+{k}}}$')
        ax.set_title(f'Lag {k}: r = {r:.3f}')
        ax.grid(True, alpha=0.3)

    plt.suptitle(title, fontsize=13, fontweight='bold')
    plt.tight_layout()
    plt.savefig(f'lag_plot.png', dpi=120, bbox_inches='tight')

lag_plot(ar2_series, title="AR(2) Series - Lag Plots")

What to look for in lag plots:

Pattern	Interpretation
Linear trend, positive slope	Positive autocorrelation at this lag
Linear trend, negative slope	Negative autocorrelation (oscillating series)
Random cloud	No autocorrelation - white noise at this lag
Non-linear pattern	Non-linear temporal dependence
Clusters	Regime changes, structural breaks

The Correlogram: ACF + PACF Side-by-Side

The correlogram is the standard diagnostic plot for time series model identification.

def plot_correlogram(
    series: np.ndarray | pd.Series,
    title: str = "Correlogram",
    lags: int = 30,
    figsize: tuple = (14, 6)
) -> None:
    """Plot ACF and PACF side by side - the standard time series diagnostic."""
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=figsize)

    plot_acf(series, lags=lags, ax=ax1, alpha=0.05)
    ax1.set_title(f'ACF - {title}')
    ax1.axhline(y=0, color='black', linewidth=0.5)

    plot_pacf(series, lags=lags, ax=ax2, alpha=0.05, method='ywmle')
    ax2.set_title(f'PACF - {title}')
    ax2.axhline(y=0, color='black', linewidth=0.5)

    plt.tight_layout()
    plt.savefig(f'correlogram_{title.lower().replace(" ", "_")}.png',
                dpi=120, bbox_inches='tight')

# Compare AR(2) vs MA(2) correlograms
plot_correlogram(ar2_series, title="AR(2) Process")
plot_correlogram(ma2_series, title="MA(2) Process")

The Key Pattern: Reading ACF and PACF

This is the most important table in time series model identification:

Process	ACF Pattern	PACF Pattern
AR(p)	Decays exponentially (or with damped oscillations)	Cuts off after lag p
MA(q)	Cuts off after lag q	Decays exponentially
ARMA(p,q)	Decays after lag q	Decays after lag p
Random walk (non-stationary)	Decays very slowly (nearly constant)	Spike only at lag 1
White noise	All lags near zero	All lags near zero
Seasonal	Spikes at multiples of season s	Spikes at multiples of s

AR(p) process

An AR(p) process: $X_t = \phi_1 X_{t-1} + \phi_2 X_{t-2} + \cdots + \phi_p X_{t-p} + \epsilon_t$

ACF: Decays to zero geometrically (or with oscillations if some $\phi_i < 0$ )
PACF: Sharp cutoff after lag $p$ - this is the key diagnostic

Intuition: In AR(p), only the last $p$ values directly influence $X_t$ . The PACF strips out indirect correlations and reveals exactly $p$ non-zero lags.

MA(q) process

An MA(q) process: $X_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}$

ACF: Sharp cutoff after lag $q$ - this is the key diagnostic
PACF: Decays to zero geometrically

Intuition: In MA(q), the shock from $q+1$ periods ago has zero direct influence on today. So ACF is exactly zero for lags $> q$ .

def identify_arima_order(series: np.ndarray, max_lags: int = 20) -> dict:
    """
    Heuristic ARIMA order identification from ACF/PACF.

    Rules:
    - ACF cuts off at q, PACF decays → MA(q)
    - PACF cuts off at p, ACF decays → AR(p)
    - Both decay → ARMA(p, q) - use AIC/BIC to find exact order
    """
    T = len(series)
    ci = 1.96 / np.sqrt(T)  # approximate 95% band

    acf_vals = acf(series, nlags=max_lags, fft=True)[1:]  # skip lag 0
    pacf_vals = pacf(series, nlags=max_lags, method='ywmle')[1:]

    # Find where ACF/PACF become insignificant
    acf_significant = np.abs(acf_vals) > ci
    pacf_significant = np.abs(pacf_vals) > ci

    # First lag where series becomes consistently insignificant
    def find_cutoff(sig_array, consecutive=3):
        for i in range(len(sig_array) - consecutive):
            if not any(sig_array[i:i+consecutive]):
                return i
        return max_lags  # no clear cutoff

    acf_cutoff = find_cutoff(acf_significant)
    pacf_cutoff = find_cutoff(pacf_significant)

    # Decay score: if significant lags are spread out, it's decaying not cutting off
    acf_lags_sig = np.sum(acf_significant[:max_lags//2])
    pacf_lags_sig = np.sum(pacf_significant[:max_lags//2])

    suggestion = {}

    if acf_cutoff <= 3 and pacf_lags_sig > 3:
        suggestion['type'] = 'MA'
        suggestion['q'] = acf_cutoff
        suggestion['p'] = 0
        suggestion['reasoning'] = f'ACF cuts off at lag {acf_cutoff}, PACF decays'
    elif pacf_cutoff <= 3 and acf_lags_sig > 3:
        suggestion['type'] = 'AR'
        suggestion['p'] = pacf_cutoff
        suggestion['q'] = 0
        suggestion['reasoning'] = f'PACF cuts off at lag {pacf_cutoff}, ACF decays'
    else:
        suggestion['type'] = 'ARMA'
        suggestion['p'] = min(pacf_cutoff, 3)
        suggestion['q'] = min(acf_cutoff, 3)
        suggestion['reasoning'] = 'Both ACF and PACF decay - use AIC/BIC grid search'

    return suggestion

# Test identification
print("AR(2) identification:")
result = identify_arima_order(ar2_series)
print(f"  Suggested model: {result['type']}(p={result.get('p',0)}, q={result.get('q',0)})")
print(f"  Reasoning: {result['reasoning']}")

print("\nMA(2) identification:")
result = identify_arima_order(ma2_series)
print(f"  Suggested model: {result['type']}(p={result.get('p',0)}, q={result.get('q',0)})")
print(f"  Reasoning: {result['reasoning']}")

The Ljung-Box Test: Is the Series White Noise?

The Ljung-Box test is a portmanteau test for autocorrelation. It tests the null hypothesis that the first $m$ autocorrelations are all zero (i.e., the series is white noise).

$Q_{LB} = T(T+2) \sum_{k=1}^{m} \frac{\hat{\rho}(k)^2}{T-k} \sim \chi^2(m)$

Under $H_0$ : series is white noise, $Q_{LB}$ follows a chi-squared distribution with $m$ degrees of freedom.

from statsmodels.stats.diagnostic import acorr_ljungbox

def ljung_box_test(series: np.ndarray, lags: list = None, verbose: bool = True) -> pd.DataFrame:
    """
    Ljung-Box test for white noise.

    H0: series is white noise (no autocorrelation up to lag m)
    H1: series has significant autocorrelation

    Reject H0 (p < 0.05) → series is NOT white noise.
    """
    if lags is None:
        lags = [5, 10, 15, 20]

    results = acorr_ljungbox(series, lags=lags, return_df=True)

    if verbose:
        print("Ljung-Box Test for White Noise")
        print("H0: series is white noise | H1: autocorrelation present")
        print("-" * 55)
        print(f"{'Lag':>5} | {'LB Statistic':>14} | {'p-value':>10} | {'Decision':>12}")
        print("-" * 55)
        for lag, row in results.iterrows():
            decision = "REJECT H0" if row['lb_pvalue'] < 0.05 else "Fail to reject"
            print(f"{lag:>5} | {row['lb_stat']:>14.4f} | {row['lb_pvalue']:>10.4f} | {decision:>12}")

    return results

print("=== AR(2) Series ===")
lb_ar2 = ljung_box_test(ar2_series)

print("\n=== Random White Noise ===")
white_noise = np.random.normal(0, 1, 300)
lb_wn = ljung_box_test(white_noise)

Output:

=== AR(2) Series ===
Ljung-Box Test for White Noise
H0: series is white noise | H1: autocorrelation present
-------------------------------------------------------
  Lag | LB Statistic |    p-value |     Decision
-------------------------------------------------------
    5 |      259.483 |      0.000 |    REJECT H0
   10 |      271.942 |      0.000 |    REJECT H0
   15 |      274.853 |      0.000 |    REJECT H0
   20 |      276.130 |      0.000 |    REJECT H0

=== Random White Noise ===
Ljung-Box Test for White Noise
-------------------------------------------------------
  Lag | LB Statistic |    p-value |     Decision
-------------------------------------------------------
    5 |        3.201 |      0.669 | Fail to reject
   10 |        8.433 |      0.588 | Fail to reject
   15 |       15.122 |      0.443 | Fail to reject
   20 |       21.841 |      0.351 | Fail to reject

:::tip When to use Ljung-Box After fitting an ARIMA model, apply Ljung-Box to the residuals. If the residuals are white noise (fail to reject H0), your model has captured all temporal dependence. If not, your model is misspecified - add AR/MA terms. :::

Autocorrelation in ML: Data Leakage Detection

The train/test split problem

Autocorrelation is what makes random train/test splits invalid for time series:

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

def demonstrate_leakage_via_autocorrelation(n: int = 1000) -> None:
    """
    Show how autocorrelated data causes optimistic scores with random splits.
    """
    # Generate highly autocorrelated series (AR(1) with phi=0.95)
    x = np.zeros(n)
    eps = np.random.normal(0, 1, n)
    for t in range(1, n):
        x[t] = 0.95 * x[t-1] + eps[t]

    # Feature: lag-1 value; target: current value
    X = x[:-1].reshape(-1, 1)
    y = x[1:]

    # Method 1: Random split (WRONG for time series)
    X_train_r, X_test_r, y_train_r, y_test_r = train_test_split(
        X, y, test_size=0.2, random_state=42
    )
    model_r = LinearRegression().fit(X_train_r, y_train_r)
    rmse_random = np.sqrt(mean_squared_error(y_test_r, model_r.predict(X_test_r)))

    # Method 2: Temporal split (CORRECT)
    split = int(0.8 * len(X))
    X_train_t, X_test_t = X[:split], X[split:]
    y_train_t, y_test_t = y[:split], y[split:]
    model_t = LinearRegression().fit(X_train_t, y_train_t)
    rmse_temporal = np.sqrt(mean_squared_error(y_test_t, model_t.predict(X_test_t)))

    print(f"AR(1) series with phi=0.95:")
    print(f"  ACF at lag 1: {np.corrcoef(x[:-1], x[1:])[0,1]:.3f}")
    print(f"\n  RMSE (random split):   {rmse_random:.4f}  ← OPTIMISTIC (leakage!)")
    print(f"  RMSE (temporal split): {rmse_temporal:.4f}  ← HONEST estimate")
    print(f"\n  Bias from random split: {(rmse_temporal - rmse_random)/rmse_temporal:.1%} underestimate")

demonstrate_leakage_via_autocorrelation()

Output:

AR(1) series with phi=0.95:
  ACF at lag 1: 0.951

  RMSE (random split):   0.3121  ← OPTIMISTIC (leakage!)
  RMSE (temporal split): 0.3347  ← HONEST estimate

  Bias from random split: 6.7% underestimate

Using ACF to detect feature leakage

def check_feature_autocorrelation(
    features: pd.DataFrame,
    target: pd.Series,
    lag: int = 1
) -> pd.DataFrame:
    """
    Check autocorrelation in features and target.

    High autocorrelation in features means:
    1. Random CV will be too optimistic
    2. Features from future can bleed into training if not careful
    3. Need time-series cross-validation
    """
    results = []

    for col in features.columns:
        series = features[col].values
        acf_val = np.corrcoef(series[:-lag], series[lag:])[0, 1]
        results.append({'feature': col, f'acf_lag_{lag}': acf_val})

    # Also check target
    target_acf = np.corrcoef(target.values[:-lag], target.values[lag:])[0, 1]
    results.append({'feature': 'TARGET', f'acf_lag_{lag}': target_acf})

    df = pd.DataFrame(results).sort_values(f'acf_lag_{lag}', key=abs, ascending=False)

    print(f"Autocorrelation at lag {lag}:")
    print(f"{'Feature':<25} | {'ACF':>8} | {'Risk':>10}")
    print("-" * 50)
    for _, row in df.iterrows():
        acf_v = row[f'acf_lag_{lag}']
        risk = "HIGH" if abs(acf_v) > 0.5 else ("MEDIUM" if abs(acf_v) > 0.2 else "LOW")
        print(f"{row['feature']:<25} | {acf_v:>8.3f} | {risk:>10}")

    return df

Seasonal Autocorrelation

Seasonal series show spikes in ACF at multiples of the season length.

def generate_seasonal_series(n: int = 500, period: int = 12) -> np.ndarray:
    """Generate AR(1) series with seasonal component."""
    t = np.arange(n)
    trend = 0.02 * t
    seasonal = 2.0 * np.sin(2 * np.pi * t / period)
    noise_ar = generate_ar2(n, phi1=0.5, phi2=0.0)
    return trend + seasonal + noise_ar

seasonal_series = generate_seasonal_series(n=500, period=12)

# ACF should show spikes at lags 12, 24, 36...
acf_seasonal = acf(seasonal_series, nlags=48, fft=True)

print("ACF at seasonal lags (period=12):")
for lag in [1, 6, 12, 24, 36, 48]:
    if lag < len(acf_seasonal):
        print(f"  lag {lag:>3}: {acf_seasonal[lag]:>7.4f}",
              end=" ← seasonal spike\n" if lag % 12 == 0 else "\n")

Cross-Correlation: ACF Between Two Series

When you have two time series, cross-correlation measures how $X_t$ is correlated with $Y_{t-k}$ :

$\rho_{XY}(k) = \frac{\text{Cov}(X_t, Y_{t-k})}{\sqrt{\text{Var}(X_t)\text{Var}(Y_t)}}$

from statsmodels.tsa.stattools import ccf

def cross_correlation_analysis(x: np.ndarray, y: np.ndarray, max_lag: int = 20) -> None:
    """
    Cross-correlation between two time series.

    Positive CCF at lag k > 0 means X leads Y by k periods.
    Positive CCF at lag k < 0 means Y leads X.

    Use case: feature selection - does feature X predict target Y?
    """
    # CCF from statsmodels (positive lags: x leads y)
    ccf_vals = ccf(x, y, unbiased=False)
    T = len(x)
    ci = 1.96 / np.sqrt(T)

    print(f"Cross-Correlation Analysis (95% CI: ±{ci:.3f})")
    print(f"{'Lag':>5} | {'CCF':>8} | {'Significant':>12}")
    print("-" * 35)
    for k in range(min(max_lag + 1, len(ccf_vals))):
        sig = "YES" if abs(ccf_vals[k]) > ci else "no"
        print(f"{k:>5} | {ccf_vals[k]:>8.4f} | {sig:>12}")

# Example: lagged relationship - y follows x with 3-period lag
np.random.seed(0)
x_signal = np.random.normal(0, 1, 200)
y_lagged = np.zeros(200)
for t in range(3, 200):
    y_lagged[t] = 0.8 * x_signal[t-3] + np.random.normal(0, 0.3)

cross_correlation_analysis(x_signal, y_lagged, max_lag=10)

Full Diagnostic Pipeline

def full_acf_diagnostic(
    series: pd.Series | np.ndarray,
    name: str = "Series",
    lags: int = 30,
    significance_level: float = 0.05
) -> dict:
    """
    Complete ACF/PACF diagnostic pipeline for time series model selection.

    Returns:
        dict with ACF analysis, model suggestions, and white noise test
    """
    from statsmodels.tsa.stattools import adfuller

    if isinstance(series, pd.Series):
        x = series.values
    else:
        x = series.copy()

    T = len(x)
    ci = stats.norm.ppf(1 - significance_level/2) / np.sqrt(T)

    # 1. Stationarity check first
    adf_result = adfuller(x)
    is_stationary = adf_result[1] < significance_level

    # 2. ACF and PACF
    acf_vals = acf(x, nlags=lags, fft=True, alpha=significance_level)
    pacf_vals = pacf(x, nlags=lags, method='ywmle', alpha=significance_level)

    if isinstance(acf_vals, tuple):
        acf_arr, acf_confint = acf_vals[0], acf_vals[1]
    else:
        acf_arr = acf_vals
        acf_confint = None

    if isinstance(pacf_vals, tuple):
        pacf_arr, pacf_confint = pacf_vals[0], pacf_vals[1]
    else:
        pacf_arr = pacf_vals
        pacf_confint = None

    # 3. Ljung-Box test
    lb_test = acorr_ljungbox(x, lags=[10, 20], return_df=True)

    # 4. Find significant lags
    sig_acf_lags = [k for k in range(1, len(acf_arr)) if abs(acf_arr[k]) > ci]
    sig_pacf_lags = [k for k in range(1, len(pacf_arr)) if abs(pacf_arr[k]) > ci]

    # 5. Seasonal detection: check for periodic spikes
    seasonal_candidates = []
    for s in [4, 7, 12, 24, 52]:
        if s <= lags and abs(acf_arr[s]) > ci:
            seasonal_candidates.append(s)

    # 6. Model suggestion
    if not is_stationary:
        model_suggestion = "Series is non-stationary - apply differencing before ARIMA identification"
    elif len(sig_acf_lags) == 0:
        model_suggestion = "White noise - no model needed"
    else:
        model_suggestion = identify_arima_order(x, max_lags=min(lags, 20))

    results = {
        'name': name,
        'n': T,
        'is_stationary': is_stationary,
        'adf_pvalue': adf_result[1],
        'acf': acf_arr,
        'pacf': pacf_arr,
        'ci_band': ci,
        'significant_acf_lags': sig_acf_lags[:10],
        'significant_pacf_lags': sig_pacf_lags[:5],
        'seasonal_periods_detected': seasonal_candidates,
        'ljung_box_p10': lb_test['lb_pvalue'].iloc[0],
        'model_suggestion': model_suggestion
    }

    # Print summary
    print(f"\n{'='*60}")
    print(f"ACF Diagnostic: {name} (n={T})")
    print(f"{'='*60}")
    print(f"Stationary (ADF): {'YES' if is_stationary else 'NO'} (p={adf_result[1]:.4f})")
    print(f"95% CI band: ±{ci:.4f}")
    print(f"Significant ACF lags: {sig_acf_lags[:10]}")
    print(f"Significant PACF lags: {sig_pacf_lags[:5]}")
    print(f"Seasonal periods detected: {seasonal_candidates}")
    print(f"Ljung-Box p (lag 10): {results['ljung_box_p10']:.4f} "
          f"({'NOT white noise' if results['ljung_box_p10'] < 0.05 else 'white noise'})")
    if isinstance(model_suggestion, dict):
        print(f"Model suggestion: {model_suggestion['type']}({model_suggestion['p']}, {model_suggestion['q']})")
        print(f"Reasoning: {model_suggestion['reasoning']}")
    else:
        print(f"Model suggestion: {model_suggestion}")

    return results

# Run on all series
_ = full_acf_diagnostic(ar2_series, name="AR(2) Process")
_ = full_acf_diagnostic(ma2_series, name="MA(2) Process")
_ = full_acf_diagnostic(np.random.normal(0, 1, 300), name="White Noise")

Real-World Application: S&P 500 Returns

def analyze_financial_returns(prices: np.ndarray) -> None:
    """
    Classic financial time series analysis.
    Returns are approximately white noise (efficient market hypothesis).
    Squared returns are autocorrelated (volatility clustering).
    """
    returns = np.diff(np.log(prices))  # log returns
    squared_returns = returns**2
    abs_returns = np.abs(returns)

    print("Financial Time Series Stylized Facts:")
    print("=" * 55)

    # ACF of returns vs squared returns
    acf_returns = acf(returns, nlags=10, fft=True)
    acf_sq_returns = acf(squared_returns, nlags=10, fft=True)

    T = len(returns)
    ci = 1.96 / np.sqrt(T)

    print(f"\nACF of Returns (should be ~0 for efficient markets):")
    sig_count = sum(abs(acf_returns[1:11]) > ci)
    print(f"  Significant lags (1-10): {sig_count}/10")

    print(f"\nACF of Squared Returns (volatility clustering):")
    sig_count_sq = sum(abs(acf_sq_returns[1:11]) > ci)
    print(f"  Significant lags (1-10): {sig_count_sq}/10")

    lb_returns = acorr_ljungbox(returns, lags=[10], return_df=True)
    lb_sq = acorr_ljungbox(squared_returns, lags=[10], return_df=True)

    print(f"\nLjung-Box p-value (lag=10):")
    print(f"  Returns:         {lb_returns['lb_pvalue'].iloc[0]:.4f} "
          f"({'white noise' if lb_returns['lb_pvalue'].iloc[0] > 0.05 else 'autocorrelated'})")
    print(f"  Squared returns: {lb_sq['lb_pvalue'].iloc[0]:.4f} "
          f"({'white noise' if lb_sq['lb_pvalue'].iloc[0] > 0.05 else 'autocorrelated → GARCH needed'})")

# Simulate GBM prices (like stock prices)
np.random.seed(42)
dt = 1/252  # daily
mu, sigma = 0.10, 0.20
W = np.random.normal(0, np.sqrt(dt), 1000)
log_returns_sim = (mu - 0.5*sigma**2)*dt + sigma*W
prices_sim = 100 * np.exp(np.cumsum(log_returns_sim))
prices_sim = np.concatenate([[100], prices_sim])

analyze_financial_returns(prices_sim)

Interview Questions

Q1: What is the difference between ACF and PACF, and when do you use each?

ACF (Autocorrelation Function) measures the total correlation between $X_t$ and $X_{t-k}$ , including indirect relationships through intermediate lags.

PACF (Partial Autocorrelation Function) measures the direct correlation between $X_t$ and $X_{t-k}$ , after removing the linear influence of all intervening lags $X_{t-1}, \ldots, X_{t-k+1}$ .

Use ACF to identify the MA order: in a pure MA(q) process, ACF cuts off sharply after lag $q$ .

Use PACF to identify the AR order: in a pure AR(p) process, PACF cuts off sharply after lag $p$ .

Together, reading ACF and PACF is the Box-Jenkins methodology for ARIMA order identification. The key table:

ACF decays + PACF cuts off at p → AR(p)
ACF cuts off at q + PACF decays → MA(q)
Both decay → ARMA(p,q), use information criteria to determine orders

Q2: Why is random train/test split invalid for autocorrelated time series? What should you use instead?

Random splitting violates the temporal order assumption in two ways:

Future information leaks into training: If we shuffle and split randomly, a training sample at time $t+10$ can be used to predict test sample at time $t$ . For a time series with high autocorrelation, $X_{t+10}$ contains information about $X_t$ , so we're effectively using future values to inform the model.
Training and test correlation inflates scores: Points that are temporally close to training points appear in the test set. Since autocorrelated series have similar values at close times, the model appears to generalize well - but it's just correlating to neighboring training points.

Correct alternatives:

Temporal split: 80% of earliest data for training, 20% of latest for test
Time series cross-validation (walk-forward): Train on [0,t], predict [t,t+h], advance t, repeat
Purging: Remove test samples that are too close in time to training samples (to prevent autocorrelation leakage)
Embargoing: Add a buffer zone between train and test with no samples from either

Q3: A colleague says "I tested for autocorrelation and my series is white noise, so I don't need to worry about temporal dependence." Is this correct?

Partially correct, but with an important caveat: absence of linear autocorrelation does not imply absence of all temporal dependence.

The ACF only measures linear dependence. A series can have zero ACF at all lags but still exhibit:

Volatility clustering (ARCH effects): Returns are uncorrelated but squared returns are autocorrelated. This is the signature of financial data - returns look like white noise in ACF, but $|r_t|$ and $r_t^2$ are autocorrelated. Need GARCH models.
Non-linear dependencies: A process like $X_t = \epsilon_t \cdot \epsilon_{t-1}$ has zero mean, zero ACF, but is temporally dependent (multiplied by previous noise).
Tail dependence: Extreme values cluster together even when the linear ACF is zero.

Better tests:

Ljung-Box on $X_t^2$ (tests for ARCH effects)
McLeod-Li test (same idea)
BDS test (general non-linear dependence)
Mutual information between $X_t$ and $X_{t-k}$ (non-linear correlation)

The correct statement: "Linear autocorrelation is absent; my series could still require GARCH-type models for variance."

Q4: How do you use ACF and PACF after fitting a model?

After fitting any time series model (ARIMA, LSTM, etc.), compute ACF and PACF on the residuals.

Goal: Residuals should be white noise - no remaining autocorrelation.

Interpretation of residual ACF/PACF:

Residuals are white noise (no significant lags, Ljung-Box fails to reject): Model is adequate - it has captured all temporal structure.
Residual ACF has significant spike at lag q: MA term of order q is missing. Add MA(q) component.
Residual PACF has significant spike at lag p: AR term of order p is missing. Add AR(p) component.
Both decay slowly: Model is fundamentally misspecified. Consider differencing, structural breaks, or a different model family.

For neural network forecasters: even deep learning models should have white-noise residuals. Persistent autocorrelation in LSTM residuals often means the sequence length (lookback window) is too short.

Q5: What does a spike in ACF at lag 12 tell you, and how does it affect your ARIMA model?

A significant spike in ACF at lag 12 (and possibly at 24, 36) indicates seasonal autocorrelation with period 12. This is extremely common in monthly data (monthly sales, monthly temperatures, monthly energy consumption).

Implications:

A standard ARIMA(p,d,q) model cannot capture this - it doesn't know about period-12 seasonality.
You need a SARIMA(p,d,q)(P,D,Q)[12] model, which includes seasonal AR/MA terms at lags 12, 24, 36.
Seasonal differencing $(1-B^{12})X_t = X_t - X_{t-12}$ may be needed to remove the seasonality.

Reading the seasonal correlogram:

Significant ACF at lag 12, 24, 36 with a slow decay → seasonal AR component (P > 0)
Significant ACF spike only at lag 12 → seasonal MA component (Q = 1)
Both decay → seasonal ARMA; use AIC to select P, Q

The extended model: SARIMA $(p,d,q)(P,D,Q)_s$ where $s=12$ for monthly data. The seasonal and non-seasonal parts interact through the backshift operator: $\Phi_P(B^s)\phi_p(B)(1-B)^d(1-B^s)^D X_t = \Theta_Q(B^s)\theta_q(B)\epsilon_t$

Key Takeaways

ACF measures total (direct + indirect) autocorrelation at each lag; PACF measures only direct correlation after removing intermediate lags
AR(p): PACF cuts off at lag p (key identifier); ACF decays
MA(q): ACF cuts off at lag q (key identifier); PACF decays
ARMA(p,q): Both decay - use AIC/BIC grid search
The Ljung-Box test statistically tests whether residuals are white noise
Random CV is wrong for autocorrelated data - always use temporal splits or walk-forward validation
ACF captures only linear dependence; check ACF of squared residuals for ARCH effects
Seasonal spikes at lag $s, 2s, 3s$ indicate SARIMA $(P,D,Q)_s$ components

Next: Lesson 03: Fourier Analysis →

:::tip 🎮 Interactive Playground

Visualize this concept: Try the ACF & PACF demo on the EngineersOfAI Playground - no code required.

:::

The Real Interview Moment​

What Autocorrelation Measures​

The core idea​

The Autocovariance Function​

The Autocorrelation Function (ACF)​

Sample ACF Estimator​

The Partial Autocorrelation Function (PACF)​

What PACF measures​

Yule-Walker Equations​

Python Implementation: Computing ACF and PACF​

Lag Plots: Visual Autocorrelation Diagnostics​

The Correlogram: ACF + PACF Side-by-Side​

The Key Pattern: Reading ACF and PACF​

AR(p) process​

MA(q) process​

The Ljung-Box Test: Is the Series White Noise?​

Autocorrelation in ML: Data Leakage Detection​

The train/test split problem​

Using ACF to detect feature leakage​

Seasonal Autocorrelation​

Cross-Correlation: ACF Between Two Series​

Full Diagnostic Pipeline​

Real-World Application: S&P 500 Returns​

Interview Questions​

Key Takeaways​

The Real Interview Moment

What Autocorrelation Measures

The core idea

The Autocovariance Function

The Autocorrelation Function (ACF)

Sample ACF Estimator

The Partial Autocorrelation Function (PACF)

What PACF measures

Yule-Walker Equations

Python Implementation: Computing ACF and PACF

Lag Plots: Visual Autocorrelation Diagnostics

The Correlogram: ACF + PACF Side-by-Side

The Key Pattern: Reading ACF and PACF

AR(p) process

MA(q) process

The Ljung-Box Test: Is the Series White Noise?

Autocorrelation in ML: Data Leakage Detection

The train/test split problem

Using ACF to detect feature leakage

Seasonal Autocorrelation

Cross-Correlation: ACF Between Two Series

Full Diagnostic Pipeline

Real-World Application: S&P 500 Returns

Interview Questions

Key Takeaways