Design: Ad Click Prediction - Where ML Meets Revenue

Reading time: ~25 min | Interview relevance: High | Roles: MLE

The Real Interview Moment

"Design the ad click prediction system for a search engine or social media platform." You describe a logistic regression model that predicts clicks. The interviewer asks: "Your model predicts a 5% click probability for an ad, but the actual click rate is 3%. What happens?" You're not sure. The interviewer explains: "In a cost-per-click auction, we charge advertisers based on predicted click rates. If our predictions are 60% too high, we overcharge advertisers by 60%. They leave the platform. Calibration isn't a nice-to-have - it's a revenue requirement."

Ad click prediction is unique because model accuracy directly translates to revenue. An uncalibrated model doesn't just give bad recommendations - it breaks the ad auction economics.

What You Will Master

Ad auction mechanics (second-price, VCG) and why calibration matters
Feature engineering for ads (query-ad, user-ad, contextual features)
Calibration techniques: Platt scaling, isotonic regression
Real-time bidding architecture
Training on delayed and partial feedback
Multi-stage ranking for ad selection

The Complete Design

Step 1: Requirements (5 min)

Functional requirements:

Predict P(click | user, query, ad) for ad selection and pricing
Select top ads from 1M+ eligible ads per query
Support multiple ad formats: search ads, display ads, video ads

Non-functional requirements:

Latency: <20ms for ad scoring (ads compete with organic results)
Calibration: Predicted CTR within 5% of actual CTR across all segments
Throughput: 500K queries per second
Freshness: New ads eligible within minutes of creation

Interviewer's Perspective

The candidate who understands WHY calibration matters in ad systems - that predicted CTR feeds directly into the auction pricing equation - demonstrates real-world experience. If your model predicts 5% CTR but reality is 3%, you charge advertisers for 5% clickthrough rates they don't get. This is the #1 thing I test for in ad ML interviews.

Step 2: Problem Formulation (5 min)

The Ad Auction

For each query/impression:

Eligible ads bid: bid = advertiser_max_bid × P(click)
Ads ranked by: rank_score = bid × quality_score
Winner pays: second-price auction → cost = (next_bid / winner_CTR) + $0.01

Critical insight: The predicted CTR (P(click)) directly determines both ranking and pricing. Poor calibration means:

Over-predicted CTR → overcharge advertisers → they leave
Under-predicted CTR → undercharge advertisers → lost revenue

Business Goal	ML Objective	Primary Metric	Guardrails
Maximize ad revenue while maintaining advertiser ROI	Predict P(click \| user, query, ad)	Log-loss, calibration error	Revenue per query, advertiser churn rate

Step 3: Features & Data (8 min)

Feature Categories

Category	Features	Example
Query-Ad	Text match score, keyword match type (exact/phrase/broad), semantic similarity	Query "running shoes" + Ad "Nike Air Max"
Ad	Historical CTR, ad quality score, landing page quality, ad age, creative type	Ad with 2.5% historical CTR, image creative
User	Demographics, search history, past ad interactions, purchase intent signals	User who searched for "marathon training" yesterday
Context	Device, time of day, geographic location, search session depth	Mobile, 8pm, New York, 5th search in session
Advertiser	Account quality, bid amount, budget remaining, campaign objective	Advertiser with $10K daily budget, 60% spent

Training Data

Positive label: User clicked the ad
Negative label: Ad was shown but not clicked
CTR range: 1-5% for search ads, 0.1-0.5% for display ads
Volume: Billions of impressions/day
Label delay: Click happens within seconds, conversion (purchase) takes days

Common Trap

Ad click data has massive selection bias - you only observe clicks on ads that were shown, and they were shown because the old model ranked them highly. If you train naively on this data, you reinforce the old model's biases. Use exploration traffic (random ad selection on 1-5% of queries) to get unbiased training data, or use counterfactual learning.

Step 4: Model (8 min)

The Progression

Ad Click Prediction Model Progression - LR → Feature-Rich LR → GBDT+LR → Deep & Cross Network

Why Logistic Regression Is Still Used

In ad prediction, LR has unique advantages:

Naturally calibrated: Outputs are probabilities (sigmoid)
Fast inference: O(n) for n features - critical at 500K QPS
Online learning: Easy to update with streaming data (FTRL optimizer)
Interpretable: Feature weights explain predictions

Facebook's approach (still widely used): Use GBDT to create feature transformations, then feed leaf indices into LR. Combines GBDT's feature engineering power with LR's calibration.

Calibration

Technique	How It Works	When to Use
Platt scaling	Fit a logistic regression on model scores	Simple, works for well-behaved models
Isotonic regression	Fit a monotonic step function	More flexible, handles non-linear miscalibration
Temperature scaling	Divide logits by temperature T	Neural networks
Segment-wise calibration	Calibrate separately by segment (device, country)	When miscalibration varies by segment

How to measure: Expected Calibration Error (ECE) - bin predictions, compare mean predicted vs. actual CTR in each bin.

Step 5: Serving (8 min)

Ad Serving Pipeline - Query+Context → Candidate Selection → ML Scoring+Calibration → Auction → Winning Ads

Architecture Decisions

Component	Decision	Rationale
Candidate selection	Inverted index on keywords + targeting criteria	Sub-5ms retrieval
Model serving	Feature-hashed LR or quantized model	<10ms scoring for 100+ ads
Feature store	In-memory cache (Memcached) for user/ad features	Ultra-low latency
Calibration	Post-scoring calibration layer	Can update calibration without retraining
Online learning	FTRL with hourly mini-batch updates	Adapt to CTR changes quickly

Real-Time Bidding (RTB) Variant

For programmatic display ads, the flow is different:

Publisher sends ad request to ad exchange
Ad exchange sends bid requests to demand-side platforms (DSPs)
Each DSP has <100ms to respond with a bid
Highest bidder wins, ad is shown

This means: Your entire scoring pipeline (feature lookup + model inference + bid calculation) must complete in <50ms including network latency.

Step 6: Evaluation & Iteration (8 min)

Offline Metrics

Metric	What It Measures	Target
Log-loss	Prediction quality	Lower is better
AUC-ROC	Ranking quality	> 0.75
Calibration error (ECE)	Predicted vs. actual CTR	< 5% relative error
Revenue impact (offline simulation)	Estimated revenue change	Positive

Online Evaluation

A/B test: Split traffic, measure revenue per query, advertiser satisfaction, user experience
Metric: Revenue is the primary metric, but monitor advertiser ROI (if advertisers lose money, they leave)
Duration: 1-2 weeks, with daily monitoring for regressions

Practice Problems

Problem 1: Conversion Prediction

Direction

Beyond clicks, advertisers want to optimize for conversions (purchases). How do you design a conversion prediction model?

Key Insight

Conversions are much sparser than clicks (10-100x) and have long label delays (days to weeks). Solutions: (1) Use click prediction as an intermediate signal - P(conversion) = P(click) × P(conversion|click). (2) Handle label delay with delayed feedback models - initially train on clicks, update labels as conversions arrive. (3) Multi-task learning: predict click and conversion jointly. (4) Use value prediction (predicted revenue) not just binary conversion.

Problem 2: New Ad Cold Start

Direction

A new advertiser creates their first ad. You have no historical performance data. How do you estimate CTR?

Key Insight

Cold start for ads: (1) Use content features (ad text, landing page quality) to estimate initial CTR. (2) Use similar-ad CTR as a prior (find ads with similar keywords/creative). (3) Exploration: show the ad to a small random sample, collect data quickly. (4) Thompson sampling: maintain uncertainty estimates, explore more when uncertain. Key trade-off: too much exploration wastes impressions on bad ads, too little means good new ads never get a chance.

Interview Cheat Sheet

Question Pattern	Framework	Key Phrases
"Design ad click prediction"	Scoring + calibration + auction	"Calibrated CTR feeds into the auction - miscalibration directly impacts revenue"
"Why is calibration important?"	Auction economics	"Predicted CTR × bid = rank score. Over-prediction → overcharging → advertiser churn"
"How do you handle billions of features?"	Feature hashing + sparse LR	"Feature hashing to fixed dimension, FTRL for online learning"
"How do you handle label delay?"	Delayed feedback models	"Train on clicks (immediate), update with conversions (delayed)"

Spaced Repetition Checkpoints

Day 0: Explain the ad auction formula. Why does calibration matter for pricing?
Day 3: Compare LR vs. GBDT+LR vs. deep models for CTR prediction. Trade-offs?
Day 7: Design ad ranking for a video platform in 45 minutes.
Day 14: Explain Platt scaling and isotonic regression. When would you use each?
Day 21: Mock interview with follow-ups on real-time bidding, online learning, and conversion prediction.

What's Next

Content Moderation - Another classification problem with multi-modal inputs
Fraud Detection - Similar real-time scoring with high business impact

The Real Interview Moment​

What You Will Master​

The Complete Design​

Step 1: Requirements (5 min)​

Step 2: Problem Formulation (5 min)​

The Ad Auction​

Step 3: Features & Data (8 min)​

Feature Categories​

Training Data​

Step 4: Model (8 min)​

The Progression​

Why Logistic Regression Is Still Used​

Calibration​

Step 5: Serving (8 min)​

Architecture Decisions​

Real-Time Bidding (RTB) Variant​

Step 6: Evaluation & Iteration (8 min)​

Offline Metrics​

Online Evaluation​

Practice Problems​

Problem 1: Conversion Prediction​

Problem 2: New Ad Cold Start​

Interview Cheat Sheet​

Spaced Repetition Checkpoints​

What's Next​