Skip to main content

Design: Ad Click Prediction - Where ML Meets Revenue

Reading time: ~25 min | Interview relevance: High | Roles: MLE

The Real Interview Moment

"Design the ad click prediction system for a search engine or social media platform." You describe a logistic regression model that predicts clicks. The interviewer asks: "Your model predicts a 5% click probability for an ad, but the actual click rate is 3%. What happens?" You're not sure. The interviewer explains: "In a cost-per-click auction, we charge advertisers based on predicted click rates. If our predictions are 60% too high, we overcharge advertisers by 60%. They leave the platform. Calibration isn't a nice-to-have - it's a revenue requirement."

Ad click prediction is unique because model accuracy directly translates to revenue. An uncalibrated model doesn't just give bad recommendations - it breaks the ad auction economics.

What You Will Master

  • Ad auction mechanics (second-price, VCG) and why calibration matters
  • Feature engineering for ads (query-ad, user-ad, contextual features)
  • Calibration techniques: Platt scaling, isotonic regression
  • Real-time bidding architecture
  • Training on delayed and partial feedback
  • Multi-stage ranking for ad selection

The Complete Design

Step 1: Requirements (5 min)

Functional requirements:

  • Predict P(click | user, query, ad) for ad selection and pricing
  • Select top ads from 1M+ eligible ads per query
  • Support multiple ad formats: search ads, display ads, video ads

Non-functional requirements:

  • Latency: <20ms for ad scoring (ads compete with organic results)
  • Calibration: Predicted CTR within 5% of actual CTR across all segments
  • Throughput: 500K queries per second
  • Freshness: New ads eligible within minutes of creation
Interviewer's Perspective

The candidate who understands WHY calibration matters in ad systems - that predicted CTR feeds directly into the auction pricing equation - demonstrates real-world experience. If your model predicts 5% CTR but reality is 3%, you charge advertisers for 5% clickthrough rates they don't get. This is the #1 thing I test for in ad ML interviews.

Step 2: Problem Formulation (5 min)

The Ad Auction

For each query/impression:

  1. Eligible ads bid: bid = advertiser_max_bid × P(click)
  2. Ads ranked by: rank_score = bid × quality_score
  3. Winner pays: second-price auction → cost = (next_bid / winner_CTR) + $0.01

Critical insight: The predicted CTR (P(click)) directly determines both ranking and pricing. Poor calibration means:

  • Over-predicted CTR → overcharge advertisers → they leave
  • Under-predicted CTR → undercharge advertisers → lost revenue
Business GoalML ObjectivePrimary MetricGuardrails
Maximize ad revenue while maintaining advertiser ROIPredict P(click | user, query, ad)Log-loss, calibration errorRevenue per query, advertiser churn rate

Step 3: Features & Data (8 min)

Feature Categories

CategoryFeaturesExample
Query-AdText match score, keyword match type (exact/phrase/broad), semantic similarityQuery "running shoes" + Ad "Nike Air Max"
AdHistorical CTR, ad quality score, landing page quality, ad age, creative typeAd with 2.5% historical CTR, image creative
UserDemographics, search history, past ad interactions, purchase intent signalsUser who searched for "marathon training" yesterday
ContextDevice, time of day, geographic location, search session depthMobile, 8pm, New York, 5th search in session
AdvertiserAccount quality, bid amount, budget remaining, campaign objectiveAdvertiser with $10K daily budget, 60% spent

Training Data

  • Positive label: User clicked the ad
  • Negative label: Ad was shown but not clicked
  • CTR range: 1-5% for search ads, 0.1-0.5% for display ads
  • Volume: Billions of impressions/day
  • Label delay: Click happens within seconds, conversion (purchase) takes days
Common Trap

Ad click data has massive selection bias - you only observe clicks on ads that were shown, and they were shown because the old model ranked them highly. If you train naively on this data, you reinforce the old model's biases. Use exploration traffic (random ad selection on 1-5% of queries) to get unbiased training data, or use counterfactual learning.

Step 4: Model (8 min)

The Progression

Ad Click Prediction Model Progression - LR → Feature-Rich LR → GBDT+LR → Deep &amp; Cross Network

Why Logistic Regression Is Still Used

In ad prediction, LR has unique advantages:

  • Naturally calibrated: Outputs are probabilities (sigmoid)
  • Fast inference: O(n) for n features - critical at 500K QPS
  • Online learning: Easy to update with streaming data (FTRL optimizer)
  • Interpretable: Feature weights explain predictions

Facebook's approach (still widely used): Use GBDT to create feature transformations, then feed leaf indices into LR. Combines GBDT's feature engineering power with LR's calibration.

Calibration

TechniqueHow It WorksWhen to Use
Platt scalingFit a logistic regression on model scoresSimple, works for well-behaved models
Isotonic regressionFit a monotonic step functionMore flexible, handles non-linear miscalibration
Temperature scalingDivide logits by temperature TNeural networks
Segment-wise calibrationCalibrate separately by segment (device, country)When miscalibration varies by segment

How to measure: Expected Calibration Error (ECE) - bin predictions, compare mean predicted vs. actual CTR in each bin.

Step 5: Serving (8 min)

Ad Serving Pipeline - Query+Context → Candidate Selection → ML Scoring+Calibration → Auction → Winning Ads

Architecture Decisions

ComponentDecisionRationale
Candidate selectionInverted index on keywords + targeting criteriaSub-5ms retrieval
Model servingFeature-hashed LR or quantized model<10ms scoring for 100+ ads
Feature storeIn-memory cache (Memcached) for user/ad featuresUltra-low latency
CalibrationPost-scoring calibration layerCan update calibration without retraining
Online learningFTRL with hourly mini-batch updatesAdapt to CTR changes quickly

Real-Time Bidding (RTB) Variant

For programmatic display ads, the flow is different:

  1. Publisher sends ad request to ad exchange
  2. Ad exchange sends bid requests to demand-side platforms (DSPs)
  3. Each DSP has <100ms to respond with a bid
  4. Highest bidder wins, ad is shown

This means: Your entire scoring pipeline (feature lookup + model inference + bid calculation) must complete in <50ms including network latency.

Step 6: Evaluation & Iteration (8 min)

Offline Metrics

MetricWhat It MeasuresTarget
Log-lossPrediction qualityLower is better
AUC-ROCRanking quality> 0.75
Calibration error (ECE)Predicted vs. actual CTR< 5% relative error
Revenue impact (offline simulation)Estimated revenue changePositive

Online Evaluation

  • A/B test: Split traffic, measure revenue per query, advertiser satisfaction, user experience
  • Metric: Revenue is the primary metric, but monitor advertiser ROI (if advertisers lose money, they leave)
  • Duration: 1-2 weeks, with daily monitoring for regressions

Practice Problems

Problem 1: Conversion Prediction

Direction

Beyond clicks, advertisers want to optimize for conversions (purchases). How do you design a conversion prediction model?

Key Insight

Conversions are much sparser than clicks (10-100x) and have long label delays (days to weeks). Solutions: (1) Use click prediction as an intermediate signal - P(conversion) = P(click) × P(conversion|click). (2) Handle label delay with delayed feedback models - initially train on clicks, update labels as conversions arrive. (3) Multi-task learning: predict click and conversion jointly. (4) Use value prediction (predicted revenue) not just binary conversion.

Problem 2: New Ad Cold Start

Direction

A new advertiser creates their first ad. You have no historical performance data. How do you estimate CTR?

Key Insight

Cold start for ads: (1) Use content features (ad text, landing page quality) to estimate initial CTR. (2) Use similar-ad CTR as a prior (find ads with similar keywords/creative). (3) Exploration: show the ad to a small random sample, collect data quickly. (4) Thompson sampling: maintain uncertainty estimates, explore more when uncertain. Key trade-off: too much exploration wastes impressions on bad ads, too little means good new ads never get a chance.

Interview Cheat Sheet

Question PatternFrameworkKey Phrases
"Design ad click prediction"Scoring + calibration + auction"Calibrated CTR feeds into the auction - miscalibration directly impacts revenue"
"Why is calibration important?"Auction economics"Predicted CTR × bid = rank score. Over-prediction → overcharging → advertiser churn"
"How do you handle billions of features?"Feature hashing + sparse LR"Feature hashing to fixed dimension, FTRL for online learning"
"How do you handle label delay?"Delayed feedback models"Train on clicks (immediate), update with conversions (delayed)"

Spaced Repetition Checkpoints

  • Day 0: Explain the ad auction formula. Why does calibration matter for pricing?
  • Day 3: Compare LR vs. GBDT+LR vs. deep models for CTR prediction. Trade-offs?
  • Day 7: Design ad ranking for a video platform in 45 minutes.
  • Day 14: Explain Platt scaling and isotonic regression. When would you use each?
  • Day 21: Mock interview with follow-ups on real-time bidding, online learning, and conversion prediction.

What's Next

© 2026 EngineersOfAI. All rights reserved.