Python Caching Strategies Practice Problems & Exercises
Practice: Caching Strategies
← Back to lessonEasy
Apply @functools.lru_cache to the recursive Fibonacci function and inspect the cache hit statistics.
import functools
@functools.lru_cache(maxsize=None)
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
result = fib(35)
info = fib.cache_info()
print(f"fib(35) = {result}")
print(f"cache_info: hits={info.hits} misses={info.misses}")
Solution
import functools
@functools.lru_cache(maxsize=None)
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
result = fib(35)
info = fib.cache_info()
print(f"fib(35) = {result}")
print(f"cache_info: hits={info.hits} misses={info.misses}")
lru_cache mechanics:
maxsize=Nonedisables eviction — the cache grows unboundedly.fib(35)makes 36 unique calls (misses) and 33 repeated calls (hits).- Without caching: 29.8M recursive calls for fib(35). With caching: 36 unique calls.
cache_info()returnsCacheInfo(hits, misses, maxsize, currsize).- Call
fn.cache_clear()between independent test cases to reset state.
Expected Output
fib(35) = 9227465\ncache_info: hits=33 misses=36Hints
Hint 1: Apply @functools.lru_cache(maxsize=None) to a recursive function to memoize results.
Hint 2: Call fn.cache_info() after the computation to see hits, misses, maxsize, and currsize.
Use @functools.cache to memoize a Collatz sequence length function, then verify the cache size.
import functools
@functools.cache
def collatz_length(n):
"""Count steps to reach 1 in the Collatz sequence."""
if n == 1:
return 1
if n % 2 == 0:
return 1 + collatz_length(n // 2)
return 1 + collatz_length(3 * n + 1)
results = [collatz_length(i) for i in range(1, 11)]
info = collatz_length.cache_info()
print(f"result cached correctly")
print(f"cache size: {info.currsize}")
Solution
import functools
@functools.cache
def collatz_length(n):
if n == 1:
return 1
if n % 2 == 0:
return 1 + collatz_length(n // 2)
return 1 + collatz_length(3 * n + 1)
results = [collatz_length(i) for i in range(1, 11)]
info = collatz_length.cache_info()
assert results == [1, 2, 8, 3, 6, 9, 17, 4, 20, 7], f"unexpected: {results}"
print(f"result cached correctly")
print(f"cache size: {info.currsize}")
functools.cache vs lru_cache:
@functools.cache(Python 3.9+) islru_cache(maxsize=None)without LRU tracking overhead.- Slightly faster for unbounded caches — no doubly-linked list is maintained.
- Use
lru_cache(maxsize=N)when you need eviction to bound memory usage. - Use
cachewhen the input space is small and bounded.
Expected Output
result cached correctly\ncache size: 11Hints
Hint 1: functools.cache is equivalent to lru_cache(maxsize=None) but slightly faster due to no eviction tracking.
Hint 2: Use cache_info().currsize to check how many entries are stored.
Implement manual memoization using a dictionary and a call counter to quantify how many redundant computations are avoided.
def make_memoized_fib():
cache = {}
calls_saved = [0]
def fib(n):
if n in cache:
calls_saved[0] += 1
return cache[n]
if n <= 1:
result = n
else:
result = fib(n - 1) + fib(n - 2)
cache[n] = result
return result
return fib, calls_saved, cache
fib, calls_saved, cache = make_memoized_fib()
result = fib(20)
print(f"computed: {result}")
print(f"cache hits saved {calls_saved[0]} calls")
Solution
def make_memoized_fib():
cache = {}
calls_saved = [0]
def fib(n):
if n in cache:
calls_saved[0] += 1
return cache[n]
if n <= 1:
result = n
else:
result = fib(n - 1) + fib(n - 2)
cache[n] = result
return result
return fib, calls_saved, cache
fib, calls_saved, cache = make_memoized_fib()
result = fib(20)
assert result == 6765
print(f"computed: {result}")
print(f"cache hits saved {calls_saved[0]} calls")
Manual memoization advantages:
- Full control: inspect the cache, serialize it, warm it with precomputed values.
- Can cache unhashable arguments by converting to a cache key (e.g.,
tuple(sorted(d.items()))). - Use a closure to keep the cache private — prevents external mutation.
- Limitation: must implement invalidation manually;
lru_cachehandles this viacache_clear().
Expected Output
computed: 6765\ncache hits saved 17 callsHints
Hint 1: Check if the argument is in the cache dict before computing. Store and return the result.
Hint 2: Track a calls_saved counter: increment it every time a cached value is returned.
Compute the cache hit rate of an lru_cache-wrapped function over 1000 lookups with keys drawn from a small pool.
import functools
import random
@functools.lru_cache(maxsize=128)
def expensive_lookup(key):
return sum(range(key * 100))
random.seed(42)
keys = [random.randint(0, 19) for _ in range(1000)]
for k in keys:
expensive_lookup(k)
info = expensive_lookup.cache_info()
# Compute hit rate and print:
# "hit rate: X%"
# "cache is effective: True" if hit_rate > 80% else "cache is effective: False"
Solution
import functools
import random
@functools.lru_cache(maxsize=128)
def expensive_lookup(key):
return sum(range(key * 100))
random.seed(42)
keys = [random.randint(0, 19) for _ in range(1000)]
for k in keys:
expensive_lookup(k)
info = expensive_lookup.cache_info()
total = info.hits + info.misses
hit_rate = info.hits / total * 100 if total > 0 else 0.0
print(f"hits={info.hits} misses={info.misses} currsize={info.currsize}")
print(f"hit rate: {hit_rate:.1f}%")
print(f"cache is effective: {hit_rate > 80}")
Cache effectiveness criteria:
- Hit rate above 80% means the cache provides significant benefit.
- Hit rate below 50% means most calls are misses — the cache overhead may not be worth it.
- This workload uses 20 unique keys with
maxsize=128— all fit, so after 20 calls the hit rate approaches 100%. - For real workloads, measure actual hit rates before assuming a cache helps.
Expected Output
hit rate: X%\ncache is effective: TrueHints
Hint 1: hit_rate = hits / (hits + misses) * 100
Hint 2: A hit rate above 80% generally means the cache is providing good value.
Medium
Implement a TTLCache class that expires entries after a configurable time-to-live.
import time
class TTLCache:
def __init__(self, ttl_seconds):
self.ttl = ttl_seconds
self._cache = {}
def get(self, key):
"""Return cached value or None if missing/expired."""
pass
def set(self, key, value):
"""Store value with an expiry timestamp."""
pass
def __contains__(self, key):
"""True if key exists and has not expired."""
pass
cache = TTLCache(ttl_seconds=0.1)
def fetch(key):
if key in cache:
print(f"{key}: cached")
return cache.get(key)
print(f"{key}: computed")
cache.set(key, key * 100)
return key * 100
fetch("x") # computed
fetch("x") # cached
time.sleep(0.15)
fetch("x") # computed again (TTL expired)
Solution
import time
class TTLCache:
def __init__(self, ttl_seconds):
self.ttl = ttl_seconds
self._cache = {}
def get(self, key):
if key in self._cache:
value, expiry = self._cache[key]
if time.time() <= expiry:
return value
del self._cache[key]
return None
def set(self, key, value):
self._cache[key] = (value, time.time() + self.ttl)
def __contains__(self, key):
if key in self._cache:
_, expiry = self._cache[key]
if time.time() <= expiry:
return True
del self._cache[key]
return False
cache = TTLCache(ttl_seconds=0.1)
def fetch(key):
if key in cache:
print(f"{key}: cached")
return cache.get(key)
print(f"{key}: computed")
cache.set(key, key * 100)
return key * 100
fetch("x")
fetch("x")
time.sleep(0.15)
fetch("x")
TTL cache design:
- Store
(value, expiry_timestamp)tuples. - Lazy expiration: check on access only — avoids background thread complexity.
- Production alternative:
cachetools.TTLCache(maxsize=1000, ttl=300)with thread safety. - Background cleanup needed for memory-sensitive applications that cannot tolerate lazy expiration.
Expected Output
first call: computed\nsecond call (within TTL): cached\nthird call (TTL expired): computedHints
Hint 1: Store (value, expiry_time) in the cache. On get, check if time.time() > expiry_time and recompute.
Hint 2: time.time() returns seconds since epoch — add the TTL in seconds to get the expiry timestamp.
Implement an LRU cache using collections.OrderedDict and verify that the least-recently-used entry is evicted when capacity is exceeded.
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self._cache = OrderedDict()
def get(self, key):
"""Return value or -1 if not found. Promote on hit."""
pass
def put(self, key, value):
"""Insert or update. Evict LRU when at capacity."""
pass
lru = LRUCache(3)
lru.put("a", 1)
lru.put("b", 2)
lru.put("c", 3)
lru.get("a") # access "a" — now most recently used
lru.put("d", 4) # capacity exceeded: "b" is LRU, should be evicted
assert lru.get("b") == -1
assert lru.get("a") == 1
assert lru.get("c") == 3
assert lru.get("d") == 4
print("LRU eviction verified: oldest entry evicted first")
Solution
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self._cache = OrderedDict()
def get(self, key):
if key not in self._cache:
return -1
self._cache.move_to_end(key)
return self._cache[key]
def put(self, key, value):
if key in self._cache:
self._cache.move_to_end(key)
self._cache[key] = value
if len(self._cache) > self.capacity:
self._cache.popitem(last=False)
lru = LRUCache(3)
lru.put("a", 1)
lru.put("b", 2)
lru.put("c", 3)
lru.get("a")
lru.put("d", 4)
assert lru.get("b") == -1
assert lru.get("a") == 1
assert lru.get("c") == 3
assert lru.get("d") == 4
print("LRU eviction verified: oldest entry evicted first")
LRU implementation:
move_to_end(key)moves the key to positionlast(most recently used).popitem(last=False)removes the first (oldest/least recently used) item.getmust callmove_to_endon a hit — that is what makes it LRU.- O(1) for both get and put due to OrderedDict's O(1) operations.
Expected Output
LRU eviction verified: oldest entry evicted firstHints
Hint 1: Use collections.OrderedDict — move_to_end() promotes a key to most-recently-used position.
Hint 2: When cache exceeds maxsize, call popitem(last=False) to remove the LRU (first) entry.
Implement a cache with explicit dependency tracking so that changing source data invalidates all dependent cached results.
class DependencyCache:
def __init__(self):
self._cache = {}
self._deps = {} # source_key -> set of cache_keys
def put(self, key, value, depends_on=None):
self._cache[key] = value
for src in (depends_on or []):
self._deps.setdefault(src, set()).add(key)
def get(self, key):
return self._cache.get(key)
def invalidate_source(self, source_key):
"""Remove all cache entries that depend on source_key."""
pass
cache = DependencyCache()
cache.put("report_2024", "Revenue: $1M", depends_on=["sales_data"])
cache.put("chart_2024", "Chart HTML", depends_on=["sales_data", "config"])
cache.put("static_help", "Help text", depends_on=[])
cache.invalidate_source("sales_data")
assert cache.get("report_2024") is None, "should be invalidated"
assert cache.get("chart_2024") is None, "should be invalidated"
assert cache.get("static_help") == "Help text", "should NOT be invalidated"
print("stale value detected and cache invalidated correctly")
Solution
class DependencyCache:
def __init__(self):
self._cache = {}
self._deps = {}
def put(self, key, value, depends_on=None):
self._cache[key] = value
for src in (depends_on or []):
self._deps.setdefault(src, set()).add(key)
def get(self, key):
return self._cache.get(key)
def invalidate_source(self, source_key):
for cache_key in self._deps.pop(source_key, set()):
self._cache.pop(cache_key, None)
cache = DependencyCache()
cache.put("report_2024", "Revenue: $1M", depends_on=["sales_data"])
cache.put("chart_2024", "Chart HTML", depends_on=["sales_data", "config"])
cache.put("static_help", "Help text", depends_on=[])
cache.invalidate_source("sales_data")
assert cache.get("report_2024") is None
assert cache.get("chart_2024") is None
assert cache.get("static_help") == "Help text"
print("stale value detected and cache invalidated correctly")
Cache invalidation patterns:
- Tag-based invalidation is scalable: one source change invalidates N dependents in O(N).
- Alternative: version-based invalidation — store a version number per source; cache entries include the version at computation time.
- Redis pattern: tag keys with
SADD tag:sales_data report_2024 chart_2024, thenSUNIONSTORE+DELon invalidation.
Expected Output
stale value detected and cache invalidated correctlyHints
Hint 1: Track which cache entries depend on which source data using a dict: source_key -> set of cache_keys.
Hint 2: When source data changes, call invalidate_source() to remove all dependent entries.
Implement a two-level cache where L1 is a 3-entry LRU and L2 is a 20-entry dict, with automatic promotion on L2 hits.
from collections import OrderedDict
class TwoLevelCache:
def __init__(self, l1_size=3, l2_size=20):
self.l1 = OrderedDict()
self.l2 = {}
self.l1_size = l1_size
self.l2_size = l2_size
def _l1_put(self, key, value):
if key in self.l1:
self.l1.move_to_end(key)
self.l1[key] = value
if len(self.l1) > self.l1_size:
self.l1.popitem(last=False)
def get(self, key):
"""L1 hit -> return. L2 hit -> promote to L1, return. Miss -> None."""
pass
def put(self, key, value):
"""Store in both L1 and L2."""
pass
cache = TwoLevelCache(l1_size=3, l2_size=20)
for i in range(10):
cache.l2[f"k{i}"] = i * 10
cache._l1_put("k0", 0)
assert cache.get("k0") == 0
print("L1 hit")
assert cache.get("k5") == 50
assert "k5" in cache.l1
print("L2 hit (promoted to L1)")
assert cache.get("missing") is None
print("L1 miss L2 miss")
Solution
from collections import OrderedDict
class TwoLevelCache:
def __init__(self, l1_size=3, l2_size=20):
self.l1 = OrderedDict()
self.l2 = {}
self.l1_size = l1_size
self.l2_size = l2_size
def _l1_put(self, key, value):
if key in self.l1:
self.l1.move_to_end(key)
self.l1[key] = value
if len(self.l1) > self.l1_size:
self.l1.popitem(last=False)
def get(self, key):
if key in self.l1:
self.l1.move_to_end(key)
return self.l1[key]
if key in self.l2:
value = self.l2[key]
self._l1_put(key, value) # promote
return value
return None
def put(self, key, value):
self._l1_put(key, value)
if len(self.l2) < self.l2_size:
self.l2[key] = value
cache = TwoLevelCache(l1_size=3, l2_size=20)
for i in range(10):
cache.l2[f"k{i}"] = i * 10
cache._l1_put("k0", 0)
assert cache.get("k0") == 0
print("L1 hit")
assert cache.get("k5") == 50
assert "k5" in cache.l1
print("L2 hit (promoted to L1)")
assert cache.get("missing") is None
print("L1 miss L2 miss")
Two-level cache design:
- L1 = in-process dict/LRU — nanosecond latency. L2 = Redis/memcached — microsecond latency.
- Promotion: on L2 hit, copy to L1 so subsequent accesses are L1 hits.
- Cache warming: pre-populate L1 with the most frequently accessed keys at startup.
- In production: L1 is per-process (not shared), L2 is shared across all service instances.
Expected Output
L1 hit\nL2 hit (promoted to L1)\nL1 miss L2 missHints
Hint 1: L1 is a small fast LRU (OrderedDict). L2 is a larger dict simulating Redis/memcached.
Hint 2: On L1 miss, check L2. On L2 hit, promote to L1. On both miss, return None.
Hard
Implement a cache using weakref.WeakValueDictionary that allows the GC to collect objects even while cached.
import weakref
import gc
class ExpensiveObject:
def __init__(self, key):
self.key = key
self.data = list(range(1000))
class WeakCache:
def __init__(self):
self._cache = weakref.WeakValueDictionary()
def get_or_create(self, key):
obj = self._cache.get(key)
if obj is None:
obj = ExpensiveObject(key)
self._cache[key] = obj
return obj
cache = WeakCache()
obj1 = cache.get_or_create("widget_1")
assert "widget_1" in cache._cache
print("object in cache while alive")
del obj1
gc.collect()
assert "widget_1" not in cache._cache
print("object gone from cache after GC")
Solution
import weakref
import gc
class ExpensiveObject:
def __init__(self, key):
self.key = key
self.data = list(range(1000))
class WeakCache:
def __init__(self):
self._cache = weakref.WeakValueDictionary()
def get_or_create(self, key):
obj = self._cache.get(key)
if obj is None:
obj = ExpensiveObject(key)
self._cache[key] = obj
return obj
cache = WeakCache()
obj1 = cache.get_or_create("widget_1")
assert "widget_1" in cache._cache
print("object in cache while alive")
del obj1
gc.collect()
assert "widget_1" not in cache._cache
print("object gone from cache after GC")
Weak reference cache:
WeakValueDictionaryholds weak references — GC can collect values when no strong references exist.- Use case: object registries and flyweight patterns — avoid re-creating live objects, but allow freeing when unused.
weakref.ref(obj)creates a single weak reference; call it to get the object orNoneif collected.- Limitation: values must be weakly referenceable — most custom class instances are, but
int,str,tupleare not.
Expected Output
object in cache while alive\nobject gone from cache after GCHints
Hint 1: weakref.WeakValueDictionary holds weak references — the GC can collect values even while the cache holds them.
Hint 2: After del obj and gc.collect(), the entry disappears from WeakValueDictionary automatically.
Compare cold-start vs pre-warmed cache hit rates on a Zipfian workload (a few keys dominate access).
import functools
import random
from collections import Counter
@functools.lru_cache(maxsize=50)
def slow_compute(key):
return key * key + key
def zipfian_workload(n_requests, n_keys, seed=42):
rng = random.Random(seed)
weights = [1.0 / (i + 1) for i in range(n_keys)]
total = sum(weights)
probs = [w / total for w in weights]
return rng.choices(list(range(n_keys)), weights=probs, k=n_requests)
def hit_rate(fn):
info = fn.cache_info()
total = info.hits + info.misses
return info.hits / total * 100 if total > 0 else 0.0
workload = zipfian_workload(1000, 200, seed=42)
# 1. Measure cold start hit rate on workload
# 2. Clear cache, warm with top 20 keys, re-run same workload, measure hit rate
# 3. Print improvement
Solution
import functools
import random
from collections import Counter
@functools.lru_cache(maxsize=50)
def slow_compute(key):
return key * key + key
def zipfian_workload(n_requests, n_keys, seed=42):
rng = random.Random(seed)
weights = [1.0 / (i + 1) for i in range(n_keys)]
total = sum(weights)
probs = [w / total for w in weights]
return rng.choices(list(range(n_keys)), weights=probs, k=n_requests)
def hit_rate(fn):
info = fn.cache_info()
total = info.hits + info.misses
return info.hits / total * 100 if total > 0 else 0.0
workload = zipfian_workload(1000, 200, seed=42)
# Cold start
slow_compute.cache_clear()
for k in workload:
slow_compute(k)
cold_hr = hit_rate(slow_compute)
# Warm with top 20 keys from workload analysis
top_keys = [k for k, _ in Counter(workload).most_common(20)]
slow_compute.cache_clear()
for k in top_keys:
slow_compute(k)
for k in workload:
slow_compute(k)
warm_hr = hit_rate(slow_compute)
improvement = warm_hr - cold_hr
print(f"cold start hit rate: {cold_hr:.1f}%")
print(f"warmed hit rate: {warm_hr:.1f}%")
print(f"warming improved hit rate by {improvement:.1f}%")
Cache warming strategy:
- Identify hot keys from access logs or Zipf analysis.
- Pre-populate at startup before serving live traffic.
- For Zipfian workloads, warming the top 10% of keys by frequency captures ~80% of accesses.
- Production: load top-N keys from a snapshot at service startup, before the health check passes.
Expected Output
cold start hit rate: X%\nwarmed hit rate: Y%\nwarming improved hit rate by Z%Hints
Hint 1: Pre-populate the cache with the most frequently accessed keys before running the workload.
Hint 2: Measure hit rate on the same workload with and without warming to quantify the benefit.
Implement LRU and LFU caches and compare their hit rates on a recency-biased workload.
from collections import OrderedDict, defaultdict
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self._cache = OrderedDict()
def get(self, key):
if key not in self._cache:
return None
self._cache.move_to_end(key)
return self._cache[key]
def put(self, key, value):
if key in self._cache:
self._cache.move_to_end(key)
self._cache[key] = value
if len(self._cache) > self.capacity:
self._cache.popitem(last=False)
class LFUCache:
def __init__(self, capacity):
self.capacity = capacity
self._cache = {}
self._freq = defaultdict(int)
self._min_freq = 0
self._freq_map = defaultdict(OrderedDict)
def _inc(self, key):
f = self._freq[key]
self._freq[key] = f + 1
self._freq_map[f].pop(key, None)
if self._min_freq == f and not self._freq_map[f]:
self._min_freq = f + 1
self._freq_map[f + 1][key] = None
def get(self, key):
if key not in self._cache:
return None
self._inc(key)
return self._cache[key]
def put(self, key, value):
if self.capacity <= 0:
return
if key in self._cache:
self._cache[key] = value
self._inc(key)
return
if len(self._cache) >= self.capacity:
evict, _ = self._freq_map[self._min_freq].popitem(last=False)
del self._cache[evict]
del self._freq[evict]
self._cache[key] = value
self._freq[key] = 1
self._freq_map[1][key] = None
self._min_freq = 1
def simulate(cache_cls, capacity, workload):
cache = cache_cls(capacity)
hits = 0
for key in workload:
if cache.get(key) is not None:
hits += 1
else:
cache.put(key, key)
return hits / len(workload) * 100
import random
rng = random.Random(0)
workload = []
for window in range(10):
base = window * 5
workload += [base + rng.randint(0, 4) for _ in range(100)]
cap = 10
lru_hr = simulate(LRUCache, cap, workload)
lfu_hr = simulate(LFUCache, cap, workload)
better = "LRU" if lru_hr >= lfu_hr else "LFU"
print(f"LRU hit rate: {lru_hr:.1f}%")
print(f"LFU hit rate: {lfu_hr:.1f}%")
print(f"better policy for this workload: {better}")
Solution
from collections import OrderedDict, defaultdict
import random
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self._cache = OrderedDict()
def get(self, key):
if key not in self._cache:
return None
self._cache.move_to_end(key)
return self._cache[key]
def put(self, key, value):
if key in self._cache:
self._cache.move_to_end(key)
self._cache[key] = value
if len(self._cache) > self.capacity:
self._cache.popitem(last=False)
class LFUCache:
def __init__(self, capacity):
self.capacity = capacity
self._cache = {}
self._freq = defaultdict(int)
self._min_freq = 0
self._freq_map = defaultdict(OrderedDict)
def _inc(self, key):
f = self._freq[key]
self._freq[key] = f + 1
self._freq_map[f].pop(key, None)
if self._min_freq == f and not self._freq_map[f]:
self._min_freq = f + 1
self._freq_map[f + 1][key] = None
def get(self, key):
if key not in self._cache:
return None
self._inc(key)
return self._cache[key]
def put(self, key, value):
if self.capacity <= 0:
return
if key in self._cache:
self._cache[key] = value
self._inc(key)
return
if len(self._cache) >= self.capacity:
evict, _ = self._freq_map[self._min_freq].popitem(last=False)
del self._cache[evict]
del self._freq[evict]
self._cache[key] = value
self._freq[key] = 1
self._freq_map[1][key] = None
self._min_freq = 1
def simulate(cache_cls, capacity, workload):
cache = cache_cls(capacity)
hits = 0
for key in workload:
if cache.get(key) is not None:
hits += 1
else:
cache.put(key, key)
return hits / len(workload) * 100
rng = random.Random(0)
workload = []
for window in range(10):
base = window * 5
workload += [base + rng.randint(0, 4) for _ in range(100)]
cap = 10
lru_hr = simulate(LRUCache, cap, workload)
lfu_hr = simulate(LFUCache, cap, workload)
better = "LRU" if lru_hr >= lfu_hr else "LFU"
print(f"LRU hit rate: {lru_hr:.1f}%")
print(f"LFU hit rate: {lfu_hr:.1f}%")
print(f"better policy for this workload: {better}")
LRU vs LFU tradeoffs:
- LRU: best for temporal locality — recent accesses predict future ones.
- LFU: best for stable hot-set workloads — popular items stay popular indefinitely.
- Recency-biased workload (window shifts): LRU wins because it adapts to new hot keys; LFU clings to old popular keys.
- LFU drawback: frequency counter pollution — old high-frequency keys take long to evict after becoming cold.
- Production: Redis uses probabilistic LRU approximation (samples 5–10 random keys). ARC and LIRS algorithms combine LRU + LFU benefits.
Expected Output
LRU hit rate: X%\nLFU hit rate: Y%\nbetter policy for this workload: LRU/LFUHints
Hint 1: LRU evicts the least recently used entry. LFU evicts the least frequently used entry.
Hint 2: LFU excels on stable hot-set workloads. LRU excels on recency-based (scanning) workloads.
