Python Caching Strategies Practice Problems & Exercises

Practice: Caching Strategies

11 problems4 Easy4 Medium3 Hard⏱ 70–100 min

Easy

#1functools.lru_cache BasicsEasy

lru_cachefunctoolsmemoizationfibonacci

Apply @functools.lru_cache to the recursive Fibonacci function and inspect the cache hit statistics.

import functools

@functools.lru_cache(maxsize=None)
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

result = fib(35)
info = fib.cache_info()
print(f"fib(35) = {result}")
print(f"cache_info: hits={info.hits} misses={info.misses}")

Solution

import functools

@functools.lru_cache(maxsize=None)
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

result = fib(35)
info = fib.cache_info()
print(f"fib(35) = {result}")
print(f"cache_info: hits={info.hits} misses={info.misses}")

lru_cache mechanics:

maxsize=None disables eviction — the cache grows unboundedly.
fib(35) makes 36 unique calls (misses) and 33 repeated calls (hits).
Without caching: 29.8M recursive calls for fib(35). With caching: 36 unique calls.
cache_info() returns CacheInfo(hits, misses, maxsize, currsize).
Call fn.cache_clear() between independent test cases to reset state.

Expected Output

fib(35) = 9227465\ncache_info: hits=33 misses=36

Hints

Hint 1: Apply @functools.lru_cache(maxsize=None) to a recursive function to memoize results.

Hint 2: Call fn.cache_info() after the computation to see hits, misses, maxsize, and currsize.

#2functools.cache — Python 3.9+ Unbounded CacheEasy

functools.cachememoizationpython39unbounded

Use @functools.cache to memoize a Collatz sequence length function, then verify the cache size.

import functools

@functools.cache
def collatz_length(n):
    """Count steps to reach 1 in the Collatz sequence."""
    if n == 1:
        return 1
    if n % 2 == 0:
        return 1 + collatz_length(n // 2)
    return 1 + collatz_length(3 * n + 1)

results = [collatz_length(i) for i in range(1, 11)]
info = collatz_length.cache_info()
print(f"result cached correctly")
print(f"cache size: {info.currsize}")

Solution

import functools

@functools.cache
def collatz_length(n):
    if n == 1:
        return 1
    if n % 2 == 0:
        return 1 + collatz_length(n // 2)
    return 1 + collatz_length(3 * n + 1)

results = [collatz_length(i) for i in range(1, 11)]
info = collatz_length.cache_info()

assert results == [1, 2, 8, 3, 6, 9, 17, 4, 20, 7], f"unexpected: {results}"
print(f"result cached correctly")
print(f"cache size: {info.currsize}")

functools.cache vs lru_cache:

@functools.cache (Python 3.9+) is lru_cache(maxsize=None) without LRU tracking overhead.
Slightly faster for unbounded caches — no doubly-linked list is maintained.
Use lru_cache(maxsize=N) when you need eviction to bound memory usage.
Use cache when the input space is small and bounded.

Expected Output

result cached correctly\ncache size: 11

Hints

Hint 1: functools.cache is equivalent to lru_cache(maxsize=None) but slightly faster due to no eviction tracking.

Hint 2: Use cache_info().currsize to check how many entries are stored.

#3Manual Memoization with a DictionaryEasy

memoizationdictmanualclosure

Implement manual memoization using a dictionary and a call counter to quantify how many redundant computations are avoided.

def make_memoized_fib():
    cache = {}
    calls_saved = [0]

    def fib(n):
        if n in cache:
            calls_saved[0] += 1
            return cache[n]
        if n <= 1:
            result = n
        else:
            result = fib(n - 1) + fib(n - 2)
        cache[n] = result
        return result

    return fib, calls_saved, cache

fib, calls_saved, cache = make_memoized_fib()
result = fib(20)
print(f"computed: {result}")
print(f"cache hits saved {calls_saved[0]} calls")

Solution

def make_memoized_fib():
    cache = {}
    calls_saved = [0]

    def fib(n):
        if n in cache:
            calls_saved[0] += 1
            return cache[n]
        if n <= 1:
            result = n
        else:
            result = fib(n - 1) + fib(n - 2)
        cache[n] = result
        return result

    return fib, calls_saved, cache

fib, calls_saved, cache = make_memoized_fib()
result = fib(20)
assert result == 6765
print(f"computed: {result}")
print(f"cache hits saved {calls_saved[0]} calls")

Manual memoization advantages:

Full control: inspect the cache, serialize it, warm it with precomputed values.
Can cache unhashable arguments by converting to a cache key (e.g., tuple(sorted(d.items()))).
Use a closure to keep the cache private — prevents external mutation.
Limitation: must implement invalidation manually; lru_cache handles this via cache_clear().

Expected Output

computed: 6765\ncache hits saved 17 calls

Hints

Hint 1: Check if the argument is in the cache dict before computing. Store and return the result.

Hint 2: Track a calls_saved counter: increment it every time a cached value is returned.

#4Cache Hit Rate AnalysisEasy

cache-hit-ratelru_cachecache_infoanalysis

Compute the cache hit rate of an lru_cache-wrapped function over 1000 lookups with keys drawn from a small pool.

import functools
import random

@functools.lru_cache(maxsize=128)
def expensive_lookup(key):
    return sum(range(key * 100))

random.seed(42)
keys = [random.randint(0, 19) for _ in range(1000)]
for k in keys:
    expensive_lookup(k)

info = expensive_lookup.cache_info()
# Compute hit rate and print:
# "hit rate: X%"
# "cache is effective: True" if hit_rate > 80% else "cache is effective: False"

Solution

import functools
import random

@functools.lru_cache(maxsize=128)
def expensive_lookup(key):
    return sum(range(key * 100))

random.seed(42)
keys = [random.randint(0, 19) for _ in range(1000)]
for k in keys:
    expensive_lookup(k)

info = expensive_lookup.cache_info()
total = info.hits + info.misses
hit_rate = info.hits / total * 100 if total > 0 else 0.0

print(f"hits={info.hits}  misses={info.misses}  currsize={info.currsize}")
print(f"hit rate: {hit_rate:.1f}%")
print(f"cache is effective: {hit_rate > 80}")

Cache effectiveness criteria:

Hit rate above 80% means the cache provides significant benefit.
Hit rate below 50% means most calls are misses — the cache overhead may not be worth it.
This workload uses 20 unique keys with maxsize=128 — all fit, so after 20 calls the hit rate approaches 100%.
For real workloads, measure actual hit rates before assuming a cache helps.

Expected Output

hit rate: X%\ncache is effective: True

Hints

Hint 1: hit_rate = hits / (hits + misses) * 100

Hint 2: A hit rate above 80% generally means the cache is providing good value.

Medium

#5TTL-Based CacheMedium

TTLcachetime-to-liveexpiry

Implement a TTLCache class that expires entries after a configurable time-to-live.

import time

class TTLCache:
    def __init__(self, ttl_seconds):
        self.ttl = ttl_seconds
        self._cache = {}

    def get(self, key):
        """Return cached value or None if missing/expired."""
        pass

    def set(self, key, value):
        """Store value with an expiry timestamp."""
        pass

    def __contains__(self, key):
        """True if key exists and has not expired."""
        pass

cache = TTLCache(ttl_seconds=0.1)

def fetch(key):
    if key in cache:
        print(f"{key}: cached")
        return cache.get(key)
    print(f"{key}: computed")
    cache.set(key, key * 100)
    return key * 100

fetch("x")           # computed
fetch("x")           # cached
time.sleep(0.15)
fetch("x")           # computed again (TTL expired)

Solution

import time

class TTLCache:
    def __init__(self, ttl_seconds):
        self.ttl = ttl_seconds
        self._cache = {}

    def get(self, key):
        if key in self._cache:
            value, expiry = self._cache[key]
            if time.time() <= expiry:
                return value
            del self._cache[key]
        return None

    def set(self, key, value):
        self._cache[key] = (value, time.time() + self.ttl)

    def __contains__(self, key):
        if key in self._cache:
            _, expiry = self._cache[key]
            if time.time() <= expiry:
                return True
            del self._cache[key]
        return False

cache = TTLCache(ttl_seconds=0.1)

def fetch(key):
    if key in cache:
        print(f"{key}: cached")
        return cache.get(key)
    print(f"{key}: computed")
    cache.set(key, key * 100)
    return key * 100

fetch("x")
fetch("x")
time.sleep(0.15)
fetch("x")

TTL cache design:

Store (value, expiry_timestamp) tuples.
Lazy expiration: check on access only — avoids background thread complexity.
Production alternative: cachetools.TTLCache(maxsize=1000, ttl=300) with thread safety.
Background cleanup needed for memory-sensitive applications that cannot tolerate lazy expiration.

Expected Output

first call: computed\nsecond call (within TTL): cached\nthird call (TTL expired): computed

Hints

Hint 1: Store (value, expiry_time) in the cache. On get, check if time.time() > expiry_time and recompute.

Hint 2: time.time() returns seconds since epoch — add the TTL in seconds to get the expiry timestamp.

#6LRU Cache from Scratch with OrderedDictMedium

LRUOrderedDictevictionfrom-scratch

Implement an LRU cache using collections.OrderedDict and verify that the least-recently-used entry is evicted when capacity is exceeded.

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self._cache = OrderedDict()

    def get(self, key):
        """Return value or -1 if not found. Promote on hit."""
        pass

    def put(self, key, value):
        """Insert or update. Evict LRU when at capacity."""
        pass

lru = LRUCache(3)
lru.put("a", 1)
lru.put("b", 2)
lru.put("c", 3)
lru.get("a")       # access "a" — now most recently used
lru.put("d", 4)    # capacity exceeded: "b" is LRU, should be evicted

assert lru.get("b") == -1
assert lru.get("a") == 1
assert lru.get("c") == 3
assert lru.get("d") == 4
print("LRU eviction verified: oldest entry evicted first")

Solution

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self._cache = OrderedDict()

    def get(self, key):
        if key not in self._cache:
            return -1
        self._cache.move_to_end(key)
        return self._cache[key]

    def put(self, key, value):
        if key in self._cache:
            self._cache.move_to_end(key)
        self._cache[key] = value
        if len(self._cache) > self.capacity:
            self._cache.popitem(last=False)

lru = LRUCache(3)
lru.put("a", 1)
lru.put("b", 2)
lru.put("c", 3)
lru.get("a")
lru.put("d", 4)

assert lru.get("b") == -1
assert lru.get("a") == 1
assert lru.get("c") == 3
assert lru.get("d") == 4
print("LRU eviction verified: oldest entry evicted first")

LRU implementation:

move_to_end(key) moves the key to position last (most recently used).
popitem(last=False) removes the first (oldest/least recently used) item.
get must call move_to_end on a hit — that is what makes it LRU.
O(1) for both get and put due to OrderedDict's O(1) operations.

Expected Output

LRU eviction verified: oldest entry evicted first

Hints

Hint 1: Use collections.OrderedDict — move_to_end() promotes a key to most-recently-used position.

Hint 2: When cache exceeds maxsize, call popitem(last=False) to remove the LRU (first) entry.

#7Cache Invalidation with Dependency TrackingMedium

cache-invalidationdependencytag-based

Implement a cache with explicit dependency tracking so that changing source data invalidates all dependent cached results.

class DependencyCache:
    def __init__(self):
        self._cache = {}
        self._deps  = {}   # source_key -> set of cache_keys

    def put(self, key, value, depends_on=None):
        self._cache[key] = value
        for src in (depends_on or []):
            self._deps.setdefault(src, set()).add(key)

    def get(self, key):
        return self._cache.get(key)

    def invalidate_source(self, source_key):
        """Remove all cache entries that depend on source_key."""
        pass

cache = DependencyCache()
cache.put("report_2024", "Revenue: $1M", depends_on=["sales_data"])
cache.put("chart_2024",  "Chart HTML",   depends_on=["sales_data", "config"])
cache.put("static_help", "Help text",    depends_on=[])

cache.invalidate_source("sales_data")

assert cache.get("report_2024") is None, "should be invalidated"
assert cache.get("chart_2024")  is None, "should be invalidated"
assert cache.get("static_help") == "Help text", "should NOT be invalidated"
print("stale value detected and cache invalidated correctly")

Solution

class DependencyCache:
    def __init__(self):
        self._cache = {}
        self._deps  = {}

    def put(self, key, value, depends_on=None):
        self._cache[key] = value
        for src in (depends_on or []):
            self._deps.setdefault(src, set()).add(key)

    def get(self, key):
        return self._cache.get(key)

    def invalidate_source(self, source_key):
        for cache_key in self._deps.pop(source_key, set()):
            self._cache.pop(cache_key, None)

cache = DependencyCache()
cache.put("report_2024", "Revenue: $1M", depends_on=["sales_data"])
cache.put("chart_2024",  "Chart HTML",   depends_on=["sales_data", "config"])
cache.put("static_help", "Help text",    depends_on=[])

cache.invalidate_source("sales_data")

assert cache.get("report_2024") is None
assert cache.get("chart_2024")  is None
assert cache.get("static_help") == "Help text"
print("stale value detected and cache invalidated correctly")

Cache invalidation patterns:

Tag-based invalidation is scalable: one source change invalidates N dependents in O(N).
Alternative: version-based invalidation — store a version number per source; cache entries include the version at computation time.
Redis pattern: tag keys with SADD tag:sales_data report_2024 chart_2024, then SUNIONSTORE + DEL on invalidation.

Expected Output

stale value detected and cache invalidated correctly

Hints

Hint 1: Track which cache entries depend on which source data using a dict: source_key -> set of cache_keys.

Hint 2: When source data changes, call invalidate_source() to remove all dependent entries.

#8Two-Level Cache — L1 In-Process, L2 SharedMedium

two-level-cacheL1L2promotioncache-hierarchy

Implement a two-level cache where L1 is a 3-entry LRU and L2 is a 20-entry dict, with automatic promotion on L2 hits.

from collections import OrderedDict

class TwoLevelCache:
    def __init__(self, l1_size=3, l2_size=20):
        self.l1 = OrderedDict()
        self.l2 = {}
        self.l1_size = l1_size
        self.l2_size = l2_size

    def _l1_put(self, key, value):
        if key in self.l1:
            self.l1.move_to_end(key)
        self.l1[key] = value
        if len(self.l1) > self.l1_size:
            self.l1.popitem(last=False)

    def get(self, key):
        """L1 hit -> return. L2 hit -> promote to L1, return. Miss -> None."""
        pass

    def put(self, key, value):
        """Store in both L1 and L2."""
        pass

cache = TwoLevelCache(l1_size=3, l2_size=20)
for i in range(10):
    cache.l2[f"k{i}"] = i * 10

cache._l1_put("k0", 0)
assert cache.get("k0") == 0
print("L1 hit")

assert cache.get("k5") == 50
assert "k5" in cache.l1
print("L2 hit (promoted to L1)")

assert cache.get("missing") is None
print("L1 miss L2 miss")

Solution

from collections import OrderedDict

class TwoLevelCache:
    def __init__(self, l1_size=3, l2_size=20):
        self.l1 = OrderedDict()
        self.l2 = {}
        self.l1_size = l1_size
        self.l2_size = l2_size

    def _l1_put(self, key, value):
        if key in self.l1:
            self.l1.move_to_end(key)
        self.l1[key] = value
        if len(self.l1) > self.l1_size:
            self.l1.popitem(last=False)

    def get(self, key):
        if key in self.l1:
            self.l1.move_to_end(key)
            return self.l1[key]
        if key in self.l2:
            value = self.l2[key]
            self._l1_put(key, value)  # promote
            return value
        return None

    def put(self, key, value):
        self._l1_put(key, value)
        if len(self.l2) < self.l2_size:
            self.l2[key] = value

cache = TwoLevelCache(l1_size=3, l2_size=20)
for i in range(10):
    cache.l2[f"k{i}"] = i * 10

cache._l1_put("k0", 0)
assert cache.get("k0") == 0
print("L1 hit")

assert cache.get("k5") == 50
assert "k5" in cache.l1
print("L2 hit (promoted to L1)")

assert cache.get("missing") is None
print("L1 miss L2 miss")

Two-level cache design:

L1 = in-process dict/LRU — nanosecond latency. L2 = Redis/memcached — microsecond latency.
Promotion: on L2 hit, copy to L1 so subsequent accesses are L1 hits.
Cache warming: pre-populate L1 with the most frequently accessed keys at startup.
In production: L1 is per-process (not shared), L2 is shared across all service instances.

Expected Output

L1 hit\nL2 hit (promoted to L1)\nL1 miss L2 miss

Hints

Hint 1: L1 is a small fast LRU (OrderedDict). L2 is a larger dict simulating Redis/memcached.

Hint 2: On L1 miss, check L2. On L2 hit, promote to L1. On both miss, return None.

Hard

#9Weak-Reference Cache — Avoid Preventing GCHard

weakrefWeakValueDictionaryGCmemory

Implement a cache using weakref.WeakValueDictionary that allows the GC to collect objects even while cached.

import weakref
import gc

class ExpensiveObject:
    def __init__(self, key):
        self.key  = key
        self.data = list(range(1000))

class WeakCache:
    def __init__(self):
        self._cache = weakref.WeakValueDictionary()

    def get_or_create(self, key):
        obj = self._cache.get(key)
        if obj is None:
            obj = ExpensiveObject(key)
            self._cache[key] = obj
        return obj

cache = WeakCache()
obj1 = cache.get_or_create("widget_1")
assert "widget_1" in cache._cache
print("object in cache while alive")

del obj1
gc.collect()

assert "widget_1" not in cache._cache
print("object gone from cache after GC")

Solution

import weakref
import gc

class ExpensiveObject:
    def __init__(self, key):
        self.key  = key
        self.data = list(range(1000))

class WeakCache:
    def __init__(self):
        self._cache = weakref.WeakValueDictionary()

    def get_or_create(self, key):
        obj = self._cache.get(key)
        if obj is None:
            obj = ExpensiveObject(key)
            self._cache[key] = obj
        return obj

cache = WeakCache()
obj1 = cache.get_or_create("widget_1")
assert "widget_1" in cache._cache
print("object in cache while alive")

del obj1
gc.collect()

assert "widget_1" not in cache._cache
print("object gone from cache after GC")

Weak reference cache:

WeakValueDictionary holds weak references — GC can collect values when no strong references exist.
Use case: object registries and flyweight patterns — avoid re-creating live objects, but allow freeing when unused.
weakref.ref(obj) creates a single weak reference; call it to get the object or None if collected.
Limitation: values must be weakly referenceable — most custom class instances are, but int, str, tuple are not.

Expected Output

object in cache while alive\nobject gone from cache after GC

Hints

Hint 1: weakref.WeakValueDictionary holds weak references — the GC can collect values even while the cache holds them.

Hint 2: After del obj and gc.collect(), the entry disappears from WeakValueDictionary automatically.

#10Cache Warming for Cold StartHard

cache-warminglru_cachecold-startZipf

Compare cold-start vs pre-warmed cache hit rates on a Zipfian workload (a few keys dominate access).

import functools
import random
from collections import Counter

@functools.lru_cache(maxsize=50)
def slow_compute(key):
    return key * key + key

def zipfian_workload(n_requests, n_keys, seed=42):
    rng = random.Random(seed)
    weights = [1.0 / (i + 1) for i in range(n_keys)]
    total   = sum(weights)
    probs   = [w / total for w in weights]
    return rng.choices(list(range(n_keys)), weights=probs, k=n_requests)

def hit_rate(fn):
    info = fn.cache_info()
    total = info.hits + info.misses
    return info.hits / total * 100 if total > 0 else 0.0

workload = zipfian_workload(1000, 200, seed=42)

# 1. Measure cold start hit rate on workload
# 2. Clear cache, warm with top 20 keys, re-run same workload, measure hit rate
# 3. Print improvement

Solution

import functools
import random
from collections import Counter

@functools.lru_cache(maxsize=50)
def slow_compute(key):
    return key * key + key

def zipfian_workload(n_requests, n_keys, seed=42):
    rng = random.Random(seed)
    weights = [1.0 / (i + 1) for i in range(n_keys)]
    total   = sum(weights)
    probs   = [w / total for w in weights]
    return rng.choices(list(range(n_keys)), weights=probs, k=n_requests)

def hit_rate(fn):
    info = fn.cache_info()
    total = info.hits + info.misses
    return info.hits / total * 100 if total > 0 else 0.0

workload = zipfian_workload(1000, 200, seed=42)

# Cold start
slow_compute.cache_clear()
for k in workload:
    slow_compute(k)
cold_hr = hit_rate(slow_compute)

# Warm with top 20 keys from workload analysis
top_keys = [k for k, _ in Counter(workload).most_common(20)]
slow_compute.cache_clear()
for k in top_keys:
    slow_compute(k)

for k in workload:
    slow_compute(k)
warm_hr = hit_rate(slow_compute)

improvement = warm_hr - cold_hr
print(f"cold start hit rate: {cold_hr:.1f}%")
print(f"warmed hit rate: {warm_hr:.1f}%")
print(f"warming improved hit rate by {improvement:.1f}%")

Cache warming strategy:

Identify hot keys from access logs or Zipf analysis.
Pre-populate at startup before serving live traffic.
For Zipfian workloads, warming the top 10% of keys by frequency captures ~80% of accesses.
Production: load top-N keys from a snapshot at service startup, before the health check passes.

Expected Output

cold start hit rate: X%\nwarmed hit rate: Y%\nwarming improved hit rate by Z%

Hints

Hint 1: Pre-populate the cache with the most frequently accessed keys before running the workload.

Hint 2: Measure hit rate on the same workload with and without warming to quantify the benefit.

#11LRU vs LFU Eviction Policy ComparisonHard

eviction-policyLRULFUcomparisonhit-rate

Implement LRU and LFU caches and compare their hit rates on a recency-biased workload.

from collections import OrderedDict, defaultdict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self._cache = OrderedDict()

    def get(self, key):
        if key not in self._cache:
            return None
        self._cache.move_to_end(key)
        return self._cache[key]

    def put(self, key, value):
        if key in self._cache:
            self._cache.move_to_end(key)
        self._cache[key] = value
        if len(self._cache) > self.capacity:
            self._cache.popitem(last=False)

class LFUCache:
    def __init__(self, capacity):
        self.capacity  = capacity
        self._cache    = {}
        self._freq     = defaultdict(int)
        self._min_freq = 0
        self._freq_map = defaultdict(OrderedDict)

    def _inc(self, key):
        f = self._freq[key]
        self._freq[key] = f + 1
        self._freq_map[f].pop(key, None)
        if self._min_freq == f and not self._freq_map[f]:
            self._min_freq = f + 1
        self._freq_map[f + 1][key] = None

    def get(self, key):
        if key not in self._cache:
            return None
        self._inc(key)
        return self._cache[key]

    def put(self, key, value):
        if self.capacity <= 0:
            return
        if key in self._cache:
            self._cache[key] = value
            self._inc(key)
            return
        if len(self._cache) >= self.capacity:
            evict, _ = self._freq_map[self._min_freq].popitem(last=False)
            del self._cache[evict]
            del self._freq[evict]
        self._cache[key] = value
        self._freq[key]  = 1
        self._freq_map[1][key] = None
        self._min_freq = 1

def simulate(cache_cls, capacity, workload):
    cache = cache_cls(capacity)
    hits = 0
    for key in workload:
        if cache.get(key) is not None:
            hits += 1
        else:
            cache.put(key, key)
    return hits / len(workload) * 100

import random
rng = random.Random(0)
workload = []
for window in range(10):
    base = window * 5
    workload += [base + rng.randint(0, 4) for _ in range(100)]

cap = 10
lru_hr = simulate(LRUCache, cap, workload)
lfu_hr = simulate(LFUCache, cap, workload)
better = "LRU" if lru_hr >= lfu_hr else "LFU"

print(f"LRU hit rate: {lru_hr:.1f}%")
print(f"LFU hit rate: {lfu_hr:.1f}%")
print(f"better policy for this workload: {better}")

Solution

from collections import OrderedDict, defaultdict
import random

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self._cache = OrderedDict()

    def get(self, key):
        if key not in self._cache:
            return None
        self._cache.move_to_end(key)
        return self._cache[key]

    def put(self, key, value):
        if key in self._cache:
            self._cache.move_to_end(key)
        self._cache[key] = value
        if len(self._cache) > self.capacity:
            self._cache.popitem(last=False)

class LFUCache:
    def __init__(self, capacity):
        self.capacity  = capacity
        self._cache    = {}
        self._freq     = defaultdict(int)
        self._min_freq = 0
        self._freq_map = defaultdict(OrderedDict)

    def _inc(self, key):
        f = self._freq[key]
        self._freq[key] = f + 1
        self._freq_map[f].pop(key, None)
        if self._min_freq == f and not self._freq_map[f]:
            self._min_freq = f + 1
        self._freq_map[f + 1][key] = None

    def get(self, key):
        if key not in self._cache:
            return None
        self._inc(key)
        return self._cache[key]

    def put(self, key, value):
        if self.capacity <= 0:
            return
        if key in self._cache:
            self._cache[key] = value
            self._inc(key)
            return
        if len(self._cache) >= self.capacity:
            evict, _ = self._freq_map[self._min_freq].popitem(last=False)
            del self._cache[evict]
            del self._freq[evict]
        self._cache[key] = value
        self._freq[key]  = 1
        self._freq_map[1][key] = None
        self._min_freq = 1

def simulate(cache_cls, capacity, workload):
    cache = cache_cls(capacity)
    hits = 0
    for key in workload:
        if cache.get(key) is not None:
            hits += 1
        else:
            cache.put(key, key)
    return hits / len(workload) * 100

rng = random.Random(0)
workload = []
for window in range(10):
    base = window * 5
    workload += [base + rng.randint(0, 4) for _ in range(100)]

cap = 10
lru_hr = simulate(LRUCache, cap, workload)
lfu_hr = simulate(LFUCache, cap, workload)
better = "LRU" if lru_hr >= lfu_hr else "LFU"

print(f"LRU hit rate: {lru_hr:.1f}%")
print(f"LFU hit rate: {lfu_hr:.1f}%")
print(f"better policy for this workload: {better}")

LRU vs LFU tradeoffs:

LRU: best for temporal locality — recent accesses predict future ones.
LFU: best for stable hot-set workloads — popular items stay popular indefinitely.
Recency-biased workload (window shifts): LRU wins because it adapts to new hot keys; LFU clings to old popular keys.
LFU drawback: frequency counter pollution — old high-frequency keys take long to evict after becoming cold.
Production: Redis uses probabilistic LRU approximation (samples 5–10 random keys). ARC and LIRS algorithms combine LRU + LFU benefits.

Expected Output

LRU hit rate: X%\nLFU hit rate: Y%\nbetter policy for this workload: LRU/LFU

Hints

Hint 1: LRU evicts the least recently used entry. LFU evicts the least frequently used entry.

Hint 2: LFU excels on stable hot-set workloads. LRU excels on recency-based (scanning) workloads.

Practice: Caching Strategies

Easy​

Medium​

Hard​

Easy

Medium

Hard