Cache Replacement Policies

This document summarizes common cache replacement (eviction) policies, their tradeoffs, and when to use (or avoid) each. It’s written as a practical companion to docs/design.md.

Implementation notes live in docs/policies/README.md.

Terminology used below:

Admission: whether an item is allowed into cache at all (some “policies” combine admission + eviction).
Eviction: which resident item to remove when making space.
Scan pollution: one-time accesses (e.g., large scans) pushing out genuinely hot items.
Metadata cost: extra per-entry memory and CPU needed to maintain the policy.

How To Choose (Quick Guidance)

Pick based on workload first:

Strong temporal locality (hot keys repeat quickly): LRU / Clock usually works.
One-hit-wonder dominated / scan-heavy: prefer scan-resistant policies like LRU-K, 2Q, ARC.
Frequency matters more than recency (hot keys repeat but with long gaps): LFU (with aging) or hybrids.
Need low overhead & simple: FIFO, Random, Clock.
Need adaptive across shifting workloads: ARC (or related adaptive policies).

If you can only implement one “general purpose” policy for mixed workloads, ARC-style adaptivity or LRU-K/2Q-style scan resistance usually beats plain LRU, at the cost of more metadata and complexity.

Policy Catalog (Summaries)

OPT / MIN (Belady’s Optimal)

Idea: evict the item whose next use is farthest in the future.

Pros: best possible hit rate for a known future; gold-standard for evaluation.
Cons: requires knowing the future; not implementable in real systems (except offline traces/simulators).
Use when: benchmarking and comparing other policies on recorded traces.
Avoid when: building a real cache.

Random

Idea: evict a uniformly random resident item.

Pros: trivial; very low overhead; surprisingly decent under some adversarial patterns.
Cons: ignores locality; unstable hit rate; can evict hot items.
Use when: you need the simplest possible eviction with constant overhead.
Avoid when: locality exists and you can afford minimal tracking.

FIFO (First-In, First-Out)

Idea: evict the oldest inserted item.

Pros: simple; O(1); predictable; low metadata.
Cons: ignores reuse; can be very poor when early inserts stay hot.
Use when: insert order correlates with staleness (e.g., streaming-ish workloads), or you want predictability.
Avoid when: strong temporal locality; “old but hot” keys are common.

LIFO / FILO (Last-In, First-Out)

Idea: evict the most recently inserted item.

Pros: can work for some cyclic/scan-like patterns where newest items are least likely to be reused.
Cons: counterproductive under temporal locality; uncommon in general-purpose caches.
Use when: you have evidence newest items are least reusable.
Avoid when: typical request caches with recency locality.

LRU (Least Recently Used)

Idea: evict the item not accessed for the longest time.

Pros: strong default for temporal locality; intuitive; stable.
Cons: vulnerable to scan pollution; maintaining exact LRU can be costly under high concurrency.
Use when: workloads have strong recency locality; you can tolerate metadata and updates on every access.
Avoid when: large sequential scans are common; cache is highly contended and strict ordering is too expensive.

MRU (Most Recently Used)

Idea: evict the most recently accessed item.

Pros: can outperform LRU for some “looping scan” patterns where the just-touched item won’t be reused soon.
Cons: performs poorly for typical temporal locality.
Use when: known cyclic access where the most-recently-used item is least likely to be reused next.
Avoid when: you’re unsure; MRU is rarely a safe default.

Second-Chance / Clock

Idea: approximate LRU using a circular list and a referenced bit; give items a “second chance”.

Pros: O(1) amortized; lower overhead than strict LRU; good concurrency properties in practice.
Cons: approximation quality depends on implementation; still suffers from scan pollution in many forms.
Use when: you want LRU-like behavior with cheaper metadata and fewer writes.
Avoid when: you specifically need scan resistance or frequency awareness.

NRU (Not Recently Used)

Idea: evict an item whose “referenced” bit is not set; bits are periodically cleared (epochs).

Pros: very low overhead; works well when you can batch/reset reference bits cheaply.
Cons: coarse recency signal; behavior depends heavily on epoch length.
Use when: you already have hardware/software reference bits or can cheaply track “touched this epoch”.
Avoid when: you need tight recency ordering.

LFU (Least Frequently Used)

Idea: evict the item with the smallest access count.

Pros: strong when popularity is stable and skewed; resists scan pollution better than LRU.
Cons: “cache pollution by history” (once-hot items stick around); needs aging/decay to adapt; counters add overhead.
Use when: hot items remain hot for long periods; frequency is the primary predictor.
Avoid when: the hot set shifts quickly; you can’t implement decay/aging safely.

MFU (Most Frequently Used)

Idea: evict the most frequently used item.

Pros: can work in specific “burst then never again” patterns (items that were heavily used are now “done”).
Cons: usually the opposite of what you want; not a general-purpose choice.
Use when: you have evidence “most frequent so far” implies “least likely to be reused now”.
Avoid when: almost always.

Aging / Decayed LFU (LFU with time decay)

Idea: combine frequency with time so old counts lose influence (e.g., periodic halving, exponential decay).

Pros: avoids LFU’s “stale hot” problem; adapts to changing popularity.
Cons: more complexity; decay schedule can be tricky; still more metadata than LRU/Clock.
Use when: you want frequency but with adaptivity to phase changes.
Avoid when: extremely latency-sensitive hot paths where counter maintenance dominates.

LRU-K

Idea: evict based on the K-th most recent access time (e.g., K=2 tracks the 2nd most recent touch).

Pros: filters one-time accesses; much more scan-resistant than LRU.
Cons: more metadata per entry; more expensive updates; needs careful implementation to stay O(1) in practice.
Use when: mixed point-lookups + scans; DB buffer pools; workloads with many one-hit-wonders.
Avoid when: you need the simplest possible policy or can’t afford per-entry history.

2Q

Idea: use two queues: a short “probation” FIFO for new items and a main LRU for items that are accessed again.

Pros: simple scan resistance; cheaper than LRU-K; widely used pattern.
Cons: requires tuning queue sizes; still mainly recency-based once admitted to main queue.
Use when: you want an easy scan-resistant upgrade over LRU.
Avoid when: you can’t tolerate tuning knobs or workload changes dramatically.

SLRU (Segmented LRU)

Idea: split LRU into segments (e.g., probationary + protected); promotion requires reuse.

Pros: reduces scan pollution; simple; common in practice.
Cons: needs segment sizing; not as adaptive as ARC-style approaches.
Use when: you want low-complexity scan resistance with LRU semantics.
Avoid when: workload shifts require continual retuning.

ARC (Adaptive Replacement Cache)

Idea: adaptively balances recency vs frequency using two LRU lists plus “ghost” history lists to tune itself.

Pros: strong across many workloads; self-tuning between scan resistance and frequency-ish behavior.
Cons: more complex; more metadata (including ghost entries); harder to implement lock-efficiently.
Use when: you need robust performance across shifting patterns and can afford complexity.
Avoid when: memory overhead must be minimal or implementation complexity is a hard constraint.

CAR (Clock with Adaptive Replacement)

Idea: ARC-like adaptivity but with Clock structures to reduce overhead.

Pros: retains ARC’s adaptivity with lower overhead in some implementations.
Cons: still complex; behavior depends on details.
Use when: you want ARC-like behavior but prefer Clock-style mechanics.
Avoid when: you need simplicity or have no room for ghost/history metadata.

LIRS (Low Inter-reference Recency Set)

Idea: use inter-reference recency (distance between repeated touches) to classify and protect frequently reused items.

Pros: excellent scan resistance in many workloads; strong theoretical grounding.
Cons: implementation complexity; metadata overhead; harder to explain/debug than LRU variants.
Use when: you can invest in a high-quality scan-resistant policy for DB-like workloads.
Avoid when: you need a small, simple policy surface.

CLOCK-Pro

Idea: Clock-based policy that differentiates hot/cold pages and tracks recent history to handle scans better than Clock.

Pros: good scan resistance with Clock mechanics; practical for OS/DB buffer caches.
Cons: more complex than Clock; tuning/implementation details matter.
Use when: you want better-than-Clock scan handling without full ARC machinery.
Avoid when: you want the simplest possible eviction logic.

Size/Cost-Aware Policies (GDS / GDSF family)

Idea: evict based on a “value” score that accounts for retrieval cost and/or object size (common in web caches).

Pros: optimizes for byte hit rate or cost-weighted hit rate; better than LRU when object sizes vary widely.
Cons: more bookkeeping; needs cost/size signals; can be less intuitive.
Use when: objects have large size variance; misses have heterogeneous cost (e.g., network fetch cost).
Avoid when: costs are uniform and you only care about request hit rate.

TTL / Time-Based Expiration (Not a Replacement Policy)

Idea: entries expire after a time-to-live, regardless of recency/frequency.

Pros: bounds staleness; essential for correctness in many domains.
Cons: does not optimize hit rate by itself; still needs an eviction policy when full.
Use when: correctness requires freshness bounds (configs, tokens, CDN-like caching).
Avoid when: you treat TTL as a substitute for eviction optimization.

Practical Tradeoffs (What Changes In Real Systems)

Scan resistance: LRU/Clock are vulnerable; LRU-K/2Q/ARC/LIRS handle scans better.
Metadata & CPU: Random/FIFO < Clock < LRU < 2Q/SLRU < LRU-K/ARC/LIRS.
Concurrency: strict global LRU lists can contend; Clock and sharded designs often scale better.
Adaptivity: LFU needs decay to adapt; ARC-family adapts via history; static partitions (2Q/SLRU) need tuning.
Predictability: simpler policies are easier to reason about under tail-latency constraints; complex policies can have more edge cases.

When To Use / Not Use (Rules Of Thumb)

Use LRU when you have temporal locality and can tolerate per-hit metadata updates.
Prefer Clock when you want LRU-like behavior with lower overhead.
Avoid plain LRU for workloads with large scans unless you add scan resistance (e.g., 2Q, SLRU, LRU-K).
Use LFU (with aging) when popularity is stable and you care about long-term hot items.
Use ARC when workload is mixed or shifting and you can afford the complexity and memory overhead.
Use cost/size-aware policies (GDS/GDSF) when optimizing byte hit rate or miss cost, not just request count.

Reference Material

Wikipedia: Cache replacement policies: https://en.wikipedia.org/wiki/Cache_replacement_policies
LRU-K: “The LRU-K page replacement algorithm for database disk buffering” (O’Neil, O’Neil, Weikum), 1993.
2Q: “2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm” (Johnson, Shasha), 1994.
ARC: “ARC: A Self-Tuning, Low Overhead Replacement Cache” (Megiddo, Modha), 2003.
LIRS: “LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance” (Jiang, Zhang), 2002.
OPT (Belady): “A study of replacement algorithms for a virtual-storage computer” (Belady), 1966.
GDS/GDSF: “GreedyDual-Size: An algorithm for web caching” (Cao, Irani), 1997.