LHD (Learning Hit Density)
Goal
Maximize value per byte by using a learned estimate of hit probability over
remaining lifetime, especially when object sizes vary substantially.
Core Idea
LHD estimates each object’s future hit density (roughly expected future hits per
byte over a horizon) from observed reuse behavior. Eviction prefers entries with
the lowest estimated hit density.
Compared to simple size-aware scoring, LHD adapts estimates from workload data
instead of relying on fixed formulas.
Core Data Structures (Typical)
- Hash index
K -> EntryMeta
- Size-aware entry metadata (
size, age/timestamp, access stats)
- Lightweight learned model/state (bucketized statistics tables)
- Victim selector keyed by estimated hit density
Complexity & Overhead
- Higher implementation complexity than GDS/GDSF
- Additional memory for model tables and per-entry predictors
- Runtime cost depends on estimator complexity (can be O(1) with table lookups)
Notes For CacheKit
- High-value candidate for byte hit rate optimization and heterogeneous object sizes.
- Start with coarse bucketed estimators to keep hot paths predictable.
- Benchmark against
GDS, GDSF, and Hyperbolic on mixed-size traces.
References
- Beckmann et al. (2018): “LHD: Improving Cache Hit Rate by Maximizing Hit Density”, USENIX OSDI 2018 paper with follow-up implementations.