CacheKit Docs

High-performance cache policies and supporting data structures.

View the Project on GitHub OxidizeLabs/cachekit

TTL / Time-Based Expiration — Design Notes

Status: Phase 1 implemented and shipped behind the ttl feature flag. Phase 2 (per-policy embedded expires_at, ConcurrentExpiring, timer-wheel swap-in, serde) is deferred. Companion to the implementation tracker at docs/policies/roadmap/ttl.md.

TTL is not a replacement policy; it is an expiration rule that coexists with an eviction policy. This document captures the rationale behind how TTL is wired into cachekit and preserves the project’s invariants:

Current State

Phase 1 is fully landed behind the ttl feature flag (Cargo.toml):

Phase 2 has not landed:

CacheInner still wires only 17 of the 18 policy modules under src/policy/; policy::car is gated by policy-car but absent from DynCache. TTL-via-builder for CAR remains gated on closing that gap.


1. Design Patterns

Five viable patterns, ordered roughly by invasiveness.

a) Decorator / wrapper cache (chosen)

A new struct Expiring<C, K, V, T> owns an inner C: Cache<K, V> plus a per-key expiry index, intercepting get / peek / insert / remove to consult the index. This is what shipped — see src/policy/expiring.rs.

pub struct Expiring<C, K, V, T = StdClock> {
    inner: C,
    index: ExpirationIndex<K>,
    clock: T,
    default_ttl_ticks: Option<u64>,
    #[cfg(feature = "metrics")]
    expirations: u64,
    _value: PhantomData<fn() -> V>,
}

impl<C, K, V, T> Cache<K, V> for Expiring<C, K, V, T>
where
    C: Cache<K, V>,
    K: Eq + Hash + Clone,
    T: Clock,
{ /* … */ }

K must appear as a type parameter on the wrapper itself because the index is keyed by K; threading it only through the Cache<K, V> impl is not enough. V is recorded as PhantomData<fn() -> V> so the inner cache’s value type is fixed at construction without dragging V into auto-trait bounds.

b) Capability trait + per-policy implementation

Add an extension trait alongside RecencyTracking / FrequencyTracking:

pub trait ExpiringCache<K, V>: Cache<K, V> {
    fn insert_with_ttl(&mut self, key: K, value: V, ttl: Duration) -> Option<V>;
    fn ttl_status(&self, key: &K) -> TtlStatus;
    fn set_ttl(&mut self, key: &K, ttl: Duration) -> bool;
    fn purge_expired(&mut self) -> usize;
}

pub enum TtlStatus {
    Missing,
    Immortal,
    Expired,
    Live { remaining: Duration, deadline: Tick },
}

See §4(a) for the Tick newtype rationale. Each policy embeds an expires_at: u64 field in its node struct (Node<K, V> in src/policy/lru.rs, src/policy/s3_fifo.rs, etc.) and shares an ExpirationIndex data structure. The embedded u64 is the in-memory representation; the public TtlStatus::Live surface still hands callers a Tick, not a raw u64.

c) Storage-level TTL

Move expiration into a new ExpiringStore<S> that wraps a StoreCore / StoreMut. Policies stay TTL-unaware; the store returns None (and increments evictions) on expired reads.

d) Observer / callback (entry-aware policy)

Combine (b) and (c) by giving policies an on_evict(key) hook the store can invoke when it lazily detects expiration.

A Box<dyn ExpirationIndex> injected into any policy. Conflicts with the .cursorrules guidance to “minimize Arc usage in hot paths” and “avoid heavy Rust ergonomics in hot loops (trait objects, …)”.

Recommendation (shipped as Phase 1)

(a) was shipped as the ttl feature, and the wrapper deliberately offers logical expiration over the current Cache trait rather than a zero-overhead embedded TTL.

For builder integration, the shipped approach is a hybrid of options (1) and (2) from §4(c):

DynExpiringCache does not implement Cache<K, V>. Because Expiring<C> requires C: Cache<K, V>, Expiring<DynExpiringCache> is structurally unrepresentable, which restores the “one TTL layer” guarantee that pure option (1) couldn’t give without a runtime check. The cost is one extra public type plus a thin mirror of DynCache’s inherent methods.

Phase 2: where profiling justifies it, embed expires_at into specific policies (b) — LRU, FastLRU and S3-FIFO are the high-value targets. The embed must be opt-in per-node so non-TTL users do not pay 8 bytes per entry (see §6, step 7).


2. Which Policies Can Use It

TTL is orthogonal to eviction, but the interaction differs by policy.

Policy Embedded TTL fit Interaction notes
FIFO, LIFO Trivial — single VecDeque/Vec; one extra u64 per entry Eviction order independent; expire-first then evict-by-policy
LRU, FastLRU Excellent — Node<K,V> already has prev/next/key/value, add u64 Expired LRU node short-circuits pop_lru
S3-FIFO Excellent — slot arena nodes; expiry can drop from Small/Main without changing admission history Ghost list should track capacity evictions, not TTL expiry
LRU-K, SLRU, 2Q Natural — multi-segment node already exists; expiry can drop probationary entries cheaply Promote-and-expire interaction matters: do not promote already-expired
Clock, Clock-PRO, NRU Natural — clock entry already has a reference bit; add u64; sweep can expire on the way past Clock-PRO’s adaptive logic must treat expiry-evictions separately from cold-list evictions
LFU, Heap-LFU, MFU Works, but TTL-evictions distort frequency stats. Either decay frequency on expire or accept skew Co-locate with the existing LazyMinHeap if desired
ARC, CAR Same as LFU — adaptive parameter p should not move on a TTL eviction Track TTL-evictions separately to avoid mis-tuning. CAR is not currently a DynCache variant; TTL-via-builder for CAR is gated on closing that gap first.
Random Trivial; no ordering to corrupt

The least useful combinations are MFU and pure ARC where TTL competes with the policy’s signal. Document the warning rather than disabling.


3. Data Structures & Algorithms

The codebase already owns most of the building blocks: SlotArena, LazyMinHeap, IntrusiveList, GhostList, ClockRing. Concrete options for the expiration index follow.

A) Lazy min-heap of (expires_at, key) (shipped)

ds::LazyMinHeap<K, S> already existed at src/ds/lazy_heap.rs and explicitly listed TTL in its use cases. Insertion is O(log n); pop_best is amortized O(log n); update is O(log n) with maybe_rebuild to bound staleness.

The shipped ExpirationIndex wrapper lives at src/ds/expiration_index.rs:

pub type Deadline = u64;

pub struct ExpirationIndex<K> { /* LazyMinHeap<K, Deadline> */ }

impl<K: Eq + Hash + Clone> ExpirationIndex<K> {
    pub fn new() -> Self;
    pub fn with_capacity(capacity: usize) -> Self;

    pub fn set_deadline(&mut self, key: K, expires_at: Deadline) -> Option<Deadline>;
    pub fn deadline_of<Q>(&self, key: &Q) -> Option<Deadline>
    where K: Borrow<Q>, Q: Hash + Eq + ?Sized;
    pub fn remove<Q>(&mut self, key: &Q) -> Option<Deadline>
    where K: Borrow<Q>, Q: Hash + Eq + ?Sized;

    pub fn next_deadline(&mut self) -> Option<(&K, Deadline)>;
    pub fn pop_expired(&mut self, now: Deadline) -> Option<(K, Deadline)>;
    pub fn drain_expired(&mut self, now: Deadline)
        -> impl Iterator<Item = (K, Deadline)> + '_;

    pub fn iter(&self) -> Iter<'_, K>;
    pub fn set_auto_rebuild(&mut self, factor: Option<usize>) -> &mut Self;
}

Auto-rebuild defaults to factor 2 — stale heap entries are bounded at 2 * live_len() — and callers that mutate every entry many times per epoch can tighten this via set_auto_rebuild.

LazyMinHeap had destructive pop_best but no non-destructive live-minimum peek. Phase 1 added a peek_best(&mut self) -> Option<(&K, &S)> primitive to LazyMinHeap that drains stale-tombstoned roots in place (mutating because lazy deletion may need to advance past tombstones, immutable observation otherwise). ExpirationIndex::next_deadline is a thin wrapper around it, so the wrapper does not couple to the heap’s internal staleness representation. The primitive is reusable outside TTL.

B) Hashed timer wheel (single wheel)

N slots, each a Vec<K> (or IntrusiveList<SlotId>); insert at slot = (expires_at / tick) % N; on advance, drain the current slot.

C) Hierarchical timer wheel (Linux-style)

Multiple wheels (e.g., 256 ms, 16 s, 1024 s, …) cascading on overflow.

D) Sorted index over expires_at

A BTreeMap<u64, SmallVec<[K; 4]>> or IntrusiveList per bucket time.

E) Intrusive expiry list per slot/segment

For LruCore-style policies, a second doubly-linked list through the same SlotArena, ordered by insertion-time TTL. If TTL is uniform per-cache (a single global default_ttl), this becomes O(1): a simple FIFO over expiration order, no priority queue needed.

F) Generation / epoch counter (invalidation, not TTL)

Tag every entry with the cache’s epoch; bump epoch to invalidate everything; lazy purge on access.

Time source

Do not hardcode std::time::Instant. Introduce a small Clock trait so tests/benches can use a mock and so users on no_std-adjacent targets can plug their own:

pub trait Clock {
    /// Monotonic ticks. Recommended unit: milliseconds.
    fn now(&self) -> u64;
}

Algorithm cheat sheet


4. Unified API & Integration

cachekit already has a strong pattern for optional capabilities: extension traits in src/traits.rs and a DynCache enum in src/builder.rs. TTL fits the same shape.

a) New capability trait alongside the others

pub trait ExpiringCache<K, V>: Cache<K, V> {
    fn insert_with_ttl(&mut self, key: K, value: V, ttl: Duration) -> Option<V>;
    fn ttl_status(&self, key: &K) -> TtlStatus;
    fn set_ttl(&mut self, key: &K, ttl: Duration) -> bool;
    fn purge_expired(&mut self) -> usize;
}

/// Monotonic deadline expressed in the cache's tick unit.
///
/// Wrapping the raw `u64` keeps the tick representation out of the
/// public API surface; users compare it via `Tick`'s methods or convert
/// to/from `Duration` via the cache.
#[derive(Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Debug)]
pub struct Tick(u64);

pub enum TtlStatus {
    Missing,
    Immortal,
    Expired,
    Live { remaining: Duration, deadline: Tick },
}

This sits beside RecencyTracking / FrequencyTracking / HistoryTracking and stays object-safe. Prefer ttl_status over separate expires_at / ttl_remaining methods so users can distinguish missing keys, non-expiring keys, expired-but-not-yet-purged keys, and live TTL entries.

Exposing Tick rather than a bare u64 matters because the tick unit (ms vs. ns vs. wall-clock seconds) is a private implementation choice. The newtype lets the unit move without breaking downstream callers, and gives a natural place to hang Tick::saturating_add(Duration), Tick::as_duration_since(&self, Clock), and serde glue.

Precedence rule for default_ttl and per-entry TTL: per-entry TTL always wins. Calling insert_with_ttl(k, v, Duration::ZERO) on a cache with default_ttl(Some(60s)) configured must produce immediate expiry, not a 60-second TTL. Plain insert(k, v) uses the default; specifying any TTL — including zero — overrides it. State this explicitly in both the trait docs and the builder docs because the “default + explicit override” pattern is ambiguous in user code.

b) Clock abstraction for testability

A Clock parameter on Expiring<C, K, V, T> (default StdClock) and on any policy that embeds expiry. Mirrors how RandomCore keeps rng_state rather than calling rand::thread_rng() directly. The shipped Clock trait is object-safe and requires Send + Sync + Debug, with StdClock(Instant) and MockClock(AtomicU64) covering production and test cases respectively.

c) Builder integration (shipped)

The shipped surface:

use cachekit::builder::{CacheBuilder, CachePolicy};
use std::time::Duration;

let mut cache = CacheBuilder::new(1000)
    .with_default_ttl(Duration::from_secs(60))
    .build::<u64, String>(CachePolicy::Lru);

cache.insert(1, "v".to_string());                              // uses default TTL
cache.insert_with_ttl(2, "fast".into(), Duration::from_secs(5)); // overrides

CacheBuilder::with_default_ttl(Duration) returns an ExpiringBuilder whose build::<K, V>(policy) returns a DynExpiringCache<K, V> rather than a DynCache<K, V>. The two paths considered were:

  1. Add impl Cache<K, V> for DynCache<K, V> and wrap the result in Expiring<DynCache<K, V>, …> directly.
  2. Introduce a separate DynExpiringCache<K, V> returned by a TTL-specific builder path, avoiding a recursive enum at the cost of another public type.

The shipped code uses the mechanism of (1) plus the surface of (2). DynCache<K, V> now implements Cache<K, V> (see builder.rs), so Expiring<DynCache, …> is internally type-checkable. DynExpiringCache<K, V> wraps that Expiring<DynCache<K, V>, K, V, StdClock> with the inner field private, and the Cache trait is not implemented on the public wrapper — therefore Expiring<DynExpiringCache> is structurally unrepresentable. The “one new variant, not 18” goal holds: the new type mirrors DynCache’s inherent methods (one delegation per method, not per policy) and adds the TTL surface.

d) Feature gating (shipped)

A ttl feature lives in Cargo.toml. TTL depends only on std::time and the existing LazyMinHeap — no new runtime dependency. The following modules are gated behind #[cfg(feature = "ttl")]:

When ttl is off the ds module does not grow a time abstraction and DynCache reverts to its original surface.

Metrics integration is a single counter — Expiring::expirations: u64 behind #[cfg(feature = "metrics")], with an accessor that returns 0 when the feature is off so call sites compile unconditionally. Per-policy metrics structures (LruMetrics, StoreMetrics) were intentionally left alone — the decorator-owned counter is enough for Phase 1 because every TTL purge funnels through the wrapper.

e) Concurrent variants (Phase 2)

The existing Concurrent* wrappers (ConcurrentLruCache in src/policy/lru.rs, ConcurrentSlotArena in src/ds/slot_arena.rs, ConcurrentClockRing in src/ds/clock_ring.rs) wrap their single-threaded core in parking_lot::RwLock. ConcurrentExpiring<C> is not shipped yet; when it lands it must follow two non-negotiable rules:

  1. Return owned/Arc<V>, not &V. The Cache::get(&mut self) -> Option<&V> signature cannot be implemented safely on ConcurrentExpiring<C> without holding the write lock across the borrow, which serializes readers and defeats the point of RwLock. ConcurrentExpiring<C> should expose fn get(&self, key: &K) -> Option<Arc<V>> (and the sibling mutators), and should not implement Cache<K, V>. It is a concrete type with its own API, mirroring how ConcurrentLruCache already deviates from Cache<K, V>.
  2. Atomic expiry-and-removal. The expiry check, policy removal, and index removal must be one atomic write-locked operation. Splitting them allows a concurrent set_ttl or insert to race with a stale expiry decision and produce a “you remove an entry I just renewed” outcome. A read-locked fast path (check expires_at <= now under a read lock, escalate to write lock for the actual removal) is safe so long as the write-locked path re-checks the deadline before acting.

Today, callers needing thread-safety wrap a DynExpiringCache in Arc<RwLock<…>> exactly as they would for any other Cache.

f) DynCache touchpoint (shipped)

DynCache<K, V> gained an impl Cache<K, V> so that Expiring<DynCache, K, V, StdClock> type-checks — see src/builder.rs. All other policy structs already implemented Cache<K, V>; this fills the last gap so the decorator works through the runtime-selected enum without a per-policy match arm.

DynExpiringCache<K, V> (also in builder.rs) lives next to DynCache and mirrors DynCache’s inherent methods plus the TTL surface (insert_with_ttl, ttl_status, set_ttl, purge_expired, live_len, default_ttl, expirations). Its Debug impl reports default_ttl, len, and capacity via finish_non_exhaustive — no keys or deadlines leak.

g) prelude.rs (shipped)

src/prelude.rs re-exports Clock, StdClock, MockClock, Tick, TtlStatus, ExpiringCache, DynExpiringCache, and ExpiringBuilder under #[cfg(feature = "ttl")]. Expiring itself is not re-exported — callers reach it through cachekit::policy::expiring::Expiring when they need the raw decorator, but most code should consume the builder-vended DynExpiringCache.


5. Trade-offs Side-by-Side

Pattern trade-offs

Pattern Hot-path read cost Hot-path insert cost Code churn Fits .cursorrules When to pick
(a1) Expiring<C> + heap index (A) +1 hash probe, +1 cmp O(log n) heap insert + key clone Low, plus DynExpiringCache plumbing Mostly — preserves separation, not allocation-free First TTL iteration; benchmark the overhead
(a2) Expiring<C> + intrusive list (E) +1 cmp (no hash probe) O(1) list push Low–medium (intrusive list reuse) Yes for uniform TTL When TTL is uniform across all entries
(b) Per-policy embedded +1 cmp +1 cmp High (18 policies × edits) Yes — best layout Long-term, after profiling shows the decorator overhead matters
(c) Storage-level +1 cmp inside store +1 cmp inside store Medium (new store + callbacks) Mostly If also adding size-aware eviction or weight-aware stores
(d) Observer/callback medium medium High Mixed (adds vtable-ish hop) Multi-tenant caches with heterogeneous policies
(e) Trait-object mixin high (vcall) high (vcall) Low No — violates trait-object/Arc rule Don’t

Index data-structure trade-offs

Index Insert Expire batch Memory / entry Variable TTL? Fits ds module
Lazy min-heap (A) O(log n) + key clone O(k log n) amortized HashMap entry + heap entry + sequence; stale entries bounded by rebuild Yes Already exists (LazyMinHeap)
Single timer wheel (B) O(1) O(slot size) per tick ~8 B + slot vec Bounded New ds module (TimerWheel)
Hierarchical wheel (C) O(1) am. O(1) am. with cascade ~8 B + multi-wheel Yes New, large addition
BTreeMap (D) O(log n) O(log n) range drain ~32 B + alloc Yes Doesn’t fit (allocations)
Intrusive expiry list (E) O(1) O(k) 2 × usize per entry No (uniform TTL) Reuses SlotArena + IntrusiveList
Epoch tag (F) O(1) O(1) (lazy) 8 B per entry N/A Trivial

Eviction-policy interaction trade-offs

Time source trade-offs

API trade-offs


6. Phased Roadmap

Phase 1 — landed

All six items below are merged behind the ttl feature flag. The shipped sequence preserves policy/storage separation and keeps TTL opt-in, but the decorator does not preserve every hot-path invariant: inserts pay heap maintenance, the index clones keys, and expired entries may remain physically resident until a mutable operation purges them. The Phase 2 benchmark gate (step 7) is therefore part of the design, not optional cleanup.

  1. src/policy/expiring.rsExpiring<C, K, V, T = StdClock> decorator. peek / contains are logical reads; Cache::len reports physical occupancy; Expiring::live_len(&self) walks the index once for the live count (see §1(a) Decision). The shipped signature is &self, not &mut self, because ExpirationIndex::iter is borrow-only.
  2. src/ds/expiration_index.rsExpirationIndex<K> backed by LazyMinHeap<K, u64> with auto-rebuild enabled (factor 2 by default) to bound stale heap growth. The door is open to swap in a timer wheel later. Added a peek_best primitive to LazyMinHeap itself (see §3.A) so ExpirationIndex::next_deadline / pop_expired / drain_expired do not couple to heap internals.
  3. src/time.rsClock trait + StdClock / MockClock. Clock takes &self, is object-safe, and requires Send + Sync + Debug.
  4. src/traits.rsTick, TtlStatus, and the ExpiringCache<K, V> capability trait (object-safe; sits alongside RecencyTracking, FrequencyTracking, HistoryTracking).
  5. CacheBuilder::with_default_ttl(Duration) returns an ExpiringBuilder whose build produces a DynExpiringCache<K, V> (separate public type with a private inner field; cannot be wrapped in another Expiring). To make this work without per-policy plumbing, DynCache<K, V> gained an impl Cache<K, V> — see §1 Recommendation and §4(c).
  6. Feature flag ttl; counter Expiring::expirations (gated on metrics, accessor returns 0 unconditionally so call sites compile); doctests on every public item; tests/ttl_integration_test.rs exercises the decorator and the builder path under proptest; benches/ttl_overhead.rs compares plain LRU vs. Expiring<LRU> under Zipfian / scan / mixed workloads.

Phase 2 — deferred

  1. Embedded expires_at. Profile Phase 1 with the existing ttl_overhead group and, if the extra hash hit shows up in flamegraphs, embed expires_at: u64 into LruCore::Node and S3FifoCache::Node (the two highest-traffic policies in the existing benches at benches/) — but opt-in per node, not unconditionally. Two viable shapes:
    • A const generic Node<K, V, const TTL: bool> so non-TTL caches monomorphize to the slimmer layout.
    • A separate type LruWithTtl<K, V> (and S3FifoWithTtl<K, V>) that wraps the slot arena with a parallel Vec<u64> keyed by slot handle.

    Embedding expires_at unconditionally would add 8 bytes per node to every LRU and S3-FIFO instance — a 10–25% memory regression for the common case of fixed-size value caches — and would regress the very benchmarks Phase 1 uses as a gate. The .cursorrules “keep metadata tight” rule applies here.

  2. ConcurrentExpiring<C> following §4(e) (owned/Arc<V> return, atomic expiry-and-removal).
  3. Timer-wheel swap-in as an alternative ExpirationIndex backend when TTL is uniform and high-throughput (see §3.B/§3.C).
  4. serde support for Tick / TtlStatus, with the relative- duration serialization rule from §5 (API trade-offs).
  5. CAR in DynCache. TTL-via-builder for CAR is blocked until policy::car becomes a CacheInner variant.

7. Open Questions

Still open:

Resolved during the Phase 1 design pass (kept here for posterity):


References

Shipped source

Supporting docs

External