High-performance cache policies and supporting data structures.
Status: design rationale for the benchmark suite under
benches/and shared benchmark support underbench-support/. Companion todesign.md§10 and the benchmark reference docs.
cachekit benchmarks are designed to answer cache questions, not just produce fast-looking numbers. A cache policy can be excellent on uniform keys and weak under scans, or fast on micro-operations and poor at preserving hit rate. The benchmark suite therefore separates micro-operation cost, policy effectiveness, trace-shaped workloads, reporting, and machine-readable artifacts.
The benchmark suite has four layers:
| Layer | Files | Purpose |
|---|---|---|
| Criterion measurements | benches/workloads.rs, benches/ops.rs, benches/comparison.rs, benches/policy/*.rs |
statistically sampled latency and throughput |
| Console reports | benches/reports.rs |
fast, readable tables without Criterion overhead |
| JSON artifact runner | benches/runner.rs |
structured output for docs, charts, CI, historical comparison |
| Shared support crate | bench-support/ |
policy registry, workloads, metrics, JSON schema, doc renderer |
This split is deliberate. Criterion is good for micro-benchmark statistics; the artifact runner is good for automation; console reports are good while tuning a policy locally. No single binary is forced to serve every audience.
Benchmarks iterate policies through for_each_policy! in
bench-support/src/registry.rs:
for_each_policy! {
with |policy_id, display_name, make_cache| {
let mut cache = make_cache(CAPACITY);
// measured workload...
}
}
The macro expands to one block per concrete policy type. This avoids dynamic
dispatch in the measured loop while keeping policy iteration centralized.
POLICIES in the same module provides presentation metadata (stable id,
display name, chart color) for renderers and reports.
The trade-off is that adding a policy touches the macro and metadata table. A
test (policies_metadata_matches_macro) keeps the two from drifting. This is
the same explicit-boilerplate-over-magic choice as DynCache: more arms in
source, fewer surprises in hot code.
Workload definitions live in bench-support/src/registry.rs; generators live in
bench-support/src/workload.rs. The
current standard workloads cover:
docs/benchmarks/workloads.md is the catalog. It
also contains a large roadmap of workloads that should not be confused with
implemented cases. New workloads should land first in the support crate, then in
the docs, then in reports.
benches/runner.rs pre-allocates one Arc<u64> per key in the universe and
passes a closure that returns Arc::clone:
fn preallocate_values() -> Vec<Arc<u64>> {
(0..UNIVERSE).map(Arc::new).collect()
}
The rule is: do not allocate values inside the measured operation loop.
Allocating on every miss makes the benchmark measure the allocator and value
constructor, not the policy. A cheap Arc::clone isolates hit/miss behaviour,
eviction order, and policy metadata overhead.
This is especially important because policies store values differently:
FastLru stores V directly, while LRU / LFU / Heap-LFU use Arc<V> in some
paths. Pre-allocation keeps those representation differences from dominating
the benchmark.
bench-support/src/json_results.rs defines the stable JSON schema for results:
SCHEMA_VERSION follows semantic schema rules.Each BenchmarkArtifact contains:
metadata: timestamp, git commit, branch, dirty bit, rustc, host, CPU,
benchmark config.results: rows keyed by policy, workload, and case_id.metrics: optional typed sections for hit rate, throughput, latency,
eviction, scan resistance, adaptation speed.The schema is presentation-neutral. Markdown tables and charts are rendered
later by bench-support/src/bin/render_docs.rs, so measurement and presentation
can evolve independently.
Use case_id::* constants from json_results.rs instead of string literals:
hit_ratecomprehensivescan_resistanceadaptationThis catches typos at compile time and prevents a result section from silently disappearing from rendered docs. Adding a new case means adding a constant, teaching the runner to populate it, and teaching the renderer how to display it.
| Benchmark | Question |
|---|---|
ops.rs |
What is the raw cost of get / insert / policy-specific operations? |
workloads.rs |
Which policies preserve hit rate under standard workloads? |
comparison.rs |
How does cachekit compare with external crates (lru, quick_cache)? |
policy/*.rs |
What is the cost of each policy’s unique operations? |
reports.rs |
What should a human inspect while tuning? |
runner.rs |
What should CI and docs consume? |
Do not overload one benchmark to answer all questions. If you need policy
micro-cost, use ops.rs; if you need hit rate under scans, use workloads.rs
or runner.rs.
ScrambledZipfian over raw Zipfian for cross-policy comparison when
hardware prefetch could bias hot-key locality.The docs pipeline runs the benchmark suite, writes
target/benchmarks/<run-id>/results.json, and renders
docs/benchmarks/latest/ plus charts. Release-tag snapshots live under
docs/benchmarks/vX.Y.Z/.
Manual workflow:
cargo bench --bench runner
./scripts/update_benchmark_docs.sh
The script is the high-level path for refreshing published benchmark docs. Use
individual benches (cargo bench --bench ops, cargo bench --bench reports -- scan)
while developing a policy.
for_each_policy! with a concrete constructor.PolicyMeta in POLICIES.cargo bench --bench reports -- hit_rate for a quick sanity check.cargo bench --bench runner before publishing docs.Keep constructors comparable. If one policy needs Arc<u64> and another stores
u64, choose the value shape that preserves fairness and document the exception
in the registry comment.
bench-support/src/workload.rs.WorkloadCase in the registry with stable id and display name.docs/benchmarks/workloads.md.Do not add a workload just because it is mathematically interesting. It should answer a policy-selection question.
bench-support/src/registry.rsbench-support/src/json_results.rsbenches/runner.rs