KohakuHub/scripts/dev/down_infra.sh at main

mirror of https://github.com/KohakuBlueleaf/KohakuHub.git synced 2026-05-10 15:44:42 -05:00

Files

narugo1992 aff9fd47ef perf(cache): add Valkey-based L2 cache infrastructure

Introduces the prerequisite cache layer tracked in #73. No business code
yet consumes the helpers — this is plumbing only, gated by
KOHAKU_HUB_CACHE_ENABLED (default: false). Subsequent issues will adopt
specific cache patterns on top of this foundation.

Why now: every hot read endpoint currently makes 1–3 LakeFS REST calls
plus several Postgres queries per request, with the only existing cache
being a per-process cachetools.TTLCache in fallback/cache.py — useless
across the default 4-worker uvicorn deployment.

Design highlights (full design in docs/development/cache.md):

- Pure cache, no source-of-truth state. Silent-degradation contract:
  every cache call is wrapped in try/except and falls back to L3 when
  Valkey is unreachable. CI runs a dedicated cache-disabled job to
  regression-guard this.
- L1 (per-worker cachetools) restricted to immutable / content-addressed
  data only — multi-worker uvicorn has no portable cross-worker
  invalidation channel, and constraining L1 to "key contains its own
  version" sidesteps that entirely.
- L2 (Valkey) holds everything. Helpers ship with TTL jitter (±15%
  default), two-level singleflight (asyncio.Lock + Valkey SET NX EX),
  negative cache, generation counters, and per-namespace metrics.
- Persistence: RDB on, AOF off, persistent volume. Mode-A
  (lakefs:commit, lakefs:stat, lakefs:list — commit_id-keyed) survives
  restart safely. Mode-B (mutable) namespaces are flushed on every
  Valkey restart by a run_id-based bootstrap coordinator that
  serializes the flush across workers.

Includes:

- src/kohakuhub/cache.py — the helper module (319 stmts, 83% coverage
  via the new test module).
- src/kohakuhub/api/admin/routers/cache.py — admin endpoints exposing
  hit/miss/error counters, Valkey memory state, and bootstrap-flush
  metadata.
- test/kohakuhub/test_cache.py — 34 tests against a real Valkey,
  covering: round-trips, TTL jitter spread, SCAN-based prefix delete
  over >SCAN_BATCH_SIZE keys, two-level singleflight (100 concurrent
  calls fold to 1 fetch), bootstrap flush selectivity (Mode-A survives,
  Mode-B is wiped, exactly), two-worker bootstrap coordination, silent
  degradation when Valkey is disabled OR unreachable, generation
  counters, negative cache, the Mode-B prefix list shape contract.
- docker-compose.example.yml — adds the valkey service with RDB +
  LFU + bind-mounted hub-meta/valkey-data, mirroring the persistence
  pattern of the other stateful services.
- scripts/dev/up_infra.sh / down_infra.sh / reset_local_data.sh —
  Valkey container plumbing for local dev (host port 26379).
- .github/workflows/fullstack-tests.yml — adds valkey to the existing
  matrix services and adds a separate single-Python job
  (backend-tests-cache-disabled) running with KOHAKU_HUB_CACHE_ENABLED=false
  as the silent-degradation contract regression guard.
- docs/development/cache.md — the design doc referenced by the cache
  module's docstrings.

Refs: #73

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-30 19:21:04 +08:00

627 B

Executable File

Raw Permalink Blame History

View Raw

627 B Executable File Raw Permalink Blame History

627 B

Executable File

Raw Permalink Blame History