[AZ-308] c6 CacheBudgetEnforcer: 10 GB hard cap + LRU sweep

CacheBudgetEnforcer.reserve_headroom(needed_bytes) returns immediately
when total_disk_bytes() + needed_bytes <= budget, otherwise iterates
lru_candidates in eviction_batch_size batches, deletes via delete_tile,
emits one INFO log per evicted tile (c6.evicted) and one FDR record per
eviction batch (c6.eviction_batch, evicted_tile_ids capped to 5).
Raises CacheBudgetExhaustedError AFTER a full sweep if the budget
cannot be met. BudgetEnforcedTileStore decorates a TileStore so the
policy stays separable from PostgresFilesystemStore. Composition root
in storage_factory.build_tile_store wires the wrapper unconditionally.

PostgresFilesystemStore now accepts lru_clock: Clock | None = None;
when set, read_tile_pixels calls record_lru_access(tile_id, now) so
eviction picks the right LRU candidates. Production wiring injects
WallClock(); AZ-305 unit tests still construct without the clock and
keep their pass-through semantics. Contract tile_store.md bumped to
v1.1.0 to add CacheBudgetExhaustedError to the TileCacheError family;
shared FDR schema bumped to v1.3.0 for the new c6.eviction_batch kind.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 20:37:41 +03:00
parent 39ff47087f
commit d571ca25f9
13 changed files with 1588 additions and 29 deletions
@@ -1,207 +0,0 @@
# C6 Cache Budget Eviction — 10 GB Hard Cap with LRU Sweep
**Task**: AZ-308_c6_cache_budget_eviction
**Name**: C6 Cache Budget Eviction
**Description**: Implement the 10 GB cache-budget enforcer per RESTRICT-SAT-2 (cache budget across operational area). Wraps `PostgresFilesystemStore.write_tile` with a pre-write head-room check; on overflow, drives an LRU sweep using the store's `lru_candidates(max_count) -> list[TileMetadata]` and `delete_tile(tile_id) -> bool` primitives until enough head-room is freed. Emits an INFO log per eviction (`kind="c6.evicted"` with `tile_id`, `disk_bytes`, `accessed_at`) and an FDR record per eviction batch. Hard cap is config-driven (`config.tile_cache.cache_budget_bytes`, default 10 GB). Defends against the silent-overflow failure mode where a runaway F4 burst would push past the cap and either fill the disk or get arbitrarily evicted by the OS.
**Complexity**: 3 points
**Dependencies**: AZ-303_c6_storage_interfaces, AZ-305_c6_postgres_filesystem_store, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-273_fdr_client_ringbuf
**Component**: c6_tile_cache (epic AZ-250 / E-C6)
**Tracker**: AZ-308
**Epic**: AZ-250 (E-C6)
### Document Dependencies
- `_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md` — defines `lru_candidates`, `record_lru_access`, `total_disk_bytes`, `delete_tile` primitives this task consumes; produced by AZ-303.
- `_docs/02_document/contracts/c6_tile_cache/tile_store.md``delete_tile` semantics (idempotent, returns `False` on missing).
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md``kind="c6.eviction_batch"` envelope.
- `_docs/02_document/contracts/shared_logging/log_record_schema.md` — INFO log shape per evicted tile.
## Problem
Without budget enforcement:
- RESTRICT-SAT-2 (10 GB cache budget) collapses — the cache grows unboundedly under sustained F4 mid-flight ingest; eventually the disk fills.
- An adversarial mid-flight ingest path (compromised companion sending bogus tiles via the F4 boundary, even before the parent-suite trust layer applies) could DoS the disk.
- The OS would silently evict the OS page cache to make room, blowing the C6-PT-01 read-side latency budget for legitimate tiles.
- C6-IT-06 (synthetic 10 GB fill + 100 MB overrun → eviction count matches insert overrun) cannot pass — the eviction is a hole.
- Operators cannot tell which tiles got dropped — there's no per-eviction log or FDR trace.
This task is the budget enforcer. The store's primitives are now ready; this is the policy layer that consumes them.
## Outcome
- A `CacheBudgetEnforcer` class at `src/gps_denied_onboard/components/c6_tile_cache/cache_budget_enforcer.py` with one public method `reserve_headroom(needed_bytes: int) -> EvictionResult` that, when called pre-write:
1. Reads `total_disk_bytes()` from the metadata store.
2. Computes `available_bytes = budget_bytes - current_bytes`.
3. If `available_bytes >= needed_bytes` → returns `EvictionResult(evicted=[], freed_bytes=0)` immediately (no eviction needed).
4. Else → runs an LRU sweep: in batches of `eviction_batch_size` (default 32), calls `lru_candidates(max_count=batch_size)`, iterates the candidates, calls `delete_tile(tile_id)` for each, accumulates `freed_bytes` (from each candidate's `disk_bytes`), and stops when `freed_bytes >= (needed_bytes - available_bytes)`. If the candidate list is exhausted before the freed budget is hit, raises `CacheBudgetExhaustedError` (NEW error type subclassing `TileCacheError`).
5. Returns `EvictionResult(evicted: list[TileMetadata], freed_bytes: int)` listing what was evicted.
- Constructor signature: `__init__(self, *, store: TileMetadataStore, fdr_client: FdrClient, logger: Logger, budget_bytes: int, eviction_batch_size: int = 32)`. The `store` reference also provides `delete_tile` (since the impl is `PostgresFilesystemStore`, both Protocols are on one instance — the type hint is `TileMetadataStore` for clarity but the runtime object also provides `delete_tile` via the `TileStore` Protocol).
- The `PostgresFilesystemStore.write_tile` method is wrapped at the composition root: a `BudgetEnforcedTileStore` decorator class wraps the store and calls `enforcer.reserve_headroom(len(tile_blob))` before delegating to the wrapped store's `write_tile`. This is the same decorator pattern AZ-307's freshness gate uses (composition-root wiring).
- Per-eviction INFO log: `kind="c6.evicted"`, payload `{tile_id, disk_bytes, accessed_at, evicted_at}` — one log entry per evicted tile.
- Per-batch FDR record: `kind="c6.eviction_batch"`, payload `{trigger_tile_id, freed_bytes, evicted_count, evicted_tile_ids[:5]}` (first 5 evicted ids — keeps the FDR record bounded; the full list is in the logs).
- The `record_lru_access` primitive is wired into the read path of `PostgresFilesystemStore.read_tile_pixels` (AZ-305 already declares it; this task's wiring change ensures every read updates the LRU clock so eviction picks the right candidates).
## Scope
### Included
- `CacheBudgetEnforcer` class with the `reserve_headroom` method.
- `EvictionResult` dataclass `@dataclass(frozen=True)`.
- `CacheBudgetExhaustedError` (subclass of `TileCacheError` — added to `c6_tile_cache.errors`).
- `BudgetEnforcedTileStore` decorator class wrapping a `TileStore` and calling the enforcer pre-write.
- Composition-root wiring: the factory `build_tile_store` returns a `BudgetEnforcedTileStore` wrapping `PostgresFilesystemStore` when `config.tile_cache.cache_budget_bytes > 0`; returns the bare store otherwise (for Tier-0 dev where the budget is irrelevant).
- INFO log per evicted tile.
- FDR record per eviction batch.
- A wiring change in `PostgresFilesystemStore.read_tile_pixels` to call `record_lru_access(tile_id, now)` on every read (so the LRU clock stays current). This is a small additive change to AZ-305's class — implemented as a constructor argument `lru_clock: Clock | None = None` so AZ-305 stays unit-testable; when `lru_clock` is provided, every read appends to the LRU clock.
- A standalone CLI `python -m c6_tile_cache.cache_budget_enforcer dry-run --pretend-needed-bytes N` for operators to inspect what would be evicted without performing the eviction.
- A construction-time INFO log with `budget_bytes`, `current_disk_bytes`, `headroom_bytes`.
### Excluded
- Voting-status-aware eviction (e.g., "evict PENDING before TRUSTED") — out of scope this cycle. The LRU-only policy is the simplest enforcement and matches RESTRICT-SAT-2 directly. A future task can add voting-tier-weighted eviction.
- Eviction-throttling under sustained-burst pressure (e.g., "stop accepting new writes if eviction-rate exceeds threshold") — out of scope; the budget is a hard cap, not a soft one.
- Per-zoom or per-source quota — out of scope; the budget is global.
- Background-sweep eviction (eager eviction on a timer) — out of scope; eviction runs only on write-side budget pressure.
- `delete_tile` failure handling beyond logging — if `delete_tile` returns `False` (already missing) or raises `TileFsError`, the enforcer logs and continues; the budget calculation still subtracts the row's `disk_bytes` because the row is gone.
- Cross-flight cache state — every flight starts with whatever the prior flight's persistent state was; eviction is per-flight bookkeeping.
## Acceptance Criteria
**AC-1: No-eviction fast path**
Given `budget_bytes = 10 GB`, `current_disk_bytes = 1 GB`
When `reserve_headroom(needed_bytes=10 MB)` is called
Then the result is `EvictionResult(evicted=[], freed_bytes=0)`; no `lru_candidates` call is made (verifiable via mock counter); no INFO log; no FDR record
**AC-2: Single-tile eviction frees enough**
Given `budget_bytes = 10 GB`, `current_disk_bytes = 9.99 GB` (10 MB headroom), and an LRU candidate with `disk_bytes = 50 MB`
When `reserve_headroom(needed_bytes = 30 MB)` is called
Then the candidate is deleted; result: `evicted=[that tile]`, `freed_bytes=50 MB`; one INFO log per evicted tile; one FDR `kind="c6.eviction_batch"` record
**AC-3: Multi-tile eviction iterates LRU candidates**
Given `current_disk_bytes` at the cap and 10 LRU candidates each `disk_bytes = 5 MB`
When `reserve_headroom(needed_bytes = 30 MB)` is called
Then exactly the 6 oldest candidates are evicted (6 × 5 MB = 30 MB matches the need); the 7th (and onwards) are left alone
**AC-4: Eviction batches respect `eviction_batch_size`**
Given 100 LRU candidates and a `eviction_batch_size=32`
When the eviction needs to free 50 candidates' worth
Then `lru_candidates` is called with `max_count=32` first, then again with `max_count=32` for the remaining; total 2 SELECTs (verifiable via psycopg query log)
**AC-5: Insufficient candidates raise CacheBudgetExhaustedError**
Given a budget so small that even evicting every existing tile won't free `needed_bytes`
When `reserve_headroom(needed_bytes = 50 GB)` is called
Then `CacheBudgetExhaustedError` is raised AFTER all candidates have been evicted (the eviction loop runs to completion before raising; this is so the operator's recovery path has the maximum head-room possible); the error message names `needed_bytes`, `available_bytes`, `evicted_count`
**AC-6: BudgetEnforcedTileStore decorator integrates with write_tile**
Given a `BudgetEnforcedTileStore` wrapping a `PostgresFilesystemStore`, with `current_disk_bytes` near the cap
When `write_tile(tile_blob, metadata)` is called
Then the enforcer's `reserve_headroom(len(tile_blob))` runs first; if eviction was triggered, the evicted tiles are gone before the new write proceeds; the new tile lands successfully
**AC-7: BudgetEnforcedTileStore propagates TileCacheError unchanged**
Given the wrapped store raises a `ContentHashMismatchError`
When `write_tile` is called via the decorator
Then the same `ContentHashMismatchError` propagates unchanged; the decorator does NOT swallow or rewrap
**AC-8: read_tile_pixels updates the LRU clock**
Given a `PostgresFilesystemStore` constructed WITH the `lru_clock` injection
When `read_tile_pixels(tile_id)` is called
Then `record_lru_access(tile_id, now())` is invoked exactly once with `now() = clock.utcnow()`; the row's `accessed_at` is updated (verifiable via subsequent `lru_candidates` ordering)
**AC-9: Single eviction is O(1) extra disk-bytes query**
Given a no-eviction-needed call and an eviction-needed call
When the test counts SELECT queries
Then the no-eviction path executes 1 SELECT (`total_disk_bytes()`); the eviction path executes 1 SELECT for `total_disk_bytes` + N SELECTs for `lru_candidates` (each batch) + N UPDATEs for `delete_tile` row deletes; no quadratic blowup
**AC-10: 10 GB budget enforcement under synthetic load**
Given `budget_bytes = 10 GB - 50 MB`, then 100 MB of new tiles inserted in 5 MB chunks
When the test runs
Then total disk usage stays ≤ 10 GB at all times (verifiable via `total_disk_bytes` between every write); the eviction count matches the insert overrun (≥ 50 MB - 50 MB = 0; depends on prior current_bytes — the test's exact pre-state is documented in C6-IT-06); every eviction is logged at INFO
**AC-11: FDR eviction-batch payload bounded**
Given a single `reserve_headroom` call that triggers 100 evictions
When the FDR record is captured
Then the record contains `evicted_count=100`, `evicted_tile_ids` of length AT MOST 5 (first 5 in eviction order); the record size stays bounded regardless of how many evictions occurred
**AC-12: Construction-time disk-bytes report**
Given a `CacheBudgetEnforcer` constructed against a non-empty store
When the construction completes
Then an INFO log `kind="c6.budget.loaded"` is emitted with `budget_bytes`, `current_disk_bytes`, `headroom_bytes`; if `current_disk_bytes > budget_bytes` (over-budget at startup), an additional WARN log is emitted naming the overage
## Non-Functional Requirements
**Performance**
- No-eviction path p99 ≤ 5 ms (one `total_disk_bytes` query).
- Eviction path: per-evicted-tile cost is dominated by the `delete_tile` UPDATE + filesystem unlink (~510 ms each on Tier-2 SSD); a typical 5 Hz F4 burst evicting 12 tiles per write keeps the write-side latency under 30 ms.
- The eviction loop does NOT block the F3 hot path — eviction runs synchronously inside `write_tile`, which is on the F4 producer thread (not C2 / C2.5 / C3 reads).
**Reliability**
- Eviction is idempotent — `delete_tile` returning `False` is a no-op (the candidate was already evicted by a concurrent path); the enforcer logs and continues.
- Construction-time over-budget detection (AC-12 WARN log) catches the case where the prior flight ended over-budget; the enforcer does NOT proactively evict on construction (operator may want to inspect the over-budget state first).
- The enforcer is the SOLE eviction path during a flight — no other component evicts tiles. Code-review's Architecture phase treats unauthorised `delete_tile` callers as findings.
**Compatibility**
- Reuses the AZ-303 Protocols + AZ-273 FdrClient + AZ-266 logger.
- No new third-party dependencies.
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-1 | reserve_headroom with ample budget | EvictionResult empty; no lru_candidates call; no logs |
| AC-2 | reserve_headroom with tight budget; one 50 MB candidate | One eviction; freed_bytes=50 MB; one INFO; one FDR |
| AC-3 | 10 candidates × 5 MB; need 30 MB | Exactly 6 evicted; 7th-10th untouched |
| AC-4 | 100 candidates needed; batch_size=32 | Exactly 2 lru_candidates calls (or 3 if last batch needed) |
| AC-5 | Need 50 GB; budget far smaller | All candidates evicted; CacheBudgetExhaustedError raised AFTER |
| AC-6 | BudgetEnforcedTileStore + write_tile near cap | Pre-write eviction; new tile lands |
| AC-7 | Wrapped store raises ContentHashMismatchError | Same exception propagates from decorator |
| AC-8 | read_tile_pixels with lru_clock injected | record_lru_access called exactly once with clock.utcnow() |
| AC-9 | Query-count tally for no-evict and evict paths | 1 select for no-evict; 1+N+N for evict |
| AC-10 | C6-IT-06 synthetic 10 GB fill + 100 MB overrun | Disk usage ≤ 10 GB throughout; eviction count matches overrun; logs emitted |
| AC-11 | reserve_headroom triggering 100 evictions | FDR record has evicted_count=100; evicted_tile_ids length ≤ 5 |
| AC-12 | Construct enforcer over a non-empty store | INFO log on construction; WARN if over-budget |
| NFR-perf-no-evict | Microbench reserve_headroom × 10000 (no-evict path) | p99 ≤ 5 ms |
| NFR-reliability-delete-already-gone | reserve_headroom with a candidate that's racing-deleted by a concurrent caller | delete_tile returns False; enforcer logs at INFO and continues |
## Constraints
- LRU is the only eviction policy this cycle — voting-tier-aware is a future task.
- `eviction_batch_size` is config-driven; default 32 is a reasonable balance between query overhead and memory residence of the candidate list.
- `CacheBudgetExhaustedError` is raised AFTER the eviction loop completes — partial eviction is preferable to no eviction even when the budget cannot be met (frees up as much head-room as possible for whatever the operator decides to do next).
- The decorator pattern (`BudgetEnforcedTileStore`) is mandatory — modifying `PostgresFilesystemStore.write_tile` to do the budget check directly would couple the policy to the impl, breaking the single-responsibility design.
- The `record_lru_access` injection into `read_tile_pixels` is OPT-IN (constructor arg `lru_clock: Clock | None = None`) so AZ-305's tests can run the store WITHOUT the LRU update; production wiring always passes the clock.
- The FDR `evicted_tile_ids` cap (first 5) keeps the record bounded; the full list is in the INFO logs which can be replayed post-flight.
- This task does NOT introduce new third-party dependencies.
## Risks & Mitigation
**Risk 1: LRU thrashing under sustained F4 burst**
- *Risk*: Under a 5 Hz mid-flight ingest sustained near the cap, every write evicts an old tile; the cache becomes a sliding window and the operational area shrinks.
- *Mitigation*: This is the intended behaviour — the cap is hard, and freshness wins. Operator can bump the cap via `config.tile_cache.cache_budget_bytes`. The FDR's eviction-batch records show post-flight whether thrashing occurred.
**Risk 2: Eviction races a concurrent read**
- *Risk*: A reader (C2/C2.5/C3) holds a `TilePixelHandle` for tile T; the enforcer evicts T; the reader's mmap goes stale.
- *Mitigation*: Per `tile_store.md` Constraints, consumers MUST NOT cache `TilePixelHandle` across calls — use within a `with` block and release. The OS keeps the fd alive until the consumer's `__exit__`, so the mmap is read-correct even after the file is unlinked. Documented and tested in AZ-305's read-handle lifecycle test.
**Risk 3: Construction-time over-budget triggers cascade eviction**
- *Risk*: The previous flight ended over-budget; on next start, the enforcer sees `current_disk_bytes > budget_bytes`; the first `reserve_headroom` evicts a lot to get back under cap.
- *Mitigation*: AC-12 WARN log surfaces the over-budget state at construction. The first F4 write triggers normal eviction; AC-5 covers the worst case (`CacheBudgetExhaustedError` if even all candidates can't fit the new tile).
**Risk 4: `delete_tile` partially fails (filesystem unlink succeeds, row delete fails) leaving a dangling row**
- *Risk*: AZ-305's `delete_tile` is supposed to be a single transaction with the filesystem op; if the filesystem unlink succeeds but the row delete fails, the row claims `disk_bytes > 0` but the file is gone. `total_disk_bytes` is now wrong.
- *Mitigation*: This is the same partial-failure window AZ-305 § Risk 1 covers via the construction-time reconciliation scan. The enforcer doesn't add new risk here; the reconciliation is the fix-up point.
**Risk 5: Operator bumps `cache_budget_bytes` mid-flight**
- *Risk*: Operator edits config to raise the cap mid-flight; the enforcer's `budget_bytes` is fixed at construction; the change is silent.
- *Mitigation*: Documented constraint — config is per-flight; mid-flight changes require a process restart. Future task could add a SIGHUP-driven reload — out of scope this cycle.
## Runtime Completeness
- **Named capability**: 10 GB hard-cap eviction with LRU sweep enforcing RESTRICT-SAT-2 (description.md / E-C6 / RESTRICT-SAT-2 / C6-IT-06).
- **Production code that must exist**: real `CacheBudgetEnforcer` class with real `total_disk_bytes` query, real `lru_candidates` iteration, real `delete_tile` calls, real INFO logs per eviction, real FDR records per batch, real `BudgetEnforcedTileStore` decorator wrapping the production store, real `record_lru_access` wiring on every read.
- **Allowed external stubs**: tests MAY use a fake `TileMetadataStore` (in-memory implementation of the AZ-303 Protocol with simulated `lru_candidates` ordering) and fake `FdrClient` / `Logger`; production wiring uses real `PostgresFilesystemStore` + real AZ-273 `FdrClient` + real AZ-266 `Logger`.
- **Unacceptable substitutes**: a "soft cap" that logs but doesn't actually evict (would defeat RESTRICT-SAT-2 — the cap is hard); a background-sweep timer that evicts asynchronously (would race with `write_tile` and lose the head-room guarantee at the call site); skipping the LRU update on read (would make eviction pick wrong candidates); rewrapping/swallowing `TileCacheError` in the decorator (would hide insert-side errors from the F4 path).
## Contract
This task implements behaviour mandated by `_docs/02_document/contracts/c6_tile_cache/tile_metadata_store.md` § I-4 (LRU clock) + § I-5 (disk-budget invariant) + the methods `lru_candidates`, `record_lru_access`, `total_disk_bytes`. No new contract file — the enforcer is a policy implementation behind the existing Protocol surface. `CacheBudgetExhaustedError` is added to the documented `TileCacheError` family in `tile_store.md` § Error types via a minor contract version bump (1.0.0 → 1.1.0); the producer task (this one) updates the contract's Change Log when shipping.