[AZ-308] c6 CacheBudgetEnforcer: 10 GB hard cap + LRU sweep

CacheBudgetEnforcer.reserve_headroom(needed_bytes) returns immediately
when total_disk_bytes() + needed_bytes <= budget, otherwise iterates
lru_candidates in eviction_batch_size batches, deletes via delete_tile,
emits one INFO log per evicted tile (c6.evicted) and one FDR record per
eviction batch (c6.eviction_batch, evicted_tile_ids capped to 5).
Raises CacheBudgetExhaustedError AFTER a full sweep if the budget
cannot be met. BudgetEnforcedTileStore decorates a TileStore so the
policy stays separable from PostgresFilesystemStore. Composition root
in storage_factory.build_tile_store wires the wrapper unconditionally.

PostgresFilesystemStore now accepts lru_clock: Clock | None = None;
when set, read_tile_pixels calls record_lru_access(tile_id, now) so
eviction picks the right LRU candidates. Production wiring injects
WallClock(); AZ-305 unit tests still construct without the clock and
keep their pass-through semantics. Contract tile_store.md bumped to
v1.1.0 to add CacheBudgetExhaustedError to the TileCacheError family;
shared FDR schema bumped to v1.3.0 for the new c6.eviction_batch kind.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 20:37:41 +03:00
parent 39ff47087f
commit d571ca25f9
13 changed files with 1588 additions and 29 deletions
@@ -7,7 +7,7 @@
- AZ-TBD-c6-freshness-gate (insert hook collaborator)
- AZ-TBD-c6-cache-budget-eviction (uses `tile_exists` + `delete_tile`)
- TBD at decompose time: E-C2.5 (AZ-256), E-C3 (AZ-257), E-C11 (AZ-251 — both `TileDownloader` and `TileUploader`)
**Version**: 1.0.0
**Version**: 1.1.0
**Status**: draft
**Last Updated**: 2026-05-10
@@ -104,11 +104,12 @@ All under `c6_tile_cache.errors`:
```
TileCacheError (Exception subclass)
├── TileNotFoundError # tile_id not present on disk
├── TileFsError # I/O error on read/write/rename
├── TileMetadataError # row missing despite file present, or vice-versa (consistency violation)
├── ContentHashMismatchError # supplied JPEG bytes don't match declared content_sha256
── FreshnessRejectionError # rejected by the C6 freshness gate (raised on insert in active_conflict)
├── TileNotFoundError # tile_id not present on disk
├── TileFsError # I/O error on read/write/rename
├── TileMetadataError # row missing despite file present, or vice-versa (consistency violation)
├── ContentHashMismatchError # supplied JPEG bytes don't match declared content_sha256
── FreshnessRejectionError # rejected by the C6 freshness gate (raised on insert in active_conflict)
└── CacheBudgetExhaustedError # LRU sweep ran to completion but couldn't free `needed_bytes` (AZ-308)
```
`IndexUnavailableError` lives under the same package but is exclusively raised by `DescriptorIndex` — it is not part of `TileStore`'s envelope.
@@ -164,3 +165,4 @@ JPEG body lands at `<root>/tiles/{zoom_level}/{x}/{y}.jpg` where `(x, y)` is der
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | 2026-05-10 | Initial contract — Protocol + DTOs + 5-error family + filesystem byte-identity invariant. | autodev (decompose Step 2 of AZ-250 / E-C6) |
| 1.1.0 | 2026-05-12 | Additive: `CacheBudgetExhaustedError` joins the `TileCacheError` family for AZ-308 cache-budget enforcement. No existing-shape changes. | autodev (AZ-308) |
@@ -3,7 +3,7 @@
**Component**: shared_fdr_client (cross-cutting concern owned by E-CC-FDR-CLIENT / AZ-247)
**Producer task**: AZ-272 — `_docs/02_tasks/todo/AZ-272_fdr_record_schema.md`
**Consumer tasks**: every onboard component that emits FDR records (C1C13), the C13 writer (AZ-248 / E-C13), post-flight tooling (E-C12 operator side), the FdrClient ring buffer (AZ-XX / E-CC-FDR-CLIENT next task), and `FakeFdrSink` (AZ-XX / E-CC-FDR-CLIENT fourth task)
**Version**: 1.2.0
**Version**: 1.3.0
**Status**: draft
**Last Updated**: 2026-05-12
@@ -57,6 +57,7 @@ class FdrRecord:
| `c6.write_failed` | C6 (`PostgresFilesystemStore`) | `{tile_id, source, reason, error_class, message}` | v1.1.0 (AZ-305). Emitted on every failed `write_tile` path. `reason``{content_hash_mismatch, freshness_reject, metadata_error, fs_error}`; `error_class` is the exception class name; `message` is the rewrapped exception's `str` (truncated to 512 chars to keep the record inline). Envelope `producer_id="c6_tile_cache.store"`. |
| `c6.freshness.rejected` | C6 (`FreshnessGate`) | `{tile_id, age_seconds, classification, rule_action, rule_max_age_seconds}` | v1.2.0 (AZ-307). Emitted on every active-conflict-stale reject. `tile_id` is the canonical UUIDv5; `age_seconds` is the integer-rounded `(now - capture_timestamp).total_seconds()` at decision time; `classification` is the `SectorClassification` enum value (always `"active_conflict"` for this kind in practice); `rule_action` is always `"reject"`; `rule_max_age_seconds` is the rule's threshold (e.g. `15552000` for the 6-month default). Envelope `producer_id="c6_tile_cache.freshness"`. |
| `c6.freshness.downgraded` | C6 (`FreshnessGate`) | `{tile_id, age_seconds, classification, rule_action, rule_max_age_seconds}` | v1.2.0 (AZ-307). Emitted on every stable-rear-stale downgrade (including the implicit-default path for tiles outside every loaded sector). Same payload shape as `c6.freshness.rejected` so reject/downgrade FDR traces are line-for-line comparable; `rule_action` is always `"downgrade"` and `classification` is always `"stable_rear"` for this kind. Envelope `producer_id="c6_tile_cache.freshness"`. |
| `c6.eviction_batch` | C6 (`CacheBudgetEnforcer`) | `{trigger_tile_id, freed_bytes, evicted_count, evicted_tile_ids}` | v1.3.0 (AZ-308). Emitted once per `reserve_headroom` call that actually evicted at least one tile (RESTRICT-SAT-2 enforcement). `trigger_tile_id` is the canonical UUIDv5 of the tile whose write triggered the sweep; `freed_bytes` is the integer total reclaimed; `evicted_count` is the FULL count of evictions in the batch regardless of payload caps; `evicted_tile_ids` is bounded to the first **5** evicted ids (the full list lives in the per-tile `c6.evicted` INFO logs). Envelope `producer_id="c6_tile_cache.budget"`. |
### Wire bytes
@@ -111,3 +112,4 @@ class FdrRecord:
| 1.0.0 | 2026-05-10 | Initial contract derived from E-CC-FDR-CLIENT epic (AZ-247) | autodev decompose Step 2 |
| 1.1.0 | 2026-05-12 | Add `c6.write` and `c6.write_failed` kinds emitted by C6 `PostgresFilesystemStore` (AZ-305). Non-breaking; v1.0 parsers see the records as unknown kinds and route them through the forward-compat opaque path. | AZ-305 implement |
| 1.2.0 | 2026-05-12 | Add `c6.freshness.rejected` and `c6.freshness.downgraded` kinds emitted by the C6 `FreshnessGate` (AZ-307). Non-breaking; v1.1 parsers see the records as unknown kinds and route them through the forward-compat opaque path. | AZ-307 implement |
| 1.3.0 | 2026-05-12 | Add `c6.eviction_batch` kind emitted by the C6 `CacheBudgetEnforcer` (AZ-308). Non-breaking; v1.2 parsers see the record as an unknown kind and route it through the forward-compat opaque path. | AZ-308 implement |
@@ -0,0 +1,224 @@
# Batch 30 / Cycle 1 — Implementation Report
**Date**: 2026-05-12
**Tasks**: AZ-308 (C6 Cache Budget Eviction — 10 GB hard cap with LRU sweep)
**Story points landed**: 3
**Status**: complete (AZ-308 → In Testing)
## Scope summary
Single-task batch landing the production `CacheBudgetEnforcer` — the
policy layer that converts AZ-303's `total_disk_bytes` / `lru_candidates`
/ `delete_tile` / `record_lru_access` primitives into RESTRICT-SAT-2's
**10 GB hard cap**. The enforcer runs **synchronously inside
`write_tile`** via the new `BudgetEnforcedTileStore` decorator: every
write first asks `reserve_headroom(len(tile_blob))`; if head-room is
sufficient the call is a single `total_disk_bytes()` SELECT and
returns immediately, otherwise the enforcer iterates
`lru_candidates(max_count=eviction_batch_size)` in 32-row batches,
deletes the oldest tiles via `delete_tile`, and stops as soon as the
freed bytes meet the shortfall. If the candidate list is exhausted
without meeting the budget, `CacheBudgetExhaustedError` is raised
**after** the full sweep (per AC-5 — partial eviction beats no
eviction so the operator's recovery has maximum head-room).
Eviction is observable end-to-end: one INFO log per evicted tile
(`kind="c6.evicted"`, payload `{tile_id, disk_bytes, accessed_at,
evicted_at}`), one FDR record per eviction batch (`kind=
"c6.eviction_batch"`, payload `{trigger_tile_id, freed_bytes,
evicted_count, evicted_tile_ids[:5]}` — capped to 5 ids to keep the
record bounded), and one construction-time INFO log
(`kind="c6.budget.loaded"`) so the operator sees `(budget_bytes,
current_disk_bytes, headroom_bytes)` at process start (with a WARN if
the prior flight ended over-budget).
The AZ-305 LRU-clock hook is now wired: `PostgresFilesystemStore`
accepts an optional `lru_clock: Clock | None = None` ctor argument, and
when set, every `read_tile_pixels` call invokes `record_lru_access(
tile_id, now)` after the row/file existence check. The unit-test path
(AZ-305's existing fixtures) can still construct the store with
`lru_clock=None`, preserving the AZ-305 contract. Production wiring
in `storage_factory.build_tile_store` always injects `WallClock()`
into the inner store and wraps the result in `BudgetEnforcedTileStore`.
The decorator pattern is mandatory per the spec § Constraints — making
budget enforcement a wrapper keeps the policy layer separable from the
store impl (single-responsibility), and a future voting-tier-aware
policy can replace the enforcer without changing
`PostgresFilesystemStore`.
## Files added / modified
### New (production)
- `src/gps_denied_onboard/components/c6_tile_cache/cache_budget_enforcer.py`
`EvictionResult` frozen dataclass; `_iso_ts_now` UTC helper;
`CacheBudgetEnforcer` class with one public method
`reserve_headroom(needed_bytes) -> EvictionResult` doing the
no-evict fast-path → LRU-sweep escalation flow, emitting one INFO
log per eviction and one FDR record per batch, plus the AC-12
construction-time `c6.budget.loaded` INFO log (with optional WARN
on over-budget startup); `BudgetEnforcedTileStore` decorator
implementing the `TileStore` Protocol by delegating
`read_tile_pixels` / `tile_exists` / `delete_tile` straight through
and calling `enforcer.reserve_headroom(len(tile_blob))` before
delegating `write_tile`; and an operator CLI
(`python -m gps_denied_onboard.components.c6_tile_cache.cache_budget_enforcer dry-run --pretend-needed-bytes N`)
that loads config via `load_config(os.environ)` and prints what
WOULD be evicted without performing the eviction (no `delete_tile`
call, no FDR write, no INFO log).
### Modified (production)
- `src/gps_denied_onboard/components/c6_tile_cache/errors.py` — adds
`CacheBudgetExhaustedError` to the `TileCacheError` family with
diagnostic fields `needed_bytes`, `available_bytes`,
`evicted_count` (all keyword-only, all default to `None` so the
parameter set is forward-compatible with future tightening).
- `src/gps_denied_onboard/components/c6_tile_cache/config.py` — adds
the `eviction_batch_size: int = 32` config knob (default per spec
§ Constraints, validated `> 0` in `__post_init__`); the existing
`lru_eviction_threshold_bytes` already provides the budget.
- `src/gps_denied_onboard/components/c6_tile_cache/postgres_filesystem_store.py`
— adds optional `lru_clock: Clock | None = None` ctor arg; when
present, `read_tile_pixels` calls
`self.record_lru_access(tile_id, now_dt)` after row/file existence
checks succeed, where `now_dt = datetime.fromtimestamp(
self._lru_clock.time_ns() / 1e9, tz=UTC)`. `from_config` now
injects `WallClock()` so the production path always updates the
LRU clock; AZ-305's unit tests that construct the store directly
with no clock keep the pass-through behaviour (the LRU UPDATE is
guarded by `if self._lru_clock is not None`).
- `src/gps_denied_onboard/fdr_client/records.py` — adds
`c6.eviction_batch` (payload `{trigger_tile_id, freed_bytes,
evicted_count, evicted_tile_ids}` capped to 5 ids per AC-11) to
`KNOWN_PAYLOAD_KEYS`. The per-tile `c6.evicted` event is INFO log
only (it is high-frequency under load and would dilute the FDR
ring-buffer; aggregated batch counts go to FDR).
- `src/gps_denied_onboard/runtime_root/storage_factory.py`
`build_tile_store` now constructs a `PostgresFilesystemStore`, a
`CacheBudgetEnforcer` wired to a producer-local `FdrClient`
(`producer_id="c6_tile_cache.budget"`) and the C6 logger, with
`budget_bytes = config.tile_cache.lru_eviction_threshold_bytes`
and `eviction_batch_size = config.tile_cache.eviction_batch_size`
— then wraps the store in a `BudgetEnforcedTileStore` and returns
the decorator. `build_tile_metadata_store` is unchanged (the
decorator only intercepts `TileStore`, never the metadata store).
### Modified (tests)
- `tests/unit/c6_tile_cache/test_cache_budget_enforcer.py`**NEW**
suite of 18 tests:
- 4 non-docker unit tests for `CacheBudgetEnforcer` against an
in-memory `_FakeStore` covering AC-1 (no-eviction fast path),
AC-2 (single-tile sweep), AC-3 (multi-tile until shortfall met),
AC-4 (batch-size-respecting `lru_candidates` calls).
- 3 non-docker tests for the error-handling envelope: AC-5 (sweep
exhausted → `CacheBudgetExhaustedError` AFTER all candidates
deleted), AC-7 (decorator does NOT rewrap a
`ContentHashMismatchError` from the inner store), AC-9
(SELECT-count tally for no-evict vs evict paths).
- 4 non-docker tests for FDR + log payloads: AC-11 (evicted_tile_ids
truncated to 5 even when 100 evictions occurred), AC-12
(construction-time `c6.budget.loaded` INFO log + WARN-on-over-
budget), and the NFR-reliability "candidate gone mid-sweep"
case where `delete_tile` returns False.
- 1 non-docker NFR test (`reserve_headroom × 10000` no-evict path
with a strict p99 ≤ 5 ms ceiling).
- 3 `@pytest.mark.docker` Tier-2 tests against a real Postgres
(composition-root smoke): AC-6 (decorator + `write_tile`
end-to-end with near-cap state), AC-8 (real `read_tile_pixels`
bumps the LRU clock and changes `lru_candidates` ordering), and
AC-10 (synthetic-fill test — 50 MB of writes under a deliberately
tight 50 MB pre-eviction headroom; verifies eviction kicks in
and disk usage never exceeds the cap).
- 3 protocol-shape sanity tests (`EvictionResult` is frozen and
`total_freed_bytes` derives correctly, the wrapper exposes the
underlying store as `_wrapped`, and the decorator passes
`tile_exists` / `delete_tile` straight through).
- `tests/unit/c6_tile_cache/test_protocol_conformance.py` — adjusted
`_install_fake_postgres_store_module` to provide a working
`total_disk_bytes() -> 0` (the prior `NotImplementedError` stub
would break `CacheBudgetEnforcer.__init__` which reads the value
for AC-12); and rewrote
`test_ac4_build_tile_store_returns_protocol_impl` to recognise the
AZ-308 wrapper (`isinstance(store, BudgetEnforcedTileStore)`,
`isinstance(store, TileStore)`, `isinstance(store._wrapped,
fake_cls)`). No new fakes; the change is local to one shared
helper + one test.
- `tests/unit/test_az272_fdr_record_schema.py` — adds a fixture
payload for the new `c6.eviction_batch` kind so the AZ-272 per-kind
round-trip test covers it.
### Modified (docs)
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md`
— bumped to v1.3.0; added a row for `c6.eviction_batch`
(producer `c6_tile_cache.budget`, payload shape, cap-to-5 note) in
the v1.0.0 closed-enum table and a change-log entry.
- `_docs/02_document/contracts/c6_tile_cache/tile_store.md` — bumped
to v1.1.0 (additive); `CacheBudgetExhaustedError` joins the
`TileCacheError` family diagram + change-log entry per the
Versioning Rules § "new error variant added to `TileCacheError`".
## Acceptance criteria coverage
| AC | Test | Status |
|----|------|--------|
| AC-1 No-eviction fast path | `test_ac1_no_eviction_fast_path` | passing |
| AC-2 Single-tile eviction frees enough | `test_ac2_single_tile_eviction_frees_enough` | passing |
| AC-3 Multi-tile eviction iterates LRU candidates | `test_ac3_multi_tile_eviction_iterates_until_target` | passing |
| AC-4 Eviction batches respect `eviction_batch_size` | `test_ac4_eviction_batches_respect_batch_size` | passing |
| AC-5 Insufficient candidates raise `CacheBudgetExhaustedError` | `test_ac5_insufficient_candidates_raise_after_full_sweep` | passing |
| AC-6 `BudgetEnforcedTileStore` decorator integrates with `write_tile` | `test_ac6_decorator_write_tile_triggers_eviction` (Docker) | passing |
| AC-7 Decorator propagates `TileCacheError` unchanged | `test_ac7_decorator_propagates_tilecacheerror_unchanged` | passing |
| AC-8 `read_tile_pixels` updates the LRU clock | `test_ac8_read_tile_pixels_updates_lru_clock` (Docker) | passing |
| AC-9 No-evict path = 1 SELECT; evict path = 1 + N + N | `test_ac9_no_evict_path_uses_single_select` | passing |
| AC-10 10 GB budget enforcement under synthetic load | `test_ac10_synthetic_load_stays_under_cap` (Docker) | passing |
| AC-11 FDR `evicted_tile_ids` capped to 5 | `test_ac11_fdr_evicted_tile_ids_capped_at_five` | passing |
| AC-12 Construction-time disk-bytes report | `test_ac12_construction_emits_budget_loaded_info` + `test_ac12_construction_warns_when_over_budget` | passing |
| NFR-perf no-evict p99 ≤ 5 ms | `test_nfr_perf_no_evict_path_p99_under_5ms` | passing |
| NFR-reliability candidate-gone mid-sweep | `test_nfr_reliability_delete_returns_false_no_op` | passing |
## AC Test Coverage: 12 of 12 covered (+ 2 NFRs + 1 frozen-dataclass shape test)
## Code Review Verdict: PASS
## Auto-Fix Attempts: 1 (ruff `format` + `check` — 8 cosmetic findings auto-resolved: 4 ambiguous `×` characters in comments, 3 unused `noqa: ARG002` directives, 1 unescaped-metacharacter regex in `pytest.raises(match=...)`)
## Stuck Agents: None
## Findings (self-review)
| # | Severity | Category | Location | Note | Resolution |
|---|----------|----------|----------|------|------------|
| 1 | Low | Maintainability | `CacheBudgetEnforcer.__init__` | The ctor runs `self._store.total_disk_bytes()` synchronously to emit the AC-12 startup INFO log. If the metadata store's pool is contended at process start, this blocks the composition-root path. Accepted because the enforcer is constructed once per process and the cost is one indexed SELECT. | Open (Low) — accepted as-is. |
| 2 | Low | Test-quality | `test_ac10_synthetic_load_stays_under_cap` | Uses a 50 MB synthetic budget (not the 10 GB production cap) to keep the test reasonable on a dev laptop. The cap-enforcement logic is the same shape; the test verifies the loop terminates correctly and disk usage never exceeds the cap. | Open (Low) — accepted as-is. |
| 3 | Low | Test-quality | `test_ac8_read_tile_pixels_updates_lru_clock` | Wall-clock parity between the host (Python) and Postgres container is sub-second-skew on macOS/Colima, so a real `record_lru_access` UPDATE with the host wall clock can lose to `GREATEST(accessed_at, %s)` against the DB's `DEFAULT now()`. Test pins the LRU clock to a far-future timestamp (`2099-01-01`) via a fixture-local `_FakeClock`; production wiring (`storage_factory`) still injects `WallClock()`. | Open (Low) — accepted as-is. |
| 4 | Low | Adjacent-Hygiene | `tests/unit/c6_tile_cache/test_protocol_conformance.py::_FakePostgresFilesystemStore` | The AZ-303 protocol-conformance fake inherits `total_disk_bytes` from `_FullTileMetadataStore` which raises `NotImplementedError`. Once `build_tile_store` started constructing a `CacheBudgetEnforcer` (which calls `total_disk_bytes` at construction), this stub broke the test. Overrode `total_disk_bytes` on the AZ-308 path to return 0 — minimal change, no other test using the shared helper changed semantically. | **FIXED** in this batch. |
| 5 | Low | Maintainability | `BudgetEnforcedTileStore._wrapped` | The wrapper exposes the inner store via a private `_wrapped` attribute so tests + future debugging can introspect it. This is documented in the AC-4 protocol-conformance test comment; not part of the public Protocol contract (the Protocol only requires the four `TileStore` methods, which the wrapper provides). | Open (Low) — accepted as documented. |
## Tracker
- AZ-308 transitioned to **In Progress** on session start; will be moved to **In Testing** post-commit per `protocols.md`.
## Test suite
- `tests/unit/c6_tile_cache/test_cache_budget_enforcer.py` (18 tests) —
passing standalone (Tier-2 + Docker Postgres) and as part of the
combined c6 suite (193 / 194 passed in the combined run; see below).
- `tests/unit/c6_tile_cache/` (194 tests) — 193 passing; the same
`test_ac13_read_tile_pixels_warm_latency_p95` flake noted in the
AZ-307 batch 29 report (Finding 3 of the AZ-305 batch 28 report)
surfaces under combined load. Verified non-regression by `git stash
-u` round-trip: with my AZ-308 changes stashed, the same test still
fails (`p95 ≈ 8 ms` vs the 5 ms ceiling) in the combined run, and
passes 3-of-3 standalone. Not a blocker for AZ-308.
- `tests/unit/test_az272_fdr_record_schema.py` — passing with the new
`c6.eviction_batch` kind fixtured.
- Full unit suite (excluding `tests/integration/` and the unrelated
c7 `test_ac8_read_host_tuple_on_jetson` that requires `pynvml`,
pre-existing) — 1267 passed, 8 environment-skipped (CUDA-only, cmake,
actionlint), 1 deselected (pynvml).
## Next batch
Cycle 1 advances per the greenfield queue — autodev re-detects the
next AZ ticket in the Step 7 batch loop and continues.
@@ -0,0 +1,448 @@
"""C6 cache-budget enforcer (AZ-308).
RESTRICT-SAT-2 enforcement: the on-disk tile cache MUST NOT exceed a
configurable hard cap (default 10 GiB). Every :meth:`TileStore.write_tile`
goes through the :class:`BudgetEnforcedTileStore` decorator which calls
:meth:`CacheBudgetEnforcer.reserve_headroom` BEFORE the underlying store
writes a byte to disk. If the cap would be breached, the enforcer runs
an LRU sweep using the store's :meth:`TileMetadataStore.lru_candidates`
+ :meth:`TileStore.delete_tile` primitives (AZ-305) until enough
head-room is freed; if even total eviction cannot fit ``needed_bytes``
it raises :class:`CacheBudgetExhaustedError` AFTER the loop completes
(partial eviction is preferable to no eviction — AC-5).
The enforcer is the SOLE eviction path during a flight: no other
component evicts tiles (Reliability constraint of AZ-308). Per-eviction
INFO logs (``kind="c6.evicted"``) carry the tile-level detail; the
per-batch FDR record (``kind="c6.eviction_batch"``) is bounded to the
first 5 evicted ids (AC-11) so the F4 producer never blows the FDR
ring with a runaway sweep.
The decorator pattern is mandatory — moving the budget check inside
``PostgresFilesystemStore.write_tile`` would couple policy to the
filesystem impl and break the single-responsibility design that lets
the store remain unit-testable in isolation.
"""
from __future__ import annotations
import argparse
import logging
import os
import sys
from dataclasses import dataclass
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Final
from gps_denied_onboard.components.c6_tile_cache._types import (
TileId,
TileMetadata,
)
from gps_denied_onboard.components.c6_tile_cache.errors import (
CacheBudgetExhaustedError,
TileCacheError,
TileFsError,
TileMetadataError,
)
from gps_denied_onboard.components.c6_tile_cache.interface import (
TileMetadataStore,
TileStore,
)
from gps_denied_onboard.fdr_client.records import CURRENT_SCHEMA_VERSION, FdrRecord
if TYPE_CHECKING:
from gps_denied_onboard.components.c6_tile_cache._tile_pixel_handle import (
TilePixelHandle,
)
from gps_denied_onboard.fdr_client.client import FdrClient
__all__ = [
"BudgetEnforcedTileStore",
"CacheBudgetEnforcer",
"EvictionResult",
]
_PRODUCER_ID: Final[str] = "c6_tile_cache.budget"
_FDR_TILE_IDS_CAP: Final[int] = 5
@dataclass(frozen=True)
class EvictionResult:
"""Outcome of one :meth:`CacheBudgetEnforcer.reserve_headroom` call.
``evicted`` is the ordered list of :class:`TileMetadata` rows that
were removed; empty if the no-eviction fast path fired. ``freed_bytes``
is the integer total reclaimed (each candidate's ``disk_bytes``).
"""
evicted: list[TileMetadata]
freed_bytes: int
def _iso_ts_now() -> str:
"""RFC 3339 UTC timestamp with microsecond precision and ``Z`` suffix.
Used only on the FDR record envelope ``ts`` field — distinct from the
per-row ``accessed_at`` / ``evicted_at`` datetimes which use the same
wall-clock source but carry the operator-facing semantics.
"""
return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ")
class CacheBudgetEnforcer:
"""LRU-driven 10 GiB hard-cap enforcer for the C6 tile cache.
Construction reads :meth:`TileMetadataStore.total_disk_bytes` once
to emit a startup INFO log (AC-12) and warns if the prior flight
ended over-budget. Construction does NOT proactively evict — the
first :meth:`reserve_headroom` call drives any required sweep.
The enforcer holds no per-flight state of its own; the store is the
source of truth. This keeps the enforcer trivially restartable and
means a mid-flight process restart resumes with whatever disk
state the prior process left behind (subject to AZ-305's startup
orphan-reconciliation scan).
"""
def __init__(
self,
*,
store: TileMetadataStore,
fdr_client: FdrClient,
logger: logging.Logger,
budget_bytes: int,
eviction_batch_size: int = 32,
) -> None:
if budget_bytes <= 0:
raise TileMetadataError(
f"CacheBudgetEnforcer: budget_bytes must be > 0; got {budget_bytes}"
)
if eviction_batch_size <= 0:
raise TileMetadataError(
f"CacheBudgetEnforcer: eviction_batch_size must be > 0; got {eviction_batch_size}"
)
self._store = store
# The runtime ``PostgresFilesystemStore`` instance satisfies both
# the metadata-store and tile-store Protocols; the type hint stays
# ``TileMetadataStore`` for clarity but we duck-type ``delete_tile``
# off the same reference (AZ-308 spec § Outcome).
self._tile_store: TileStore = store # type: ignore[assignment]
self._fdr_client = fdr_client
self._logger = logger
self._budget_bytes = budget_bytes
self._eviction_batch_size = eviction_batch_size
current = self._store.total_disk_bytes()
headroom = max(self._budget_bytes - current, 0)
self._logger.info(
"c6.budget.loaded",
extra={
"kind": "c6.budget.loaded",
"kv": {
"budget_bytes": self._budget_bytes,
"current_disk_bytes": current,
"headroom_bytes": headroom,
"eviction_batch_size": self._eviction_batch_size,
},
},
)
if current > self._budget_bytes:
# AC-12: prior flight left the cache over-budget; surface
# the overage so operators can choose whether to inspect
# before the first F4 burst triggers cascade eviction.
self._logger.warning(
"c6.budget.over_budget_at_construction",
extra={
"kind": "c6.budget.over_budget_at_construction",
"kv": {
"budget_bytes": self._budget_bytes,
"current_disk_bytes": current,
"overage_bytes": current - self._budget_bytes,
},
},
)
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
@property
def budget_bytes(self) -> int:
return self._budget_bytes
@property
def eviction_batch_size(self) -> int:
return self._eviction_batch_size
def reserve_headroom(
self,
needed_bytes: int,
*,
trigger_tile_id: TileId | None = None,
) -> EvictionResult:
"""Ensure at least ``needed_bytes`` of head-room.
Reads :meth:`total_disk_bytes` once, computes the available
head-room, and either:
- returns an empty :class:`EvictionResult` if there is already
room (AC-1, the no-evict fast path), or
- sweeps LRU candidates in batches of ``eviction_batch_size``
until ``freed_bytes >= shortfall`` (AC-2, AC-3, AC-4), or
- raises :class:`CacheBudgetExhaustedError` AFTER the sweep
exhausts all candidates without reaching the target (AC-5).
"""
if needed_bytes < 0:
raise TileMetadataError(
f"reserve_headroom: needed_bytes must be >= 0; got {needed_bytes}"
)
current = self._store.total_disk_bytes()
available = self._budget_bytes - current
if available >= needed_bytes:
return EvictionResult(evicted=[], freed_bytes=0)
shortfall = needed_bytes - available
evicted_metadata: list[TileMetadata] = []
freed_bytes = 0
while freed_bytes < shortfall:
candidates = self._store.lru_candidates(max_count=self._eviction_batch_size)
if not candidates:
break
for persistent in candidates:
if freed_bytes >= shortfall:
break
md = persistent.metadata
evicted_at = datetime.now(timezone.utc)
try:
removed = self._tile_store.delete_tile(md.tile_id)
except TileFsError as exc:
# Row delete succeeded (AZ-305 contract) but the
# filesystem unlink failed; the budget already
# reflects the row's absence so we still count
# disk_bytes per the spec § Exclusions.
removed = True
self._logger.warning(
"c6.evict.fs_error",
extra={
"kind": "c6.evict.fs_error",
"kv": {
"tile_id_str": str(md.tile_id),
"disk_bytes": persistent.disk_bytes,
"error": str(exc),
},
},
)
if not removed:
# NFR-reliability-delete-already-gone: a concurrent
# path already evicted this row. Count the bytes
# anyway (per spec § Exclusions) so the loop makes
# progress; the next ``lru_candidates`` call won't
# return this id.
self._logger.info(
"c6.evict.already_gone",
extra={
"kind": "c6.evict.already_gone",
"kv": {
"tile_id_str": str(md.tile_id),
"disk_bytes": persistent.disk_bytes,
},
},
)
evicted_metadata.append(md)
freed_bytes += persistent.disk_bytes
self._logger.info(
"c6.evicted",
extra={
"kind": "c6.evicted",
"kv": {
"tile_id_str": str(md.tile_id),
"disk_bytes": persistent.disk_bytes,
"accessed_at": persistent.accessed_at.isoformat(),
"evicted_at": evicted_at.isoformat(),
},
},
)
if evicted_metadata:
self._emit_eviction_batch(
trigger_tile_id=trigger_tile_id,
evicted=evicted_metadata,
freed_bytes=freed_bytes,
)
if freed_bytes < shortfall:
available_post = available + freed_bytes
raise CacheBudgetExhaustedError(
f"CacheBudgetEnforcer: cannot reserve {needed_bytes} bytes — "
f"available_bytes={available_post} after evicting "
f"{len(evicted_metadata)} tiles (freed {freed_bytes} bytes); "
f"budget_bytes={self._budget_bytes}",
needed_bytes=needed_bytes,
available_bytes=available_post,
evicted_count=len(evicted_metadata),
)
return EvictionResult(evicted=evicted_metadata, freed_bytes=freed_bytes)
# ------------------------------------------------------------------
# Internal
# ------------------------------------------------------------------
def _emit_eviction_batch(
self,
*,
trigger_tile_id: TileId | None,
evicted: list[TileMetadata],
freed_bytes: int,
) -> None:
capped_ids = [str(md.tile_id) for md in evicted[:_FDR_TILE_IDS_CAP]]
self._fdr_client.enqueue(
FdrRecord(
schema_version=CURRENT_SCHEMA_VERSION,
ts=_iso_ts_now(),
producer_id=_PRODUCER_ID,
kind="c6.eviction_batch",
payload={
"trigger_tile_id": str(trigger_tile_id) if trigger_tile_id else "",
"freed_bytes": freed_bytes,
"evicted_count": len(evicted),
"evicted_tile_ids": capped_ids,
},
)
)
class BudgetEnforcedTileStore:
"""Decorator wrapping a :class:`TileStore` with pre-write budget check.
Implements the :class:`TileStore` Protocol — :meth:`write_tile` adds
the :meth:`CacheBudgetEnforcer.reserve_headroom` step before
delegating; the other three methods pass through unchanged so the
decorator is transparent to read-side consumers.
Crucially, the decorator does NOT swallow or rewrap exceptions from
the wrapped store (AC-7). Every :class:`TileCacheError` subclass
raised by the underlying ``write_tile`` (content-hash mismatch,
freshness rejection, fs/metadata errors) propagates unchanged so
the F4 producer's existing error-handling stays correct.
"""
def __init__(
self,
*,
wrapped: TileStore,
enforcer: CacheBudgetEnforcer,
) -> None:
self._wrapped = wrapped
self._enforcer = enforcer
def read_tile_pixels(self, tile_id: TileId) -> TilePixelHandle:
return self._wrapped.read_tile_pixels(tile_id)
def write_tile(self, tile_blob: bytes, metadata: TileMetadata) -> None:
# Pre-write budget check (AC-6). On eviction, the freed disk
# space is committed BEFORE the wrapped store opens its write
# transaction — there is no window where the budget is
# transiently breached.
self._enforcer.reserve_headroom(len(tile_blob), trigger_tile_id=metadata.tile_id)
self._wrapped.write_tile(tile_blob, metadata)
def tile_exists(self, tile_id: TileId) -> bool:
return self._wrapped.tile_exists(tile_id)
def delete_tile(self, tile_id: TileId) -> bool:
return self._wrapped.delete_tile(tile_id)
# ----------------------------------------------------------------------
# Operator CLI — `python -m c6_tile_cache.cache_budget_enforcer dry-run ...`
# ----------------------------------------------------------------------
def _build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
prog="c6_tile_cache.cache_budget_enforcer",
description=(
"Operator-side dry-run of the cache-budget enforcer. Reports "
"what the LRU sweep WOULD evict to make room for a hypothetical "
"write of N bytes, without actually deleting anything."
),
)
sub = parser.add_subparsers(dest="cmd", required=True)
dry = sub.add_parser(
"dry-run",
help="Show what would be evicted for a given needed-bytes target.",
)
dry.add_argument(
"--pretend-needed-bytes",
type=int,
required=True,
help="Hypothetical incoming write size (bytes).",
)
return parser
def _dry_run(args: argparse.Namespace) -> int:
from gps_denied_onboard.components.c6_tile_cache.postgres_filesystem_store import (
PostgresFilesystemStore,
)
from gps_denied_onboard.config import load_config
config = load_config(os.environ)
block = config.components["c6_tile_cache"]
store = PostgresFilesystemStore.from_config(config)
current = store.total_disk_bytes()
budget = block.lru_eviction_threshold_bytes
available = budget - current
needed = args.pretend_needed_bytes
print(f"budget_bytes: {budget}")
print(f"current_disk_bytes: {current}")
print(f"available_bytes: {available}")
print(f"needed_bytes: {needed}")
if available >= needed:
print("decision: NO_EVICTION (already enough head-room)")
return 0
shortfall = needed - available
print(f"shortfall_bytes: {shortfall}")
# Walk the LRU snapshot in the same batch shape the enforcer uses
# so the dry-run reports exactly what the real sweep would evict.
print(f"eviction_batch_size: {block.eviction_batch_size}")
print(f"would_evict (up to {block.eviction_batch_size}):")
freed = 0
count = 0
for persistent in store.lru_candidates(max_count=block.eviction_batch_size):
if freed >= shortfall:
break
print(
f" - tile_id={persistent.metadata.tile_id} "
f"accessed_at={persistent.accessed_at.isoformat()} "
f"disk_bytes={persistent.disk_bytes}"
)
freed += persistent.disk_bytes
count += 1
print(f"would_free_bytes: {freed}")
print(f"would_evict_count: {count}")
if freed < shortfall:
print("decision: WOULD_RAISE_CacheBudgetExhaustedError (not enough candidates)")
else:
print("decision: EVICT (sweep stops as soon as shortfall is covered)")
return 0
def main(argv: list[str] | None = None) -> int:
parser = _build_parser()
args = parser.parse_args(argv)
if args.cmd == "dry-run":
try:
return _dry_run(args)
except TileCacheError as exc:
print(f"error: {exc}", file=sys.stderr)
return 1
parser.error(f"unknown subcommand {args.cmd!r}")
return 2 # unreachable; argparse exits non-zero on error
if __name__ == "__main__":
sys.exit(main())
@@ -59,6 +59,7 @@ class C6TileCacheConfig:
postgres_dsn: str = ""
postgres_pool_size: int = 4
lru_eviction_threshold_bytes: int = 10 * 1024**3
eviction_batch_size: int = 32
def __post_init__(self) -> None:
if self.store_runtime not in KNOWN_TILE_STORE_RUNTIMES:
@@ -88,3 +89,7 @@ class C6TileCacheConfig:
f"C6TileCacheConfig.lru_eviction_threshold_bytes must be > 0; "
f"got {self.lru_eviction_threshold_bytes}"
)
if self.eviction_batch_size <= 0:
raise ConfigError(
f"C6TileCacheConfig.eviction_batch_size must be > 0; got {self.eviction_batch_size}"
)
@@ -22,6 +22,7 @@ if TYPE_CHECKING:
)
__all__ = [
"CacheBudgetExhaustedError",
"ContentHashMismatchError",
"FreshnessRejectionError",
"IndexBuildError",
@@ -107,6 +108,35 @@ class FreshnessRejectionError(TileCacheError):
self.rule = rule
class CacheBudgetExhaustedError(TileCacheError):
"""The cache-budget enforcer cannot make room for a new write.
Raised by :class:`CacheBudgetEnforcer.reserve_headroom` AFTER the
LRU sweep has run to completion and still could not free
``needed_bytes``. Partial eviction is preferable to no eviction
even when the budget cannot be met, so the loop runs first and the
error fires last (AZ-308 AC-5).
Diagnostic fields are populated for the F4 producer / operator
recovery path: ``needed_bytes`` is what the failed write asked
for; ``available_bytes`` is what was left after the sweep;
``evicted_count`` is how many tiles the sweep removed.
"""
def __init__(
self,
message: str,
*,
needed_bytes: int | None = None,
available_bytes: int | None = None,
evicted_count: int | None = None,
) -> None:
super().__init__(message)
self.needed_bytes = needed_bytes
self.available_bytes = available_bytes
self.evicted_count = evicted_count
class IndexUnavailableError(TileCacheError):
"""The descriptor index could not satisfy a read.
@@ -83,6 +83,7 @@ from gps_denied_onboard.helpers.sha256_sidecar import (
from gps_denied_onboard.helpers.wgs_converter import WgsConverter
if TYPE_CHECKING:
from gps_denied_onboard.clock.interface import Clock
from gps_denied_onboard.config.schema import Config
__all__ = ["MmapTilePixelHandle", "PostgresFilesystemStore"]
@@ -182,6 +183,7 @@ class PostgresFilesystemStore:
fdr_client: FdrClient,
logger: logging.Logger,
freshness_gate: FreshnessGate | None = None,
lru_clock: Clock | None = None,
) -> None:
self._root_dir = Path(root_dir)
self._tiles_dir = self._root_dir / "tiles"
@@ -194,6 +196,11 @@ class PostgresFilesystemStore:
# ``None`` keeps the AZ-305-only test path working (no gate wiring
# required for unit tests of the store in isolation).
self._freshness_gate = freshness_gate
# AZ-308 AC-8: when injected, every ``read_tile_pixels`` call
# records an LRU access stamp so :meth:`lru_candidates` picks
# accurate eviction targets. ``None`` keeps AZ-305 unit-test
# wiring (no LRU UPDATE on read) intact.
self._lru_clock = lru_clock
try:
self._tiles_dir.mkdir(parents=True, exist_ok=True)
except OSError as exc:
@@ -268,6 +275,7 @@ class PostgresFilesystemStore:
fdr_client=fdr_client,
logger=logger,
freshness_gate=freshness_gate,
lru_clock=WallClock(),
)
# ------------------------------------------------------------------
@@ -322,6 +330,15 @@ class PostgresFilesystemStore:
},
)
raise TileMetadataError(msg)
# AZ-308 AC-8: refresh the LRU clock on every read so eviction
# picks the actually-coldest tiles. The UPDATE runs only when the
# Clock was injected at construction (production wiring); AZ-305
# unit tests pass ``lru_clock=None`` and pay zero cost.
if self._lru_clock is not None:
now_dt = datetime.fromtimestamp(
self._lru_clock.time_ns() / 1_000_000_000, tz=timezone.utc
)
self.record_lru_access(tile_id, now_dt)
return MmapTilePixelHandle(path)
def write_tile(self, tile_blob: bytes, metadata: TileMetadata) -> None:
@@ -145,6 +145,19 @@ KNOWN_PAYLOAD_KEYS: Final[dict[str, frozenset[str]]] = {
"c6.freshness.downgraded": frozenset(
{"tile_id", "age_seconds", "classification", "rule_action", "rule_max_age_seconds"}
),
# AZ-308 / E-C6: emitted by the CacheBudgetEnforcer at the end of every
# LRU sweep that evicted at least one tile (RESTRICT-SAT-2). One record
# per ``reserve_headroom`` call that hit the eviction path, never per
# evicted tile (per-tile detail goes to logs). ``trigger_tile_id`` is
# the canonical UUIDv5 of the tile whose write triggered the sweep;
# ``freed_bytes`` is the integer total reclaimed; ``evicted_count`` is
# the full count regardless of how many ids fit in
# ``evicted_tile_ids`` (capped to 5 to keep the record bounded —
# AC-11). The full eviction list is replayable from the per-tile
# ``c6.evicted`` INFO log records.
"c6.eviction_batch": frozenset(
{"trigger_tile_id", "freed_bytes", "evicted_count", "evicted_tile_ids"}
),
}
KNOWN_KINDS: Final[frozenset[str]] = frozenset(KNOWN_PAYLOAD_KEYS.keys())
@@ -71,14 +71,26 @@ def build_tile_store(config: Config) -> TileStore:
invoked via ``PostgresFilesystemStore.from_config(config)`` which
wires the ``ConnectionPool`` / ``FdrClient`` / logger / static
helper dependencies from the config block.
AZ-308: the returned :class:`TileStore` is wrapped in a
:class:`BudgetEnforcedTileStore` so every ``write_tile`` first
reserves head-room against the configured
``lru_eviction_threshold_bytes`` budget (RESTRICT-SAT-2). The
wrapper is transparent for read-side consumers.
"""
block = _c6_config(config)
runtime = block.store_runtime
if runtime == "postgres_filesystem":
try:
from gps_denied_onboard.components.c6_tile_cache.cache_budget_enforcer import (
BudgetEnforcedTileStore,
CacheBudgetEnforcer,
)
from gps_denied_onboard.components.c6_tile_cache.postgres_filesystem_store import (
PostgresFilesystemStore,
)
from gps_denied_onboard.fdr_client.client import make_fdr_client
from gps_denied_onboard.logging import get_logger
except ModuleNotFoundError as exc:
raise RuntimeNotAvailableError(
f"TileStore runtime {runtime!r} is configured but its "
@@ -86,7 +98,15 @@ def build_tile_store(config: Config) -> TileStore:
"'c6_tile_cache.postgres_filesystem_store' has not been "
"built into this binary yet (AZ-305 pending)."
) from exc
return PostgresFilesystemStore.from_config(config)
store = PostgresFilesystemStore.from_config(config)
enforcer = CacheBudgetEnforcer(
store=store,
fdr_client=make_fdr_client("c6_tile_cache.budget", config),
logger=get_logger("c6_tile_cache.budget"),
budget_bytes=block.lru_eviction_threshold_bytes,
eviction_batch_size=block.eviction_batch_size,
)
return BudgetEnforcedTileStore(wrapped=store, enforcer=enforcer)
raise RuntimeNotAvailableError(
f"TileStore runtime {runtime!r} is not buildable in this binary."
)
@@ -0,0 +1,774 @@
"""AZ-308 — ``CacheBudgetEnforcer`` + ``BudgetEnforcedTileStore`` tests.
Most ACs are exercised against an in-memory fake ``TileMetadataStore``
that satisfies the AZ-303 Protocol so the unit tests run on Tier-1.
The few ACs that genuinely need a real Postgres + filesystem layout
(``AC-6`` decorator+write_tile end-to-end, ``AC-8`` LRU clock wired
into ``read_tile_pixels``, ``AC-10`` synthetic 10 GiB fill) carry
``@pytest.mark.docker`` and are auto-skipped on Tier-1.
To run the docker tests locally::
docker compose -f docker-compose.test.yml up -d db
GPS_DENIED_TIER=2 DB_URL=postgresql://gps_denied:dev@localhost:55432/gps_denied \\
pytest tests/unit/c6_tile_cache/test_cache_budget_enforcer.py
"""
from __future__ import annotations
import hashlib
import logging
import os
import time
from collections.abc import Iterator
from datetime import datetime, timedelta, timezone
from pathlib import Path
import psycopg
import pytest
from psycopg_pool import ConnectionPool
from gps_denied_onboard.components.c6_tile_cache._types import (
FreshnessLabel,
TileId,
TileMetadata,
TileMetadataPersistent,
TileSource,
VotingStatus,
)
from gps_denied_onboard.components.c6_tile_cache.cache_budget_enforcer import (
BudgetEnforcedTileStore,
CacheBudgetEnforcer,
EvictionResult,
)
from gps_denied_onboard.components.c6_tile_cache.config import C6TileCacheConfig
from gps_denied_onboard.components.c6_tile_cache.errors import (
CacheBudgetExhaustedError,
ContentHashMismatchError,
TileFsError,
TileMetadataError,
)
from gps_denied_onboard.components.c6_tile_cache.migrations import apply_migrations
from gps_denied_onboard.components.c6_tile_cache.postgres_filesystem_store import (
PostgresFilesystemStore,
)
from gps_denied_onboard.config.schema import Config
from gps_denied_onboard.fdr_client.fakes import FakeFdrSink
from gps_denied_onboard.helpers.sha256_sidecar import Sha256Sidecar
from gps_denied_onboard.helpers.wgs_converter import WgsConverter
from gps_denied_onboard.logging import get_logger
_docker = pytest.mark.docker
_NS_PER_S = 1_000_000_000
# ----------------------------------------------------------------------
# Test doubles
# ----------------------------------------------------------------------
class _FakeClock:
def __init__(self, now_dt: datetime) -> None:
self._now_ns = int(now_dt.timestamp() * _NS_PER_S)
def monotonic_ns(self) -> int:
return self._now_ns
def time_ns(self) -> int:
return self._now_ns
def sleep_until_ns(self, target_ns: int) -> None:
if target_ns > self._now_ns:
self._now_ns = target_ns
class _FakeStore:
"""In-memory ``TileMetadataStore`` + ``TileStore`` for unit tests.
Tracks LRU order via insertion order; ``delete_tile`` is idempotent;
counters on every Protocol method let tests assert AC-4 / AC-9 query
discipline without round-tripping a real DB.
"""
def __init__(
self,
*,
rows: list[TileMetadataPersistent] | None = None,
delete_returns: dict[TileId, bool] | None = None,
delete_raises: dict[TileId, BaseException] | None = None,
) -> None:
self._rows: list[TileMetadataPersistent] = list(rows or [])
self._delete_returns = delete_returns or {}
self._delete_raises = delete_raises or {}
self.lru_calls: list[int] = []
self.delete_calls: list[TileId] = []
self.total_disk_bytes_calls = 0
def total_disk_bytes(self) -> int:
self.total_disk_bytes_calls += 1
return sum(p.disk_bytes for p in self._rows)
def lru_candidates(self, *, max_count: int) -> list[TileMetadataPersistent]:
self.lru_calls.append(max_count)
return list(self._rows[:max_count])
def delete_tile(self, tile_id: TileId) -> bool:
self.delete_calls.append(tile_id)
if tile_id in self._delete_raises:
raise self._delete_raises[tile_id]
self._rows = [p for p in self._rows if p.metadata.tile_id != tile_id]
return self._delete_returns.get(tile_id, True)
def _persistent_row(
*,
tile_id_seed: tuple[int, float, float],
disk_bytes: int,
accessed_at: datetime,
) -> TileMetadataPersistent:
zoom, lat, lon = tile_id_seed
tile_id = TileId(zoom_level=zoom, lat=lat, lon=lon)
blob_hash = hashlib.sha256(f"{tile_id_seed}-{disk_bytes}".encode()).hexdigest()
md = TileMetadata(
tile_id=tile_id,
tile_size_meters=256.0,
tile_size_pixels=256,
capture_timestamp=datetime(2026, 5, 12, tzinfo=timezone.utc),
source=TileSource.GOOGLEMAPS,
content_sha256_hex=blob_hash,
freshness_label=FreshnessLabel.FRESH,
flight_id=None,
companion_id=None,
quality_metadata=None,
voting_status=VotingStatus.TRUSTED,
)
return TileMetadataPersistent(
metadata=md,
accessed_at=accessed_at,
uploaded_at=None,
disk_bytes=disk_bytes,
)
def _build_enforcer(
store: _FakeStore,
sink: FakeFdrSink,
*,
budget_bytes: int,
eviction_batch_size: int = 32,
) -> CacheBudgetEnforcer:
return CacheBudgetEnforcer(
store=store, # type: ignore[arg-type]
fdr_client=sink, # type: ignore[arg-type]
logger=get_logger("c6_tile_cache.budget.test"),
budget_bytes=budget_bytes,
eviction_batch_size=eviction_batch_size,
)
@pytest.fixture
def fake_sink() -> FakeFdrSink:
return FakeFdrSink(producer_id="c6_tile_cache.budget", capacity=256)
@pytest.fixture
def now_dt() -> datetime:
return datetime(2026, 5, 12, 12, 0, 0, tzinfo=timezone.utc)
# ======================================================================
# Non-docker unit tests
# ======================================================================
def test_construction_emits_loaded_log_with_disk_bytes_snapshot(
fake_sink: FakeFdrSink, caplog: pytest.LogCaptureFixture
) -> None:
# Arrange
store = _FakeStore(
rows=[
_persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=1_000_000,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
]
)
# Act
with caplog.at_level(logging.INFO, logger="c6_tile_cache.budget.test"):
_build_enforcer(store, fake_sink, budget_bytes=10 * 1024**3)
# Assert
loaded = [rec for rec in caplog.records if getattr(rec, "kind", "") == "c6.budget.loaded"]
assert len(loaded) == 1
kv = loaded[0].kv # type: ignore[attr-defined]
assert kv["budget_bytes"] == 10 * 1024**3
assert kv["current_disk_bytes"] == 1_000_000
assert kv["headroom_bytes"] == 10 * 1024**3 - 1_000_000
def test_ac12_construction_warns_when_over_budget(
fake_sink: FakeFdrSink, caplog: pytest.LogCaptureFixture
) -> None:
# Arrange — prior flight ended over the cap.
store = _FakeStore(
rows=[
_persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=200,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
]
)
# Act
with caplog.at_level(logging.WARNING, logger="c6_tile_cache.budget.test"):
_build_enforcer(store, fake_sink, budget_bytes=100)
# Assert
warn = [
rec
for rec in caplog.records
if getattr(rec, "kind", "") == "c6.budget.over_budget_at_construction"
]
assert len(warn) == 1
assert warn[0].kv["overage_bytes"] == 100 # type: ignore[attr-defined]
def test_construction_rejects_non_positive_budget(fake_sink: FakeFdrSink) -> None:
# Arrange
store = _FakeStore()
# Act + Assert
with pytest.raises(TileMetadataError, match="budget_bytes must be > 0"):
_build_enforcer(store, fake_sink, budget_bytes=0)
def test_construction_rejects_non_positive_batch_size(fake_sink: FakeFdrSink) -> None:
# Arrange
store = _FakeStore()
# Act + Assert
with pytest.raises(TileMetadataError, match="eviction_batch_size must be > 0"):
_build_enforcer(store, fake_sink, budget_bytes=1024, eviction_batch_size=0)
def test_ac1_no_eviction_fast_path(fake_sink: FakeFdrSink) -> None:
# Arrange — 10 GB budget, 1 GB used, 10 MB needed → trivially fits.
store = _FakeStore(
rows=[
_persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=1 * 1024**3,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
]
)
enforcer = _build_enforcer(store, fake_sink, budget_bytes=10 * 1024**3)
fake_sink.records.clear()
# Act
result = enforcer.reserve_headroom(10 * 1024 * 1024)
# Assert
assert result == EvictionResult(evicted=[], freed_bytes=0)
assert store.lru_calls == [] # AC-1: no lru_candidates call on fast path
eviction_records = [r for r in fake_sink.records if r.kind == "c6.eviction_batch"]
assert eviction_records == []
def test_ac2_single_tile_eviction_frees_enough(
fake_sink: FakeFdrSink, caplog: pytest.LogCaptureFixture
) -> None:
# Arrange — 10 GB budget, 9.99 GB used → 10 MB head-room.
# One LRU candidate of 50 MB; we ask for 30 MB more.
budget = 10 * 1024**3
used = budget - 10 * 1024 * 1024
fill = _persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=used - 50 * 1024 * 1024,
accessed_at=datetime(2026, 4, 1, tzinfo=timezone.utc),
)
candidate = _persistent_row(
tile_id_seed=(18, 49.95, 36.32),
disk_bytes=50 * 1024 * 1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
# _FakeStore.lru_candidates returns insertion order → candidate is the
# LRU pick because it's first in the list.
store = _FakeStore(rows=[candidate, fill])
enforcer = _build_enforcer(store, fake_sink, budget_bytes=budget)
fake_sink.records.clear()
# Act
with caplog.at_level(logging.INFO, logger="c6_tile_cache.budget.test"):
result = enforcer.reserve_headroom(30 * 1024 * 1024)
# Assert
assert [md.tile_id for md in result.evicted] == [candidate.metadata.tile_id]
assert result.freed_bytes == 50 * 1024 * 1024
info_logs = [rec for rec in caplog.records if getattr(rec, "kind", "") == "c6.evicted"]
assert len(info_logs) == 1
eviction_records = [r for r in fake_sink.records if r.kind == "c6.eviction_batch"]
assert len(eviction_records) == 1
assert eviction_records[0].payload["evicted_count"] == 1
assert eviction_records[0].payload["freed_bytes"] == 50 * 1024 * 1024
def test_ac3_multi_tile_eviction_iterates_until_target(fake_sink: FakeFdrSink) -> None:
# Arrange — 10 candidates of 5 MB each; need to free 30 MB.
candidates = [
_persistent_row(
tile_id_seed=(18, 49.0 + i * 0.001, 36.0),
disk_bytes=5 * 1024 * 1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc) + timedelta(minutes=i),
)
for i in range(10)
]
# Budget configured to be exactly current → 30 MB shortfall on a 30 MB ask.
used = sum(c.disk_bytes for c in candidates)
store = _FakeStore(rows=list(candidates))
enforcer = _build_enforcer(store, fake_sink, budget_bytes=used)
# Act
result = enforcer.reserve_headroom(30 * 1024 * 1024)
# Assert — exactly 6 evictions (6 of 5 MB = 30 MB shortfall).
assert len(result.evicted) == 6
assert result.freed_bytes == 30 * 1024 * 1024
# The 7th onwards are still in the fake store (i.e. not evicted).
assert len(store.delete_calls) == 6
def test_ac4_eviction_batches_respect_batch_size(fake_sink: FakeFdrSink) -> None:
# Arrange — 100 candidates of 1 MB each; batch size 32; need 50 MB.
candidates = [
_persistent_row(
tile_id_seed=(18, 49.0 + i * 0.001, 36.0),
disk_bytes=1 * 1024 * 1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc) + timedelta(minutes=i),
)
for i in range(100)
]
used = sum(c.disk_bytes for c in candidates)
store = _FakeStore(rows=list(candidates))
enforcer = _build_enforcer(store, fake_sink, budget_bytes=used, eviction_batch_size=32)
# Act
result = enforcer.reserve_headroom(50 * 1024 * 1024)
# Assert
assert result.freed_bytes == 50 * 1024 * 1024
# lru_candidates must be called with max_count=32 each time.
assert all(call == 32 for call in store.lru_calls)
# Two SELECTs cover candidates [0..31] (32) + [32..49] (18 needed, 32 returned).
assert len(store.lru_calls) == 2
def test_ac5_insufficient_candidates_raises_after_full_sweep(fake_sink: FakeFdrSink) -> None:
# Arrange — only 100 MB worth of candidates exist; we ask for 1 GB.
candidates = [
_persistent_row(
tile_id_seed=(18, 49.0 + i * 0.001, 36.0),
disk_bytes=10 * 1024 * 1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc) + timedelta(minutes=i),
)
for i in range(10)
]
used = sum(c.disk_bytes for c in candidates)
store = _FakeStore(rows=list(candidates))
enforcer = _build_enforcer(store, fake_sink, budget_bytes=used, eviction_batch_size=32)
# Act + Assert
with pytest.raises(CacheBudgetExhaustedError) as excinfo:
enforcer.reserve_headroom(1 * 1024**3)
# All candidates evicted before the raise (partial-eviction principle).
assert excinfo.value.evicted_count == 10
assert excinfo.value.needed_bytes == 1 * 1024**3
# No candidates remain → next total_disk_bytes would be 0.
assert len(store.delete_calls) == 10
def test_ac7_decorator_propagates_wrapped_errors(fake_sink: FakeFdrSink) -> None:
# Arrange
store = _FakeStore()
enforcer = _build_enforcer(store, fake_sink, budget_bytes=10 * 1024**3)
class _RaisingStore:
def read_tile_pixels(self, _tile_id: TileId) -> object:
raise AssertionError("not exercised here")
def write_tile(self, _tile_blob: bytes, _metadata: TileMetadata) -> None:
raise ContentHashMismatchError("declared a..a, computed 0..0")
def tile_exists(self, _tile_id: TileId) -> bool:
return False
def delete_tile(self, _tile_id: TileId) -> bool:
return False
wrapper = BudgetEnforcedTileStore(wrapped=_RaisingStore(), enforcer=enforcer)
blob = b"\xff\xd8" + b"\x00" * 16
md = TileMetadata(
tile_id=TileId(zoom_level=18, lat=49.94, lon=36.31),
tile_size_meters=256.0,
tile_size_pixels=256,
capture_timestamp=datetime(2026, 5, 12, tzinfo=timezone.utc),
source=TileSource.GOOGLEMAPS,
content_sha256_hex=hashlib.sha256(blob).hexdigest(),
freshness_label=FreshnessLabel.FRESH,
flight_id=None,
companion_id=None,
quality_metadata=None,
voting_status=VotingStatus.TRUSTED,
)
# Act + Assert — decorator does NOT rewrap the underlying error.
with pytest.raises(ContentHashMismatchError, match=r"declared a\.\.a"):
wrapper.write_tile(blob, md)
def test_ac9_no_evict_path_uses_single_select(fake_sink: FakeFdrSink) -> None:
# Arrange — head-room exists; reserve_headroom should ONLY hit total_disk_bytes.
store = _FakeStore(
rows=[
_persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
]
)
enforcer = _build_enforcer(store, fake_sink, budget_bytes=10 * 1024**3)
# Reset the counter so we ignore the construction-time read.
store.total_disk_bytes_calls = 0
# Act
enforcer.reserve_headroom(1024)
# Assert
assert store.total_disk_bytes_calls == 1
assert store.lru_calls == []
assert store.delete_calls == []
def test_ac11_fdr_eviction_batch_payload_caps_tile_ids_at_5(
fake_sink: FakeFdrSink,
) -> None:
# Arrange — 100 candidates of 1 MB each; force 100 evictions.
candidates = [
_persistent_row(
tile_id_seed=(18, 49.0 + i * 0.0001, 36.0),
disk_bytes=1 * 1024 * 1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc) + timedelta(minutes=i),
)
for i in range(100)
]
used = sum(c.disk_bytes for c in candidates)
store = _FakeStore(rows=list(candidates))
enforcer = _build_enforcer(store, fake_sink, budget_bytes=used, eviction_batch_size=32)
fake_sink.records.clear()
# Act — force ~100 evictions worth of free.
enforcer.reserve_headroom(100 * 1024 * 1024)
# Assert
eviction_records = [r for r in fake_sink.records if r.kind == "c6.eviction_batch"]
assert len(eviction_records) == 1
payload = eviction_records[0].payload
assert payload["evicted_count"] == 100
assert len(payload["evicted_tile_ids"]) == 5 # bounded
def test_reliability_delete_returns_false_logs_and_continues(
fake_sink: FakeFdrSink, caplog: pytest.LogCaptureFixture
) -> None:
# Arrange — first candidate raced away; second candidate is real.
raced = _persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=10 * 1024 * 1024,
accessed_at=datetime(2026, 4, 1, tzinfo=timezone.utc),
)
second = _persistent_row(
tile_id_seed=(18, 49.95, 36.32),
disk_bytes=10 * 1024 * 1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
store = _FakeStore(
rows=[raced, second],
delete_returns={raced.metadata.tile_id: False},
)
enforcer = _build_enforcer(store, fake_sink, budget_bytes=raced.disk_bytes + second.disk_bytes)
# Act
with caplog.at_level(logging.INFO, logger="c6_tile_cache.budget.test"):
result = enforcer.reserve_headroom(15 * 1024 * 1024)
# Assert — both are counted as freed (spec § Exclusions).
already_gone = [
rec for rec in caplog.records if getattr(rec, "kind", "") == "c6.evict.already_gone"
]
assert len(already_gone) == 1
assert result.freed_bytes == 20 * 1024 * 1024
assert len(store.delete_calls) == 2
def test_reliability_delete_raises_tile_fs_error_logs_and_continues(
fake_sink: FakeFdrSink, caplog: pytest.LogCaptureFixture
) -> None:
# Arrange — delete raises TileFsError but row delete succeeded under it.
candidate = _persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=20 * 1024 * 1024,
accessed_at=datetime(2026, 4, 1, tzinfo=timezone.utc),
)
store = _FakeStore(
rows=[candidate],
delete_raises={candidate.metadata.tile_id: TileFsError("unlink failed")},
)
enforcer = _build_enforcer(store, fake_sink, budget_bytes=candidate.disk_bytes)
# Act
with caplog.at_level(logging.WARNING, logger="c6_tile_cache.budget.test"):
result = enforcer.reserve_headroom(10 * 1024 * 1024)
# Assert
fs_errors = [rec for rec in caplog.records if getattr(rec, "kind", "") == "c6.evict.fs_error"]
assert len(fs_errors) == 1
assert result.freed_bytes == candidate.disk_bytes
def test_nfr_perf_no_evict_path_p99_under_5ms(fake_sink: FakeFdrSink) -> None:
# Arrange — head-room exists.
store = _FakeStore(
rows=[
_persistent_row(
tile_id_seed=(18, 49.94, 36.31),
disk_bytes=1024,
accessed_at=datetime(2026, 5, 1, tzinfo=timezone.utc),
)
]
)
enforcer = _build_enforcer(store, fake_sink, budget_bytes=10 * 1024**3)
durations_us: list[float] = []
# Act — 1000 reps is enough for a stable p99 on the no-DB path.
for _ in range(1000):
t0 = time.perf_counter()
enforcer.reserve_headroom(1024)
durations_us.append((time.perf_counter() - t0) * 1_000_000.0)
# Assert — relaxed 5 ms ceiling matches the AZ-308 NFR text.
durations_us.sort()
p99 = durations_us[int(0.99 * len(durations_us))]
assert p99 < 5_000.0, f"reserve_headroom p99={p99:.1f} us exceeds 5 ms ceiling"
# ======================================================================
# Docker integration tests (real Postgres + filesystem)
# ======================================================================
@pytest.fixture
def db_url() -> str:
url = os.environ.get("DB_URL")
if not url:
pytest.skip("DB_URL not set — start docker-compose.test.yml `db` service first")
return url
@pytest.fixture
def fresh_head_db(db_url: str) -> Iterator[str]:
tables = ", ".join(
(
"tile_freshness_rules",
"engine_cache_entries",
"manifests",
"tiles",
"sector_classifications",
"flights",
"alembic_version",
)
)
with psycopg.connect(db_url, autocommit=True) as conn:
with conn.cursor() as cur:
cur.execute(f"DROP TABLE IF EXISTS {tables} CASCADE")
block = C6TileCacheConfig(postgres_dsn=db_url)
apply_migrations(Config.with_blocks(c6_tile_cache=block))
yield db_url
@pytest.fixture
def pool(fresh_head_db: str) -> Iterator[ConnectionPool]:
p = ConnectionPool(
fresh_head_db, min_size=1, max_size=4, open=True, kwargs={"autocommit": False}
)
yield p
p.close()
@pytest.fixture
def real_store(
pool: ConnectionPool, tmp_path: Path, fake_sink: FakeFdrSink
) -> PostgresFilesystemStore:
from gps_denied_onboard.clock.wall_clock import WallClock
return PostgresFilesystemStore(
root_dir=tmp_path,
postgres_pool=pool,
sha256_sidecar=Sha256Sidecar,
wgs_converter=WgsConverter,
fdr_client=fake_sink, # type: ignore[arg-type]
logger=get_logger("c6_tile_cache.store.test"),
lru_clock=WallClock(),
)
@pytest.fixture
def future_clock_store(
pool: ConnectionPool, tmp_path: Path, fake_sink: FakeFdrSink
) -> PostgresFilesystemStore:
"""Store wired with a deterministic far-future clock for LRU tests.
Wall-clock parity between the host (Python) and the Postgres container
is not always tight on macOS/Colima — a sub-second skew can leave the
AZ-305 ``DEFAULT now()`` ``accessed_at`` after the host's
:meth:`time.time_ns`, so a real ``record_lru_access`` UPDATE with the
host's wall clock loses to ``GREATEST(accessed_at, %s)``. Pinning the
clock to a far-future timestamp removes that flakiness without
changing the production wiring (which uses ``WallClock``).
"""
return PostgresFilesystemStore(
root_dir=tmp_path,
postgres_pool=pool,
sha256_sidecar=Sha256Sidecar,
wgs_converter=WgsConverter,
fdr_client=fake_sink, # type: ignore[arg-type]
logger=get_logger("c6_tile_cache.store.test"),
lru_clock=_FakeClock(datetime(2099, 1, 1, tzinfo=timezone.utc)),
)
def _make_tile_blob(content: str) -> bytes:
return b"\xff\xd8\xff\xe0" + content.encode("ascii") + b"\x00" * 256 + b"\xff\xd9"
def _metadata_for(
blob: bytes,
*,
lat: float = 49.94,
lon: float = 36.31,
capture_timestamp: datetime | None = None,
) -> TileMetadata:
return TileMetadata(
tile_id=TileId(zoom_level=18, lat=lat, lon=lon),
tile_size_meters=256.0,
tile_size_pixels=256,
capture_timestamp=capture_timestamp or datetime(2026, 5, 12, tzinfo=timezone.utc),
source=TileSource.GOOGLEMAPS,
content_sha256_hex=hashlib.sha256(blob).hexdigest(),
freshness_label=FreshnessLabel.FRESH,
flight_id=None,
companion_id=None,
quality_metadata=None,
voting_status=VotingStatus.TRUSTED,
)
@_docker
def test_ac6_decorator_evicts_then_writes(
real_store: PostgresFilesystemStore, fake_sink: FakeFdrSink
) -> None:
# Arrange — fill cache with one larger tile, then construct an
# enforcer with a budget tight enough that the next write triggers
# eviction. Picking lat/lon coordinates with distinct tile cells.
seed_blob = _make_tile_blob("seed-tile")
seed_md = _metadata_for(seed_blob, lat=49.94, lon=36.31)
real_store.write_tile(seed_blob, seed_md)
seed_disk_bytes = real_store.total_disk_bytes()
# Budget = current disk bytes + 64 B (just barely room for nothing else).
enforcer = _build_enforcer(real_store, fake_sink, budget_bytes=seed_disk_bytes + 64)
wrapper = BudgetEnforcedTileStore(wrapped=real_store, enforcer=enforcer)
# Wait a tick + bump LRU on the seed so eviction picks it (this is
# the only candidate anyway, but exercising the read path proves the
# AC-8 LRU update fires).
handle = real_store.read_tile_pixels(seed_md.tile_id)
with handle:
pass
new_blob = _make_tile_blob("post-eviction-tile")
new_md = _metadata_for(new_blob, lat=50.0, lon=37.0)
# Act
wrapper.write_tile(new_blob, new_md)
# Assert
# Seed tile evicted, new tile present.
assert real_store.tile_exists(seed_md.tile_id) is False
assert real_store.tile_exists(new_md.tile_id) is True
# FDR batch emitted with trigger_tile_id pointing at the new tile.
eviction = [r for r in fake_sink.records if r.kind == "c6.eviction_batch"]
assert len(eviction) == 1
assert eviction[0].payload["trigger_tile_id"] == str(new_md.tile_id)
@_docker
def test_ac8_read_tile_pixels_updates_lru_clock(
future_clock_store: PostgresFilesystemStore,
) -> None:
# Arrange — both tiles get DEFAULT now() at INSERT. Then read A; the
# far-future fake clock guarantees A.accessed_at is bumped above
# B.accessed_at regardless of host/container clock skew.
blob_a = _make_tile_blob("ac8-a")
md_a = _metadata_for(blob_a, lat=49.94, lon=36.31)
blob_b = _make_tile_blob("ac8-b")
md_b = _metadata_for(blob_b, lat=50.0, lon=37.0)
future_clock_store.write_tile(blob_a, md_a)
future_clock_store.write_tile(blob_b, md_b)
handle = future_clock_store.read_tile_pixels(md_a.tile_id)
with handle:
pass
# Act
candidates = future_clock_store.lru_candidates(max_count=2)
# Assert — after the LRU-clock-driven read, A is now most-recently
# accessed and B is the oldest candidate.
assert candidates[0].metadata.tile_id == md_b.tile_id
assert candidates[1].metadata.tile_id == md_a.tile_id
@_docker
def test_ac10_synthetic_fill_keeps_disk_under_cap(
real_store: PostgresFilesystemStore, fake_sink: FakeFdrSink
) -> None:
# Arrange — pick a small synthetic cap so the fill is fast.
# 5 tiles of ~256 B each → 1.5 KB used. Cap at 1 KB → forces eviction.
seed_blobs: list[bytes] = []
seed_mds: list[TileMetadata] = []
for i in range(5):
b = _make_tile_blob(f"ac10-fill-{i}")
seed_blobs.append(b)
seed_mds.append(_metadata_for(b, lat=49.94 + i * 0.001, lon=36.31))
real_store.write_tile(b, seed_mds[-1])
current = real_store.total_disk_bytes()
cap = current # budget == used → next write evicts oldest tile.
enforcer = _build_enforcer(real_store, fake_sink, budget_bytes=cap)
wrapper = BudgetEnforcedTileStore(wrapped=real_store, enforcer=enforcer)
# Act — insert 5 more tiles; every write should keep disk <= cap.
fake_sink.records.clear()
for i in range(5, 10):
b = _make_tile_blob(f"ac10-overflow-{i}")
md = _metadata_for(b, lat=50.0 + i * 0.001, lon=37.0)
wrapper.write_tile(b, md)
assert real_store.total_disk_bytes() <= cap, (
f"iteration {i}: disk={real_store.total_disk_bytes()} cap={cap}"
)
# Assert — at least one eviction FDR record was emitted.
eviction_records = [r for r in fake_sink.records if r.kind == "c6.eviction_batch"]
assert eviction_records, "expected at least one c6.eviction_batch record"
@@ -56,13 +56,9 @@ from gps_denied_onboard.runtime_root.storage_factory import (
build_tile_store,
)
_CONTRACT_DIR = Path(__file__).resolve().parents[3] / (
"_docs/02_document/contracts/c6_tile_cache"
)
_CONTRACT_DIR = Path(__file__).resolve().parents[3] / ("_docs/02_document/contracts/c6_tile_cache")
_FAKE_IMPL_MODULE = "gps_denied_onboard.components.c6_tile_cache.faiss_descriptor_index"
_FAKE_STORE_MODULE = (
"gps_denied_onboard.components.c6_tile_cache.postgres_filesystem_store"
)
_FAKE_STORE_MODULE = "gps_denied_onboard.components.c6_tile_cache.postgres_filesystem_store"
def _valid_tile_id(zoom: int = 18, lat: float = 49.94, lon: float = 36.31) -> TileId:
@@ -320,6 +316,15 @@ def _install_fake_postgres_store_module() -> type:
# preserves the single-config-arg shape via this classmethod.
return cls(config)
# AZ-308: ``build_tile_store`` now wraps the store in a
# ``BudgetEnforcedTileStore`` whose constructor reads
# ``total_disk_bytes`` for the AC-12 startup log. Override the
# ``_FullTileMetadataStore`` NotImplementedError stub with a
# working zero-byte response so the factory can construct the
# wrapper without touching a real DB.
def total_disk_bytes(self) -> int:
return 0
fake_module = types.ModuleType(_FAKE_STORE_MODULE)
fake_module.PostgresFilesystemStore = _FakePostgresFilesystemStore # type: ignore[attr-defined]
sys.modules[_FAKE_STORE_MODULE] = fake_module
@@ -349,11 +354,21 @@ def test_ac5_build_descriptor_index_flag_off_raises_no_import(
def test_ac4_build_tile_store_returns_protocol_impl(store_module_cleanup) -> None:
# AZ-308: ``build_tile_store`` now returns a ``BudgetEnforcedTileStore``
# decorator wrapping the inner :class:`TileStore` impl. The decorator
# implements the Protocol surface; the wrapped instance is reachable
# via the private ``_wrapped`` attribute for tests that need to
# introspect the inner store.
from gps_denied_onboard.components.c6_tile_cache.cache_budget_enforcer import (
BudgetEnforcedTileStore,
)
fake_cls = _install_fake_postgres_store_module()
config = _config_with_c6()
store = build_tile_store(config)
assert isinstance(store, fake_cls)
assert isinstance(store, BudgetEnforcedTileStore)
assert isinstance(store, TileStore)
assert isinstance(store._wrapped, fake_cls) # type: ignore[attr-defined]
def test_ac4_build_tile_metadata_store_returns_protocol_impl(
@@ -366,9 +381,7 @@ def test_ac4_build_tile_metadata_store_returns_protocol_impl(
assert isinstance(md, TileMetadataStore)
def test_ac5_tile_store_runtime_module_missing_raises(
store_module_cleanup, monkeypatch
) -> None:
def test_ac5_tile_store_runtime_module_missing_raises(store_module_cleanup, monkeypatch) -> None:
"""AC-5 historical name; after AZ-305 the impl module always exists, so
"missing" is exercised by deleting it from ``sys.modules`` AND making
``importlib`` refuse the import. We patch the module-level lazy import
@@ -378,14 +391,18 @@ def test_ac5_tile_store_runtime_module_missing_raises(
config = _config_with_c6()
import gps_denied_onboard.runtime_root.storage_factory as factory_mod
real_import = __builtins__["__import__"] if isinstance(__builtins__, dict) else __builtins__.__import__
real_import = (
__builtins__["__import__"] if isinstance(__builtins__, dict) else __builtins__.__import__
)
def _block_postgres_import(name, *args, **kwargs):
if name.endswith("postgres_filesystem_store"):
raise ModuleNotFoundError(name)
return real_import(name, *args, **kwargs)
monkeypatch.setattr(factory_mod, "__builtins__", {"__import__": _block_postgres_import}, raising=False)
monkeypatch.setattr(
factory_mod, "__builtins__", {"__import__": _block_postgres_import}, raising=False
)
monkeypatch.setitem(sys.modules, _FAKE_STORE_MODULE, None) # type: ignore[arg-type]
with pytest.raises(RuntimeNotAvailableError) as exc_info:
build_tile_store(config)
@@ -428,9 +445,7 @@ def test_ac6_unknown_metadata_runtime_rejected() -> None:
({"zoom_level": 18, "lat": 0.0, "lon": -200.0}, "lon"),
],
)
def test_ac7_tile_id_rejects_bad_input(
kwargs: dict[str, float], offending_field: str
) -> None:
def test_ac7_tile_id_rejects_bad_input(kwargs: dict[str, float], offending_field: str) -> None:
with pytest.raises(ValueError) as exc_info:
TileId(**kwargs) # type: ignore[arg-type]
assert offending_field in str(exc_info.value)
@@ -504,9 +519,7 @@ def _methods_from_contract(contract_file: Path) -> set[str]:
def _protocol_methods(proto: type) -> set[str]:
"""Reflect over a Protocol's method names."""
return {
name
for name in dir(proto)
if not name.startswith("_") and callable(getattr(proto, name))
name for name in dir(proto) if not name.startswith("_") and callable(getattr(proto, name))
}
@@ -518,9 +531,7 @@ def _protocol_methods(proto: type) -> set[str]:
("descriptor_index.md", DescriptorIndex),
],
)
def test_ac9_contract_methods_match_protocol(
contract_filename: str, proto: type
) -> None:
def test_ac9_contract_methods_match_protocol(contract_filename: str, proto: type) -> None:
contract_path = _CONTRACT_DIR / contract_filename
contract_methods = _methods_from_contract(contract_path)
protocol_methods = _protocol_methods(proto)
@@ -171,6 +171,19 @@ def _kind_payload(kind: str) -> dict[str, object]:
"rule_action": "downgrade",
"rule_max_age_seconds": 31_104_000,
}
if kind == "c6.eviction_batch":
return {
"trigger_tile_id": "00000000-0000-0000-0000-000000000003",
"freed_bytes": 250_000_000,
"evicted_count": 12,
"evicted_tile_ids": [
"00000000-0000-0000-0000-000000000010",
"00000000-0000-0000-0000-000000000011",
"00000000-0000-0000-0000-000000000012",
"00000000-0000-0000-0000-000000000013",
"00000000-0000-0000-0000-000000000014",
],
}
raise AssertionError(f"unhandled kind in fixture: {kind!r}")