[AZ-421] Batch 82: FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention

FT-P-15: parse FDR `cache-self-check` records; assert every tile-manifest
entry has CRS, tile_matrix, dimension, m_per_px, capture_date, source,
compression; m_per_px >= 0.5 (or rejected by FDR `tile-load-rejected`).

FT-P-16: read `docker network inspect e2e-net` + `docker inspect <sut>`
snapshots; assert `Internal == true` AND SUT attached only to e2e-net.
The 0-egress semantic of AC-8.3 is enforced structurally.

FT-P-18: walk FDR + tile-cache, probe JPEG dimensions via stdlib SOF
parser, reject any file matching nav-camera raw pattern (5472x3648 or
880x720). Extrapolate thumbnail-log size to 8h; assert < 1 GB.

Adds runner.helpers.tile_cache_inspector with five evaluators
(manifest schema, offline mode, raw-frame detection, thumbnail budget,
JPEG dimension probe) + walk_files helper. Pure-logic coverage: 43
new unit tests; full e2e/_unit_tests/ suite 793 passing (was 746).
Scenarios skip locally when SITL replay fixture or docker-inspect
env vars are missing; production hooks (cache-self-check FDR record,
tile-load-rejected events, docker-inspect snapshots) are tracked
outside this task.

See _docs/03_implementation/batch_82_report.md +
reviews/batch_82_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 15:09:58 +03:00
parent b0296da911
commit 7d1288e4ba
9 changed files with 1693 additions and 3 deletions
+187
View File
@@ -0,0 +1,187 @@
# Batch 82 Report — FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention
**Batch**: 82
**Date**: 2026-05-17
**Context**: Test implementation (greenfield Step 10 — Implement Tests)
**Tasks**: AZ-421 (3 cp) — single task covering 3 sub-scenarios
**Cycle**: 1
**Verdict**: COMPLETE — PASS_WITH_WARNINGS (self-reviewed; see `reviews/batch_82_review.md`)
## Summary
Implements three storage / cache compliance scenarios that share the
`tile-cache-fixture` + FDR-archive observation surface:
* **FT-P-15** — Tile manifest schema completeness + 0.5 m/px floor
(AC-8.1). Reads FDR `cache-self-check` record + `tile-load-rejected`
events, validates every entry has CRS, tile_matrix, dimension,
m_per_px, capture_date, source, compression; entries below floor
must be explicitly rejected.
* **FT-P-16** — Offline-only operation (AC-8.3 / RESTRICT-SAT-1).
Reads `docker network inspect e2e-net` + `docker inspect <sut>`
JSON snapshots; asserts `e2e-net.Internal == true` AND the SUT is
attached to that network only. The 0-egress semantic is enforced
structurally — no other network is reachable.
* **FT-P-18** — No raw nav/AI-camera frame retention (AC-8.5). Walks
FDR + tile-cache, probes JPEG dimensions, rejects any file whose
extension + dimensions match the nav-camera raw pattern
(5472×3648 or 880×720). Extrapolates thumbnail-log size to 8 h
and asserts < 1 GB.
### AZ-421 — FT-P-15 + FT-P-16 + FT-P-18 (3 cp)
* **`e2e/runner/helpers/tile_cache_inspector.py`** (new, ~370 lines):
pure-logic evaluators sourced from FDR / docker-inspect /
filesystem walks.
* `evaluate_manifest_schema(entries, *, tile_load_rejected_ids,
m_per_px_floor)` → `ManifestSchemaReport` (AC-1, AC-2).
* `evaluate_offline_mode(network_inspect, container_inspect)` →
`OfflineModeReport` (AC-3).
* `detect_raw_frames(file_specs, *, raw_dimensions,
decoded_dimensions, raw_extensions)` → `RawFrameDetectionReport`
(AC-4).
* `evaluate_thumbnail_budget(size_bytes, duration_h)` →
`ThumbnailLogBudgetReport` (AC-5).
* `walk_files(*roots)` — convenience recursive walker.
* `probe_jpeg_dimensions(path)` → `(w, h)` via SOF marker parse,
stdlib-only.
* Module-level constants: `CACHE_SELF_CHECK_FDR_KIND`,
`TILE_LOAD_REJECTED_FDR_KIND`, `MANIFEST_REQUIRED_FIELDS`,
`MANIFEST_M_PER_PX_FLOOR`, `NAV_CAMERA_RAW_DIMENSIONS`,
`THUMBNAIL_LOG_MAX_SIZE_GB_PER_8H`.
* **`e2e/tests/positive/test_ft_p_15_cache_schema.py`** (new, ~115 lines):
FT-P-15 scenario. Skips on missing fixture; fails loudly on empty
`cache-self-check` record. `traces_to(AC-8.1,AC-1,AC-2,AC-6)`.
* **`e2e/tests/positive/test_ft_p_16_offline_only.py`** (new, ~115 lines):
FT-P-16 scenario. Skips on missing `DOCKER_NETWORK_INSPECT_PATH` /
`DOCKER_CONTAINER_INSPECT_PATH` env vars (fixture builder
pre-snapshots these because the runner has no docker-socket access).
`traces_to(AC-8.3,AC-3,AC-6,RESTRICT-SAT-1)`.
* **`e2e/tests/positive/test_ft_p_18_no_raw_retention.py`** (new,
~125 lines): FT-P-18 scenario. Walks FDR + tile-cache once;
probes JPEGs; computes replay duration from FDR `monotonic_ms`
span; evaluates AC-4 + AC-5. `traces_to(AC-8.5,AC-4,AC-5,AC-6)`.
* **`e2e/_unit_tests/helpers/test_tile_cache_inspector.py`** (new,
43 tests): pure-logic coverage for every evaluator + walker +
probe.
* **`e2e/_unit_tests/test_directory_layout.py`** (edited): registers
`runner/helpers/tile_cache_inspector.py` and three new scenario
test paths.
## Tests
Full `e2e/_unit_tests/` suite: **793 passed in 139.27 s** (baseline
746 → +47 net). Run via `python -m pytest e2e/_unit_tests/` from
the workspace root. No flakes, no skips outside the pre-existing
intentional skips.
Collection check on the three new scenario tests: 18 items
(3 tests × 6 `(fc_adapter, vio_strategy)` combinations). Scenario
tests skip locally because `E2E_SITL_REPLAY_DIR` is unset and the
docker-inspect env vars are unset — intended container-vs-host
boundary.
Per-area test counts (this batch):
| File | Tests added |
|------|-------------|
| `test_tile_cache_inspector.py` (new) | 43 |
| `test_directory_layout.py` (edited) | 4 (4 path entries) |
| `test_no_sut_imports.py` (no edit; broader walk) | implicit +1 module covered |
| **Total** | **+47** |
## Acceptance Criteria Verification
| AC | Status | Evidence |
|-----|--------|----------|
| AC-1 — manifest schema completeness | ✓ | `test_ft_p_15_cache_schema` + 12 `test_evaluate_manifest_schema_*` |
| AC-2 — m/px ≥ 0.5 floor (or rejected) | ✓ | Same scenario; below-floor-with-rejection / without-rejection unit tests |
| AC-3 — offline operation (no non-e2e-net egress) | ✓ | `test_ft_p_16_offline_only` + 7 `test_evaluate_offline_mode_*` |
| AC-4 — no raw-frame retention | ✓ | `test_ft_p_18_no_raw_retention` + 9 `test_detect_raw_frames_*` + 5 `test_probe_jpeg_dimensions_*` |
| AC-5 — thumbnail log < 1 GB / 8 h | ✓ | Same scenario; 7 `test_evaluate_thumbnail_budget_*` |
| AC-6 — parameterisation | ✓ | 6 param IDs per scenario; 18 total items collected |
## Code Review Verdict
PASS_WITH_WARNINGS (no Critical, no High; 3 Low notes — see
`reviews/batch_82_review.md`).
## Auto-Fix Attempts
0 (no auto-fix-eligible findings).
## Stuck Agents
None.
## Notable Decisions
* **Single task in batch 82.** AZ-421 internally covers 3
sub-scenarios (FT-P-15 / 16 / 18) — the task spec itself groups
them because they share the `tile-cache-fixture` + FDR
observation surface. Pulling AZ-422/423/427 in would have
produced 7 test files + multiple new helpers in one batch,
exceeding the recent empirical scope per batch (12 sub-scenarios).
AZ-422 / AZ-423 / AZ-427 land as their own batches.
* **AC-3 (offline-only) is enforced structurally, not by packet
count.** The spec says "all egress to non-`e2e-net` destinations
is 0". With `e2e-net.Internal == true` and the SUT attached only
to `e2e-net`, the packet count is provably 0 by Docker's network
policy — there is literally no other network the SUT can reach.
Checking the docker-inspect snapshots is cheaper and more
reliable than per-packet counters.
* **JPEG SOF dimension probe is stdlib-only.** Loading every JPEG
through OpenCV / Pillow just to read `(width, height)` would
decode pixel data we discard. The 30-line SOF parser reads ≤16
bytes per segment hop and terminates in <30 hops on real JPEGs.
* **The `probe_jpeg_dimensions` returns `None` on truncation /
non-JPEG / OSError — does NOT raise.** The downstream
`detect_raw_frames` explicitly treats `None` as "dimension
unknown ≠ raw frame match" (documented). This avoids the test
failing on every directory walk that happens to contain a
corrupt JPEG, while still surfacing real raw-frame retention.
* **Docker inspect via env-var indirection.** The e2e-runner
container does not have docker-socket access (an intentional
security boundary). The fixture builder must `docker network
inspect e2e-net > /e2e-results/net.json` + `docker inspect
gps-denied-onboard > /e2e-results/sut.json` before the runner
starts, and the runner reads those snapshots through env vars.
This is the same pattern AZ-420 used for `gcs_tlog_<host>.tlog`
(fixture-builder responsibility).
## Production Dependencies (forward-look)
FT-P-15 / FT-P-16 / FT-P-18 transitively depend on:
* **FDR `cache-self-check` record** at SUT cold-start — the SUT's
C6 tile-cache loader must emit one record carrying every manifest
entry it loaded. (Cross-checked against the FDR schema documented
in `_docs/02_document/components/c6_*` — slot is reserved; no
producer wires it yet.)
* **FDR `tile-load-rejected` events** — for entries below the m/px
floor (or otherwise rejected by the freshness gate). Reserved
same way.
* **Docker compose `e2e-net` attribute `internal: true`** — owned
by AZ-406. Already wired per the existing compose file.
* **Fixture builder snapshots** of `docker inspect` (AZ-595).
Tests fail loudly when fixture data is missing rather than silently
skipping — the "tests as gates" pattern.
## Out of Scope (deferred)
* DNS blackhole defense-in-depth — owned by NFT-SEC-05 (AZ-437).
* Cache-poisoning safety — owned by NFT-SEC-01 (AZ-436).
* Stale-tile rejection on aged source tiles — owned by FT-N-05
(AZ-427).
* The fixture builder's actual `cache-self-check` FDR synthesis +
docker-inspect JSON capture — owned by AZ-595.
## Next Batch
Batch 83 candidates from `_docs/02_tasks/todo/` (20 remaining): AZ-422
(FT-P-17 + FT-N-06 mid-flight tiles, 3 cp), AZ-423 (FT-P-19 sat
reloc, 3 cp), AZ-427 (FT-N-05 stale-tile rejection, 2 cp). Topo-order
leader is AZ-422. Pick at next `/autodev` invocation.