[AZ-421] Batch 82: FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention

FT-P-15: parse FDR `cache-self-check` records; assert every tile-manifest entry has CRS, tile_matrix, dimension, m_per_px, capture_date, source, compression; m_per_px >= 0.5 (or rejected by FDR `tile-load-rejected`). FT-P-16: read `docker network inspect e2e-net` + `docker inspect <sut>` snapshots; assert `Internal == true` AND SUT attached only to e2e-net. The 0-egress semantic of AC-8.3 is enforced structurally. FT-P-18: walk FDR + tile-cache, probe JPEG dimensions via stdlib SOF parser, reject any file matching nav-camera raw pattern (5472x3648 or 880x720). Extrapolate thumbnail-log size to 8h; assert < 1 GB. Adds runner.helpers.tile_cache_inspector with five evaluators (manifest schema, offline mode, raw-frame detection, thumbnail budget, JPEG dimension probe) + walk_files helper. Pure-logic coverage: 43 new unit tests; full e2e/_unit_tests/ suite 793 passing (was 746). Scenarios skip locally when SITL replay fixture or docker-inspect env vars are missing; production hooks (cache-self-check FDR record, tile-load-rejected events, docker-inspect snapshots) are tracked outside this task. See _docs/03_implementation/batch_82_report.md + reviews/batch_82_review.md. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-22 20:11:14 +00:00 · 2026-05-17 15:09:58 +03:00
parent b0296da911
commit 7d1288e4ba
9 changed files with 1693 additions and 3 deletions
@@ -0,0 +1,224 @@
+# Code Review Report
+
+**Batch**: 82 — AZ-421 (FT-P-15 + FT-P-16 + FT-P-18 cache/offline/no-raw-retention)
+**Date**: 2026-05-17
+**Verdict**: PASS_WITH_WARNINGS
+
+## Findings
+
+| #  | Severity | Category        | File:Line                                                                  | Title                                                          |
+|----|----------|-----------------|----------------------------------------------------------------------------|----------------------------------------------------------------|
+| 1  | Low      | Maintainability | `e2e/runner/helpers/tile_cache_inspector.py:120`                           | `_resolve_entry_id` falls back to `tile_matrix` before synth   |
+| 2  | Low      | Style           | `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139`                 | Multi-OR assert in synthesised-id test                         |
+| 3  | Low      | Scope           | `e2e/tests/positive/test_ft_p_16_offline_only.py:80`                       | Docker inspect JSON env-var indirection requires fixture support |
+
+### Finding Details
+
+**F1: `_resolve_entry_id` lookup order may surface `tile_matrix` as an id**
+(Low / Maintainability)
+
+- Location: `e2e/runner/helpers/tile_cache_inspector.py:120-124`
+- Description: When an entry lacks both `id` and `tile_id`, the
+  resolver falls through to `tile_matrix` before synthesising an
+  `entry_N` placeholder. This can produce duplicate "id" values if
+  several entries share a tile-matrix, which would in turn block
+  the `rejected_below_floor_ids` lookup from matching the right
+  entry.
+- Suggestion: leave as-is for now; the FDR schema commits to `id`
+  being present per `_docs/02_document/components/c6_*` contracts.
+  The fallback is a defensive read for malformed fixtures. If the
+  fixture builder ever produces entries without `id`, the AC-1
+  "missing_fields" check already fails first — the entry-id
+  resolution is then for diagnostic display only.
+- Task: AZ-421
+
+**F2: Multi-OR assert in synthesised-id test** (Low / Style)
+
+- Location: `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139`
+- Description:
+  `test_evaluate_manifest_schema_entry_id_falls_back_to_synthesised`
+  uses a 3-way OR assert because the `_resolve_entry_id` resolver
+  inspects `id` → `tile_id` → `tile_matrix` → `entry_N` and the
+  test entry happens to have `tile_matrix`. The assert is correct
+  (covers the actual lookup order) but reads ambiguously.
+- Suggestion: leave as-is; tightening the assert would force the
+  test to know the resolver's internal lookup chain, which is the
+  exact coupling code review usually flags. Documented here for
+  future cleanup if the resolver simplifies.
+- Task: AZ-421
+
+**F3: Docker inspect indirection requires fixture-builder support**
+(Low / Scope)
+
+- Location: `e2e/tests/positive/test_ft_p_16_offline_only.py:80-92`
+- Description: The FT-P-16 scenario reads
+  `docker network inspect e2e-net` + `docker inspect <sut-ctr>` from
+  JSON files (env vars `DOCKER_NETWORK_INSPECT_PATH` /
+  `DOCKER_CONTAINER_INSPECT_PATH`) rather than calling `docker`
+  directly. This is intentional — the e2e-runner container does not
+  have docker-socket access, and the fixture builder must snapshot
+  inspect output before the runner starts.
+- Suggestion: the fixture builder (AZ-595) needs a thin wrapper
+  that produces both JSON files at the start of every scenario run
+  that needs them. Tracked outside this batch.
+- Task: AZ-421
+
+## Findings Sweep
+
+### Phase 1 — Context Loading
+
+Read AZ-421 spec, blackbox-tests § FT-P-15/16/18, module-layout (confirmed
+`blackbox_tests` owns `e2e/**`), conftest (fixture surface), existing FDR
+reader, and recent helpers as templates (`gcs_telemetry_evaluator.py`,
+`ap_contract_evaluator.py`).
+
+### Phase 2 — Spec Compliance (AC trace)
+
+* **AC-1 (FT-P-15 manifest schema completeness)** ✓
+  - Scenario: `test_ft_p_15_cache_schema` walks FDR for
+    `cache-self-check` records, builds the manifest entry list, calls
+    `evaluate_manifest_schema`, asserts `report.passes`.
+  - Pure-logic: 12 `test_evaluate_manifest_schema_*` unit tests
+    covering full-fields-pass, missing-fields-fail (single + multi
+    + ordered), at-floor exactly, empty list, non-numeric m/px,
+    invalid floor → ValueError, custom required fields.
+
+* **AC-2 (FT-P-15 m/px floor ≥ 0.5)** ✓
+  - Covered by `ManifestEntryReport.passes_floor` +
+    `ManifestSchemaReport.passes` (rejects below-floor entries
+    unless `tile_load_rejected_ids` includes them).
+  - Pure-logic: below-floor-no-rejection-fails,
+    below-floor-with-rejection-passes, at-floor-exactly-passes.
+
+* **AC-3 (FT-P-16 offline operation)** ✓
+  - Scenario: `test_ft_p_16_offline_only` loads two docker-inspect
+    JSON files, calls `evaluate_offline_mode`, asserts
+    `report.passes`.
+  - Pure-logic: 7 `test_evaluate_offline_mode_*` unit tests
+    (passes, non-internal-fails, extra-network-fails,
+    no-networks-fails, missing-Internal-key-fails,
+    non-bool-Internal-fails, custom-expected-network-passes).
+
+* **AC-4 (FT-P-18 no raw-frame retention)** ✓
+  - Scenario: `test_ft_p_18_no_raw_retention` walks FDR + tile-cache
+    via `walk_files`, probes JPEG dimensions, calls
+    `detect_raw_frames`, asserts `report.passes`.
+  - Pure-logic: 9 `test_detect_raw_frames_*` + 5
+    `test_probe_jpeg_dimensions_*` + 3 `test_walk_files_*` =
+    17 unit tests.
+
+* **AC-5 (FT-P-18 thumbnail budget < 1 GB / 8 h)** ✓
+  - Scenario: computes `thumbnail_log_size_bytes` from the walk +
+    replay duration from FDR `monotonic_ms` span; calls
+    `evaluate_thumbnail_budget`; asserts `report.passes`.
+  - Pure-logic: 7 `test_evaluate_thumbnail_budget_*` unit tests
+    (under-budget, over-budget, extrapolation math,
+    zero-duration-fails, negative-size raises, invalid budget
+    raises, custom-budget-passes).
+
+* **AC-6 (parameterisation)** ✓
+  - `pytest --collect-only` confirms 6 param IDs per scenario
+    (`[ardupilot|inav]-[okvis2|klt_ransac|vins_mono]`). All three
+    tests accept `fc_adapter` + `vio_strategy` fixtures.
+
+### Phase 3 — Code Quality
+
+* SRP: `tile_cache_inspector.py` carries five evaluators
+  (`evaluate_manifest_schema`, `evaluate_offline_mode`,
+  `detect_raw_frames`, `evaluate_thumbnail_budget`,
+  `probe_jpeg_dimensions`) + one walker (`walk_files`). Each
+  evaluator handles one AC family of one sub-scenario; the JPEG
+  dimension probe is co-located because it pairs structurally with
+  `detect_raw_frames`. ✓
+* Naming: `m_per_px_floor`, `observed_size_bytes`,
+  `extrapolated_8h_size_bytes`, `nav_camera_raw_dimensions` —
+  units in names. ✓
+* AAA pattern in unit tests with `# Arrange / # Act / # Assert`
+  comments per coding rule. ✓
+* No `try/except` swallows errors. `probe_jpeg_dimensions` catches
+  `OSError` and returns `None` — documented as "the file is not a
+  JPEG, the SOF marker is not present, or the file is truncated".
+  Callers of `probe_jpeg_dimensions` correctly treat `None` as
+  "dimension unknown" rather than silently zero. ✓
+* No code comments narrating mechanics — only docstrings + one
+  one-liner on the SOF marker byte map (the byte list is part of
+  the JPEG standard; the link inside the docstring isn't needed
+  given the standard reference is universally known). ✓
+* Function lengths: longest is `probe_jpeg_dimensions` at ~30 lines
+  including docstring; all under the 50-line / cyclomatic-10
+  threshold. ✓
+
+### Phase 4 — Security Quick-Scan
+
+* No SQL, no `shell=True`, no `eval`/`exec`. ✓
+* No hardcoded secrets / API keys. ✓
+* The JPEG SOF parser does bounded reads (every `read` checks
+  return-length); a malformed JPEG cannot cause unbounded memory
+  consumption. ✓
+* `evaluate_offline_mode` validates `Internal` is a `bool` (not
+  truthy-coerced) — a string "true" or integer 1 in the inspect
+  JSON will not silently pass the gate. ✓
+* `evaluate_thumbnail_budget` rejects negative size and
+  zero-or-negative budget. ✓
+
+### Phase 5 — Performance
+
+* `evaluate_manifest_schema`: O(N entries × F fields) — typically
+  <100 entries × 7 fields, trivial. ✓
+* `detect_raw_frames`: O(N files), single pass; extension check
+  uses a tuple membership test (O(K) where K=8). ✓
+* `evaluate_offline_mode`: O(M networks) where M is usually 1. ✓
+* `evaluate_thumbnail_budget`: O(1). ✓
+* `probe_jpeg_dimensions`: reads only segment headers (≤16 bytes
+  per segment hop) until SOF; even a multi-MB JPEG terminates
+  in <30 hops. ✓
+* `walk_files`: O(total files under the roots), standard rglob
+  iteration; no in-memory list buffering. ✓
+
+### Phase 6 — Cross-Task Consistency (single-task batch)
+
+* Naming follows the recent `gcs_telemetry_evaluator` / `*_report`
+  / `passes` property convention. ✓
+* FDR record types declared as module-level constants
+  (`CACHE_SELF_CHECK_FDR_KIND`, `TILE_LOAD_REJECTED_FDR_KIND`)
+  mirrors the b81 pattern (`HINT_FDR_KIND`,
+  `ANCHOR_SEARCH_REGION_FDR_KIND`). ✓
+* Skip-rule pattern (`if not sitl_replay_ready: pytest.skip(...)`)
+  is consistent with the 18 other scenario tests in `tests/positive`. ✓
+
+### Phase 7 — Architecture Compliance
+
+`_docs/02_document/module-layout.md` declares `blackbox_tests` as the
+sole owner of `e2e/**`.
+
+1. **Layer direction**: every import in the six new/edited files
+   resolves to `runner.helpers.*`, `runner.helpers.fdr_reader`,
+   `runner.helpers.tile_cache_inspector`, stdlib, or pytest. No
+   `src/gps_denied_onboard` imports. ✓ (verified by
+   `test_no_sut_imports.py`).
+2. **Public API respect**: scenario tests import only top-level
+   module symbols from `runner.helpers.*` (no `_private`). ✓
+3. **No new cyclic deps**: `tile_cache_inspector` is a leaf consumed
+   by 3 scenario tests + 1 unit-test module; no back-edges. ✓
+4. **Duplicate symbols**: `probe_jpeg_dimensions` is the first JPEG
+   header parser in the e2e tree. If a future scenario needs the
+   same probe (e.g., NFT-LIM-02 size budgeting), promote to a
+   shared `runner/helpers/image_probe.py`. Tracked, not flagged.
+5. **Cross-cutting concerns**: file-system walks (`walk_files`)
+   are local to `tile_cache_inspector` for now. If another scenario
+   needs filesystem walks for different reasons (e.g., FT-P-17
+   tile-output verification), promote. ✓
+
+## Regression Gate
+
+Full `e2e/_unit_tests/` suite: **793 passed in 139.27 s**, single run,
+no flakes. Up from 746 (batch 81) by +47:
+
+* +43 in new `test_tile_cache_inspector.py` (12 manifest, 7 offline,
+  9 raw-frames, 7 thumbnail-budget, 5 JPEG-probe, 3 walk).
+* +3 new entries in `test_directory_layout.py` (3 scenario test paths).
+* +1 from a `test_no_sut_imports.py` walk that now covers the new
+  helper.
+
+No tests removed. Scenario tests skip locally because
+`E2E_SITL_REPLAY_DIR` is unset (intended docker-vs-host boundary).