# Code Review Report **Batch**: 82 — AZ-421 (FT-P-15 + FT-P-16 + FT-P-18 cache/offline/no-raw-retention) **Date**: 2026-05-17 **Verdict**: PASS_WITH_WARNINGS ## Findings | # | Severity | Category | File:Line | Title | |----|----------|-----------------|----------------------------------------------------------------------------|----------------------------------------------------------------| | 1 | Low | Maintainability | `e2e/runner/helpers/tile_cache_inspector.py:120` | `_resolve_entry_id` falls back to `tile_matrix` before synth | | 2 | Low | Style | `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139` | Multi-OR assert in synthesised-id test | | 3 | Low | Scope | `e2e/tests/positive/test_ft_p_16_offline_only.py:80` | Docker inspect JSON env-var indirection requires fixture support | ### Finding Details **F1: `_resolve_entry_id` lookup order may surface `tile_matrix` as an id** (Low / Maintainability) - Location: `e2e/runner/helpers/tile_cache_inspector.py:120-124` - Description: When an entry lacks both `id` and `tile_id`, the resolver falls through to `tile_matrix` before synthesising an `entry_N` placeholder. This can produce duplicate "id" values if several entries share a tile-matrix, which would in turn block the `rejected_below_floor_ids` lookup from matching the right entry. - Suggestion: leave as-is for now; the FDR schema commits to `id` being present per `_docs/02_document/components/c6_*` contracts. The fallback is a defensive read for malformed fixtures. If the fixture builder ever produces entries without `id`, the AC-1 "missing_fields" check already fails first — the entry-id resolution is then for diagnostic display only. - Task: AZ-421 **F2: Multi-OR assert in synthesised-id test** (Low / Style) - Location: `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139` - Description: `test_evaluate_manifest_schema_entry_id_falls_back_to_synthesised` uses a 3-way OR assert because the `_resolve_entry_id` resolver inspects `id` → `tile_id` → `tile_matrix` → `entry_N` and the test entry happens to have `tile_matrix`. The assert is correct (covers the actual lookup order) but reads ambiguously. - Suggestion: leave as-is; tightening the assert would force the test to know the resolver's internal lookup chain, which is the exact coupling code review usually flags. Documented here for future cleanup if the resolver simplifies. - Task: AZ-421 **F3: Docker inspect indirection requires fixture-builder support** (Low / Scope) - Location: `e2e/tests/positive/test_ft_p_16_offline_only.py:80-92` - Description: The FT-P-16 scenario reads `docker network inspect e2e-net` + `docker inspect ` from JSON files (env vars `DOCKER_NETWORK_INSPECT_PATH` / `DOCKER_CONTAINER_INSPECT_PATH`) rather than calling `docker` directly. This is intentional — the e2e-runner container does not have docker-socket access, and the fixture builder must snapshot inspect output before the runner starts. - Suggestion: the fixture builder (AZ-595) needs a thin wrapper that produces both JSON files at the start of every scenario run that needs them. Tracked outside this batch. - Task: AZ-421 ## Findings Sweep ### Phase 1 — Context Loading Read AZ-421 spec, blackbox-tests § FT-P-15/16/18, module-layout (confirmed `blackbox_tests` owns `e2e/**`), conftest (fixture surface), existing FDR reader, and recent helpers as templates (`gcs_telemetry_evaluator.py`, `ap_contract_evaluator.py`). ### Phase 2 — Spec Compliance (AC trace) * **AC-1 (FT-P-15 manifest schema completeness)** ✓ - Scenario: `test_ft_p_15_cache_schema` walks FDR for `cache-self-check` records, builds the manifest entry list, calls `evaluate_manifest_schema`, asserts `report.passes`. - Pure-logic: 12 `test_evaluate_manifest_schema_*` unit tests covering full-fields-pass, missing-fields-fail (single + multi + ordered), at-floor exactly, empty list, non-numeric m/px, invalid floor → ValueError, custom required fields. * **AC-2 (FT-P-15 m/px floor ≥ 0.5)** ✓ - Covered by `ManifestEntryReport.passes_floor` + `ManifestSchemaReport.passes` (rejects below-floor entries unless `tile_load_rejected_ids` includes them). - Pure-logic: below-floor-no-rejection-fails, below-floor-with-rejection-passes, at-floor-exactly-passes. * **AC-3 (FT-P-16 offline operation)** ✓ - Scenario: `test_ft_p_16_offline_only` loads two docker-inspect JSON files, calls `evaluate_offline_mode`, asserts `report.passes`. - Pure-logic: 7 `test_evaluate_offline_mode_*` unit tests (passes, non-internal-fails, extra-network-fails, no-networks-fails, missing-Internal-key-fails, non-bool-Internal-fails, custom-expected-network-passes). * **AC-4 (FT-P-18 no raw-frame retention)** ✓ - Scenario: `test_ft_p_18_no_raw_retention` walks FDR + tile-cache via `walk_files`, probes JPEG dimensions, calls `detect_raw_frames`, asserts `report.passes`. - Pure-logic: 9 `test_detect_raw_frames_*` + 5 `test_probe_jpeg_dimensions_*` + 3 `test_walk_files_*` = 17 unit tests. * **AC-5 (FT-P-18 thumbnail budget < 1 GB / 8 h)** ✓ - Scenario: computes `thumbnail_log_size_bytes` from the walk + replay duration from FDR `monotonic_ms` span; calls `evaluate_thumbnail_budget`; asserts `report.passes`. - Pure-logic: 7 `test_evaluate_thumbnail_budget_*` unit tests (under-budget, over-budget, extrapolation math, zero-duration-fails, negative-size raises, invalid budget raises, custom-budget-passes). * **AC-6 (parameterisation)** ✓ - `pytest --collect-only` confirms 6 param IDs per scenario (`[ardupilot|inav]-[okvis2|klt_ransac|vins_mono]`). All three tests accept `fc_adapter` + `vio_strategy` fixtures. ### Phase 3 — Code Quality * SRP: `tile_cache_inspector.py` carries five evaluators (`evaluate_manifest_schema`, `evaluate_offline_mode`, `detect_raw_frames`, `evaluate_thumbnail_budget`, `probe_jpeg_dimensions`) + one walker (`walk_files`). Each evaluator handles one AC family of one sub-scenario; the JPEG dimension probe is co-located because it pairs structurally with `detect_raw_frames`. ✓ * Naming: `m_per_px_floor`, `observed_size_bytes`, `extrapolated_8h_size_bytes`, `nav_camera_raw_dimensions` — units in names. ✓ * AAA pattern in unit tests with `# Arrange / # Act / # Assert` comments per coding rule. ✓ * No `try/except` swallows errors. `probe_jpeg_dimensions` catches `OSError` and returns `None` — documented as "the file is not a JPEG, the SOF marker is not present, or the file is truncated". Callers of `probe_jpeg_dimensions` correctly treat `None` as "dimension unknown" rather than silently zero. ✓ * No code comments narrating mechanics — only docstrings + one one-liner on the SOF marker byte map (the byte list is part of the JPEG standard; the link inside the docstring isn't needed given the standard reference is universally known). ✓ * Function lengths: longest is `probe_jpeg_dimensions` at ~30 lines including docstring; all under the 50-line / cyclomatic-10 threshold. ✓ ### Phase 4 — Security Quick-Scan * No SQL, no `shell=True`, no `eval`/`exec`. ✓ * No hardcoded secrets / API keys. ✓ * The JPEG SOF parser does bounded reads (every `read` checks return-length); a malformed JPEG cannot cause unbounded memory consumption. ✓ * `evaluate_offline_mode` validates `Internal` is a `bool` (not truthy-coerced) — a string "true" or integer 1 in the inspect JSON will not silently pass the gate. ✓ * `evaluate_thumbnail_budget` rejects negative size and zero-or-negative budget. ✓ ### Phase 5 — Performance * `evaluate_manifest_schema`: O(N entries × F fields) — typically <100 entries × 7 fields, trivial. ✓ * `detect_raw_frames`: O(N files), single pass; extension check uses a tuple membership test (O(K) where K=8). ✓ * `evaluate_offline_mode`: O(M networks) where M is usually 1. ✓ * `evaluate_thumbnail_budget`: O(1). ✓ * `probe_jpeg_dimensions`: reads only segment headers (≤16 bytes per segment hop) until SOF; even a multi-MB JPEG terminates in <30 hops. ✓ * `walk_files`: O(total files under the roots), standard rglob iteration; no in-memory list buffering. ✓ ### Phase 6 — Cross-Task Consistency (single-task batch) * Naming follows the recent `gcs_telemetry_evaluator` / `*_report` / `passes` property convention. ✓ * FDR record types declared as module-level constants (`CACHE_SELF_CHECK_FDR_KIND`, `TILE_LOAD_REJECTED_FDR_KIND`) mirrors the b81 pattern (`HINT_FDR_KIND`, `ANCHOR_SEARCH_REGION_FDR_KIND`). ✓ * Skip-rule pattern (`if not sitl_replay_ready: pytest.skip(...)`) is consistent with the 18 other scenario tests in `tests/positive`. ✓ ### Phase 7 — Architecture Compliance `_docs/02_document/module-layout.md` declares `blackbox_tests` as the sole owner of `e2e/**`. 1. **Layer direction**: every import in the six new/edited files resolves to `runner.helpers.*`, `runner.helpers.fdr_reader`, `runner.helpers.tile_cache_inspector`, stdlib, or pytest. No `src/gps_denied_onboard` imports. ✓ (verified by `test_no_sut_imports.py`). 2. **Public API respect**: scenario tests import only top-level module symbols from `runner.helpers.*` (no `_private`). ✓ 3. **No new cyclic deps**: `tile_cache_inspector` is a leaf consumed by 3 scenario tests + 1 unit-test module; no back-edges. ✓ 4. **Duplicate symbols**: `probe_jpeg_dimensions` is the first JPEG header parser in the e2e tree. If a future scenario needs the same probe (e.g., NFT-LIM-02 size budgeting), promote to a shared `runner/helpers/image_probe.py`. Tracked, not flagged. 5. **Cross-cutting concerns**: file-system walks (`walk_files`) are local to `tile_cache_inspector` for now. If another scenario needs filesystem walks for different reasons (e.g., FT-P-17 tile-output verification), promote. ✓ ## Regression Gate Full `e2e/_unit_tests/` suite: **793 passed in 139.27 s**, single run, no flakes. Up from 746 (batch 81) by +47: * +43 in new `test_tile_cache_inspector.py` (12 manifest, 7 offline, 9 raw-frames, 7 thumbnail-budget, 5 JPEG-probe, 3 walk). * +3 new entries in `test_directory_layout.py` (3 scenario test paths). * +1 from a `test_no_sut_imports.py` walk that now covers the new helper. No tests removed. Scenario tests skip locally because `E2E_SITL_REPLAY_DIR` is unset (intended docker-vs-host boundary).