[AZ-421] Batch 82: FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention

FT-P-15: parse FDR `cache-self-check` records; assert every tile-manifest
entry has CRS, tile_matrix, dimension, m_per_px, capture_date, source,
compression; m_per_px >= 0.5 (or rejected by FDR `tile-load-rejected`).

FT-P-16: read `docker network inspect e2e-net` + `docker inspect <sut>`
snapshots; assert `Internal == true` AND SUT attached only to e2e-net.
The 0-egress semantic of AC-8.3 is enforced structurally.

FT-P-18: walk FDR + tile-cache, probe JPEG dimensions via stdlib SOF
parser, reject any file matching nav-camera raw pattern (5472x3648 or
880x720). Extrapolate thumbnail-log size to 8h; assert < 1 GB.

Adds runner.helpers.tile_cache_inspector with five evaluators
(manifest schema, offline mode, raw-frame detection, thumbnail budget,
JPEG dimension probe) + walk_files helper. Pure-logic coverage: 43
new unit tests; full e2e/_unit_tests/ suite 793 passing (was 746).
Scenarios skip locally when SITL replay fixture or docker-inspect
env vars are missing; production hooks (cache-self-check FDR record,
tile-load-rejected events, docker-inspect snapshots) are tracked
outside this task.

See _docs/03_implementation/batch_82_report.md +
reviews/batch_82_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 15:09:58 +03:00
parent b0296da911
commit 7d1288e4ba
9 changed files with 1693 additions and 3 deletions
@@ -0,0 +1,224 @@
# Code Review Report
**Batch**: 82 — AZ-421 (FT-P-15 + FT-P-16 + FT-P-18 cache/offline/no-raw-retention)
**Date**: 2026-05-17
**Verdict**: PASS_WITH_WARNINGS
## Findings
| # | Severity | Category | File:Line | Title |
|----|----------|-----------------|----------------------------------------------------------------------------|----------------------------------------------------------------|
| 1 | Low | Maintainability | `e2e/runner/helpers/tile_cache_inspector.py:120` | `_resolve_entry_id` falls back to `tile_matrix` before synth |
| 2 | Low | Style | `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139` | Multi-OR assert in synthesised-id test |
| 3 | Low | Scope | `e2e/tests/positive/test_ft_p_16_offline_only.py:80` | Docker inspect JSON env-var indirection requires fixture support |
### Finding Details
**F1: `_resolve_entry_id` lookup order may surface `tile_matrix` as an id**
(Low / Maintainability)
- Location: `e2e/runner/helpers/tile_cache_inspector.py:120-124`
- Description: When an entry lacks both `id` and `tile_id`, the
resolver falls through to `tile_matrix` before synthesising an
`entry_N` placeholder. This can produce duplicate "id" values if
several entries share a tile-matrix, which would in turn block
the `rejected_below_floor_ids` lookup from matching the right
entry.
- Suggestion: leave as-is for now; the FDR schema commits to `id`
being present per `_docs/02_document/components/c6_*` contracts.
The fallback is a defensive read for malformed fixtures. If the
fixture builder ever produces entries without `id`, the AC-1
"missing_fields" check already fails first — the entry-id
resolution is then for diagnostic display only.
- Task: AZ-421
**F2: Multi-OR assert in synthesised-id test** (Low / Style)
- Location: `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139`
- Description:
`test_evaluate_manifest_schema_entry_id_falls_back_to_synthesised`
uses a 3-way OR assert because the `_resolve_entry_id` resolver
inspects `id``tile_id``tile_matrix``entry_N` and the
test entry happens to have `tile_matrix`. The assert is correct
(covers the actual lookup order) but reads ambiguously.
- Suggestion: leave as-is; tightening the assert would force the
test to know the resolver's internal lookup chain, which is the
exact coupling code review usually flags. Documented here for
future cleanup if the resolver simplifies.
- Task: AZ-421
**F3: Docker inspect indirection requires fixture-builder support**
(Low / Scope)
- Location: `e2e/tests/positive/test_ft_p_16_offline_only.py:80-92`
- Description: The FT-P-16 scenario reads
`docker network inspect e2e-net` + `docker inspect <sut-ctr>` from
JSON files (env vars `DOCKER_NETWORK_INSPECT_PATH` /
`DOCKER_CONTAINER_INSPECT_PATH`) rather than calling `docker`
directly. This is intentional — the e2e-runner container does not
have docker-socket access, and the fixture builder must snapshot
inspect output before the runner starts.
- Suggestion: the fixture builder (AZ-595) needs a thin wrapper
that produces both JSON files at the start of every scenario run
that needs them. Tracked outside this batch.
- Task: AZ-421
## Findings Sweep
### Phase 1 — Context Loading
Read AZ-421 spec, blackbox-tests § FT-P-15/16/18, module-layout (confirmed
`blackbox_tests` owns `e2e/**`), conftest (fixture surface), existing FDR
reader, and recent helpers as templates (`gcs_telemetry_evaluator.py`,
`ap_contract_evaluator.py`).
### Phase 2 — Spec Compliance (AC trace)
* **AC-1 (FT-P-15 manifest schema completeness)** ✓
- Scenario: `test_ft_p_15_cache_schema` walks FDR for
`cache-self-check` records, builds the manifest entry list, calls
`evaluate_manifest_schema`, asserts `report.passes`.
- Pure-logic: 12 `test_evaluate_manifest_schema_*` unit tests
covering full-fields-pass, missing-fields-fail (single + multi
+ ordered), at-floor exactly, empty list, non-numeric m/px,
invalid floor → ValueError, custom required fields.
* **AC-2 (FT-P-15 m/px floor ≥ 0.5)** ✓
- Covered by `ManifestEntryReport.passes_floor` +
`ManifestSchemaReport.passes` (rejects below-floor entries
unless `tile_load_rejected_ids` includes them).
- Pure-logic: below-floor-no-rejection-fails,
below-floor-with-rejection-passes, at-floor-exactly-passes.
* **AC-3 (FT-P-16 offline operation)** ✓
- Scenario: `test_ft_p_16_offline_only` loads two docker-inspect
JSON files, calls `evaluate_offline_mode`, asserts
`report.passes`.
- Pure-logic: 7 `test_evaluate_offline_mode_*` unit tests
(passes, non-internal-fails, extra-network-fails,
no-networks-fails, missing-Internal-key-fails,
non-bool-Internal-fails, custom-expected-network-passes).
* **AC-4 (FT-P-18 no raw-frame retention)** ✓
- Scenario: `test_ft_p_18_no_raw_retention` walks FDR + tile-cache
via `walk_files`, probes JPEG dimensions, calls
`detect_raw_frames`, asserts `report.passes`.
- Pure-logic: 9 `test_detect_raw_frames_*` + 5
`test_probe_jpeg_dimensions_*` + 3 `test_walk_files_*` =
17 unit tests.
* **AC-5 (FT-P-18 thumbnail budget < 1 GB / 8 h)** ✓
- Scenario: computes `thumbnail_log_size_bytes` from the walk +
replay duration from FDR `monotonic_ms` span; calls
`evaluate_thumbnail_budget`; asserts `report.passes`.
- Pure-logic: 7 `test_evaluate_thumbnail_budget_*` unit tests
(under-budget, over-budget, extrapolation math,
zero-duration-fails, negative-size raises, invalid budget
raises, custom-budget-passes).
* **AC-6 (parameterisation)** ✓
- `pytest --collect-only` confirms 6 param IDs per scenario
(`[ardupilot|inav]-[okvis2|klt_ransac|vins_mono]`). All three
tests accept `fc_adapter` + `vio_strategy` fixtures.
### Phase 3 — Code Quality
* SRP: `tile_cache_inspector.py` carries five evaluators
(`evaluate_manifest_schema`, `evaluate_offline_mode`,
`detect_raw_frames`, `evaluate_thumbnail_budget`,
`probe_jpeg_dimensions`) + one walker (`walk_files`). Each
evaluator handles one AC family of one sub-scenario; the JPEG
dimension probe is co-located because it pairs structurally with
`detect_raw_frames`. ✓
* Naming: `m_per_px_floor`, `observed_size_bytes`,
`extrapolated_8h_size_bytes`, `nav_camera_raw_dimensions`
units in names. ✓
* AAA pattern in unit tests with `# Arrange / # Act / # Assert`
comments per coding rule. ✓
* No `try/except` swallows errors. `probe_jpeg_dimensions` catches
`OSError` and returns `None` — documented as "the file is not a
JPEG, the SOF marker is not present, or the file is truncated".
Callers of `probe_jpeg_dimensions` correctly treat `None` as
"dimension unknown" rather than silently zero. ✓
* No code comments narrating mechanics — only docstrings + one
one-liner on the SOF marker byte map (the byte list is part of
the JPEG standard; the link inside the docstring isn't needed
given the standard reference is universally known). ✓
* Function lengths: longest is `probe_jpeg_dimensions` at ~30 lines
including docstring; all under the 50-line / cyclomatic-10
threshold. ✓
### Phase 4 — Security Quick-Scan
* No SQL, no `shell=True`, no `eval`/`exec`. ✓
* No hardcoded secrets / API keys. ✓
* The JPEG SOF parser does bounded reads (every `read` checks
return-length); a malformed JPEG cannot cause unbounded memory
consumption. ✓
* `evaluate_offline_mode` validates `Internal` is a `bool` (not
truthy-coerced) — a string "true" or integer 1 in the inspect
JSON will not silently pass the gate. ✓
* `evaluate_thumbnail_budget` rejects negative size and
zero-or-negative budget. ✓
### Phase 5 — Performance
* `evaluate_manifest_schema`: O(N entries × F fields) — typically
<100 entries × 7 fields, trivial. ✓
* `detect_raw_frames`: O(N files), single pass; extension check
uses a tuple membership test (O(K) where K=8). ✓
* `evaluate_offline_mode`: O(M networks) where M is usually 1. ✓
* `evaluate_thumbnail_budget`: O(1). ✓
* `probe_jpeg_dimensions`: reads only segment headers (≤16 bytes
per segment hop) until SOF; even a multi-MB JPEG terminates
in <30 hops. ✓
* `walk_files`: O(total files under the roots), standard rglob
iteration; no in-memory list buffering. ✓
### Phase 6 — Cross-Task Consistency (single-task batch)
* Naming follows the recent `gcs_telemetry_evaluator` / `*_report`
/ `passes` property convention. ✓
* FDR record types declared as module-level constants
(`CACHE_SELF_CHECK_FDR_KIND`, `TILE_LOAD_REJECTED_FDR_KIND`)
mirrors the b81 pattern (`HINT_FDR_KIND`,
`ANCHOR_SEARCH_REGION_FDR_KIND`). ✓
* Skip-rule pattern (`if not sitl_replay_ready: pytest.skip(...)`)
is consistent with the 18 other scenario tests in `tests/positive`. ✓
### Phase 7 — Architecture Compliance
`_docs/02_document/module-layout.md` declares `blackbox_tests` as the
sole owner of `e2e/**`.
1. **Layer direction**: every import in the six new/edited files
resolves to `runner.helpers.*`, `runner.helpers.fdr_reader`,
`runner.helpers.tile_cache_inspector`, stdlib, or pytest. No
`src/gps_denied_onboard` imports. ✓ (verified by
`test_no_sut_imports.py`).
2. **Public API respect**: scenario tests import only top-level
module symbols from `runner.helpers.*` (no `_private`). ✓
3. **No new cyclic deps**: `tile_cache_inspector` is a leaf consumed
by 3 scenario tests + 1 unit-test module; no back-edges. ✓
4. **Duplicate symbols**: `probe_jpeg_dimensions` is the first JPEG
header parser in the e2e tree. If a future scenario needs the
same probe (e.g., NFT-LIM-02 size budgeting), promote to a
shared `runner/helpers/image_probe.py`. Tracked, not flagged.
5. **Cross-cutting concerns**: file-system walks (`walk_files`)
are local to `tile_cache_inspector` for now. If another scenario
needs filesystem walks for different reasons (e.g., FT-P-17
tile-output verification), promote. ✓
## Regression Gate
Full `e2e/_unit_tests/` suite: **793 passed in 139.27 s**, single run,
no flakes. Up from 746 (batch 81) by +47:
* +43 in new `test_tile_cache_inspector.py` (12 manifest, 7 offline,
9 raw-frames, 7 thumbnail-budget, 5 JPEG-probe, 3 walk).
* +3 new entries in `test_directory_layout.py` (3 scenario test paths).
* +1 from a `test_no_sut_imports.py` walk that now covers the new
helper.
No tests removed. Scenario tests skip locally because
`E2E_SITL_REPLAY_DIR` is unset (intended docker-vs-host boundary).