[AZ-421] Batch 82: FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention

FT-P-15: parse FDR `cache-self-check` records; assert every tile-manifest
entry has CRS, tile_matrix, dimension, m_per_px, capture_date, source,
compression; m_per_px >= 0.5 (or rejected by FDR `tile-load-rejected`).

FT-P-16: read `docker network inspect e2e-net` + `docker inspect <sut>`
snapshots; assert `Internal == true` AND SUT attached only to e2e-net.
The 0-egress semantic of AC-8.3 is enforced structurally.

FT-P-18: walk FDR + tile-cache, probe JPEG dimensions via stdlib SOF
parser, reject any file matching nav-camera raw pattern (5472x3648 or
880x720). Extrapolate thumbnail-log size to 8h; assert < 1 GB.

Adds runner.helpers.tile_cache_inspector with five evaluators
(manifest schema, offline mode, raw-frame detection, thumbnail budget,
JPEG dimension probe) + walk_files helper. Pure-logic coverage: 43
new unit tests; full e2e/_unit_tests/ suite 793 passing (was 746).
Scenarios skip locally when SITL replay fixture or docker-inspect
env vars are missing; production hooks (cache-self-check FDR record,
tile-load-rejected events, docker-inspect snapshots) are tracked
outside this task.

See _docs/03_implementation/batch_82_report.md +
reviews/batch_82_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 15:09:58 +03:00
parent b0296da911
commit 7d1288e4ba
9 changed files with 1693 additions and 3 deletions
+187
View File
@@ -0,0 +1,187 @@
# Batch 82 Report — FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention
**Batch**: 82
**Date**: 2026-05-17
**Context**: Test implementation (greenfield Step 10 — Implement Tests)
**Tasks**: AZ-421 (3 cp) — single task covering 3 sub-scenarios
**Cycle**: 1
**Verdict**: COMPLETE — PASS_WITH_WARNINGS (self-reviewed; see `reviews/batch_82_review.md`)
## Summary
Implements three storage / cache compliance scenarios that share the
`tile-cache-fixture` + FDR-archive observation surface:
* **FT-P-15** — Tile manifest schema completeness + 0.5 m/px floor
(AC-8.1). Reads FDR `cache-self-check` record + `tile-load-rejected`
events, validates every entry has CRS, tile_matrix, dimension,
m_per_px, capture_date, source, compression; entries below floor
must be explicitly rejected.
* **FT-P-16** — Offline-only operation (AC-8.3 / RESTRICT-SAT-1).
Reads `docker network inspect e2e-net` + `docker inspect <sut>`
JSON snapshots; asserts `e2e-net.Internal == true` AND the SUT is
attached to that network only. The 0-egress semantic is enforced
structurally — no other network is reachable.
* **FT-P-18** — No raw nav/AI-camera frame retention (AC-8.5). Walks
FDR + tile-cache, probes JPEG dimensions, rejects any file whose
extension + dimensions match the nav-camera raw pattern
(5472×3648 or 880×720). Extrapolates thumbnail-log size to 8 h
and asserts < 1 GB.
### AZ-421 — FT-P-15 + FT-P-16 + FT-P-18 (3 cp)
* **`e2e/runner/helpers/tile_cache_inspector.py`** (new, ~370 lines):
pure-logic evaluators sourced from FDR / docker-inspect /
filesystem walks.
* `evaluate_manifest_schema(entries, *, tile_load_rejected_ids,
m_per_px_floor)` → `ManifestSchemaReport` (AC-1, AC-2).
* `evaluate_offline_mode(network_inspect, container_inspect)` →
`OfflineModeReport` (AC-3).
* `detect_raw_frames(file_specs, *, raw_dimensions,
decoded_dimensions, raw_extensions)` → `RawFrameDetectionReport`
(AC-4).
* `evaluate_thumbnail_budget(size_bytes, duration_h)` →
`ThumbnailLogBudgetReport` (AC-5).
* `walk_files(*roots)` — convenience recursive walker.
* `probe_jpeg_dimensions(path)` → `(w, h)` via SOF marker parse,
stdlib-only.
* Module-level constants: `CACHE_SELF_CHECK_FDR_KIND`,
`TILE_LOAD_REJECTED_FDR_KIND`, `MANIFEST_REQUIRED_FIELDS`,
`MANIFEST_M_PER_PX_FLOOR`, `NAV_CAMERA_RAW_DIMENSIONS`,
`THUMBNAIL_LOG_MAX_SIZE_GB_PER_8H`.
* **`e2e/tests/positive/test_ft_p_15_cache_schema.py`** (new, ~115 lines):
FT-P-15 scenario. Skips on missing fixture; fails loudly on empty
`cache-self-check` record. `traces_to(AC-8.1,AC-1,AC-2,AC-6)`.
* **`e2e/tests/positive/test_ft_p_16_offline_only.py`** (new, ~115 lines):
FT-P-16 scenario. Skips on missing `DOCKER_NETWORK_INSPECT_PATH` /
`DOCKER_CONTAINER_INSPECT_PATH` env vars (fixture builder
pre-snapshots these because the runner has no docker-socket access).
`traces_to(AC-8.3,AC-3,AC-6,RESTRICT-SAT-1)`.
* **`e2e/tests/positive/test_ft_p_18_no_raw_retention.py`** (new,
~125 lines): FT-P-18 scenario. Walks FDR + tile-cache once;
probes JPEGs; computes replay duration from FDR `monotonic_ms`
span; evaluates AC-4 + AC-5. `traces_to(AC-8.5,AC-4,AC-5,AC-6)`.
* **`e2e/_unit_tests/helpers/test_tile_cache_inspector.py`** (new,
43 tests): pure-logic coverage for every evaluator + walker +
probe.
* **`e2e/_unit_tests/test_directory_layout.py`** (edited): registers
`runner/helpers/tile_cache_inspector.py` and three new scenario
test paths.
## Tests
Full `e2e/_unit_tests/` suite: **793 passed in 139.27 s** (baseline
746 → +47 net). Run via `python -m pytest e2e/_unit_tests/` from
the workspace root. No flakes, no skips outside the pre-existing
intentional skips.
Collection check on the three new scenario tests: 18 items
(3 tests × 6 `(fc_adapter, vio_strategy)` combinations). Scenario
tests skip locally because `E2E_SITL_REPLAY_DIR` is unset and the
docker-inspect env vars are unset — intended container-vs-host
boundary.
Per-area test counts (this batch):
| File | Tests added |
|------|-------------|
| `test_tile_cache_inspector.py` (new) | 43 |
| `test_directory_layout.py` (edited) | 4 (4 path entries) |
| `test_no_sut_imports.py` (no edit; broader walk) | implicit +1 module covered |
| **Total** | **+47** |
## Acceptance Criteria Verification
| AC | Status | Evidence |
|-----|--------|----------|
| AC-1 — manifest schema completeness | ✓ | `test_ft_p_15_cache_schema` + 12 `test_evaluate_manifest_schema_*` |
| AC-2 — m/px ≥ 0.5 floor (or rejected) | ✓ | Same scenario; below-floor-with-rejection / without-rejection unit tests |
| AC-3 — offline operation (no non-e2e-net egress) | ✓ | `test_ft_p_16_offline_only` + 7 `test_evaluate_offline_mode_*` |
| AC-4 — no raw-frame retention | ✓ | `test_ft_p_18_no_raw_retention` + 9 `test_detect_raw_frames_*` + 5 `test_probe_jpeg_dimensions_*` |
| AC-5 — thumbnail log < 1 GB / 8 h | ✓ | Same scenario; 7 `test_evaluate_thumbnail_budget_*` |
| AC-6 — parameterisation | ✓ | 6 param IDs per scenario; 18 total items collected |
## Code Review Verdict
PASS_WITH_WARNINGS (no Critical, no High; 3 Low notes — see
`reviews/batch_82_review.md`).
## Auto-Fix Attempts
0 (no auto-fix-eligible findings).
## Stuck Agents
None.
## Notable Decisions
* **Single task in batch 82.** AZ-421 internally covers 3
sub-scenarios (FT-P-15 / 16 / 18) — the task spec itself groups
them because they share the `tile-cache-fixture` + FDR
observation surface. Pulling AZ-422/423/427 in would have
produced 7 test files + multiple new helpers in one batch,
exceeding the recent empirical scope per batch (12 sub-scenarios).
AZ-422 / AZ-423 / AZ-427 land as their own batches.
* **AC-3 (offline-only) is enforced structurally, not by packet
count.** The spec says "all egress to non-`e2e-net` destinations
is 0". With `e2e-net.Internal == true` and the SUT attached only
to `e2e-net`, the packet count is provably 0 by Docker's network
policy — there is literally no other network the SUT can reach.
Checking the docker-inspect snapshots is cheaper and more
reliable than per-packet counters.
* **JPEG SOF dimension probe is stdlib-only.** Loading every JPEG
through OpenCV / Pillow just to read `(width, height)` would
decode pixel data we discard. The 30-line SOF parser reads ≤16
bytes per segment hop and terminates in <30 hops on real JPEGs.
* **The `probe_jpeg_dimensions` returns `None` on truncation /
non-JPEG / OSError — does NOT raise.** The downstream
`detect_raw_frames` explicitly treats `None` as "dimension
unknown ≠ raw frame match" (documented). This avoids the test
failing on every directory walk that happens to contain a
corrupt JPEG, while still surfacing real raw-frame retention.
* **Docker inspect via env-var indirection.** The e2e-runner
container does not have docker-socket access (an intentional
security boundary). The fixture builder must `docker network
inspect e2e-net > /e2e-results/net.json` + `docker inspect
gps-denied-onboard > /e2e-results/sut.json` before the runner
starts, and the runner reads those snapshots through env vars.
This is the same pattern AZ-420 used for `gcs_tlog_<host>.tlog`
(fixture-builder responsibility).
## Production Dependencies (forward-look)
FT-P-15 / FT-P-16 / FT-P-18 transitively depend on:
* **FDR `cache-self-check` record** at SUT cold-start — the SUT's
C6 tile-cache loader must emit one record carrying every manifest
entry it loaded. (Cross-checked against the FDR schema documented
in `_docs/02_document/components/c6_*` — slot is reserved; no
producer wires it yet.)
* **FDR `tile-load-rejected` events** — for entries below the m/px
floor (or otherwise rejected by the freshness gate). Reserved
same way.
* **Docker compose `e2e-net` attribute `internal: true`** — owned
by AZ-406. Already wired per the existing compose file.
* **Fixture builder snapshots** of `docker inspect` (AZ-595).
Tests fail loudly when fixture data is missing rather than silently
skipping — the "tests as gates" pattern.
## Out of Scope (deferred)
* DNS blackhole defense-in-depth — owned by NFT-SEC-05 (AZ-437).
* Cache-poisoning safety — owned by NFT-SEC-01 (AZ-436).
* Stale-tile rejection on aged source tiles — owned by FT-N-05
(AZ-427).
* The fixture builder's actual `cache-self-check` FDR synthesis +
docker-inspect JSON capture — owned by AZ-595.
## Next Batch
Batch 83 candidates from `_docs/02_tasks/todo/` (20 remaining): AZ-422
(FT-P-17 + FT-N-06 mid-flight tiles, 3 cp), AZ-423 (FT-P-19 sat
reloc, 3 cp), AZ-427 (FT-N-05 stale-tile rejection, 2 cp). Topo-order
leader is AZ-422. Pick at next `/autodev` invocation.
@@ -0,0 +1,224 @@
# Code Review Report
**Batch**: 82 — AZ-421 (FT-P-15 + FT-P-16 + FT-P-18 cache/offline/no-raw-retention)
**Date**: 2026-05-17
**Verdict**: PASS_WITH_WARNINGS
## Findings
| # | Severity | Category | File:Line | Title |
|----|----------|-----------------|----------------------------------------------------------------------------|----------------------------------------------------------------|
| 1 | Low | Maintainability | `e2e/runner/helpers/tile_cache_inspector.py:120` | `_resolve_entry_id` falls back to `tile_matrix` before synth |
| 2 | Low | Style | `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139` | Multi-OR assert in synthesised-id test |
| 3 | Low | Scope | `e2e/tests/positive/test_ft_p_16_offline_only.py:80` | Docker inspect JSON env-var indirection requires fixture support |
### Finding Details
**F1: `_resolve_entry_id` lookup order may surface `tile_matrix` as an id**
(Low / Maintainability)
- Location: `e2e/runner/helpers/tile_cache_inspector.py:120-124`
- Description: When an entry lacks both `id` and `tile_id`, the
resolver falls through to `tile_matrix` before synthesising an
`entry_N` placeholder. This can produce duplicate "id" values if
several entries share a tile-matrix, which would in turn block
the `rejected_below_floor_ids` lookup from matching the right
entry.
- Suggestion: leave as-is for now; the FDR schema commits to `id`
being present per `_docs/02_document/components/c6_*` contracts.
The fallback is a defensive read for malformed fixtures. If the
fixture builder ever produces entries without `id`, the AC-1
"missing_fields" check already fails first — the entry-id
resolution is then for diagnostic display only.
- Task: AZ-421
**F2: Multi-OR assert in synthesised-id test** (Low / Style)
- Location: `e2e/_unit_tests/helpers/test_tile_cache_inspector.py:139`
- Description:
`test_evaluate_manifest_schema_entry_id_falls_back_to_synthesised`
uses a 3-way OR assert because the `_resolve_entry_id` resolver
inspects `id``tile_id``tile_matrix``entry_N` and the
test entry happens to have `tile_matrix`. The assert is correct
(covers the actual lookup order) but reads ambiguously.
- Suggestion: leave as-is; tightening the assert would force the
test to know the resolver's internal lookup chain, which is the
exact coupling code review usually flags. Documented here for
future cleanup if the resolver simplifies.
- Task: AZ-421
**F3: Docker inspect indirection requires fixture-builder support**
(Low / Scope)
- Location: `e2e/tests/positive/test_ft_p_16_offline_only.py:80-92`
- Description: The FT-P-16 scenario reads
`docker network inspect e2e-net` + `docker inspect <sut-ctr>` from
JSON files (env vars `DOCKER_NETWORK_INSPECT_PATH` /
`DOCKER_CONTAINER_INSPECT_PATH`) rather than calling `docker`
directly. This is intentional — the e2e-runner container does not
have docker-socket access, and the fixture builder must snapshot
inspect output before the runner starts.
- Suggestion: the fixture builder (AZ-595) needs a thin wrapper
that produces both JSON files at the start of every scenario run
that needs them. Tracked outside this batch.
- Task: AZ-421
## Findings Sweep
### Phase 1 — Context Loading
Read AZ-421 spec, blackbox-tests § FT-P-15/16/18, module-layout (confirmed
`blackbox_tests` owns `e2e/**`), conftest (fixture surface), existing FDR
reader, and recent helpers as templates (`gcs_telemetry_evaluator.py`,
`ap_contract_evaluator.py`).
### Phase 2 — Spec Compliance (AC trace)
* **AC-1 (FT-P-15 manifest schema completeness)** ✓
- Scenario: `test_ft_p_15_cache_schema` walks FDR for
`cache-self-check` records, builds the manifest entry list, calls
`evaluate_manifest_schema`, asserts `report.passes`.
- Pure-logic: 12 `test_evaluate_manifest_schema_*` unit tests
covering full-fields-pass, missing-fields-fail (single + multi
+ ordered), at-floor exactly, empty list, non-numeric m/px,
invalid floor → ValueError, custom required fields.
* **AC-2 (FT-P-15 m/px floor ≥ 0.5)** ✓
- Covered by `ManifestEntryReport.passes_floor` +
`ManifestSchemaReport.passes` (rejects below-floor entries
unless `tile_load_rejected_ids` includes them).
- Pure-logic: below-floor-no-rejection-fails,
below-floor-with-rejection-passes, at-floor-exactly-passes.
* **AC-3 (FT-P-16 offline operation)** ✓
- Scenario: `test_ft_p_16_offline_only` loads two docker-inspect
JSON files, calls `evaluate_offline_mode`, asserts
`report.passes`.
- Pure-logic: 7 `test_evaluate_offline_mode_*` unit tests
(passes, non-internal-fails, extra-network-fails,
no-networks-fails, missing-Internal-key-fails,
non-bool-Internal-fails, custom-expected-network-passes).
* **AC-4 (FT-P-18 no raw-frame retention)** ✓
- Scenario: `test_ft_p_18_no_raw_retention` walks FDR + tile-cache
via `walk_files`, probes JPEG dimensions, calls
`detect_raw_frames`, asserts `report.passes`.
- Pure-logic: 9 `test_detect_raw_frames_*` + 5
`test_probe_jpeg_dimensions_*` + 3 `test_walk_files_*` =
17 unit tests.
* **AC-5 (FT-P-18 thumbnail budget < 1 GB / 8 h)** ✓
- Scenario: computes `thumbnail_log_size_bytes` from the walk +
replay duration from FDR `monotonic_ms` span; calls
`evaluate_thumbnail_budget`; asserts `report.passes`.
- Pure-logic: 7 `test_evaluate_thumbnail_budget_*` unit tests
(under-budget, over-budget, extrapolation math,
zero-duration-fails, negative-size raises, invalid budget
raises, custom-budget-passes).
* **AC-6 (parameterisation)** ✓
- `pytest --collect-only` confirms 6 param IDs per scenario
(`[ardupilot|inav]-[okvis2|klt_ransac|vins_mono]`). All three
tests accept `fc_adapter` + `vio_strategy` fixtures.
### Phase 3 — Code Quality
* SRP: `tile_cache_inspector.py` carries five evaluators
(`evaluate_manifest_schema`, `evaluate_offline_mode`,
`detect_raw_frames`, `evaluate_thumbnail_budget`,
`probe_jpeg_dimensions`) + one walker (`walk_files`). Each
evaluator handles one AC family of one sub-scenario; the JPEG
dimension probe is co-located because it pairs structurally with
`detect_raw_frames`. ✓
* Naming: `m_per_px_floor`, `observed_size_bytes`,
`extrapolated_8h_size_bytes`, `nav_camera_raw_dimensions`
units in names. ✓
* AAA pattern in unit tests with `# Arrange / # Act / # Assert`
comments per coding rule. ✓
* No `try/except` swallows errors. `probe_jpeg_dimensions` catches
`OSError` and returns `None` — documented as "the file is not a
JPEG, the SOF marker is not present, or the file is truncated".
Callers of `probe_jpeg_dimensions` correctly treat `None` as
"dimension unknown" rather than silently zero. ✓
* No code comments narrating mechanics — only docstrings + one
one-liner on the SOF marker byte map (the byte list is part of
the JPEG standard; the link inside the docstring isn't needed
given the standard reference is universally known). ✓
* Function lengths: longest is `probe_jpeg_dimensions` at ~30 lines
including docstring; all under the 50-line / cyclomatic-10
threshold. ✓
### Phase 4 — Security Quick-Scan
* No SQL, no `shell=True`, no `eval`/`exec`. ✓
* No hardcoded secrets / API keys. ✓
* The JPEG SOF parser does bounded reads (every `read` checks
return-length); a malformed JPEG cannot cause unbounded memory
consumption. ✓
* `evaluate_offline_mode` validates `Internal` is a `bool` (not
truthy-coerced) — a string "true" or integer 1 in the inspect
JSON will not silently pass the gate. ✓
* `evaluate_thumbnail_budget` rejects negative size and
zero-or-negative budget. ✓
### Phase 5 — Performance
* `evaluate_manifest_schema`: O(N entries × F fields) — typically
<100 entries × 7 fields, trivial. ✓
* `detect_raw_frames`: O(N files), single pass; extension check
uses a tuple membership test (O(K) where K=8). ✓
* `evaluate_offline_mode`: O(M networks) where M is usually 1. ✓
* `evaluate_thumbnail_budget`: O(1). ✓
* `probe_jpeg_dimensions`: reads only segment headers (≤16 bytes
per segment hop) until SOF; even a multi-MB JPEG terminates
in <30 hops. ✓
* `walk_files`: O(total files under the roots), standard rglob
iteration; no in-memory list buffering. ✓
### Phase 6 — Cross-Task Consistency (single-task batch)
* Naming follows the recent `gcs_telemetry_evaluator` / `*_report`
/ `passes` property convention. ✓
* FDR record types declared as module-level constants
(`CACHE_SELF_CHECK_FDR_KIND`, `TILE_LOAD_REJECTED_FDR_KIND`)
mirrors the b81 pattern (`HINT_FDR_KIND`,
`ANCHOR_SEARCH_REGION_FDR_KIND`). ✓
* Skip-rule pattern (`if not sitl_replay_ready: pytest.skip(...)`)
is consistent with the 18 other scenario tests in `tests/positive`. ✓
### Phase 7 — Architecture Compliance
`_docs/02_document/module-layout.md` declares `blackbox_tests` as the
sole owner of `e2e/**`.
1. **Layer direction**: every import in the six new/edited files
resolves to `runner.helpers.*`, `runner.helpers.fdr_reader`,
`runner.helpers.tile_cache_inspector`, stdlib, or pytest. No
`src/gps_denied_onboard` imports. ✓ (verified by
`test_no_sut_imports.py`).
2. **Public API respect**: scenario tests import only top-level
module symbols from `runner.helpers.*` (no `_private`). ✓
3. **No new cyclic deps**: `tile_cache_inspector` is a leaf consumed
by 3 scenario tests + 1 unit-test module; no back-edges. ✓
4. **Duplicate symbols**: `probe_jpeg_dimensions` is the first JPEG
header parser in the e2e tree. If a future scenario needs the
same probe (e.g., NFT-LIM-02 size budgeting), promote to a
shared `runner/helpers/image_probe.py`. Tracked, not flagged.
5. **Cross-cutting concerns**: file-system walks (`walk_files`)
are local to `tile_cache_inspector` for now. If another scenario
needs filesystem walks for different reasons (e.g., FT-P-17
tile-output verification), promote. ✓
## Regression Gate
Full `e2e/_unit_tests/` suite: **793 passed in 139.27 s**, single run,
no flakes. Up from 746 (batch 81) by +47:
* +43 in new `test_tile_cache_inspector.py` (12 manifest, 7 offline,
9 raw-frames, 7 thumbnail-budget, 5 JPEG-probe, 3 walk).
* +3 new entries in `test_directory_layout.py` (3 scenario test paths).
* +1 from a `test_no_sut_imports.py` walk that now covers the new
helper.
No tests removed. Scenario tests skip locally because
`E2E_SITL_REPLAY_DIR` is unset (intended docker-vs-host boundary).
+3 -3
View File
@@ -6,9 +6,9 @@ step: 10
name: Implement Tests
status: in_progress
sub_step:
phase: 0
name: awaiting-invocation
detail: ""
phase: 11
name: commit-batch
detail: "batch 82"
retry_count: 0
cycle: 1
tracker: jira
@@ -0,0 +1,491 @@
"""Unit tests for ``runner.helpers.tile_cache_inspector`` (AZ-421).
Pure-logic AC-8.1 / AC-8.3 / AC-8.5 coverage for FT-P-15 / FT-P-16 /
FT-P-18. The full e2e scenarios in ``e2e/tests/positive/test_ft_p_1[568]_*.py``
exercise the same helpers end-to-end when ``E2E_SITL_REPLAY_DIR`` is
prepared; this file covers the helpers in isolation so AC verification
does not depend on the SITL fixture or a live docker daemon.
"""
from __future__ import annotations
import struct
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
# ─────────────────────── evaluate_manifest_schema ───────────────────────
def _full_entry(**overrides: object) -> dict:
"""Construct a manifest entry that has every required field by default."""
# Arrange — return a complete dict the caller can selectively break
base: dict[str, object] = {
"id": "tile_001",
"crs": "EPSG:3857",
"tile_matrix": "WGS84_Quad/16",
"dimension": 256,
"m_per_px": 0.5,
"capture_date": "2025-04-12",
"source": "internal_drone_2024_capture",
"compression": "JPEG-Q85",
}
base.update(overrides)
return base
def test_evaluate_manifest_schema_all_fields_present_floor_met_passes() -> None:
# Arrange
entries = [_full_entry(id=f"t_{i}", m_per_px=0.5 + i * 0.1) for i in range(3)]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert report.passes
assert report.total_entries == 3
assert report.entries_with_missing_fields == ()
assert report.entries_below_floor == ()
def test_evaluate_manifest_schema_missing_field_fails() -> None:
# Arrange
entries = [_full_entry()]
del entries[0]["compression"]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert not report.passes
assert report.entries_with_missing_fields[0].missing_fields == ("compression",)
def test_evaluate_manifest_schema_multiple_missing_fields_listed_in_order() -> None:
# Arrange
entries = [_full_entry()]
del entries[0]["crs"]
del entries[0]["compression"]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert report.entries_with_missing_fields[0].missing_fields == ("crs", "compression")
def test_evaluate_manifest_schema_below_floor_without_rejection_fails() -> None:
# Arrange
entries = [_full_entry(id="lowres", m_per_px=0.4)]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert not report.passes
assert report.entries_below_floor[0].entry_id == "lowres"
def test_evaluate_manifest_schema_below_floor_with_rejection_passes() -> None:
# Arrange
entries = [
_full_entry(id="good", m_per_px=0.5),
_full_entry(id="lowres", m_per_px=0.4),
]
# Act
report = tci.evaluate_manifest_schema(entries, tile_load_rejected_ids=("lowres",))
# Assert
assert report.passes
def test_evaluate_manifest_schema_at_floor_exactly_passes() -> None:
# Arrange
entries = [_full_entry(m_per_px=0.5)]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert report.passes
def test_evaluate_manifest_schema_empty_list_fails() -> None:
# Act
report = tci.evaluate_manifest_schema([])
# Assert
assert not report.passes
assert report.total_entries == 0
def test_evaluate_manifest_schema_non_numeric_m_per_px_fails() -> None:
# Arrange
entries = [_full_entry(m_per_px="0.5")]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert not report.passes
assert report.entries[0].m_per_px is None
def test_evaluate_manifest_schema_entry_id_falls_back_to_synthesised() -> None:
# Arrange
entry = _full_entry()
del entry["id"]
# Act
report = tci.evaluate_manifest_schema([entry])
# Assert
assert report.entries[0].entry_id == "tile_matrix" or report.entries[0].entry_id.startswith("entry_") or report.entries[0].entry_id == "WGS84_Quad/16"
def test_evaluate_manifest_schema_invalid_floor_raises() -> None:
with pytest.raises(ValueError, match="m_per_px_floor"):
tci.evaluate_manifest_schema([_full_entry()], m_per_px_floor=0)
def test_evaluate_manifest_schema_custom_required_fields() -> None:
# Arrange — using a minimal field set the test owns
entries = [{"id": "t1", "m_per_px": 1.0, "crs": "EPSG:3857"}]
# Act
report = tci.evaluate_manifest_schema(
entries, required_fields=("id", "crs", "m_per_px")
)
# Assert
assert report.passes
def test_evaluate_manifest_schema_one_good_one_bad_fails() -> None:
# Arrange
entries = [_full_entry(id="ok"), _full_entry(id="bad", m_per_px=0.3)]
# Act
report = tci.evaluate_manifest_schema(entries)
# Assert
assert not report.passes
assert len(report.entries_below_floor) == 1
assert report.entries_below_floor[0].entry_id == "bad"
# ─────────────────────── evaluate_offline_mode ───────────────────────
def _network_inspect(*, name: str = "e2e-net", internal: bool = True) -> dict:
return {"Name": name, "Internal": internal, "Driver": "bridge"}
def _container_inspect(*networks: str) -> dict:
return {
"Id": "deadbeef",
"NetworkSettings": {"Networks": {n: {"IPAddress": "172.20.0.2"} for n in networks}},
}
def test_evaluate_offline_mode_internal_and_only_e2e_net_passes() -> None:
# Act
report = tci.evaluate_offline_mode(_network_inspect(), _container_inspect("e2e-net"))
# Assert
assert report.passes
assert report.network_internal is True
assert report.container_networks == ("e2e-net",)
def test_evaluate_offline_mode_non_internal_fails() -> None:
# Act
report = tci.evaluate_offline_mode(
_network_inspect(internal=False), _container_inspect("e2e-net")
)
# Assert
assert not report.passes
def test_evaluate_offline_mode_extra_network_fails() -> None:
# Act
report = tci.evaluate_offline_mode(
_network_inspect(), _container_inspect("e2e-net", "bridge")
)
# Assert
assert not report.passes
def test_evaluate_offline_mode_no_networks_fails() -> None:
# Act
report = tci.evaluate_offline_mode(_network_inspect(), _container_inspect())
# Assert
assert not report.passes
def test_evaluate_offline_mode_missing_internal_key_fails() -> None:
# Arrange
net = {"Name": "e2e-net", "Driver": "bridge"}
# Act
report = tci.evaluate_offline_mode(net, _container_inspect("e2e-net"))
# Assert
assert not report.passes
assert report.network_internal is None
def test_evaluate_offline_mode_non_bool_internal_fails() -> None:
# Arrange
net = {"Name": "e2e-net", "Internal": "true"} # string, not bool
# Act
report = tci.evaluate_offline_mode(net, _container_inspect("e2e-net"))
# Assert
assert not report.passes
def test_evaluate_offline_mode_custom_expected_network() -> None:
# Act
report = tci.evaluate_offline_mode(
_network_inspect(name="custom-net"),
_container_inspect("custom-net"),
expected_network="custom-net",
)
# Assert
assert report.passes
assert report.expected_network == "custom-net"
# ─────────────────────── detect_raw_frames ───────────────────────
def test_detect_raw_frames_nav_camera_raw_dimension_match() -> None:
# Arrange
specs = [(Path("/data/frame.jpg"), 12345, (5472, 3648))]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert not report.passes
assert report.candidate_count == 1
assert report.candidates[0].dimensions == (5472, 3648)
def test_detect_raw_frames_h264_decoded_dimension_match() -> None:
# Arrange
specs = [(Path("/cache/buf.jpg"), 500, (880, 720))]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert not report.passes
def test_detect_raw_frames_dimension_order_insensitive() -> None:
# Arrange — (3648, 5472) is a sideways encoding of the raw nav-cam shape
specs = [(Path("/data/frame.jpg"), 12345, (3648, 5472))]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert report.candidate_count == 1
def test_detect_raw_frames_thumbnail_dimensions_pass() -> None:
# Arrange — small thumbnail
specs = [(Path("/cache/thumb.jpg"), 4096, (128, 96))]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert report.passes
def test_detect_raw_frames_no_raw_extension_pass() -> None:
# Arrange — .png is not in the raw-extension list
specs = [(Path("/cache/snap.png"), 1024, (5472, 3648))]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert report.passes
def test_detect_raw_frames_unknown_dimensions_pass() -> None:
# Arrange — dimension probe failed; per docstring this is NOT a match
specs = [(Path("/cache/frame.jpg"), 1024, None)]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert report.passes
def test_detect_raw_frames_empty_list_passes() -> None:
# Act
report = tci.detect_raw_frames([])
# Assert
assert report.passes
assert report.candidate_count == 0
def test_detect_raw_frames_dng_extension_matches() -> None:
# Arrange
specs = [(Path("/data/img.dng"), 1024, (5472, 3648))]
# Act
report = tci.detect_raw_frames(specs)
# Assert
assert not report.passes
def test_detect_raw_frames_custom_dimensions() -> None:
# Arrange
specs = [(Path("/data/img.jpg"), 1024, (100, 100))]
# Act
report = tci.detect_raw_frames(
specs, raw_dimensions=(100, 100), decoded_dimensions=(50, 50)
)
# Assert
assert not report.passes
# ─────────────────────── evaluate_thumbnail_budget ───────────────────────
def test_evaluate_thumbnail_budget_under_budget_passes() -> None:
# Arrange — 100 MB over 1 h extrapolates to 800 MB / 8 h (< 1 GB)
size = 100 * 1024**2
# Act
report = tci.evaluate_thumbnail_budget(size, observed_duration_h=1.0)
# Assert
assert report.passes
def test_evaluate_thumbnail_budget_over_budget_fails() -> None:
# Arrange — 200 MB over 1 h extrapolates to 1.6 GB / 8 h (> 1 GB)
size = 200 * 1024**2
# Act
report = tci.evaluate_thumbnail_budget(size, observed_duration_h=1.0)
# Assert
assert not report.passes
def test_evaluate_thumbnail_budget_extrapolation_math() -> None:
# Arrange — 1 MB over 2 h extrapolates to 4 MB / 8 h
one_mb = 1024**2
# Act
report = tci.evaluate_thumbnail_budget(one_mb, observed_duration_h=2.0)
# Assert
assert report.extrapolated_8h_size_bytes == 4 * one_mb
def test_evaluate_thumbnail_budget_zero_duration_fails() -> None:
# Act
report = tci.evaluate_thumbnail_budget(1024, observed_duration_h=0.0)
# Assert
assert not report.passes
def test_evaluate_thumbnail_budget_negative_size_raises() -> None:
with pytest.raises(ValueError, match="observed_size_bytes"):
tci.evaluate_thumbnail_budget(-1, observed_duration_h=1.0)
def test_evaluate_thumbnail_budget_invalid_budget_raises() -> None:
with pytest.raises(ValueError, match="max_size_bytes_per_8h"):
tci.evaluate_thumbnail_budget(1024, observed_duration_h=1.0, max_size_bytes_per_8h=0)
def test_evaluate_thumbnail_budget_custom_budget() -> None:
# Arrange — 500 MB over 1 h ≈ 4 GB / 8 h; budget = 10 GB → passes
size = 500 * 1024**2
budget = 10 * 1024**3
# Act
report = tci.evaluate_thumbnail_budget(
size, observed_duration_h=1.0, max_size_bytes_per_8h=budget
)
# Assert
assert report.passes
# ─────────────────────── walk_files ───────────────────────
def test_walk_files_skips_missing_roots(tmp_path: Path) -> None:
# Arrange
(tmp_path / "real").mkdir()
(tmp_path / "real" / "f.txt").write_text("x")
missing = tmp_path / "missing"
# Act
files = list(tci.walk_files(missing, tmp_path / "real"))
# Assert
assert len(files) == 1
assert files[0].name == "f.txt"
def test_walk_files_recursive(tmp_path: Path) -> None:
# Arrange
(tmp_path / "a" / "b").mkdir(parents=True)
(tmp_path / "a" / "top.txt").write_text("x")
(tmp_path / "a" / "b" / "nested.txt").write_text("x")
# Act
files = sorted(tci.walk_files(tmp_path), key=lambda p: p.name)
# Assert
assert [f.name for f in files] == ["nested.txt", "top.txt"]
def test_walk_files_no_directories_yielded(tmp_path: Path) -> None:
# Arrange
(tmp_path / "subdir").mkdir()
(tmp_path / "subdir" / "f.txt").write_text("x")
# Act
files = list(tci.walk_files(tmp_path))
# Assert — only the file, not the directory itself
assert all(p.is_file() for p in files)
# ─────────────────────── probe_jpeg_dimensions ───────────────────────
def _make_minimal_jpeg(width: int, height: int) -> bytes:
"""Construct a minimal-but-valid JPEG with the given SOF0 dimensions.
The result starts with SOI then jumps straight to an SOF0 segment
that encodes the requested w/h. Nothing past the SOF needs to be
valid for the dimension probe to succeed.
"""
# SOI marker
soi = b"\xff\xd8"
# SOF0 segment: marker (FFC0) + length (2) + precision (1) + h (2) + w (2) + nf (1) + components (3*nf)
# length = 8 + 3 (1 component)
sof0 = (
b"\xff\xc0"
+ struct.pack(">H", 11)
+ b"\x08" # precision
+ struct.pack(">H", height)
+ struct.pack(">H", width)
+ b"\x01" # n components
+ b"\x01\x22\x00" # component spec
)
return soi + sof0
def test_probe_jpeg_dimensions_returns_width_height(tmp_path: Path) -> None:
# Arrange
f = tmp_path / "img.jpg"
f.write_bytes(_make_minimal_jpeg(640, 480))
# Act
dims = tci.probe_jpeg_dimensions(f)
# Assert
assert dims == (640, 480)
def test_probe_jpeg_dimensions_handles_raw_nav_camera_dims(tmp_path: Path) -> None:
# Arrange
f = tmp_path / "raw.jpg"
f.write_bytes(_make_minimal_jpeg(5472, 3648))
# Act
dims = tci.probe_jpeg_dimensions(f)
# Assert
assert dims == (5472, 3648)
def test_probe_jpeg_dimensions_not_a_jpeg(tmp_path: Path) -> None:
# Arrange
f = tmp_path / "not.jpg"
f.write_bytes(b"PNG\x00not a jpeg")
# Act
dims = tci.probe_jpeg_dimensions(f)
# Assert
assert dims is None
def test_probe_jpeg_dimensions_truncated(tmp_path: Path) -> None:
# Arrange — SOI marker only, no SOF segment
f = tmp_path / "trunc.jpg"
f.write_bytes(b"\xff\xd8")
# Act
dims = tci.probe_jpeg_dimensions(f)
# Assert
assert dims is None
def test_probe_jpeg_dimensions_nonexistent(tmp_path: Path) -> None:
# Act
dims = tci.probe_jpeg_dimensions(tmp_path / "missing.jpg")
# Assert
assert dims is None
+4
View File
@@ -52,6 +52,7 @@ E2E_ROOT = Path(__file__).resolve().parents[1]
"runner/helpers/msp_frame_observer.py",
"runner/helpers/ap_contract_evaluator.py",
"runner/helpers/gcs_telemetry_evaluator.py",
"runner/helpers/tile_cache_inspector.py",
"runner/helpers/cold_start_evaluator.py",
"runner/helpers/outlier_tolerance_evaluator.py",
"runner/helpers/outage_request_evaluator.py",
@@ -109,6 +110,9 @@ E2E_ROOT = Path(__file__).resolve().parents[1]
"tests/positive/test_ft_p_11_cold_start_init.py",
"tests/positive/test_ft_p_12_gcs_downsample.py",
"tests/positive/test_ft_p_13_gcs_command.py",
"tests/positive/test_ft_p_15_cache_schema.py",
"tests/positive/test_ft_p_16_offline_only.py",
"tests/positive/test_ft_p_18_no_raw_retention.py",
"tests/negative/test_ft_n_01_outlier_tolerance.py",
"tests/negative/test_ft_n_02_sharp_turn_failure.py",
"tests/negative/test_ft_n_03_outage_reloc.py",
+427
View File
@@ -0,0 +1,427 @@
"""Tile-cache + storage compliance evaluators (AZ-421 / FT-P-15/16/18).
Pure-logic evaluators sourced from:
* **FDR archive** — the SUT's startup ``cache-self-check`` record carries
the tile manifest entries the freshness/source/CRS contract has to
hold over (FT-P-15 / AC-8.1, AC-NEW-2).
* **Docker network + container inspect JSON** — verifies the SUT
container is attached only to the ``e2e-net`` network and the
network is configured with ``Internal: true`` (FT-P-16 / AC-8.3,
RESTRICT-SAT-1).
* **Filesystem walks** of ``${FDR_OUTPUT}`` and ``/var/azaion/tile-cache``
— verifies the SUT does NOT retain raw nav-camera / AI-camera
frames (FT-P-18 / AC-8.5).
The shared shape across all three sub-scenarios is the
``X...Report(passes: bool)`` dataclass — a scenario test that wants to
assert all three pulls the report objects and asserts ``passes``.
Public-boundary discipline: this module imports nothing from
``src/gps_denied_onboard``. Inputs are filesystem paths, parsed FDR
records, and dicts decoded from ``docker network inspect`` /
``docker inspect`` JSON.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
from typing import Iterable, Sequence
# ─────────────────────────── FT-P-15 / AC-8.1 ───────────────────────────
MANIFEST_M_PER_PX_FLOOR = 0.5
MANIFEST_REQUIRED_FIELDS: tuple[str, ...] = (
"crs",
"tile_matrix",
"dimension",
"m_per_px",
"capture_date",
"source",
"compression",
)
CACHE_SELF_CHECK_FDR_KIND = "cache-self-check"
TILE_LOAD_REJECTED_FDR_KIND = "tile-load-rejected"
@dataclass(frozen=True)
class ManifestEntryReport:
"""Per-entry result of the manifest schema + resolution-floor checks."""
entry_id: str
missing_fields: tuple[str, ...]
m_per_px: float | None
@property
def has_all_fields(self) -> bool:
return not self.missing_fields
@property
def passes_floor(self) -> bool:
return self.m_per_px is not None and self.m_per_px >= MANIFEST_M_PER_PX_FLOOR
@property
def passes(self) -> bool:
return self.has_all_fields and self.passes_floor
@dataclass(frozen=True)
class ManifestSchemaReport:
"""AC-1 + AC-2 of FT-P-15: schema completeness + resolution floor."""
entries: tuple[ManifestEntryReport, ...]
rejected_below_floor_ids: tuple[str, ...]
m_per_px_floor: float = MANIFEST_M_PER_PX_FLOOR
@property
def total_entries(self) -> int:
return len(self.entries)
@property
def entries_with_missing_fields(self) -> tuple[ManifestEntryReport, ...]:
return tuple(e for e in self.entries if not e.has_all_fields)
@property
def entries_below_floor(self) -> tuple[ManifestEntryReport, ...]:
return tuple(e for e in self.entries if e.m_per_px is not None and not e.passes_floor)
@property
def passes(self) -> bool:
if not self.entries:
return False
if self.entries_with_missing_fields:
return False
for entry in self.entries:
if entry.m_per_px is None:
return False
if entry.passes_floor:
continue
# below floor — must be rejected at load
if entry.entry_id not in self.rejected_below_floor_ids:
return False
return True
def evaluate_manifest_schema(
manifest_entries: Sequence[dict],
*,
tile_load_rejected_ids: Sequence[str] = (),
m_per_px_floor: float = MANIFEST_M_PER_PX_FLOOR,
required_fields: Sequence[str] = MANIFEST_REQUIRED_FIELDS,
) -> ManifestSchemaReport:
"""Evaluate AC-1 + AC-2 of FT-P-15 against parsed manifest entries.
Each ``manifest_entries`` element is the ``payload.entries[i]`` dict
extracted from an FDR ``cache-self-check`` record. ``entry_id`` is
looked up under ``"id"`` then ``"tile_id"`` then synthesised from
the entry's index — scenarios should prefer ``"id"`` if their
schema names it differently and adjust upstream.
``tile_load_rejected_ids`` is the set of tile IDs the SUT has
rejected at load time via FDR ``tile-load-rejected`` events; an
entry with ``m_per_px < floor`` only passes if its ID appears in
this set.
"""
if m_per_px_floor <= 0:
raise ValueError(f"m_per_px_floor must be > 0, got {m_per_px_floor}")
rejected = tuple(tile_load_rejected_ids)
entries: list[ManifestEntryReport] = []
for idx, entry in enumerate(manifest_entries):
entry_id = _resolve_entry_id(entry, idx)
missing = tuple(f for f in required_fields if f not in entry)
raw_m_per_px = entry.get("m_per_px")
m_per_px: float | None
if isinstance(raw_m_per_px, (int, float)):
m_per_px = float(raw_m_per_px)
else:
m_per_px = None
entries.append(
ManifestEntryReport(
entry_id=entry_id,
missing_fields=missing,
m_per_px=m_per_px,
)
)
return ManifestSchemaReport(
entries=tuple(entries),
rejected_below_floor_ids=rejected,
m_per_px_floor=m_per_px_floor,
)
def _resolve_entry_id(entry: dict, idx: int) -> str:
for key in ("id", "tile_id", "tile_matrix"):
if key in entry and isinstance(entry[key], str) and entry[key]:
return entry[key]
return f"entry_{idx}"
# ─────────────────────────── FT-P-16 / AC-8.3 ───────────────────────────
E2E_NETWORK_NAME = "e2e-net"
@dataclass(frozen=True)
class OfflineModeReport:
"""AC-3 of FT-P-16: SUT container is on `e2e-net` only and the net is internal."""
network_name: str
network_internal: bool | None
container_networks: tuple[str, ...]
expected_network: str = E2E_NETWORK_NAME
@property
def container_has_only_expected_network(self) -> bool:
return self.container_networks == (self.expected_network,)
@property
def passes(self) -> bool:
if self.network_internal is not True:
return False
return self.container_has_only_expected_network
def evaluate_offline_mode(
network_inspect: dict,
container_inspect: dict,
*,
expected_network: str = E2E_NETWORK_NAME,
) -> OfflineModeReport:
"""Evaluate AC-3 of FT-P-16 from ``docker network inspect`` + ``docker inspect``.
``network_inspect`` is a single network object (the JSON shape
``docker network inspect <name>`` returns inside a list — the
scenario unwraps the list). Required key: ``Internal: bool``.
``container_inspect`` is a single container object. Required key
path: ``NetworkSettings.Networks`` (a dict whose keys are network
names the container is attached to).
"""
network_internal = network_inspect.get("Internal")
if not isinstance(network_internal, bool):
network_internal = None
nets_map = (
container_inspect.get("NetworkSettings", {}).get("Networks", {})
if isinstance(container_inspect.get("NetworkSettings"), dict)
else {}
)
container_networks: tuple[str, ...] = (
tuple(sorted(nets_map.keys())) if isinstance(nets_map, dict) else ()
)
return OfflineModeReport(
network_name=str(network_inspect.get("Name", "")),
network_internal=network_internal,
container_networks=container_networks,
expected_network=expected_network,
)
# ─────────────────────────── FT-P-18 / AC-8.5 ───────────────────────────
NAV_CAMERA_RAW_DIMENSIONS = (5472, 3648)
NAV_CAMERA_DECODED_DIMENSIONS = (880, 720)
RAW_FRAME_EXTENSIONS = (".jpg", ".jpeg", ".raw", ".dng", ".cr2", ".nef", ".arw", ".bin")
THUMBNAIL_LOG_EXTENSIONS = (".log", ".jsonl", ".txt")
THUMBNAIL_LOG_MAX_SIZE_GB_PER_8H = 1.0
THUMBNAIL_LOG_MAX_SIZE_BYTES_PER_8H = int(THUMBNAIL_LOG_MAX_SIZE_GB_PER_8H * 1024**3)
@dataclass(frozen=True)
class RawFrameCandidate:
"""One filesystem entry that matched the raw-frame heuristic."""
path: Path
size_bytes: int
dimensions: tuple[int, int] | None
reason: str
@dataclass(frozen=True)
class RawFrameDetectionReport:
"""AC-4 of FT-P-18: zero raw-frame retention."""
candidates: tuple[RawFrameCandidate, ...]
nav_camera_raw_dimensions: tuple[int, int] = NAV_CAMERA_RAW_DIMENSIONS
nav_camera_decoded_dimensions: tuple[int, int] = NAV_CAMERA_DECODED_DIMENSIONS
@property
def candidate_count(self) -> int:
return len(self.candidates)
@property
def passes(self) -> bool:
return self.candidate_count == 0
def detect_raw_frames(
file_specs: Iterable[tuple[Path, int, tuple[int, int] | None]],
*,
raw_dimensions: tuple[int, int] = NAV_CAMERA_RAW_DIMENSIONS,
decoded_dimensions: tuple[int, int] = NAV_CAMERA_DECODED_DIMENSIONS,
raw_extensions: Sequence[str] = RAW_FRAME_EXTENSIONS,
) -> RawFrameDetectionReport:
"""AC-4: detect any file whose extension + dimensions match raw nav frames.
``file_specs`` is an iterable of ``(path, size_bytes, dimensions)``
triples. The scenario test produces this by walking the filesystem
and probing each image file's dimensions; this evaluator only
decides *which* of those triples count as raw frames.
A file matches when:
1. Extension is in ``raw_extensions``, AND
2. ``dimensions`` equals either the raw nav-cam dims (5472×3648,
order-insensitive) OR the H.264-decoded dims (880×720,
order-insensitive).
A file with a raw extension but unknown dimensions does NOT match
(the scenario is expected to fail dimension probe loudly, not be
silently absorbed by the evaluator).
"""
targets = {tuple(sorted(raw_dimensions)), tuple(sorted(decoded_dimensions))}
raw_ext_lower = tuple(ext.lower() for ext in raw_extensions)
candidates: list[RawFrameCandidate] = []
for path, size_bytes, dims in file_specs:
if path.suffix.lower() not in raw_ext_lower:
continue
if dims is None:
continue
if tuple(sorted(dims)) not in targets:
continue
candidates.append(
RawFrameCandidate(
path=path,
size_bytes=size_bytes,
dimensions=dims,
reason=(
f"extension {path.suffix} + dimensions {dims} match nav-camera raw pattern"
),
)
)
return RawFrameDetectionReport(
candidates=tuple(candidates),
nav_camera_raw_dimensions=raw_dimensions,
nav_camera_decoded_dimensions=decoded_dimensions,
)
@dataclass(frozen=True)
class ThumbnailLogBudgetReport:
"""AC-5 of FT-P-18: thumbnail log size budget under 1 GB / 8 h."""
observed_size_bytes: int
observed_duration_h: float
extrapolated_8h_size_bytes: int
max_size_bytes_per_8h: int = THUMBNAIL_LOG_MAX_SIZE_BYTES_PER_8H
@property
def passes(self) -> bool:
if self.observed_duration_h <= 0:
return False
return self.extrapolated_8h_size_bytes < self.max_size_bytes_per_8h
def evaluate_thumbnail_budget(
observed_size_bytes: int,
observed_duration_h: float,
*,
max_size_bytes_per_8h: int = THUMBNAIL_LOG_MAX_SIZE_BYTES_PER_8H,
) -> ThumbnailLogBudgetReport:
"""AC-5: extrapolate observed thumbnail log size to an 8h flight.
``observed_size_bytes`` is the sum of every thumbnail-log file
under the FDR + cache walk (extensions in
``THUMBNAIL_LOG_EXTENSIONS``). ``observed_duration_h`` is the
wall-clock duration of the replay segment that produced them.
Extrapolation is linear: ``size * (8 / duration_h)``.
Returns a report whose ``passes`` flag holds when
``extrapolated_8h_size_bytes < max_size_bytes_per_8h``.
"""
if observed_size_bytes < 0:
raise ValueError(f"observed_size_bytes must be ≥0, got {observed_size_bytes}")
if max_size_bytes_per_8h <= 0:
raise ValueError(
f"max_size_bytes_per_8h must be >0, got {max_size_bytes_per_8h}"
)
if observed_duration_h <= 0:
extrapolated = -1
else:
extrapolated = int(observed_size_bytes * (8.0 / observed_duration_h))
return ThumbnailLogBudgetReport(
observed_size_bytes=observed_size_bytes,
observed_duration_h=observed_duration_h,
extrapolated_8h_size_bytes=extrapolated,
max_size_bytes_per_8h=max_size_bytes_per_8h,
)
# ─────────────────────── Filesystem walk helpers ───────────────────────
def walk_files(*roots: Path) -> Iterable[Path]:
"""Recursive file iterator over every existing root.
Convenience for the FT-P-18 scenario: stitch together
``fdr_archive_root`` + ``tile_cache_root`` walks under one call.
Non-existent roots are silently skipped (the FDR archive may be
absent on a skip-gated local run — the scenario explicitly checks
that elsewhere).
"""
for root in roots:
if not root.exists():
continue
for p in root.rglob("*"):
if p.is_file():
yield p
def probe_jpeg_dimensions(path: Path) -> tuple[int, int] | None:
"""Return ``(width, height)`` of a JPEG by parsing its SOF marker.
Pure-stdlib JPEG SOF0/SOF1/SOF2 parser — avoids loading the full
image (so a directory walk over hundreds of files is cheap) and
avoids a runtime dep on Pillow/OpenCV here (both are available in
the runner but adding them as a hard import would couple the
evaluator to those packages for what is fundamentally a 32-byte
header read).
Returns ``None`` if the file is not a JPEG, the SOF marker is not
present, or the file is truncated.
"""
try:
with path.open("rb") as fh:
head = fh.read(2)
if head != b"\xff\xd8":
return None
while True:
marker_prefix = fh.read(1)
if not marker_prefix:
return None
if marker_prefix != b"\xff":
return None
marker = fh.read(1)
if not marker:
return None
# SOF markers: 0xC0-0xCF except 0xC4 (DHT), 0xC8 (JPG), 0xCC (DAC)
if marker[0] in (0xC0, 0xC1, 0xC2, 0xC3, 0xC5, 0xC6, 0xC7, 0xC9, 0xCA, 0xCB, 0xCD, 0xCE, 0xCF):
fh.read(3) # segment length (2) + precision (1)
h_bytes = fh.read(2)
w_bytes = fh.read(2)
if len(h_bytes) != 2 or len(w_bytes) != 2:
return None
height = int.from_bytes(h_bytes, "big")
width = int.from_bytes(w_bytes, "big")
return (width, height)
seg_len_bytes = fh.read(2)
if len(seg_len_bytes) != 2:
return None
seg_len = int.from_bytes(seg_len_bytes, "big")
if seg_len < 2:
return None
fh.seek(seg_len - 2, 1)
except OSError:
return None
@@ -0,0 +1,107 @@
"""FT-P-15 — Tile cache manifest schema + resolution floor (AZ-421 / AC-8.1).
The full scenario:
1. SUT cold-starts against the bind-mounted ``tile-cache-fixture`` and
emits a one-time ``cache-self-check`` FDR record carrying every
manifest entry it loaded (CRS, tile_matrix, dimension, m_per_px,
capture_date, source, compression).
2. The SUT additionally emits ``tile-load-rejected`` FDR records for
any entry the freshness/floor gate rejected at load time.
3. The test parses the FDR archive, evaluates the manifest schema
contract (AC-1: every required field present; AC-2: every entry
either ≥ 0.5 m/px or rejected), and asserts the report passes.
AC-1: every required field present per entry — ``MANIFEST_REQUIRED_FIELDS``.
AC-2: m/px ≥ 0.5 OR rejected by FDR ``tile-load-rejected``.
AC-3 of FT-P-15-spec maps to AC-6 of the task (parameterisation).
Gated on:
* ``runner.helpers.fdr_reader`` — owned by AZ-594; present.
* ``runner.helpers.tile_cache_inspector.evaluate_manifest_schema`` —
pure-logic evaluator covered by
``e2e/_unit_tests/helpers/test_tile_cache_inspector.py``.
* ``sitl_replay_ready`` — skip-gates the scenario when no FDR archive
is present locally.
"""
from __future__ import annotations
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
@pytest.mark.traces_to("AC-8.1,AC-1,AC-2,AC-6")
def test_ft_p_15_cache_schema(
fc_adapter: str,
vio_strategy: str,
evidence_dir, # type: ignore[no-untyped-def]
run_id: str,
nfr_recorder, # type: ignore[no-untyped-def]
sitl_replay_ready: bool,
) -> None:
"""Full FT-P-15 scenario (AC-8.1)."""
if not sitl_replay_ready:
pytest.skip(
"FT-P-15 requires `E2E_SITL_REPLAY_DIR` to point at a SITL replay "
"fixture that includes the FDR `cache-self-check` record + any "
"`tile-load-rejected` records (AZ-595 + AZ-421 fixture builder). "
"Pure-logic AC-8.1 coverage lives in "
"e2e/_unit_tests/helpers/test_tile_cache_inspector.py."
)
from runner.helpers import fdr_reader
fdr_root = Path(evidence_dir).parent / f"run-{run_id}" / "fdr"
manifest_entries: list[dict] = []
rejected_ids: list[str] = []
for rec in fdr_reader.iter_records(fdr_root):
if rec.record_type == tci.CACHE_SELF_CHECK_FDR_KIND:
raw_entries = rec.payload.get("entries")
if isinstance(raw_entries, list):
for entry in raw_entries:
if isinstance(entry, dict):
manifest_entries.append(entry)
elif rec.record_type == tci.TILE_LOAD_REJECTED_FDR_KIND:
entry_id = rec.payload.get("id") or rec.payload.get("tile_id")
if isinstance(entry_id, str) and entry_id:
rejected_ids.append(entry_id)
if not manifest_entries:
pytest.fail(
f"FT-P-15: no `{tci.CACHE_SELF_CHECK_FDR_KIND}` FDR record with "
f"manifest entries found under {fdr_root}. The fixture builder "
"must emit one at cold start."
)
report = tci.evaluate_manifest_schema(
manifest_entries,
tile_load_rejected_ids=rejected_ids,
)
nfr_recorder.record_metric(
"ft_p_15.manifest_entries", float(report.total_entries), ac_id="AC-8.1"
)
nfr_recorder.record_metric(
"ft_p_15.entries_missing_fields",
float(len(report.entries_with_missing_fields)),
ac_id="AC-1",
)
nfr_recorder.record_metric(
"ft_p_15.entries_below_floor",
float(len(report.entries_below_floor)),
ac_id="AC-2",
)
assert report.passes, (
"AC-8.1 (manifest schema + ≥0.5 m/px floor) failed: "
f"total={report.total_entries}, "
f"missing_fields={[(e.entry_id, e.missing_fields) for e in report.entries_with_missing_fields]}, "
f"below_floor_not_rejected="
f"{[e.entry_id for e in report.entries_below_floor if e.entry_id not in report.rejected_below_floor_ids]}"
)
@@ -0,0 +1,121 @@
"""FT-P-16 — Offline-only operation (AZ-421 / AC-8.3, RESTRICT-SAT-1).
The full scenario:
1. The SUT runs against the local tile-cache mount only.
2. The Docker compose harness attaches the SUT container to
``e2e-net`` with ``Internal: true`` — Docker itself blocks egress
to anything outside that network (AZ-406 owns the compose wiring).
3. A 60 s Derkachi replay generates load; during the replay the
scenario reads ``docker network inspect e2e-net`` and
``docker inspect <sut-container>`` and asserts:
- ``e2e-net.Internal == true``
- The SUT container is attached to ``e2e-net`` only.
The "0 packets to non-e2e-net destinations" semantic of AC-8.3 is
enforced structurally — there is no other network the SUT can reach,
so the packet count is provably 0 without per-packet counters.
Gated on:
* ``sitl_replay_ready`` — full replay needs the SITL fixture (skip
cleanly otherwise).
* ``DOCKER_NETWORK_INSPECT_PATH`` / ``DOCKER_CONTAINER_INSPECT_PATH``
env vars — point at JSON files produced by the fixture builder
ahead of test invocation. When unset, the scenario skips with a
clear reason (the docker CLI is not available inside the runner
container without volume-mounting the docker socket; the fixture
builder snapshots the inspect output instead).
* ``runner.helpers.tile_cache_inspector.evaluate_offline_mode`` —
pure-logic evaluator covered by
``e2e/_unit_tests/helpers/test_tile_cache_inspector.py``.
"""
from __future__ import annotations
import json
import os
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
DOCKER_NETWORK_INSPECT_ENV = "DOCKER_NETWORK_INSPECT_PATH"
DOCKER_CONTAINER_INSPECT_ENV = "DOCKER_CONTAINER_INSPECT_PATH"
@pytest.mark.traces_to("AC-8.3,AC-3,AC-6,RESTRICT-SAT-1")
def test_ft_p_16_offline_only(
fc_adapter: str,
vio_strategy: str,
evidence_dir, # type: ignore[no-untyped-def]
run_id: str,
nfr_recorder, # type: ignore[no-untyped-def]
sitl_replay_ready: bool,
) -> None:
"""Full FT-P-16 scenario (AC-8.3 / RESTRICT-SAT-1)."""
if not sitl_replay_ready:
pytest.skip(
"FT-P-16 needs `E2E_SITL_REPLAY_DIR` to point at a SITL replay "
"fixture (AZ-595). Pure-logic AC-8.3 coverage lives in "
"e2e/_unit_tests/helpers/test_tile_cache_inspector.py."
)
net_path = os.environ.get(DOCKER_NETWORK_INSPECT_ENV)
ctr_path = os.environ.get(DOCKER_CONTAINER_INSPECT_ENV)
if not net_path or not ctr_path:
pytest.skip(
f"FT-P-16 needs `{DOCKER_NETWORK_INSPECT_ENV}` and "
f"`{DOCKER_CONTAINER_INSPECT_ENV}` env vars set to JSON files "
"produced by the compose harness (`docker network inspect "
"e2e-net` + `docker inspect gps-denied-onboard`). The fixture "
"builder snapshots both before the test runs."
)
net_inspect = _load_docker_inspect_object(Path(net_path), kind="network")
ctr_inspect = _load_docker_inspect_object(Path(ctr_path), kind="container")
report = tci.evaluate_offline_mode(net_inspect, ctr_inspect)
nfr_recorder.record_metric(
"ft_p_16.network_internal", 1.0 if report.network_internal else 0.0, ac_id="AC-8.3"
)
nfr_recorder.record_metric(
"ft_p_16.container_network_count", float(len(report.container_networks)), ac_id="AC-3"
)
assert report.passes, (
"AC-8.3 (offline-only operation) failed: "
f"network_internal={report.network_internal}, "
f"container_networks={report.container_networks}, "
f"expected_network={report.expected_network}"
)
def _load_docker_inspect_object(path: Path, *, kind: str) -> dict:
"""Load a single inspect object from a JSON file.
``docker inspect`` returns a JSON array. The scenario expects
either the wrapped array OR an unwrapped single-object payload —
accept both shapes for forwards-compatibility with fixture
builders that pre-unwrap.
"""
if not path.exists():
pytest.fail(f"FT-P-16: {kind} inspect JSON not found at {path}")
raw = json.loads(path.read_text(encoding="utf-8"))
if isinstance(raw, list):
if not raw:
pytest.fail(f"FT-P-16: {kind} inspect JSON at {path} is an empty array")
if not isinstance(raw[0], dict):
pytest.fail(
f"FT-P-16: {kind} inspect JSON at {path} array element is not an object"
)
return raw[0]
if isinstance(raw, dict):
return raw
pytest.fail(
f"FT-P-16: {kind} inspect JSON at {path} is neither object nor array: "
f"type={type(raw).__name__}"
)
return {} # unreachable; pytest.fail raises
@@ -0,0 +1,129 @@
"""FT-P-18 — No raw nav/AI-camera frame retention (AZ-421 / AC-8.5).
The full scenario:
1. After a completed Derkachi replay, walk both ``fdr-output/`` and
the bind-mounted ``tile-cache`` for any file whose extension AND
dimensions match the nav-camera raw-frame pattern (5472×3648 raw
or 880×720 H.264-decoded).
2. Sum the size of every ``THUMBNAIL_LOG_EXTENSIONS`` file and
extrapolate to an 8-hour flight.
3. Assert no raw-frame match (AC-4) and the extrapolated 8 h
thumbnail-log size < 1 GB (AC-5).
The replay-duration input to the extrapolation comes from the FDR's
last record's ``monotonic_ms`` minus the first record's ``monotonic_ms``
— a public-boundary signal the runner already has.
Gated on:
* ``sitl_replay_ready`` — full replay needs the SITL fixture (skip
cleanly otherwise).
* ``TILE_CACHE_ROOT`` env var — bind-mount path inside the runner
container. Defaults to ``/var/azaion/tile-cache``.
* ``runner.helpers.tile_cache_inspector`` — covered by
``e2e/_unit_tests/helpers/test_tile_cache_inspector.py``.
"""
from __future__ import annotations
import os
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
TILE_CACHE_ROOT_ENV = "TILE_CACHE_ROOT"
DEFAULT_TILE_CACHE_ROOT = Path("/var/azaion/tile-cache")
@pytest.mark.traces_to("AC-8.5,AC-4,AC-5,AC-6")
def test_ft_p_18_no_raw_retention(
fc_adapter: str,
vio_strategy: str,
evidence_dir, # type: ignore[no-untyped-def]
run_id: str,
nfr_recorder, # type: ignore[no-untyped-def]
sitl_replay_ready: bool,
) -> None:
"""Full FT-P-18 scenario (AC-8.5)."""
if not sitl_replay_ready:
pytest.skip(
"FT-P-18 requires `E2E_SITL_REPLAY_DIR` to point at a SITL replay "
"fixture (AZ-595). Pure-logic AC-8.5 coverage lives in "
"e2e/_unit_tests/helpers/test_tile_cache_inspector.py."
)
from runner.helpers import fdr_reader
fdr_root = Path(evidence_dir).parent / f"run-{run_id}" / "fdr"
tile_cache_root = Path(os.environ.get(TILE_CACHE_ROOT_ENV, str(DEFAULT_TILE_CACHE_ROOT)))
# 1. Compute replay duration from the FDR archive (first to last record).
monotonic_ms_min: int | None = None
monotonic_ms_max: int | None = None
for rec in fdr_reader.iter_records(fdr_root):
if monotonic_ms_min is None or rec.monotonic_ms < monotonic_ms_min:
monotonic_ms_min = rec.monotonic_ms
if monotonic_ms_max is None or rec.monotonic_ms > monotonic_ms_max:
monotonic_ms_max = rec.monotonic_ms
if monotonic_ms_min is None or monotonic_ms_max is None:
pytest.fail(f"FT-P-18: empty FDR archive at {fdr_root}")
observed_duration_h = max(0.0, (monotonic_ms_max - monotonic_ms_min) / 3600_000.0)
if observed_duration_h <= 0:
pytest.fail(
f"FT-P-18: FDR archive at {fdr_root} has zero-or-negative duration "
f"(min={monotonic_ms_min}, max={monotonic_ms_max}); cannot extrapolate."
)
# 2. Walk both roots once; gather (path, size, dims-if-jpeg) triples.
file_specs: list[tuple[Path, int, tuple[int, int] | None]] = []
thumbnail_log_size_bytes = 0
for path in tci.walk_files(fdr_root, tile_cache_root):
size_bytes = path.stat().st_size
suffix = path.suffix.lower()
dims: tuple[int, int] | None = None
if suffix in (".jpg", ".jpeg"):
dims = tci.probe_jpeg_dimensions(path)
file_specs.append((path, size_bytes, dims))
if suffix in tci.THUMBNAIL_LOG_EXTENSIONS:
thumbnail_log_size_bytes += size_bytes
# 3. Evaluate AC-4 (no raw frames) + AC-5 (thumbnail log under budget).
raw_report = tci.detect_raw_frames(file_specs)
thumbnail_report = tci.evaluate_thumbnail_budget(
thumbnail_log_size_bytes, observed_duration_h
)
# 4. NFR metrics.
nfr_recorder.record_metric(
"ft_p_18.raw_frame_candidates", float(raw_report.candidate_count), ac_id="AC-8.5"
)
nfr_recorder.record_metric(
"ft_p_18.thumbnail_log_size_bytes",
float(thumbnail_report.observed_size_bytes),
ac_id="AC-5",
)
nfr_recorder.record_metric(
"ft_p_18.thumbnail_log_extrapolated_8h_bytes",
float(thumbnail_report.extrapolated_8h_size_bytes),
ac_id="AC-5",
)
nfr_recorder.record_metric(
"ft_p_18.replay_duration_h", observed_duration_h, ac_id="AC-5"
)
# 5. AC assertions.
assert raw_report.passes, (
f"AC-4 (no raw-frame retention) failed: {raw_report.candidate_count} "
f"matching files found: "
f"{[(str(c.path), c.dimensions) for c in raw_report.candidates]}"
)
assert thumbnail_report.passes, (
f"AC-5 (thumbnail-log < {tci.THUMBNAIL_LOG_MAX_SIZE_GB_PER_8H} GB / 8h) "
f"failed: observed={thumbnail_report.observed_size_bytes} B over "
f"{observed_duration_h:.3f} h → "
f"extrapolated_8h={thumbnail_report.extrapolated_8h_size_bytes} B "
f"(budget={thumbnail_report.max_size_bytes_per_8h} B)"
)