[AZ-421] Batch 82: FT-P-15 + FT-P-16 + FT-P-18 cache / offline / no-raw-retention

FT-P-15: parse FDR `cache-self-check` records; assert every tile-manifest
entry has CRS, tile_matrix, dimension, m_per_px, capture_date, source,
compression; m_per_px >= 0.5 (or rejected by FDR `tile-load-rejected`).

FT-P-16: read `docker network inspect e2e-net` + `docker inspect <sut>`
snapshots; assert `Internal == true` AND SUT attached only to e2e-net.
The 0-egress semantic of AC-8.3 is enforced structurally.

FT-P-18: walk FDR + tile-cache, probe JPEG dimensions via stdlib SOF
parser, reject any file matching nav-camera raw pattern (5472x3648 or
880x720). Extrapolate thumbnail-log size to 8h; assert < 1 GB.

Adds runner.helpers.tile_cache_inspector with five evaluators
(manifest schema, offline mode, raw-frame detection, thumbnail budget,
JPEG dimension probe) + walk_files helper. Pure-logic coverage: 43
new unit tests; full e2e/_unit_tests/ suite 793 passing (was 746).
Scenarios skip locally when SITL replay fixture or docker-inspect
env vars are missing; production hooks (cache-self-check FDR record,
tile-load-rejected events, docker-inspect snapshots) are tracked
outside this task.

See _docs/03_implementation/batch_82_report.md +
reviews/batch_82_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 15:09:58 +03:00
parent b0296da911
commit 7d1288e4ba
9 changed files with 1693 additions and 3 deletions
@@ -0,0 +1,107 @@
"""FT-P-15 — Tile cache manifest schema + resolution floor (AZ-421 / AC-8.1).
The full scenario:
1. SUT cold-starts against the bind-mounted ``tile-cache-fixture`` and
emits a one-time ``cache-self-check`` FDR record carrying every
manifest entry it loaded (CRS, tile_matrix, dimension, m_per_px,
capture_date, source, compression).
2. The SUT additionally emits ``tile-load-rejected`` FDR records for
any entry the freshness/floor gate rejected at load time.
3. The test parses the FDR archive, evaluates the manifest schema
contract (AC-1: every required field present; AC-2: every entry
either ≥ 0.5 m/px or rejected), and asserts the report passes.
AC-1: every required field present per entry — ``MANIFEST_REQUIRED_FIELDS``.
AC-2: m/px ≥ 0.5 OR rejected by FDR ``tile-load-rejected``.
AC-3 of FT-P-15-spec maps to AC-6 of the task (parameterisation).
Gated on:
* ``runner.helpers.fdr_reader`` — owned by AZ-594; present.
* ``runner.helpers.tile_cache_inspector.evaluate_manifest_schema`` —
pure-logic evaluator covered by
``e2e/_unit_tests/helpers/test_tile_cache_inspector.py``.
* ``sitl_replay_ready`` — skip-gates the scenario when no FDR archive
is present locally.
"""
from __future__ import annotations
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
@pytest.mark.traces_to("AC-8.1,AC-1,AC-2,AC-6")
def test_ft_p_15_cache_schema(
fc_adapter: str,
vio_strategy: str,
evidence_dir, # type: ignore[no-untyped-def]
run_id: str,
nfr_recorder, # type: ignore[no-untyped-def]
sitl_replay_ready: bool,
) -> None:
"""Full FT-P-15 scenario (AC-8.1)."""
if not sitl_replay_ready:
pytest.skip(
"FT-P-15 requires `E2E_SITL_REPLAY_DIR` to point at a SITL replay "
"fixture that includes the FDR `cache-self-check` record + any "
"`tile-load-rejected` records (AZ-595 + AZ-421 fixture builder). "
"Pure-logic AC-8.1 coverage lives in "
"e2e/_unit_tests/helpers/test_tile_cache_inspector.py."
)
from runner.helpers import fdr_reader
fdr_root = Path(evidence_dir).parent / f"run-{run_id}" / "fdr"
manifest_entries: list[dict] = []
rejected_ids: list[str] = []
for rec in fdr_reader.iter_records(fdr_root):
if rec.record_type == tci.CACHE_SELF_CHECK_FDR_KIND:
raw_entries = rec.payload.get("entries")
if isinstance(raw_entries, list):
for entry in raw_entries:
if isinstance(entry, dict):
manifest_entries.append(entry)
elif rec.record_type == tci.TILE_LOAD_REJECTED_FDR_KIND:
entry_id = rec.payload.get("id") or rec.payload.get("tile_id")
if isinstance(entry_id, str) and entry_id:
rejected_ids.append(entry_id)
if not manifest_entries:
pytest.fail(
f"FT-P-15: no `{tci.CACHE_SELF_CHECK_FDR_KIND}` FDR record with "
f"manifest entries found under {fdr_root}. The fixture builder "
"must emit one at cold start."
)
report = tci.evaluate_manifest_schema(
manifest_entries,
tile_load_rejected_ids=rejected_ids,
)
nfr_recorder.record_metric(
"ft_p_15.manifest_entries", float(report.total_entries), ac_id="AC-8.1"
)
nfr_recorder.record_metric(
"ft_p_15.entries_missing_fields",
float(len(report.entries_with_missing_fields)),
ac_id="AC-1",
)
nfr_recorder.record_metric(
"ft_p_15.entries_below_floor",
float(len(report.entries_below_floor)),
ac_id="AC-2",
)
assert report.passes, (
"AC-8.1 (manifest schema + ≥0.5 m/px floor) failed: "
f"total={report.total_entries}, "
f"missing_fields={[(e.entry_id, e.missing_fields) for e in report.entries_with_missing_fields]}, "
f"below_floor_not_rejected="
f"{[e.entry_id for e in report.entries_below_floor if e.entry_id not in report.rejected_below_floor_ids]}"
)
@@ -0,0 +1,121 @@
"""FT-P-16 — Offline-only operation (AZ-421 / AC-8.3, RESTRICT-SAT-1).
The full scenario:
1. The SUT runs against the local tile-cache mount only.
2. The Docker compose harness attaches the SUT container to
``e2e-net`` with ``Internal: true`` — Docker itself blocks egress
to anything outside that network (AZ-406 owns the compose wiring).
3. A 60 s Derkachi replay generates load; during the replay the
scenario reads ``docker network inspect e2e-net`` and
``docker inspect <sut-container>`` and asserts:
- ``e2e-net.Internal == true``
- The SUT container is attached to ``e2e-net`` only.
The "0 packets to non-e2e-net destinations" semantic of AC-8.3 is
enforced structurally — there is no other network the SUT can reach,
so the packet count is provably 0 without per-packet counters.
Gated on:
* ``sitl_replay_ready`` — full replay needs the SITL fixture (skip
cleanly otherwise).
* ``DOCKER_NETWORK_INSPECT_PATH`` / ``DOCKER_CONTAINER_INSPECT_PATH``
env vars — point at JSON files produced by the fixture builder
ahead of test invocation. When unset, the scenario skips with a
clear reason (the docker CLI is not available inside the runner
container without volume-mounting the docker socket; the fixture
builder snapshots the inspect output instead).
* ``runner.helpers.tile_cache_inspector.evaluate_offline_mode`` —
pure-logic evaluator covered by
``e2e/_unit_tests/helpers/test_tile_cache_inspector.py``.
"""
from __future__ import annotations
import json
import os
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
DOCKER_NETWORK_INSPECT_ENV = "DOCKER_NETWORK_INSPECT_PATH"
DOCKER_CONTAINER_INSPECT_ENV = "DOCKER_CONTAINER_INSPECT_PATH"
@pytest.mark.traces_to("AC-8.3,AC-3,AC-6,RESTRICT-SAT-1")
def test_ft_p_16_offline_only(
fc_adapter: str,
vio_strategy: str,
evidence_dir, # type: ignore[no-untyped-def]
run_id: str,
nfr_recorder, # type: ignore[no-untyped-def]
sitl_replay_ready: bool,
) -> None:
"""Full FT-P-16 scenario (AC-8.3 / RESTRICT-SAT-1)."""
if not sitl_replay_ready:
pytest.skip(
"FT-P-16 needs `E2E_SITL_REPLAY_DIR` to point at a SITL replay "
"fixture (AZ-595). Pure-logic AC-8.3 coverage lives in "
"e2e/_unit_tests/helpers/test_tile_cache_inspector.py."
)
net_path = os.environ.get(DOCKER_NETWORK_INSPECT_ENV)
ctr_path = os.environ.get(DOCKER_CONTAINER_INSPECT_ENV)
if not net_path or not ctr_path:
pytest.skip(
f"FT-P-16 needs `{DOCKER_NETWORK_INSPECT_ENV}` and "
f"`{DOCKER_CONTAINER_INSPECT_ENV}` env vars set to JSON files "
"produced by the compose harness (`docker network inspect "
"e2e-net` + `docker inspect gps-denied-onboard`). The fixture "
"builder snapshots both before the test runs."
)
net_inspect = _load_docker_inspect_object(Path(net_path), kind="network")
ctr_inspect = _load_docker_inspect_object(Path(ctr_path), kind="container")
report = tci.evaluate_offline_mode(net_inspect, ctr_inspect)
nfr_recorder.record_metric(
"ft_p_16.network_internal", 1.0 if report.network_internal else 0.0, ac_id="AC-8.3"
)
nfr_recorder.record_metric(
"ft_p_16.container_network_count", float(len(report.container_networks)), ac_id="AC-3"
)
assert report.passes, (
"AC-8.3 (offline-only operation) failed: "
f"network_internal={report.network_internal}, "
f"container_networks={report.container_networks}, "
f"expected_network={report.expected_network}"
)
def _load_docker_inspect_object(path: Path, *, kind: str) -> dict:
"""Load a single inspect object from a JSON file.
``docker inspect`` returns a JSON array. The scenario expects
either the wrapped array OR an unwrapped single-object payload —
accept both shapes for forwards-compatibility with fixture
builders that pre-unwrap.
"""
if not path.exists():
pytest.fail(f"FT-P-16: {kind} inspect JSON not found at {path}")
raw = json.loads(path.read_text(encoding="utf-8"))
if isinstance(raw, list):
if not raw:
pytest.fail(f"FT-P-16: {kind} inspect JSON at {path} is an empty array")
if not isinstance(raw[0], dict):
pytest.fail(
f"FT-P-16: {kind} inspect JSON at {path} array element is not an object"
)
return raw[0]
if isinstance(raw, dict):
return raw
pytest.fail(
f"FT-P-16: {kind} inspect JSON at {path} is neither object nor array: "
f"type={type(raw).__name__}"
)
return {} # unreachable; pytest.fail raises
@@ -0,0 +1,129 @@
"""FT-P-18 — No raw nav/AI-camera frame retention (AZ-421 / AC-8.5).
The full scenario:
1. After a completed Derkachi replay, walk both ``fdr-output/`` and
the bind-mounted ``tile-cache`` for any file whose extension AND
dimensions match the nav-camera raw-frame pattern (5472×3648 raw
or 880×720 H.264-decoded).
2. Sum the size of every ``THUMBNAIL_LOG_EXTENSIONS`` file and
extrapolate to an 8-hour flight.
3. Assert no raw-frame match (AC-4) and the extrapolated 8 h
thumbnail-log size < 1 GB (AC-5).
The replay-duration input to the extrapolation comes from the FDR's
last record's ``monotonic_ms`` minus the first record's ``monotonic_ms``
— a public-boundary signal the runner already has.
Gated on:
* ``sitl_replay_ready`` — full replay needs the SITL fixture (skip
cleanly otherwise).
* ``TILE_CACHE_ROOT`` env var — bind-mount path inside the runner
container. Defaults to ``/var/azaion/tile-cache``.
* ``runner.helpers.tile_cache_inspector`` — covered by
``e2e/_unit_tests/helpers/test_tile_cache_inspector.py``.
"""
from __future__ import annotations
import os
from pathlib import Path
import pytest
from runner.helpers import tile_cache_inspector as tci
TILE_CACHE_ROOT_ENV = "TILE_CACHE_ROOT"
DEFAULT_TILE_CACHE_ROOT = Path("/var/azaion/tile-cache")
@pytest.mark.traces_to("AC-8.5,AC-4,AC-5,AC-6")
def test_ft_p_18_no_raw_retention(
fc_adapter: str,
vio_strategy: str,
evidence_dir, # type: ignore[no-untyped-def]
run_id: str,
nfr_recorder, # type: ignore[no-untyped-def]
sitl_replay_ready: bool,
) -> None:
"""Full FT-P-18 scenario (AC-8.5)."""
if not sitl_replay_ready:
pytest.skip(
"FT-P-18 requires `E2E_SITL_REPLAY_DIR` to point at a SITL replay "
"fixture (AZ-595). Pure-logic AC-8.5 coverage lives in "
"e2e/_unit_tests/helpers/test_tile_cache_inspector.py."
)
from runner.helpers import fdr_reader
fdr_root = Path(evidence_dir).parent / f"run-{run_id}" / "fdr"
tile_cache_root = Path(os.environ.get(TILE_CACHE_ROOT_ENV, str(DEFAULT_TILE_CACHE_ROOT)))
# 1. Compute replay duration from the FDR archive (first to last record).
monotonic_ms_min: int | None = None
monotonic_ms_max: int | None = None
for rec in fdr_reader.iter_records(fdr_root):
if monotonic_ms_min is None or rec.monotonic_ms < monotonic_ms_min:
monotonic_ms_min = rec.monotonic_ms
if monotonic_ms_max is None or rec.monotonic_ms > monotonic_ms_max:
monotonic_ms_max = rec.monotonic_ms
if monotonic_ms_min is None or monotonic_ms_max is None:
pytest.fail(f"FT-P-18: empty FDR archive at {fdr_root}")
observed_duration_h = max(0.0, (monotonic_ms_max - monotonic_ms_min) / 3600_000.0)
if observed_duration_h <= 0:
pytest.fail(
f"FT-P-18: FDR archive at {fdr_root} has zero-or-negative duration "
f"(min={monotonic_ms_min}, max={monotonic_ms_max}); cannot extrapolate."
)
# 2. Walk both roots once; gather (path, size, dims-if-jpeg) triples.
file_specs: list[tuple[Path, int, tuple[int, int] | None]] = []
thumbnail_log_size_bytes = 0
for path in tci.walk_files(fdr_root, tile_cache_root):
size_bytes = path.stat().st_size
suffix = path.suffix.lower()
dims: tuple[int, int] | None = None
if suffix in (".jpg", ".jpeg"):
dims = tci.probe_jpeg_dimensions(path)
file_specs.append((path, size_bytes, dims))
if suffix in tci.THUMBNAIL_LOG_EXTENSIONS:
thumbnail_log_size_bytes += size_bytes
# 3. Evaluate AC-4 (no raw frames) + AC-5 (thumbnail log under budget).
raw_report = tci.detect_raw_frames(file_specs)
thumbnail_report = tci.evaluate_thumbnail_budget(
thumbnail_log_size_bytes, observed_duration_h
)
# 4. NFR metrics.
nfr_recorder.record_metric(
"ft_p_18.raw_frame_candidates", float(raw_report.candidate_count), ac_id="AC-8.5"
)
nfr_recorder.record_metric(
"ft_p_18.thumbnail_log_size_bytes",
float(thumbnail_report.observed_size_bytes),
ac_id="AC-5",
)
nfr_recorder.record_metric(
"ft_p_18.thumbnail_log_extrapolated_8h_bytes",
float(thumbnail_report.extrapolated_8h_size_bytes),
ac_id="AC-5",
)
nfr_recorder.record_metric(
"ft_p_18.replay_duration_h", observed_duration_h, ac_id="AC-5"
)
# 5. AC assertions.
assert raw_report.passes, (
f"AC-4 (no raw-frame retention) failed: {raw_report.candidate_count} "
f"matching files found: "
f"{[(str(c.path), c.dimensions) for c in raw_report.candidates]}"
)
assert thumbnail_report.passes, (
f"AC-5 (thumbnail-log < {tci.THUMBNAIL_LOG_MAX_SIZE_GB_PER_8H} GB / 8h) "
f"failed: observed={thumbnail_report.observed_size_bytes} B over "
f"{observed_duration_h:.3f} h → "
f"extrapolated_8h={thumbnail_report.extrapolated_8h_size_bytes} B "
f"(budget={thumbnail_report.max_size_bytes_per_8h} B)"
)