[AZ-422] Add FT-P-17 + FT-N-06 mid-flight tile blackbox tests

Implement the AC-8.4 and AC-NEW-6 blackbox scenarios for mid-flight
tile generation, dedup, landing-time upload, and freshness gating.

Helpers:
- runner/helpers/mid_flight_tile_evaluator.py — pure-logic evaluators
  for tile generation rate, Mode B Fact #105 schema check, footprint+
  GSD dedup (via geo.distance_m), upload-audit reconciliation, and
  the AC-5/AC-6 capture_utc + freshness-gate checks.
- runner/helpers/mock_suite_sat_audit.py — httpx wrapper for the
  mock-suite-sat-service /tiles/audit endpoint with strict response-
  shape validation.

Scenarios:
- tests/positive/test_ft_p_17_mid_flight_tiles.py
- tests/negative/test_ft_n_06_mid_flight_freshness.py

Both skip when sitl_replay_ready is false and fail loudly when fixture
records are missing (tests-as-gates discipline). 52 new unit tests
(41 evaluator + 11 audit client) cover every helper branch.

Review: PASS_WITH_WARNINGS (2 Low — duplicate haversine carry-over,
upstream production dependency surface).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 15:28:39 +03:00
parent 1ee54b414b
commit 5def1a3eb3
11 changed files with 1782 additions and 2 deletions
@@ -1,78 +0,0 @@
# FT-P-17 + FT-N-06 — Mid-flight tile generation + freshness
**Task**: AZ-422_ft_p_17_ftn_06_mid_flight_tiles
**Name**: Mid-flight tile generation + current-timestamp freshness (AC-8.4, AC-NEW-6)
**Description**: Implement FT-P-17 (continuous orthorectification of nav-camera frames into basemap-projected tiles, dedup, locally stored for landing-time upload; landing → upload to mock-suite-sat-service with HTTP 202) and FT-N-06 (mid-flight tiles' `capture_date` within ±60 s of generation wall-clock; treated as fresh by the freshness gate).
**Complexity**: 3 points
**Dependencies**: AZ-406, AZ-407
**Component**: Blackbox Tests / Positive + Negative / Tile generation (epic AZ-262)
**Tracker**: AZ-422
**Epic**: AZ-262 (E-BBT)
## Problem
Mid-flight tile generation is the project's contribution to the parent-suite Service voting layer. Two related properties must be measured: that tiles are generated + uploaded with quality metadata (FT-P-17, AC-8.4), and that they are timestamped as current and treated as fresh (FT-N-06, AC-NEW-6).
## Outcome
- pytest scenario at `e2e/tests/positive/test_ft_p_17_mid_flight_tiles.py` (FT-P-17) and `e2e/tests/negative/test_ft_n_06_mid_flight_freshness.py` (FT-N-06).
- FT-P-17: 5 min Derkachi segment; SUT generates and writes tiles to FDR's `mid-flight-tile-output/`; ≥1 tile per ~3 s of high-quality nav frames; each tile carries quality metadata sufficient for the Service voting layer (per Mode B Fact #105); landing-event simulation triggers upload; mock-suite-sat-service receives all tiles with HTTP 202.
- FT-N-06: same 5 min replay; inspect each generated tile's manifest entry; assert `capture_date` within ±60 s of generation wall-clock; assert each is treated as fresh by the freshness gate (no `tile-load-rejected: stale` events for the freshly generated tiles).
## Scope
### Included
- 5 min Derkachi replay against empty `mid-flight-tile-output`.
- Per-tile metadata inspection (FT-P-17 quality fields; FT-N-06 capture_date).
- Mock-suite-sat-service audit-log read for upload verification.
- Freshness-gate event check (FT-N-06).
### Excluded
- Cache-poisoning probability budget (multi-flight) — owned by NFT-SEC-01 (AZ-436).
- Stale-tile rejection on aged source tiles — owned by FT-N-05 (AZ-427).
## Acceptance Criteria
**AC-1: tile generation rate**
Given a 5 min Derkachi replay against empty `mid-flight-tile-output`
Then ≥1 tile is generated per ~3 s of high-quality nav frames (high-quality criterion per AC-2.1a normal-segment definition).
**AC-2: tile quality metadata sufficiency**
Given each generated tile
Then the tile's metadata includes the fields documented in Mode B Fact #105 sufficient for the Service voting layer (e.g., onboard-asserted geo-alignment, ground sample distance, heading at capture, registration-quality score). Missing fields fail the AC.
**AC-3: dedup**
Given the tile-generation stream
Then no two generated tiles cover the same (footprint, GSD) bin (within ±1 m and ±5 % GSD); duplicates are dropped at write-time.
**AC-4: landing-event upload**
Given a simulated landing event after replay
Then the SUT uploads all generated tiles to `mock-suite-sat-service`; the mock's audit log shows HTTP 202 for every tile.
**AC-5: FT-N-06 — current-timestamp**
Given each generated tile's manifest entry
Then `|capture_date generation_wall_clock| ≤ 60 s`.
**AC-6: FT-N-06 — fresh treatment**
Given the freshly generated tiles loaded by the freshness gate
Then no `tile-load-rejected: stale` event is observed for these tiles in FDR.
**AC-7: parameterization**
Given conftest parameterization
Then both methods run per `(fc_adapter, vio_strategy)`.
## System Under Test Boundary
End-to-end through public boundaries.
- **Allowed**: FDR read for tile manifest + freshness events; mock-suite-sat-service audit log read; landing-event simulation (a documented public-input mechanism — `simulate_landing()` MAVLink command OR config flag).
- **Forbidden**: importing C13 / C6 internal state; stubbing the freshness gate.
## Constraints
- The mock-suite-sat-service must accept all well-formed publish requests with HTTP 202 (per its declared mock contract); rejection here is itself a defect signal.
- "Generation wall-clock" is the SUT's emission timestamp on the per-tile generation log line; not the runner's wall-clock.
## Document Dependencies
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-17, § FT-N-06