mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 00:31:14 +00:00
[AZ-422] Add FT-P-17 + FT-N-06 mid-flight tile blackbox tests
Implement the AC-8.4 and AC-NEW-6 blackbox scenarios for mid-flight tile generation, dedup, landing-time upload, and freshness gating. Helpers: - runner/helpers/mid_flight_tile_evaluator.py — pure-logic evaluators for tile generation rate, Mode B Fact #105 schema check, footprint+ GSD dedup (via geo.distance_m), upload-audit reconciliation, and the AC-5/AC-6 capture_utc + freshness-gate checks. - runner/helpers/mock_suite_sat_audit.py — httpx wrapper for the mock-suite-sat-service /tiles/audit endpoint with strict response- shape validation. Scenarios: - tests/positive/test_ft_p_17_mid_flight_tiles.py - tests/negative/test_ft_n_06_mid_flight_freshness.py Both skip when sitl_replay_ready is false and fail loudly when fixture records are missing (tests-as-gates discipline). 52 new unit tests (41 evaluator + 11 audit client) cover every helper branch. Review: PASS_WITH_WARNINGS (2 Low — duplicate haversine carry-over, upstream production dependency surface). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,107 @@
|
||||
# Batch 83 — AZ-422 (FT-P-17 + FT-N-06 mid-flight tile generation + freshness)
|
||||
|
||||
**Tracker**: AZ-422
|
||||
**Tasks**: 1 task / 3 complexity points
|
||||
**Date**: 2026-05-17
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
**Review**: `_docs/03_implementation/reviews/batch_83_review.md`
|
||||
|
||||
## Scope
|
||||
|
||||
- FT-P-17 (positive, AC-8.4): mid-flight orthorectified tile generation, per-tile quality metadata, dedup, landing-time upload to mock-suite-sat-service.
|
||||
- FT-N-06 (negative, AC-NEW-6): per-tile `capture_utc` within ±60 s of generation wall-clock; freshness gate must not reject freshly generated tiles as stale.
|
||||
|
||||
Both scenarios parameterize across `(fc_adapter ∈ {ardupilot, inav}, vio_strategy ∈ {okvis2, klt_ransac, vins_mono})` → 12 collected test cases.
|
||||
|
||||
## Files
|
||||
|
||||
### Created
|
||||
- `e2e/runner/helpers/mid_flight_tile_evaluator.py` — pure-logic evaluators for AC-1..AC-6:
|
||||
* `evaluate_tile_generation_rate` (AC-1)
|
||||
* `evaluate_tile_quality_metadata` (AC-2; Mode B Fact #105 schema mirror)
|
||||
* `evaluate_dedup` (AC-3; Vincenty distance via `geo.distance_m` + GSD-fraction check)
|
||||
* `evaluate_upload_acks` (AC-4)
|
||||
* `evaluate_capture_date_freshness` (AC-5; ISO-8601 parse + monotonic-ms drift)
|
||||
* `evaluate_freshness_gate` (AC-6)
|
||||
- `e2e/runner/helpers/mock_suite_sat_audit.py` — thin `httpx` client for `GET /tiles/audit` with input validation, HTTP error, JSON shape errors all raised as `RuntimeError`.
|
||||
- `e2e/tests/positive/test_ft_p_17_mid_flight_tiles.py` — FT-P-17 scenario covering AC-1..AC-4 + AC-7.
|
||||
- `e2e/tests/negative/test_ft_n_06_mid_flight_freshness.py` — FT-N-06 scenario covering AC-5 + AC-6 + AC-7.
|
||||
- `e2e/_unit_tests/helpers/test_mid_flight_tile_evaluator.py` — 41 unit tests covering happy paths + boundary + error cases for every evaluator.
|
||||
- `e2e/_unit_tests/helpers/test_mock_suite_sat_audit.py` — 11 unit tests covering happy paths + every error branch with `httpx.MockTransport`.
|
||||
|
||||
### Modified
|
||||
- `e2e/_unit_tests/test_directory_layout.py` — registered 4 new paths under the AZ-406 layout invariant.
|
||||
|
||||
## Test Results
|
||||
|
||||
```
|
||||
$ pytest _unit_tests/helpers/test_mid_flight_tile_evaluator.py \
|
||||
_unit_tests/helpers/test_mock_suite_sat_audit.py \
|
||||
_unit_tests/test_directory_layout.py -x
|
||||
============================= 157 passed in 1.07s ==============================
|
||||
```
|
||||
|
||||
Scenario collection:
|
||||
|
||||
```
|
||||
$ pytest tests/positive/test_ft_p_17_mid_flight_tiles.py \
|
||||
tests/negative/test_ft_n_06_mid_flight_freshness.py --collect-only
|
||||
collected 12 items (6 per scenario × {ardupilot,inav} × {okvis2,klt_ransac,vins_mono})
|
||||
```
|
||||
|
||||
(Pre-existing `OSError: Read-only file system: '/e2e-results'` in `pytest_sessionfinish` is unrelated NFR-recorder teardown noise; doesn't affect collection or assertion logic.)
|
||||
|
||||
## AC Verification
|
||||
|
||||
| AC | Coverage |
|
||||
|----|----------|
|
||||
| AC-1 tile rate ≥1 per ~3 s high-quality nav frames | `evaluate_tile_generation_rate` + scenario assertion + 5 unit tests |
|
||||
| AC-2 quality metadata (Mode B Fact #105) | `evaluate_tile_quality_metadata` + scenario assertion + 7 unit tests |
|
||||
| AC-3 dedup (±1 m footprint AND ±5 % GSD) | `evaluate_dedup` + scenario assertion + 8 unit tests |
|
||||
| AC-4 landing upload HTTP 202 for every tile | `evaluate_upload_acks` + `fetch_audit` + scenario assertion + 5 unit tests + 11 HTTP unit tests |
|
||||
| AC-5 \|capture_utc − generated_at\| ≤ 60 s | `evaluate_capture_date_freshness` + scenario assertion + 8 unit tests |
|
||||
| AC-6 no `tile-load-rejected: stale` for fresh tiles | `evaluate_freshness_gate` + scenario assertion + 7 unit tests |
|
||||
| AC-7 parameterization | 12 collected variants (6 per scenario) via conftest `fc_adapter` / `vio_strategy` fixtures |
|
||||
|
||||
`traces_to` markers wire scenarios to the traceability matrix:
|
||||
- FT-P-17: `AC-8.4,AC-1,AC-2,AC-3,AC-4,AC-7`
|
||||
- FT-N-06: `AC-NEW-6,AC-5,AC-6,AC-7`
|
||||
|
||||
## Code Review
|
||||
|
||||
**Verdict**: PASS_WITH_WARNINGS — 0 Critical, 0 High, 2 Low.
|
||||
|
||||
- **F1 (carry-over)**: `gcs_telemetry_evaluator.py`'s private haversine duplicates `geo.distance_m`. Already surfaced in the batches 79–81 cumulative review; deferred to a dedicated refactor batch.
|
||||
- **F2 (production-dependency surface)**: both scenarios depend on upstream features (see Production Dependencies below). Tests skip cleanly when fixtures missing and fail loudly when fixtures exist but records are missing — adhering to "tests as gates" principle.
|
||||
|
||||
Full review: `_docs/03_implementation/reviews/batch_83_review.md`.
|
||||
|
||||
## Production Dependencies
|
||||
|
||||
These features must exist for the scenarios to actually run (rather than skip):
|
||||
|
||||
1. **SUT-side** `mid-flight-tile-output` FDR record kind matching the Mode B Fact #105 schema (`TILE_REQUIRED_TOP_LEVEL_FIELDS` + `TILE_REQUIRED_QUALITY_FIELDS`).
|
||||
2. **SUT-side** `tile-load-rejected` FDR record with `reason="stale"` emitted by the freshness gate.
|
||||
3. **SUT-side** `simulate_landing()` MAVLink command (or equivalent public-input trigger) for landing-event tile upload.
|
||||
4. **Fixture-builder-side** Derkachi 5-min replay scenario emitting both record kinds for the parameterized FC × VIO grid.
|
||||
5. **Fixture-builder-side** `FT_P_17_HIGH_QUALITY_WINDOW_S` env var injection (total seconds of high-quality nav frames per AC-2.1a normal-segment criterion).
|
||||
6. **Already exists**: `mock-suite-sat-service` `/tiles/audit` endpoint (`e2e/fixtures/mock-suite-sat/app.py`).
|
||||
7. **Already exists**: `mock_suite_sat_url` and `sitl_replay_ready` pytest fixtures (used by sibling scenarios).
|
||||
|
||||
Dependencies 1–5 are tracked against epic E-OBC (Mode B work) and AZ-595 fixture builder — outside the blackbox-test workspace.
|
||||
|
||||
## Architecture Compliance
|
||||
|
||||
- All new files under `e2e/`, owned by the Blackbox Tests component per `_docs/02_document/module-layout.md`.
|
||||
- No imports from `src/gps_denied_onboard` (explicit public-boundary discipline note in `mid_flight_tile_evaluator.py`).
|
||||
- No new cyclic dependencies.
|
||||
- `httpx` and `pyproj` (via `geo`) reuse — no new infrastructure libraries introduced.
|
||||
|
||||
## Sub-step Trace
|
||||
|
||||
Phases executed per `implement/SKILL.md`:
|
||||
- phase 5 (load-spec) → AZ-422 spec read
|
||||
- phase 6 (implement-tasks-sequentially) → helper + scenarios + unit tests
|
||||
- phase 7 (verify-ac-coverage) → 7-AC trace above
|
||||
- phase 8 (code-review) → batch_83_review.md (PASS_WITH_WARNINGS)
|
||||
- phase 11 (commit-batch) → next.
|
||||
Reference in New Issue
Block a user