mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 18:21:16 +00:00
[AZ-840] [AZ-835] e2e orchestrator test (E-AZ-835 C4)
Wraps the AZ-699 verdict-report path with the AZ-839 operator_pre_flight_setup C3 fixture so a single Tier-2 test takes only (tlog, video, calibration) and runs the full 7-step pipeline on the Jetson harness without operator hand-curation. New surface (tests-only, no src/ changes): - tests/e2e/replay/_e2e_orchestrator.py — orchestrator with OrchestratorStep enum, OrchestrationFailure exception (step prefix per AC-5), OrchestrationReport dataclass, write_effective_replay_config helper, and run_e2e_orchestration entry point covering steps 1-2-6-7. - tests/e2e/replay/test_e2e_orchestrator_unit.py — 17 unit tests covering each failure mode + happy path with mocked subprocess + ground-truth loader (AC-8). - tests/e2e/replay/test_az835_e2e_real_flight.py — Tier-2 + RUN_REPLAY_E2E gated integration test asserting verdict report exists, 15-min budget held (AC-1, AC-2, AC-3, AC-4, AC-6). The effective config write overlays c6_tile_cache.root_dir onto the static operator YAML at runtime so the airborne subprocess shares the cache_root the C3 fixture chose. Field- level merge — every other operator-config block stays verbatim. The static YAML on disk is never touched. Test run: tests/e2e/replay 45 passed, 10 skipped (10 skips were 9 pre-existing + 1 new tier2). No src/ touched, no AZ-839 driver changes; AC-7 (AZ-699 still passes) holds by inspection. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,75 @@
|
||||
# E2E orchestrator test (AZ-835 C4)
|
||||
|
||||
**Task**: AZ-840_e2e_orchestrator_test
|
||||
**Name**: E2E orchestrator test ingesting raw (tlog, video, calibration) and running steps 1-7 (AZ-835 C4)
|
||||
**Description**: Fourth building block of Epic AZ-835. A single pytest test that takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on the Jetson harness — without any operator hand-curation between steps. Extends or wraps the existing AZ-699 verdict test (`tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`) so the verdict-report-writing path is preserved. This is the test that closes the Epic's narrative: "give it a tlog + video, and the system does everything else."
|
||||
**Complexity**: 3 SP
|
||||
**Dependencies**: AZ-839 (C3, `operator_pre_flight_setup` real fixture — HARD); AZ-836 (C1, RouteSpec — In Testing); AZ-838 (C2, SatelliteProviderRouteClient — In Testing); AZ-699 (real flight validation runner — done); AZ-405 (tlog/video auto-sync — done); AZ-702 (camera factory-sheet calibration — done); AZ-696 (≥ 80 % within 100 m threshold — done); AZ-835 (parent Epic)
|
||||
**Component**: `tests/e2e/replay/test_az835_e2e_real_flight.py` (new) OR extend `test_derkachi_real_tlog.py`
|
||||
**Tracker**: AZ-840 (https://denyspopov.atlassian.net/browse/AZ-840)
|
||||
**Parent Epic**: AZ-835
|
||||
|
||||
Jira AZ-840 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Inputs (test parameters)
|
||||
|
||||
- `tlog_path: Path` — raw ArduPilot tlog binary (Derkachi as the reference fixture; parametrize for future tlogs).
|
||||
- `video_path: Path` — raw flight video.
|
||||
- `calibration_path: Path` — camera factory-sheet calibration (AZ-702).
|
||||
|
||||
## Pipeline orchestration
|
||||
|
||||
The 7 steps from the Epic:
|
||||
|
||||
1. **Active flight cut + tlog/video sync** — call AZ-405's `tlog_video_adapter`. If active-segment detection needs a small extension, file as an in-scope sub-fix; if it needs a meaningful new feature, STOP and propose a sibling ticket.
|
||||
2. **On-fly frame + IMU extraction** — `VideoFileFrameSource` + `TlogReplayFcAdapter`. No change.
|
||||
3. **Auto-create route** — call AZ-836's `extract_route_from_tlog(tlog, max_waypoints=10)`. Assert the returned `RouteSpec` materially follows the tlog trajectory.
|
||||
4. **POST route to satellite-provider** — delegate to AZ-839 (C3) fixture `operator_pre_flight_setup` (which itself calls AZ-838's `SatelliteProviderRouteClient.seed_route`). The fixture's `PopulatedC6Cache` is the dependency boundary.
|
||||
5. **Build FAISS index** — driven by C3 fixture as part of populating the cache.
|
||||
6. **Run gps-denied airborne pipeline** — invoke the `gps-denied-replay` console-script or equivalent direct-call entry point against the populated cache + tlog/video/calibration. Reuse the airborne composition root path AZ-699 exercises today.
|
||||
7. **Get GPS fixes, check vs tlog GPS** — call `helpers/accuracy_report.py` + `helpers/gps_compare.py` to compute the horizontal-error distribution and emit the verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`.
|
||||
|
||||
## Test gating
|
||||
|
||||
- `@pytest.mark.tier2`.
|
||||
- Skip-unless-env(`RUN_REPLAY_E2E=1`) with an explicit skip reason that names the missing env var — no silent skip.
|
||||
|
||||
## Verdict report
|
||||
|
||||
Emit ALWAYS, even on FAIL. The success criterion for AC-1 is that the report exists and the distribution is honest — NOT that the verdict is PASS.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | Test takes only `(tlog, video, calibration)` and runs steps 1-7 end-to-end on Tier-2 Jetson. No operator hand-curation between steps. |
|
||||
| AC-2 | Test produces the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with the honest horizontal-error distribution, REGARDLESS of PASS/FAIL on the AZ-696 AC-3 threshold (≥ 80 % within 100 m). |
|
||||
| AC-3 | Test reuses the C3 fixture's `operator_pre_flight_setup` for steps 3-5; no duplicate seeding/downloading logic. |
|
||||
| AC-4 | Test runs to completion within 15 min wall time on the Derkachi clip (soft target for first delivery; hard NFR set after first measurement is recorded in the report). |
|
||||
| AC-5 | Mid-pipeline failure (e.g. step 4 satellite-provider rejection, step 5 FAISS sidecar mismatch) fails LOUD with a clear error pointing at the failing step. No silent skip past a failing step. |
|
||||
| AC-6 | Test is gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`; explicit skip reason names the missing env var. |
|
||||
| AC-7 | The existing AZ-699 verdict test continues to pass (this test does not break or supersede it; either it lives alongside, or AZ-699 is folded into this test with the verdict-writing path preserved). |
|
||||
| AC-8 | Unit tests cover the orchestration helper layer (parameter validation, error propagation between steps). The end-to-end happy path is the Jetson integration test. |
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Un-xfailing the AZ-777 AC-4 / AC-5 tests (AZ-841 / C5).
|
||||
- Documentation updates beyond the test file's own docstring (AZ-842 / C6).
|
||||
- Real-time tlog ingestion (one finished `.tlog` per test invocation).
|
||||
- Multi-flight aggregate validation.
|
||||
- Performance optimization beyond the AC-4 soft target.
|
||||
- Modifying the airborne composition root.
|
||||
|
||||
## Risks
|
||||
|
||||
**Risk 1 — Integration glue between AZ-405 tlog/video sync and the airborne pipeline's frame-source contract.** The auto-sync adapter and the airborne composition root were authored in different cycles; small impedance mismatches are likely. Mitigation: if the glue exceeds the 3 SP budget, STOP and propose a sub-ticket rather than expanding scope.
|
||||
|
||||
**Risk 2 — Step 1 active-segment detection may need extension.** AZ-405 covered tlog↔video sync; take-off/landing boundary detection may not be implemented. Mitigation: file an in-scope sub-fix if small; STOP and propose a sibling ticket if not.
|
||||
|
||||
## References
|
||||
|
||||
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||
- Hard dep (C3 fixture): AZ-839 — https://denyspopov.atlassian.net/browse/AZ-839
|
||||
- Existing verdict test: `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`
|
||||
- Tlog/video adapter: `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` (AZ-405)
|
||||
- Helpers: `src/gps_denied_onboard/helpers/accuracy_report.py`, `src/gps_denied_onboard/helpers/gps_compare.py`
|
||||
Reference in New Issue
Block a user