mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 08:51:12 +00:00
[AZ-835] [AZ-777] Decompose Epic into C3-C6 + close AZ-777
AZ-839 (C3, 5pt) operator_pre_flight_setup real fixture: wire C1+C2+C11+C10, supersedes AZ-777 Phase 3 (route-driven, not bbox). AZ-840 (C4, 3pt) E2E orchestrator test ingesting raw (tlog, video, calibration), runs steps 1-7 end-to-end on Jetson. AZ-841 (C5, 1pt) Un-xfail AZ-777 AC-4 + AC-5 once C3 + C4 land. AZ-842 (C6, 2pt) Docs: replay_protocol Invariant 12 + architecture + orchestrator-test README. AZ-777 transitioned to Done in Jira (Phases 1+2 shipped batches 104-106; Phases 3-5 superseded per 2026-05-22 route-driven directive). Closure comment 11177 added with phase-by-phase status. Local spec moved todo/ -> done/ with a status banner at the top. Dependencies table preamble bumped to 173 tasks / 557 SP and a 2026-05-23 entry prepended. Autodev state sub_step.detail set to "batch 108 next; AZ-839 C3". Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
File diff suppressed because one or more lines are too long
+16
@@ -1,5 +1,21 @@
|
|||||||
# Derkachi e2e: wire EXISTING parent-suite satellite-provider into the operator pre-flight fixture
|
# Derkachi e2e: wire EXISTING parent-suite satellite-provider into the operator pre-flight fixture
|
||||||
|
|
||||||
|
> **Status (2026-05-23)**: **CLOSED** — Phases 1+2 shipped (cycle 3); Phases 3–5 **superseded by Epic AZ-835** per the 2026-05-22 user directive (route-driven seeding instead of bbox).
|
||||||
|
>
|
||||||
|
> | Phase | Outcome |
|
||||||
|
> |-------|---------|
|
||||||
|
> | Phase 1 (e2e-runner wire + C11 contract adapt + smoke test) | **SHIPPED** — batch 104, 2026-05-21 |
|
||||||
|
> | Phase 2 (`seed_region.py` CLI + `bbox.yaml` + license attribution) | **SHIPPED** — between batches 104 and 106 |
|
||||||
|
> | Phase 3 (real `operator_pre_flight_setup` fixture) | **SUPERSEDED** → AZ-839 (Epic AZ-835 C3, 5 SP) — route-driven, not bbox |
|
||||||
|
> | Phase 4 (un-xfail AC-4 + AC-5) | **SUPERSEDED** → AZ-841 (Epic AZ-835 C5, 1 SP) |
|
||||||
|
> | Phase 5 (docs) | **SUPERSEDED** → AZ-842 (Epic AZ-835 C6, 2 SP) |
|
||||||
|
>
|
||||||
|
> Total credited to AZ-777: 8 SP (per the 2026-05-21 single-ticket-containment override; Phases 1+2 fit within that envelope). Remaining work (~11 SP including AZ-836 / AZ-838 already shipped) is tracked under Epic AZ-835 children.
|
||||||
|
>
|
||||||
|
> Spec preserved as historical reference. **Do not implement Phases 3–5 from this file** — see the Epic AZ-835 children instead.
|
||||||
|
>
|
||||||
|
> See also: `_docs/_process_leftovers/2026-05-21_az777_complexity_override.md` (decision log).
|
||||||
|
|
||||||
**Task**: AZ-777_derkachi_c6_reference_fixture
|
**Task**: AZ-777_derkachi_c6_reference_fixture
|
||||||
**Name**: Drive the production C10/C11 pre-flight pipeline against the parent-suite `satellite-provider` .NET service ALREADY running in the Jetson e2e harness so the Derkachi clip produces a real FAISS-anchored C4/C5 satellite-fix loop end-to-end
|
**Name**: Drive the production C10/C11 pre-flight pipeline against the parent-suite `satellite-provider` .NET service ALREADY running in the Jetson e2e harness so the Derkachi clip produces a real FAISS-anchored C4/C5 satellite-fix loop end-to-end
|
||||||
**Description**: The Jetson e2e harness already runs the real `satellite-provider` .NET 8 service (lineage AZ-688 / AZ-691 / AZ-692, services `satellite-provider` + `satellite-provider-postgres` in `docker-compose.test.jetson.yml`), but the e2e-runner still points its `SATELLITE_PROVIDER_URL` at the legacy `mock-sat` fixture and the placeholder `operator_pre_flight_setup` fixture never drives the C10/C11 pipeline. Compounding this, C11's `HttpTileDownloader` path constants (`_LIST_PATH=/api/satellite/tiles`, `_GET_PATH=/api/satellite/tiles/{tile_id}`) do not match the real satellite-provider API surface (`POST /api/satellite/tiles/inventory` for LIST, `GET /tiles/{z}/{x}/{y}` for tile fetch). This task wires the existing service into the e2e-runner, adapts C11 to the real contract, seeds the Derkachi-bbox tile catalog via `POST /api/satellite/request`, replaces the placeholder fixture with a real C10+C11 driver, and un-xfails the Tier-2 Derkachi + AZ-699 verdict tests.
|
**Description**: The Jetson e2e harness already runs the real `satellite-provider` .NET 8 service (lineage AZ-688 / AZ-691 / AZ-692, services `satellite-provider` + `satellite-provider-postgres` in `docker-compose.test.jetson.yml`), but the e2e-runner still points its `SATELLITE_PROVIDER_URL` at the legacy `mock-sat` fixture and the placeholder `operator_pre_flight_setup` fixture never drives the C10/C11 pipeline. Compounding this, C11's `HttpTileDownloader` path constants (`_LIST_PATH=/api/satellite/tiles`, `_GET_PATH=/api/satellite/tiles/{tile_id}`) do not match the real satellite-provider API surface (`POST /api/satellite/tiles/inventory` for LIST, `GET /tiles/{z}/{x}/{y}` for tile fetch). This task wires the existing service into the e2e-runner, adapts C11 to the real contract, seeds the Derkachi-bbox tile catalog via `POST /api/satellite/request`, replaces the placeholder fixture with a real C10+C11 driver, and un-xfails the Tier-2 Derkachi + AZ-699 verdict tests.
|
||||||
@@ -0,0 +1,85 @@
|
|||||||
|
# operator_pre_flight_setup real fixture (AZ-835 C3)
|
||||||
|
|
||||||
|
**Task**: AZ-839_operator_pre_flight_setup_real_fixture
|
||||||
|
**Name**: operator_pre_flight_setup fixture: wire C1+C2+C11+C10 into real fixture, supersede AZ-777 Phase 3 (AZ-835 C3)
|
||||||
|
**Description**: Third building block of Epic AZ-835. Replace the placeholder `operator_pre_flight_setup` fixture (currently a `mkdir` stub at `tests/e2e/replay/conftest.py` lines 293-310) with a real driver that wires C1 (AZ-836) + C2 (AZ-838) + C11 (AZ-777 Phase 1) + C10 to populate C6 from a tlog-derived route. Supersedes AZ-777 Phase 3 (the bbox-seeded placeholder-replacement design) per the 2026-05-22 user directive — route-driven seeding is ~100x more tile-efficient and pre-commits to where the operator did fly per the tlog.
|
||||||
|
**Complexity**: 5 SP
|
||||||
|
**Dependencies**: AZ-836 (C1, RouteSpec + extractor — In Testing); AZ-838 (C2, SatelliteProviderRouteClient + seed_route.py CLI — In Testing); AZ-777 Phase 1 (e2e-runner ↔ satellite-provider wire + C11 contract adaptation — done, batch 104); AZ-322 (C10 DescriptorBatcher — done); AZ-316+AZ-777 Phase 1 (C11 HttpTileDownloader.download_for_bbox — done); AZ-306 (FAISS sidecar triple-consistency — done); AZ-835 (parent Epic)
|
||||||
|
**Component**: `tests/e2e/replay/conftest.py` (`operator_pre_flight_setup` fixture rewrite + new `PopulatedC6Cache` dataclass)
|
||||||
|
**Tracker**: AZ-839 (https://denyspopov.atlassian.net/browse/AZ-839)
|
||||||
|
**Parent Epic**: AZ-835
|
||||||
|
|
||||||
|
Jira AZ-839 is the authoritative spec; this file is the in-workspace mirror.
|
||||||
|
|
||||||
|
## Public surface
|
||||||
|
|
||||||
|
```python
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from gps_denied_onboard.replay_input.tlog_route import RouteSpec # AZ-836
|
||||||
|
|
||||||
|
@dataclass(frozen=True, slots=True)
|
||||||
|
class PopulatedC6Cache:
|
||||||
|
cache_root: Path # named-volume mount inside the e2e-runner container
|
||||||
|
tile_store_path: Path # postgres + filesystem store root
|
||||||
|
faiss_index_path: Path # .index file
|
||||||
|
faiss_sidecar_sha256_path: Path # .sha256 file
|
||||||
|
faiss_sidecar_meta_path: Path # .meta.json file
|
||||||
|
route_spec: RouteSpec # provenance — which tlog/route produced this cache
|
||||||
|
tile_count: int # how many tiles ended up in C6
|
||||||
|
elapsed_seconds: float # wall time, for the AC-1/AC-2 perf budget
|
||||||
|
```
|
||||||
|
|
||||||
|
The fixture remains a pytest fixture at `tests/e2e/replay/conftest.py::operator_pre_flight_setup`, same `session` scope as today. Input contract unchanged (same args the placeholder takes) plus a new dependency on `RouteSpec` — either fixture-injected or extracted from the test's tlog parameter via `extract_route_from_tlog`.
|
||||||
|
|
||||||
|
## Behaviour
|
||||||
|
|
||||||
|
1. Read the route spec (fixture-injected or extracted from test tlog via `extract_route_from_tlog`).
|
||||||
|
2. Instantiate `SatelliteProviderRouteClient` from env (`SATELLITE_PROVIDER_URL`, `SATELLITE_PROVIDER_API_KEY`, `SATELLITE_PROVIDER_TLS_INSECURE`).
|
||||||
|
3. Call `seed_route(route_spec)`. On `RouteValidationError` / `RouteTerminalFailureError` → re-raise with original cause. On `RouteTransientError` → retry up to 3 attempts using C11's `_DEFAULT_BACKOFF_SCHEDULE_S = (1, 2, 4, 8)`.
|
||||||
|
4. Enumerate tile coverage locally (mirror `route_client._enumerate_route_tile_coords` from AZ-838); call C11 `HttpTileDownloader.download_for_bbox` to pull every tile into C6.
|
||||||
|
5. Invoke C10 `DescriptorBatcher` against the populated C6 to build the FAISS HNSW index using the NetVLAD backbone (per `c2_vpr/config.py:67` default).
|
||||||
|
6. Verify sidecar coherence (`.index` + `.sha256` + `.meta.json` triple-consistency per AZ-306). Mismatch → `IndexUnavailableError`.
|
||||||
|
7. Yield `PopulatedC6Cache(...)`. On any failure path, clean up partial cache state (no half-built FAISS index left behind).
|
||||||
|
|
||||||
|
**Mount strategy**: write into a named docker volume that survives across pytest sessions. Cold first invocation populates; subsequent invocations within the same compose session reuse (warm cache). Same pattern AZ-777 Phase 3 originally specced; only the cache **source** changes (route, not bbox).
|
||||||
|
|
||||||
|
## Acceptance criteria
|
||||||
|
|
||||||
|
| # | Criterion |
|
||||||
|
|---|-----------|
|
||||||
|
| AC-1 | Cold first invocation on the Derkachi tlog completes in ≤ 5 min on Tier-2 Jetson (includes satellite-provider Google Maps round-trips). |
|
||||||
|
| AC-2 | Warm invocation within the same compose session completes in ≤ 30 s (named-volume reuse). |
|
||||||
|
| AC-3 | Yielded `PopulatedC6Cache` has all paths populated; `tile_count > 0`; FAISS sidecar triple-consistency passes (AZ-306). |
|
||||||
|
| AC-4 | `RouteValidationError` / `RouteTerminalFailureError` from `seed_route` is re-raised with original cause; no silent swallow. |
|
||||||
|
| AC-5 | `RouteTransientError` is retried up to 3 attempts using C11's existing backoff schedule; final attempt's exception is propagated. |
|
||||||
|
| AC-6 | Tamper test — corrupt one of the three sidecar files between fixture runs; next invocation raises `IndexUnavailableError`. |
|
||||||
|
| AC-7 | On any failure path inside the fixture, partial state is cleaned up (no half-built FAISS index, no orphaned postgres rows). |
|
||||||
|
| AC-8 | Unit tests (stubbed `SatelliteProviderRouteClient` + stubbed C11 + stubbed C10) cover: happy path, transient-retry, terminal-failure, validation-error, tamper-detection, cleanup-on-failure. |
|
||||||
|
| AC-9 | Integration test gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2` against the Jetson harness produces a real `PopulatedC6Cache` from the Derkachi tlog. |
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- Driving the airborne replay pipeline against the populated cache (AZ-840 / C4).
|
||||||
|
- Un-xfailing the existing AZ-777 AC-4 / AC-5 tests (AZ-841 / C5).
|
||||||
|
- Updating `replay_protocol.md` (AZ-842 / C6).
|
||||||
|
- Switching the C2 default backbone away from NetVLAD.
|
||||||
|
- Multi-tlog aggregate caches (one route per fixture invocation).
|
||||||
|
|
||||||
|
## Risks
|
||||||
|
|
||||||
|
**Risk 1 — Docker named-volume lifecycle across pytest sessions.** First invocation may leave half-populated volume on crash; the cleanup-on-failure path in step 7 must be robust. Mitigation: AC-7 covers explicitly + a `try/finally` around the four wiring steps.
|
||||||
|
|
||||||
|
**Risk 2 — Cold-start budget (AC-1, 5 min) tight on first Jetson run.** Google Maps round-trips for ~50-100 tiles may exceed budget on slow networks. Mitigation: instrument elapsed_seconds on every step and surface in the verdict report; if AC-1 fails, file a perf-tuning ticket rather than skipping the AC.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||||
|
- Existing placeholder: `tests/e2e/replay/conftest.py` lines 293-310
|
||||||
|
- C1: AZ-836 (`extract_route_from_tlog`) — https://denyspopov.atlassian.net/browse/AZ-836
|
||||||
|
- C2: AZ-838 (`SatelliteProviderRouteClient`) — https://denyspopov.atlassian.net/browse/AZ-838
|
||||||
|
- AZ-777 (Phase 3+ superseded): `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md`
|
||||||
|
- C10 DescriptorBatcher: `src/gps_denied_onboard/components/c10_provisioning/descriptor_batcher.py`
|
||||||
|
- C11 HttpTileDownloader: `src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py`
|
||||||
|
- AZ-306 FAISS sidecar triple-consistency reference
|
||||||
@@ -0,0 +1,75 @@
|
|||||||
|
# E2E orchestrator test (AZ-835 C4)
|
||||||
|
|
||||||
|
**Task**: AZ-840_e2e_orchestrator_test
|
||||||
|
**Name**: E2E orchestrator test ingesting raw (tlog, video, calibration) and running steps 1-7 (AZ-835 C4)
|
||||||
|
**Description**: Fourth building block of Epic AZ-835. A single pytest test that takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on the Jetson harness — without any operator hand-curation between steps. Extends or wraps the existing AZ-699 verdict test (`tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`) so the verdict-report-writing path is preserved. This is the test that closes the Epic's narrative: "give it a tlog + video, and the system does everything else."
|
||||||
|
**Complexity**: 3 SP
|
||||||
|
**Dependencies**: AZ-839 (C3, `operator_pre_flight_setup` real fixture — HARD); AZ-836 (C1, RouteSpec — In Testing); AZ-838 (C2, SatelliteProviderRouteClient — In Testing); AZ-699 (real flight validation runner — done); AZ-405 (tlog/video auto-sync — done); AZ-702 (camera factory-sheet calibration — done); AZ-696 (≥ 80 % within 100 m threshold — done); AZ-835 (parent Epic)
|
||||||
|
**Component**: `tests/e2e/replay/test_az835_e2e_real_flight.py` (new) OR extend `test_derkachi_real_tlog.py`
|
||||||
|
**Tracker**: AZ-840 (https://denyspopov.atlassian.net/browse/AZ-840)
|
||||||
|
**Parent Epic**: AZ-835
|
||||||
|
|
||||||
|
Jira AZ-840 is the authoritative spec; this file is the in-workspace mirror.
|
||||||
|
|
||||||
|
## Inputs (test parameters)
|
||||||
|
|
||||||
|
- `tlog_path: Path` — raw ArduPilot tlog binary (Derkachi as the reference fixture; parametrize for future tlogs).
|
||||||
|
- `video_path: Path` — raw flight video.
|
||||||
|
- `calibration_path: Path` — camera factory-sheet calibration (AZ-702).
|
||||||
|
|
||||||
|
## Pipeline orchestration
|
||||||
|
|
||||||
|
The 7 steps from the Epic:
|
||||||
|
|
||||||
|
1. **Active flight cut + tlog/video sync** — call AZ-405's `tlog_video_adapter`. If active-segment detection needs a small extension, file as an in-scope sub-fix; if it needs a meaningful new feature, STOP and propose a sibling ticket.
|
||||||
|
2. **On-fly frame + IMU extraction** — `VideoFileFrameSource` + `TlogReplayFcAdapter`. No change.
|
||||||
|
3. **Auto-create route** — call AZ-836's `extract_route_from_tlog(tlog, max_waypoints=10)`. Assert the returned `RouteSpec` materially follows the tlog trajectory.
|
||||||
|
4. **POST route to satellite-provider** — delegate to AZ-839 (C3) fixture `operator_pre_flight_setup` (which itself calls AZ-838's `SatelliteProviderRouteClient.seed_route`). The fixture's `PopulatedC6Cache` is the dependency boundary.
|
||||||
|
5. **Build FAISS index** — driven by C3 fixture as part of populating the cache.
|
||||||
|
6. **Run gps-denied airborne pipeline** — invoke the `gps-denied-replay` console-script or equivalent direct-call entry point against the populated cache + tlog/video/calibration. Reuse the airborne composition root path AZ-699 exercises today.
|
||||||
|
7. **Get GPS fixes, check vs tlog GPS** — call `helpers/accuracy_report.py` + `helpers/gps_compare.py` to compute the horizontal-error distribution and emit the verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`.
|
||||||
|
|
||||||
|
## Test gating
|
||||||
|
|
||||||
|
- `@pytest.mark.tier2`.
|
||||||
|
- Skip-unless-env(`RUN_REPLAY_E2E=1`) with an explicit skip reason that names the missing env var — no silent skip.
|
||||||
|
|
||||||
|
## Verdict report
|
||||||
|
|
||||||
|
Emit ALWAYS, even on FAIL. The success criterion for AC-1 is that the report exists and the distribution is honest — NOT that the verdict is PASS.
|
||||||
|
|
||||||
|
## Acceptance criteria
|
||||||
|
|
||||||
|
| # | Criterion |
|
||||||
|
|---|-----------|
|
||||||
|
| AC-1 | Test takes only `(tlog, video, calibration)` and runs steps 1-7 end-to-end on Tier-2 Jetson. No operator hand-curation between steps. |
|
||||||
|
| AC-2 | Test produces the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with the honest horizontal-error distribution, REGARDLESS of PASS/FAIL on the AZ-696 AC-3 threshold (≥ 80 % within 100 m). |
|
||||||
|
| AC-3 | Test reuses the C3 fixture's `operator_pre_flight_setup` for steps 3-5; no duplicate seeding/downloading logic. |
|
||||||
|
| AC-4 | Test runs to completion within 15 min wall time on the Derkachi clip (soft target for first delivery; hard NFR set after first measurement is recorded in the report). |
|
||||||
|
| AC-5 | Mid-pipeline failure (e.g. step 4 satellite-provider rejection, step 5 FAISS sidecar mismatch) fails LOUD with a clear error pointing at the failing step. No silent skip past a failing step. |
|
||||||
|
| AC-6 | Test is gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`; explicit skip reason names the missing env var. |
|
||||||
|
| AC-7 | The existing AZ-699 verdict test continues to pass (this test does not break or supersede it; either it lives alongside, or AZ-699 is folded into this test with the verdict-writing path preserved). |
|
||||||
|
| AC-8 | Unit tests cover the orchestration helper layer (parameter validation, error propagation between steps). The end-to-end happy path is the Jetson integration test. |
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- Un-xfailing the AZ-777 AC-4 / AC-5 tests (AZ-841 / C5).
|
||||||
|
- Documentation updates beyond the test file's own docstring (AZ-842 / C6).
|
||||||
|
- Real-time tlog ingestion (one finished `.tlog` per test invocation).
|
||||||
|
- Multi-flight aggregate validation.
|
||||||
|
- Performance optimization beyond the AC-4 soft target.
|
||||||
|
- Modifying the airborne composition root.
|
||||||
|
|
||||||
|
## Risks
|
||||||
|
|
||||||
|
**Risk 1 — Integration glue between AZ-405 tlog/video sync and the airborne pipeline's frame-source contract.** The auto-sync adapter and the airborne composition root were authored in different cycles; small impedance mismatches are likely. Mitigation: if the glue exceeds the 3 SP budget, STOP and propose a sub-ticket rather than expanding scope.
|
||||||
|
|
||||||
|
**Risk 2 — Step 1 active-segment detection may need extension.** AZ-405 covered tlog↔video sync; take-off/landing boundary detection may not be implemented. Mitigation: file an in-scope sub-fix if small; STOP and propose a sibling ticket if not.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||||
|
- Hard dep (C3 fixture): AZ-839 — https://denyspopov.atlassian.net/browse/AZ-839
|
||||||
|
- Existing verdict test: `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`
|
||||||
|
- Tlog/video adapter: `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` (AZ-405)
|
||||||
|
- Helpers: `src/gps_denied_onboard/helpers/accuracy_report.py`, `src/gps_denied_onboard/helpers/gps_compare.py`
|
||||||
@@ -0,0 +1,57 @@
|
|||||||
|
# Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests (AZ-835 C5)
|
||||||
|
|
||||||
|
**Task**: AZ-841_unxfail_az777_tier2_tests
|
||||||
|
**Name**: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests once C3 fixture + C4 orchestrator land (AZ-835 C5)
|
||||||
|
**Description**: Fifth building block of Epic AZ-835. Once C3 (AZ-839, `operator_pre_flight_setup` real fixture) and C4 (AZ-840, e2e orchestrator test) land, remove the `@pytest.mark.xfail` markers from the AZ-777 Tier-2 tests. The verdict — PASS or FAIL — becomes the honest signal. Both tests remain gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`.
|
||||||
|
**Complexity**: 1 SP
|
||||||
|
**Dependencies**: AZ-839 (C3, `operator_pre_flight_setup` real fixture — HARD); AZ-840 (C4, e2e orchestrator test — HARD); AZ-777 (being closed/superseded by this Epic; tests live in same file tree); AZ-835 (parent Epic)
|
||||||
|
**Component**: `tests/e2e/replay/test_derkachi_1min.py` (xfail removal) + `tests/e2e/replay/test_derkachi_real_tlog.py` (xfail removal)
|
||||||
|
**Tracker**: AZ-841 (https://denyspopov.atlassian.net/browse/AZ-841)
|
||||||
|
**Parent Epic**: AZ-835
|
||||||
|
|
||||||
|
Jira AZ-841 is the authoritative spec; this file is the in-workspace mirror.
|
||||||
|
|
||||||
|
## Targets
|
||||||
|
|
||||||
|
1. `tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks` (AZ-777 AC-4) — remove `@pytest.mark.xfail`; verify `@pytest.mark.tier2` + `RUN_REPLAY_E2E` gating stays in place.
|
||||||
|
2. `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report` (AZ-777 AC-5) — remove `@pytest.mark.xfail`; verify gating stays in place.
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
**On Tier-2 Jetson** (`RUN_REPLAY_E2E=1`):
|
||||||
|
- `test_ac3_within_100m_80pct_of_ticks` PASSES (≥ 80 % of ticks within 100 m of ground truth, log lines `replay.satellite_anchor_inserted` visible).
|
||||||
|
- `test_az699_real_flight_validation_emits_verdict_and_report` runs to completion within 15 min and emits `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with the honest distribution. PASS preferred but NOT required for AC-4 — emitting the honest report IS the success criterion.
|
||||||
|
|
||||||
|
**Locally** (no env):
|
||||||
|
- Both tests skip explicitly with a reason naming `RUN_REPLAY_E2E` — they MUST NOT pass as a side effect of being skipped.
|
||||||
|
|
||||||
|
## Acceptance criteria
|
||||||
|
|
||||||
|
| # | Criterion |
|
||||||
|
|---|-----------|
|
||||||
|
| AC-1 | `@pytest.mark.xfail` removed from both AZ-777 tests. |
|
||||||
|
| AC-2 | Both tests still gated by `@pytest.mark.tier2` + skip-unless-env(`RUN_REPLAY_E2E=1`). Skip reason names the missing env. |
|
||||||
|
| AC-3 | On Jetson with `RUN_REPLAY_E2E=1`, `test_ac3_within_100m_80pct_of_ticks` PASSES (≥ 80 % within 100 m). |
|
||||||
|
| AC-4 | On Jetson with `RUN_REPLAY_E2E=1`, `test_az699_real_flight_validation_emits_verdict_and_report` completes within 15 min and emits the verdict report. PASS preferred but not required for AC-4. |
|
||||||
|
| AC-5 | If either test FAILS on the metric (e.g. only 60 % within 100 m), the test reports FAIL honestly — no fallback to xfail or skip. Failure mode is a feature, not a bug. |
|
||||||
|
| AC-6 | Locally on a machine without `RUN_REPLAY_E2E`, both tests skip with an explicit reason. |
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- Modifying the airborne pipeline to improve metric performance (separate optimization tickets if AC-3 fails).
|
||||||
|
- Adding new test cases (this ticket only removes xfail; new cases belong to other tickets).
|
||||||
|
- Documentation updates (AZ-842 / C6).
|
||||||
|
- Modifying the verdict thresholds (AZ-696).
|
||||||
|
|
||||||
|
## Risks
|
||||||
|
|
||||||
|
**Risk 1 — Un-xfailed tests may FAIL on the metric.** If horizontal-error distribution comes in worse than the 80 % @ 100 m gate, this test reports FAIL. That outcome is in-scope for AC-5 (report honestly) and out-of-scope for this ticket's fix (file a separate optimization ticket).
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||||
|
- Hard deps: AZ-839 (C3), AZ-840 (C4)
|
||||||
|
- Tests: `tests/e2e/replay/test_derkachi_1min.py`, `tests/e2e/replay/test_derkachi_real_tlog.py`
|
||||||
|
- AZ-777 spec: `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` (post-closure)
|
||||||
|
- Threshold spec: AZ-696 (≥ 80 % within 100 m)
|
||||||
|
- Verdict writer: `src/gps_denied_onboard/helpers/accuracy_report.py`
|
||||||
@@ -0,0 +1,65 @@
|
|||||||
|
# Docs: replay_protocol.md + architecture.md + orchestrator-test README (AZ-835 C6)
|
||||||
|
|
||||||
|
**Task**: AZ-842_replay_protocol_and_orchestrator_docs
|
||||||
|
**Name**: Docs: replay_protocol.md Invariant 12 + AZ-777 Phase 3+ superseded note + orchestrator-test README (AZ-835 C6)
|
||||||
|
**Description**: Sixth and final building block of Epic AZ-835. Capture the route-driven flow in the authoritative documents so future implementers, operators, and reviewers understand what changed and why.
|
||||||
|
**Complexity**: 2 SP
|
||||||
|
**Dependencies**: AZ-841 (C5, un-xfail — SOFT; README describes test outcomes assuming C5 has landed); AZ-777 (being closed/superseded by this Epic — AZ-777 spec is updated during the AZ-777 closure step, verified by AC-6); AZ-835 (parent Epic)
|
||||||
|
**Component**: `_docs/02_document/contracts/replay/replay_protocol.md` + `_docs/02_document/architecture.md` + `tests/e2e/replay/README*.md`
|
||||||
|
**Tracker**: AZ-842 (https://denyspopov.atlassian.net/browse/AZ-842)
|
||||||
|
**Parent Epic**: AZ-835
|
||||||
|
|
||||||
|
Jira AZ-842 is the authoritative spec; this file is the in-workspace mirror.
|
||||||
|
|
||||||
|
## Modified files
|
||||||
|
|
||||||
|
### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension
|
||||||
|
|
||||||
|
Extend **Invariant 12** with an AZ-835 sub-section describing:
|
||||||
|
|
||||||
|
- The route-driven `operator_pre_flight_setup` fixture (AZ-839 / C3) flow: tlog → `RouteSpec` → `POST /api/satellite/route` → tile download → FAISS build → yield `PopulatedC6Cache`.
|
||||||
|
- Why route-driven supersedes the AZ-777 bbox approach (efficiency: ~100× fewer tiles; honesty: pre-commits to where the operator did fly).
|
||||||
|
- The C3 fixture's failure-handling contract (validation/terminal → re-raise; transient → retry up to 3 attempts using C11's existing backoff schedule).
|
||||||
|
|
||||||
|
### 2. `_docs/02_document/architecture.md` — satellite-provider entry extension
|
||||||
|
|
||||||
|
Append a sub-section to the existing satellite-provider entry noting that Epic AZ-835 + its C1-C5 children landed the full e2e real-flight validation path on top of AZ-777 Phase 1's wire + C11 contract adaptation. Mark AZ-777 Phase 3+ as superseded by Epic AZ-835 (pointer-only — the AZ-777 spec itself is updated in C5's wake during the AZ-777 closure step).
|
||||||
|
|
||||||
|
### 3. `tests/e2e/replay/README*.md` — orchestrator-test README
|
||||||
|
|
||||||
|
Either extend `tests/e2e/replay/README.md` or create a dedicated `tests/e2e/replay/README_AZ835.md` (prefer dedicated file if the existing README is already long). Short operator-facing content:
|
||||||
|
|
||||||
|
- How to run the new orchestrator test locally (env vars, Jetson SSH alias, expected runtime).
|
||||||
|
- What `(tlog, video, calibration)` triple to provide and where the reference Derkachi fixture lives.
|
||||||
|
- Where the verdict report is written and how to interpret it (PASS/FAIL on AZ-696 AC-3 threshold).
|
||||||
|
- Imagery-source caveat: Google Maps satellite (dev/research use only; production needs CC-BY migration on the satellite-provider side).
|
||||||
|
|
||||||
|
## Acceptance criteria
|
||||||
|
|
||||||
|
| # | Criterion |
|
||||||
|
|---|-----------|
|
||||||
|
| AC-1 | `replay_protocol.md` Invariant 12 has a new AZ-835 sub-section covering the route-driven flow, the bbox-supersedure rationale, and the failure-handling contract. |
|
||||||
|
| AC-2 | `architecture.md` satellite-provider entry has a sub-section noting Epic AZ-835's contribution and pointing at AZ-777 Phase 3+ as superseded. |
|
||||||
|
| AC-3 | `tests/e2e/replay/README*.md` exists and a new contributor can run the orchestrator test on Jetson using only the README's instructions (no out-of-band knowledge required). |
|
||||||
|
| AC-4 | All three docs link to the Epic (AZ-835) and to the relevant child tickets (AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-841). |
|
||||||
|
| AC-5 | License attribution string ("Imagery © Google") and the dev-only caveat are present in the test README. |
|
||||||
|
| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children. |
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- Updating consumer-facing API/contract docs in `../satellite-provider/` (parent-suite owns those).
|
||||||
|
- Migrating imagery source to a CC-BY provider (parent-suite, out of scope for this Epic).
|
||||||
|
- Writing additional tutorials beyond the orchestrator-test README.
|
||||||
|
- ADR creation — no new architectural decision; this Epic implements existing decisions.
|
||||||
|
|
||||||
|
## Risks
|
||||||
|
|
||||||
|
**Risk 1 — Scope creep into reformatting unrelated doc sections.** Resist; this ticket only adds what AC-1..AC-5 require.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||||
|
- Replay protocol: `_docs/02_document/contracts/replay/replay_protocol.md` Invariant 12
|
||||||
|
- Architecture: `_docs/02_document/architecture.md` (satellite-provider section)
|
||||||
|
- Tests directory: `tests/e2e/replay/`
|
||||||
|
- AZ-777 spec (being superseded): `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` (post-closure)
|
||||||
@@ -8,7 +8,7 @@ status: in_progress
|
|||||||
sub_step:
|
sub_step:
|
||||||
phase: 7
|
phase: 7
|
||||||
name: batch-loop
|
name: batch-loop
|
||||||
detail: ""
|
detail: "batch 108 next; AZ-839 C3"
|
||||||
retry_count: 0
|
retry_count: 0
|
||||||
cycle: 3
|
cycle: 3
|
||||||
tracker: jira
|
tracker: jira
|
||||||
|
|||||||
Reference in New Issue
Block a user