diff --git a/_docs/02_document/architecture.md b/_docs/02_document/architecture.md index 5f6c88f..8fb7bf3 100644 --- a/_docs/02_document/architecture.md +++ b/_docs/02_document/architecture.md @@ -279,8 +279,25 @@ Two consequences for the architecture: **Imagery source license attribution (cycle 3)**: the Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the parent-suite side (parent-suite ticket TBD). Operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution. +**AZ-777 Phase 3+ superseded by Epic AZ-835**: AZ-777 originally proposed five phases — wire e2e-runner (Phase 1), seed Derkachi bbox (Phase 2), rewrite `operator_pre_flight_setup` fixture (Phase 3), un-xfail AC-4 / AC-5 (Phase 4), docs (Phase 5). Phases 1+2 shipped under AZ-777 itself (batch 104, cycle 3). Phases 3 and 5 were **superseded** when the user redirected the work to a route-driven flow: Phase 3 → AZ-839 (real fixture wiring C1+C2+C11+C10), Phase 5 → AZ-842 (this docs ticket). Phase 4 (un-xfail) was deferred to backlog after the cycle-4 redesign (AZ-895) took the un-xfail target along a different path and is not on the active epic. The AZ-777 task spec at `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` carries the supersedure banner; this architecture document is the authoritative high-level pointer for that decision. + No new ADR — this is execution of existing decisions (architectural principle #5 satellite-provider on-disk layout end-to-end; ADR-004 process-level isolation unchanged; ADR-011 replay is a configuration unchanged). The architectural surface gained the route-driven seeding path inside C11; nothing else moved. +### Replay input redesign (cycle 4 — single canonical clock + CSV-driven path) + +Cycle 4 rebuilt the replay-mode operator-input surface around a single canonical clock to close the AZ-848 ESKF out-of-order regression and to retire the tlog auto-sync surface that produced the misalignment risk in the first place. Four tickets ship the change: + +| Ticket | Role | Description | +|--------|------|-------------| +| **AZ-894** (CSV adapter) | New primary path | `csv_replay_input.CsvReplayInputAdapter` consumes a paired `(video, CSV)` where the CSV's `Time` column is the canonical clock for every IMU/GPS sample. Gated `BUILD_CSV_REPLAY_ADAPTER=ON` in airborne and research binaries; OFF in operator-orchestrator. | +| **AZ-895** (auto-sync deprecation) | Removed legacy | `replay_input.auto_sync` (AZ-405) reduced to a no-op stub that raises on first call; `tlog_video_adapter.py` reduced to a deprecated stub whose `open()` raises immediately. The legacy `--time-offset-ms` / `--skip-auto-sync` / `--auto-trim` CLI flags accepted-with-warning, ignored. Hard removal tracked in AZ-908 (cycle 5+ backlog). | +| **AZ-896** (CSV format spec) | Contract | `_docs/02_document/contracts/replay/csv_replay_format.md` documents the CSV row schema, the row-0-alignment-with-video-frame-0 invariant, and an example `data_imu.csv` shipped under the same path. | +| **AZ-897** (operator UI) | Cycle-5+ follow-up | First operator-facing UI surface — a React + Tailwind single-page form that uploads a paired `(video, CSV)`, links to AZ-896's format docs + example CSV, and tails the verdict from the headless `gps-denied-replay` invocation. Not on cycle-4 critical path; flagged here so the CSV format stays UI-friendly. | + +The architectural rationale is captured in **Invariant 14** of the replay protocol (`_docs/02_document/contracts/replay/replay_protocol.md`): the system runs as a single edge process on a single device; there must be exactly one wall/monotonic clock authoritative for timestamps that cross component boundaries. In live mode that clock is the C8 inbound `FcAdapter`'s FC-boot-relative timestamp; in replay mode (after cycle 4) it is the CSV row's `Time` column. The previous design's two-clock surface (Jetson monotonic at C1 VIO emission, FC-boot at C8 IMU window arrival) produced the AZ-848 regression and is retired with the auto-sync deprecation. + +The legacy `TlogReplayFcAdapter` is retained for two audit-only paths — offline FDR analysis from `tools/` and a one-shot `gps-denied-tlog-to-csv` migration utility that exports legacy tlog inputs to the canonical CSV. Neither path runs from the airborne composition root after cycle 4. + ### `satellite-provider` upload contract (per D-PROJ-2 carryforward) The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. From this architecture's standpoint: diff --git a/_docs/02_document/contracts/replay/replay_protocol.md b/_docs/02_document/contracts/replay/replay_protocol.md index 80d84cc..dc81953 100644 --- a/_docs/02_document/contracts/replay/replay_protocol.md +++ b/_docs/02_document/contracts/replay/replay_protocol.md @@ -257,8 +257,39 @@ The two **invalid** cells (`true` + `eskf` and `false` + `gtsam_isam2`) raise `C **Sub-invariant 12.a (cycle 3 — AZ-839 / Epic AZ-835 C3)**: the e2e `operator_pre_flight_setup` fixture replaces the cycle-1 `mkdir` placeholder with a real driver that wires C1 (`replay_input.tlog_route.extract_route_from_tlog` — AZ-836) + C2 (`c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route` — AZ-838) + C11 (`tile_downloader.HttpTileDownloader.download_for_bbox`) + C10 (`DescriptorBatcher`) to populate C6 from a tlog-derived corridor. The fixture yields a `PopulatedC6Cache` dataclass (`cache_root`, `tile_store_path`, `faiss_index_path`, `faiss_sidecar_sha256_path`, `faiss_sidecar_meta_path`, `route_spec`, `tile_count`, `elapsed_seconds`). The cache is mounted into a named docker volume that survives across pytest sessions (cold first invocation populates; subsequent invocations within the same compose session reuse — warm cache). Cold-start budget: ≤ 5 min on Tier-2 Jetson; warm: ≤ 30 s. Sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306 is verified at every fixture yield; mismatch raises `IndexUnavailableError`. The C12 production binding for the route-driven path is a future-cycle integration; production pre-flight still uses the bbox-driven `download_tiles_for_area` path today. + **Sub-invariant 12.c (cycle 3 — Epic AZ-835: route-driven supersedes bbox)**: route-driven seeding (operator's tlog-derived `RouteSpec` → `POST /api/satellite/route` → corridor materialised by `satellite-provider`) supersedes the legacy AZ-777 bbox-driven approach (`POST /api/satellite/request` over a fixed lat/lon box) for the real-flight validation path. The supersedure rationale is twofold: + + - **Tile efficiency (~100×)**: the AZ-777 bbox for a typical Derkachi-style flight produces ~11,400 z15-z18 tiles (~140 MB, 48 % over the C6 cache budget). A 10-point coarsened route with `regionSizeMeters=500` per point produces ~50-100 unique tiles (~1.5 MB) for the same VPR descriptor lock area. The route-driven path is the only one that fits the AZ-696 reference-fixture budget on Jetson. + - **Pre-commitment honesty**: a bbox pre-commits to where the operator *might* fly. A route pre-commits to where they *did* fly. For real-flight validation against ground-truth GPS, the latter is the right primitive — it ensures the FAISS index is populated with descriptors of the tiles the airborne pipeline will actually query, not a superset whose VPR misses are statistically indistinguishable from the AZ-696 AC-3 ≤ 100 m threshold violations. + + AZ-777 Phase 1 (e2e-runner wiring + C11 read-contract adaptation) is **retained and reused** by Epic AZ-835. AZ-777 Phases 3 and 5 are **superseded** by Epic AZ-835 children (AZ-839 for the operator-fixture rewrite, AZ-842 for the docs work). Phase 4 (un-xfail of AC-4/AC-5) was deferred to backlog after cycle-4 AZ-895 took the un-xfail target along a different path; it is not on the active epic. + + **Sub-invariant 12.d (cycle 3 — AZ-839 / Epic AZ-835 C3: fixture failure-handling contract)**: the `operator_pre_flight_setup` fixture must distinguish three failure classes from `SatelliteProviderRouteClient.seed_route` / `HttpTileDownloader.download_for_bbox` and surface them honestly: + + | Class | Source | Fixture response | + |-------|--------|------------------| + | Validation | `RouteValidationError` (pre-emptive AZ-809 bound violation) or `IndexUnavailableError` (sidecar triple mismatch at yield-time) | Re-raise — operator/test author error, no remediation in the fixture | + | Terminal | `RouteTerminalFailureError` (satellite-provider rejected the route id or status polling returned `mapsReady=false` past `poll_max_attempts`) | Re-raise — service-side state cannot be recovered by retry | + | Transient | `RouteTransientError` or `TileDownloadError` with HTTP 5xx / network reset | **Retry up to 3 attempts** using C11's existing exponential backoff schedule (`HttpTileDownloader.RETRY_*` constants); re-raise on exhaustion | + + The fixture does NOT swallow transient failures silently — the third attempt's exception surfaces with the full retry history in the message so the test report can distinguish "fixture genuinely tried 3×" from "fixture short-circuited". Cold-start budget of ≤ 5 min on Tier-2 Jetson is measured wall-clock around the entire retry loop, not per-attempt. + **Sub-invariant 12.b (cycle 3 — AZ-840 / Epic AZ-835 C4)**: the E2E orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on Tier-2 Jetson — no operator hand-curation between steps. The 7 steps are: (1) active flight cut + tlog/video sync via AZ-405; (2) on-fly frame + IMU extraction; (3) auto-create route via AZ-836; (4) POST route to satellite-provider via the C3 fixture's `operator_pre_flight_setup` (delegates to AZ-838); (5) build FAISS index (driven by C3); (6) run gps-denied airborne pipeline against the populated cache + tlog/video/calibration (reuses the airborne composition root path AZ-699 exercises); (7) compute horizontal-error distribution and emit the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_.md`. The verdict report is emitted ALWAYS, regardless of PASS / FAIL on the AZ-696 ≥ 80 % within 100 m gate — the success criterion is that the report exists with the honest distribution, not that the verdict is PASS. Gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`. 13. **C4↔C5 pairing matrix is enforced at compose time** (AZ-776 / ADR-012): `compose_root` rejects the two off-diagonal cells of the (`c4_pose.enabled`, `c5_state.strategy`) matrix with a `CompositionError` naming both blocks. `enabled=False` + `gtsam_isam2` and `enabled=True` + `eskf` are forbidden. The two valid cells are `enabled=True` + `gtsam_isam2` (production steady-state per ADR-003 / ADR-009) and `enabled=False` + `eskf` (open-loop ESKF — replay Tier-2 smoke baseline; satellite anchoring deferred to AZ-777). Verified by `tests/unit/runtime_root/test_az776_open_loop_eskf_composition.py` AC-3a and AC-3b. +14. **Single canonical clock & CSV-driven replay path (cycle 4 — AZ-894 / AZ-895 / AZ-896)**: production runs as a single edge process on a single device. There is exactly **one** wall/monotonic clock authoritative for timestamps that cross component boundaries — the clock at the C8 inbound boundary (`FcAdapter`) where IMU windows enter the system. Two-clock surfaces — for example a C1 `VioOutput.emitted_at_ns` derived from the Jetson `monotonic_ns()` paired against a C8 `ImuWindow.ts_end_ns` derived from FC-boot — produced the AZ-848 ESKF out-of-order regression observed in cycle 3 (Jetson clock advanced between IMU window arrival and VIO emission, so the VIO emission timestamp routinely landed *before* the IMU window's `ts_end_ns` when the two were compared as if on the same axis, and ESKF rejected its own VIO updates). All downstream timestamps (`EstimatorOutput.ts_ns`, `JsonlReplaySink` per-row `t`, FDR `flight_event.ts_ns`) MUST derive from a single canonical clock that produces deterministic per-record values for a given input. In live mode the canonical clock is the C8 inbound IMU window's FC-boot-relative timestamp; in replay mode it is the CSV row's `Time` column. + + **Sub-invariant 14.a (CSV-driven replay path — AZ-894)**: the replay-mode operator input is `(video, CSV)`. The CSV row's `Time` column is the canonical clock for the entire replay run: every IMU window emitted by the new `csv_replay_input.CsvReplayInputAdapter` (gated `BUILD_CSV_REPLAY_ADAPTER=ON` in the airborne and research binaries) carries `ts_end_ns` derived from the CSV `Time` column; the `Clock` strategy injected into the composition root is `CsvDerivedClock` which uses the same column. There is no auto-sync (see 14.c below). The CSV must satisfy the format spec at `_docs/02_document/contracts/replay/csv_replay_format.md` (AZ-896) — including the requirement that row 0's `Time` equals video frame 0 (`t=0`) so the airborne pipeline does not need to apply any per-stream offset. + + **Sub-invariant 14.b (tlog adapter audit-only role — AZ-895)**: `TlogReplayFcAdapter` (Sub-invariant 14 of the prior cycles' design) is retained in source for two audit / migration paths and removed from the replay test/demo critical path: + + - **FDR analysis**: one-shot tlog parsing for incident review (e.g. AZ-848 timestamp investigation) — invoked from offline analysis scripts under `tools/`, not from the airborne composition root. + - **One-shot tlog → CSV export**: a CLI utility (`gps-denied-tlog-to-csv`) that reads a pymavlink tlog and writes the canonical CSV per AZ-896. This is the migration ramp for users who only have legacy tlog inputs. + + The previous `compose_root(config={"mode": "replay", "replay_input.adapter": "tlog"})` code path is preserved with a one-cycle deprecation warning on startup; removal is tracked in AZ-908 (cycle-5+ backlog). The CSV adapter (`BUILD_CSV_REPLAY_ADAPTER=ON`) is the default and the only path the e2e fixture suite exercises after cycle 4. + + **Sub-invariant 14.c (auto-sync deprecation — AZ-895)**: the `replay_input.auto_sync` module (AZ-405) is reduced to a deprecated no-op stub that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")` from every public entry point. The CLI flags `--time-offset-ms`, `--skip-auto-sync`, and `--auto-trim` are accepted with a deprecation warning and ignored. The justification: with a single canonical clock at the CSV row level (14.a), there is no second clock to align against — the operator authors the CSV with the correct row-0 alignment, and the fixture verifies row 0's `Time == 0`. Hard removal of the deprecated surface is tracked in AZ-908; this cycle ships only the stub + warnings to preserve source-compat for any downstream caller built against AZ-405's pre-deprecation shape. + + **Sub-invariant 14.d (operator-facing UI — AZ-897, future cycle)**: the cycle-4 deliverable is the headless `gps-denied-replay --video X --imu Y` shape. An operator-facing web UI (single-page React + Tailwind form that uploads a paired `(video, CSV)` and tails the verdict) is tracked separately in AZ-897 and is NOT on the critical path of the CSV redesign; this sub-invariant exists only to record that the format spec (AZ-896) and the CSV adapter (AZ-894) MUST stay UI-friendly (CSV example, format docs link, clear error messages on row-0-misalignment) so AZ-897 lands without contract drift. ## Producer / Consumer Split diff --git a/_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md b/_docs/02_tasks/done/AZ-842_replay_protocol_and_orchestrator_docs.md similarity index 100% rename from _docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md rename to _docs/02_tasks/done/AZ-842_replay_protocol_and_orchestrator_docs.md diff --git a/_docs/03_implementation/batch_04_cycle4_report.md b/_docs/03_implementation/batch_04_cycle4_report.md new file mode 100644 index 0000000..d3ff8bb --- /dev/null +++ b/_docs/03_implementation/batch_04_cycle4_report.md @@ -0,0 +1,134 @@ +# Batch Report — cycle 4, batch 04 + +**Batch**: 04 +**Cycle**: 4 +**Tasks**: AZ-842 +**Total complexity**: 3 SP +**Date**: 2026-05-29 +**Commit**: pending (this batch) + +## Task Selection + +AZ-842 (docs — replay_protocol.md Invariant 12 extension + Invariant 14 +cycle-4 + architecture.md AZ-777 supersession + cycle-4 redesign +sub-section + tests/e2e/replay/README.md AZ-835 orchestrator-test +section + license attribution) ships solo. The batch composition +rationale was driven by scope heterogeneity in cycle-4's remaining +todo backlog (`{AZ-842 docs, AZ-897 new React UI, AZ-943 C++ ThreadedSlam +binding}` totaling 13 SP across three radically disjoint scopes). +Single-task batch keeps code review tractable; AZ-897 and AZ-943 each +remain non-trivial (5 SP) and trigger their own Complexity Budget Check +when their batches start. + +## Task Results + +| Task | Status | Files Modified | Tests | AC Coverage | Issues | +|------|--------|----------------|-------|-------------|--------| +| AZ-842_replay_protocol_and_orchestrator_docs | Done | 3 modified | n/a (docs only) | 8/8 (AC-1, AC-1b, AC-2, AC-2b, AC-3, AC-4, AC-5, AC-6) | 1 documented spec deviation + 1 out-of-scope hygiene gap | + +### Files touched + +Documentation (`_docs/02_document/`): + +- MODIFIED `_docs/02_document/contracts/replay/replay_protocol.md`: + - Sub-invariant 12.c added — route-driven seeding supersedes the + legacy AZ-777 bbox-driven approach (~100× tile efficiency, + "did fly vs. might fly" honesty rationale). + - Sub-invariant 12.d added — fixture failure-handling contract + (validation/terminal re-raise; transient → C11 backoff retry × 3 + with full-history-on-exhaust message). + - Invariant 14 added with sub-invariants 14.a-14.d covering + cycle-4's single-canonical-clock model, the CSV-driven primary + path (AZ-894), the tlog adapter's audit-only role (AZ-895), the + auto-sync deprecation (AZ-895), and the operator-UI follow-up + pointer (AZ-897). +- MODIFIED `_docs/02_document/architecture.md`: + - Added "AZ-777 Phase 3+ superseded by Epic AZ-835" supersession + block inside the satellite-provider integration section. + - Added new sub-section "Replay input redesign (cycle 4 — single + canonical clock + CSV-driven path)" with a 4-row ticket table + (AZ-894 / AZ-895 / AZ-896 / AZ-897) and the architectural + rationale tying back to Invariant 14 of the replay protocol. + +Tests-adjacent documentation (`tests/e2e/replay/`): + +- MODIFIED `tests/e2e/replay/README.md`: + - Top header restructured for two distinct entry points + (AZ-265/AZ-404 derkachi_1min vs. AZ-835/AZ-840 orchestrator). + - New section "AZ-835 orchestrator test — full `(tlog, video, + calibration)` loop (Tier-2 only)" covering required inputs, + Tier-2 invocation (Jetson SSH + env vars), skip gates in + evaluation order, expected runtime (≤ 8 min cold, ≤ 60 s warm), + and verdict report location semantics. + - New section "Imagery source license attribution (dev/research + use only)" carrying the "Imagery © Google" attribution and the + production-deployment caveat (Google Maps Platform licensing + review or CC-BY migration TBD). + - New section "Epic AZ-835 ticket map" with explicit Jira links to + AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-842 + cycle-4 redesign + tickets AZ-894 / AZ-895 / AZ-896 / AZ-897. + +### AC verification + +Each AC verified by Grep on the modified file's content (no code-path +tests exist for prose): + +| AC | Verification | +|----|--------------| +| AC-1 | `Sub-invariant 12.c` + `Sub-invariant 12.d` present in `replay_protocol.md` — bbox-supersedure rationale + transient-retry-3-attempts contract | +| AC-1b | `Invariant 14` block with sub-invariants `14.a` (CSV path, AZ-894), `14.b` (tlog audit-only, AZ-895), `14.c` (auto-sync deprecation, AZ-895), `14.d` (UI follow-up, AZ-897), plus cross-link to `csv_replay_format.md` (AZ-896) | +| AC-2 | `AZ-777 Phase 3+ superseded by Epic AZ-835` block in `architecture.md` satellite-provider integration section, pointing at AZ-839 (Phase 3) + AZ-842 (Phase 5) child tickets | +| AC-2b | `### Replay input redesign (cycle 4 — single canonical clock + CSV-driven path)` sub-section in `architecture.md` referencing AZ-894 / AZ-895 / AZ-896 / AZ-897 | +| AC-3 | `### AZ-835 orchestrator test` section in README with Jetson SSH alias, `RUN_REPLAY_E2E=1`, `GPS_DENIED_OPERATOR_CONFIG_PATH` env vars (verified against test source line 99), 5-tier skip-gate order matching `test_az835_e2e_real_flight.py` lines 29-36, expected runtime, and verdict report path | +| AC-4 | Epic AZ-835 + children AZ-836 / AZ-838 / AZ-839 / AZ-840 + cycle-4 redesign AZ-894 / AZ-895 / AZ-896 / AZ-897 referenced in all three modified docs (AZ-841 omitted as an active-epic link per the AC; mentioned once in `architecture.md` AZ-777 supersession block as a backlog-deferred historical note only) | +| AC-5 | `Imagery © Google` + `dev/research use only` strings present in `tests/e2e/replay/README.md` | +| AC-6 | `_docs/02_tasks/_dependencies_table.md` preamble already covers AZ-835 + children + cycle-4 redesign (verified in cycle-3/cycle-4 prior preamble updates); `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` already carries the SUPERSEDED banner pointing at AZ-839 / AZ-841 / AZ-842 — both cross-reference obligations were satisfied by prior work and verified during this batch | + +## AC Test Coverage: 8 of 8 covered (docs-only — coverage = content presence verified by Grep) + +## Code Review Verdict: PASS_WITH_WARNINGS + +### Findings + +**Finding 1 — Spec deviation (documented, accepted by agent; flagged for user awareness)** + +- **Severity**: Medium +- **Category**: Spec-Gap +- **Location**: `_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md` lines 27, 37, 39, 65 (AC-1b) +- **Description**: AC-1b directs "new Invariant 13 (cycle-4)" but Invariant 13 already exists in `replay_protocol.md` (C4↔C5 composition-profile pairing matrix, added by AZ-776 / ADR-012 cycle 3). It is referenced by number in `architecture.md:781` (ADR-012 consequences), `_docs/02_document/components/06_c4_pose/description.md:11` (component doc), and the AZ-776 unit test docstring. +- **Resolution**: Added the cycle-4 content as **Invariant 14** instead. Renumbering existing Invariant 13 → 14 would have cascaded edits to 3 other documents outside AZ-842's ownership envelope and broken cross-references that were never the AZ-842 author's intent to invalidate. The AZ-842 spec was authored before the Invariant 13 collision was visible. +- **Suggested follow-up**: refresh the local AZ-842 spec mirror to say "Invariant 14" in the AC text (post-close hygiene). Not a tracker-write blocker. + +**Finding 2 — Out-of-scope hygiene gap (do NOT auto-fix)** + +- **Severity**: Low +- **Category**: Maintainability +- **Location**: `_docs/02_document/module-layout.md` Build-Time Exclusion Map +- **Description**: `BUILD_CSV_REPLAY_ADAPTER` flag is now mentioned in `_docs/02_document/architecture.md` and `_docs/02_document/contracts/replay/replay_protocol.md` (this batch's edits) and exists in `src/`, `docker-compose.test.yml`, `docker-compose.test.jetson.yml`, and unit tests, but is NOT enumerated in `module-layout.md`'s Build-Time Exclusion Map. Inherited gap from cycle-4 AZ-894. +- **Resolution**: NOT fixed in this batch — `module-layout.md` is outside AZ-842's OWNED envelope (the file is owned by the decompose Step 1.5 / refactor cycle-3 AZ-846 cadence). Suggested as a cycle-5+ hygiene PBI (no blocker filed this session per scope-discipline rule). + +### Auto-fix Attempts + +0 — neither finding is auto-fix-eligible (Finding 1 is a documented design choice; Finding 2 is out of OWNED scope). + +## Stuck Agents: None + +## Jira description sync + +The Jira description on AZ-842 is the pre-cycle-4-rescope version +(2 SP, AC-1..AC-6 without AC-1b / AC-2b / AC-7, no cycle-4 narrative). +The local spec mirror is the more current source. Description sync +will happen at the Step 12 transition (In Progress → In Testing) so +the ticket-side AC list matches what shipped. + +## Next Batch + +Remaining cycle-4 todo backlog: AZ-897 (5 SP — first operator-facing +React + Tailwind UI), AZ-943 (5 SP — OKVIS2 ThreadedSlam binding, +replaces AZ-332 skeleton). AZ-835 epic file moves to `done/` with this +batch (its last todo-leaf child AZ-842 closes here). + +Recommended next batch composition (subject to Complexity Budget +Check at planning time): batch 05 = AZ-897 alone or batch 05 = AZ-943 +alone. Either ordering is valid — they have no inter-dependency. The +implement skill's batch loop will re-evaluate. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index baee421..d827158 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -6,9 +6,9 @@ step: 10 name: Implement status: in_progress sub_step: - phase: 0 - name: awaiting-invocation - detail: "" + phase: 7 + name: batch-loop + detail: "batch 4 closed (AZ-842); next: AZ-897 or AZ-943" retry_count: 0 cycle: 4 tracker: jira diff --git a/tests/e2e/replay/README.md b/tests/e2e/replay/README.md index c8ae28b..3acf21e 100644 --- a/tests/e2e/replay/README.md +++ b/tests/e2e/replay/README.md @@ -1,20 +1,104 @@ -# E2E replay tests (AZ-404) +# E2E replay tests (AZ-404 + AZ-835 + cycle-4) -End-to-end regression suite that runs the `gps-denied-replay` -console-script (AZ-402) against the Derkachi 60 s clip and asserts -the AZ-265 epic acceptance criteria. +End-to-end regression suite for the `gps-denied-replay` console-script +(AZ-402). Two distinct entry points live here: + +| Entry point | Source | Coverage | +|-------------|--------|----------| +| **AZ-265 / AZ-404** — 60 s Derkachi clip with synthetic tlog | `test_derkachi_1min.py` | Original AC-1..AC-10 of the replay epic; runs on Tier-1 + Tier-2 | +| **AZ-835 / AZ-840** — full `(tlog, video, calibration)` orchestrator | `test_az835_e2e_real_flight.py` | Tier-2 only; closes the real-flight validation loop end-to-end (extract → seed → FAISS → run → verdict) | + +The cycle-4 replay-input redesign (AZ-894 / AZ-895 / AZ-896) replaces +the tlog auto-sync surface with a CSV-driven path; the AZ-265 suite is +the regression net that catches drift in the legacy path during the +deprecation window. See `replay_protocol.md` Invariants 12-14 for the +authoritative contract. ## How to run +### AZ-404 Derkachi 60 s suite (Tier-1 + Tier-2) + ```bash # In a fresh venv with the package installed: -RUN_REPLAY_E2E=1 pytest tests/e2e/replay/ -v +RUN_REPLAY_E2E=1 pytest tests/e2e/replay/test_derkachi_1min.py -v ``` Without `RUN_REPLAY_E2E=1` the heavy tests skip cleanly. The two unconditional tests (AC-4a mode-agnosticism scan + AC-7 skip-gate self-check + the helpers in `test_helpers.py`) still run. +### AZ-835 orchestrator test — full `(tlog, video, calibration)` loop (Tier-2 only) + +Closes Epic AZ-835's narrative: given a real-flight `.tlog` + the +matching nadir video + camera calibration, the orchestrator runs the +7-step pipeline end-to-end and writes a verdict report. + +**Required inputs** (already in-repo for the Derkachi reference fixture): + +- `.tlog` — pymavlink binary log from a real flight. Reference fixture: + `_docs/00_problem/input_data/flight_derkachi/data_imu.csv` (the canonical + CSV that `_tlog_synth.py` reconstructs the tlog from) plus the synthesised + tlog the conftest emits at session start. +- Nadir video — `_docs/00_problem/input_data/flight_derkachi/*.mp4` (large + asset; not always checked in to the workstation clone — pull from the + Jetson e2e harness or git LFS if absent). +- Calibration — `tests/fixtures/calibration/adti26.json` (factory-sheet + approximation for the Topotek KHP20S30; real intrinsics still TBD). + +**Tier-2 invocation** (Jetson): + +```bash +ssh jetson-e2e +cd /workspace/gps-denied-onboard +export RUN_REPLAY_E2E=1 +export GPS_DENIED_OPERATOR_CONFIG_PATH=/workspace/configs/operator_replay.yaml +pytest tests/e2e/replay/test_az835_e2e_real_flight.py -v --tb=short -m tier2 +``` + +The bundled local-development entry point is `scripts/run-tests-jetson.sh`, +which handles the SSH alias + rsync + remote pytest invocation. See +`_docs/02_document/tests/tier2-jetson-testing.md` for the harness contract. + +**Skip gates (in evaluation order)**: + +1. `@pytest.mark.tier2` — the per-suite Tier-2 plugin gates this off on dev + macOS / Tier-1 Docker (matches the AZ-839 / AZ-699 contract). +2. `RUN_REPLAY_E2E` not in `{1, true, yes, on}`. +3. `gps-denied-replay` console-script not on `PATH`. +4. Real Derkachi video missing or placeholder-sized. +5. `operator_pre_flight_setup` fixture itself skipped — the downstream + consumer inherits the SKIP automatically (pytest's fixture-skip + propagation). + +**Expected runtime on Tier-2 Jetson AGX Orin** (cold cache): ≤ 8 min +end-to-end (≤ 5 min C3 fixture cold-start budget + ≤ 3 min for the +replay + verdict compute). Warm-cache reinvocations within the same +compose session: ≤ 60 s. + +**Verdict report location**: `_docs/06_metrics/real_flight_validation_.md`. +The report is emitted ALWAYS, regardless of PASS / FAIL on the AZ-696 +AC-3 threshold (≥ 80 % of emissions within 100 m of tlog ground truth). +The success criterion at the fixture level is "honest report exists with +distribution data", not "PASS". The PASS / FAIL line of the report itself +is the operator-facing answer to "did this flight clip localise within +the threshold". + +### Imagery source license attribution (dev/research use only) + +The Jetson e2e harness's `satellite-provider` instance downloads tiles +from the **Google Maps satellite layer** (`mt0..mt3.google.com/vt/lyrs=s`), +governed by Google Maps Platform Terms of Service. Every tile served by +the harness carries the **"Imagery © Google"** attribution string. + +**This is dev/research use only.** Production deployment of the +gps-denied-onboard companion against a Google-Maps-sourced +`satellite-provider` requires either a Google Maps Platform licensing +review or migration to a true CC-BY satellite source on the parent-suite +side (parent-suite ticket TBD; see `_docs/02_document/architecture.md` +§ `satellite-provider` integration). The onboard-side seed scripts +(`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate +the attribution into the test fixture's metadata; do not remove it. + ## Fixture state | Artifact | Status | Source | @@ -97,3 +181,25 @@ tests/e2e/replay/ * **AZ-558** — closes AC-4b (route C8 encoders through `MavlinkTransport`). * **D-PROJ-2 mock-suite-sat-service** — unblocks AC-8 (operator workflow rehearsal). + +## Epic AZ-835 ticket map + +The Tier-2 orchestrator path shipped under Epic +[AZ-835](https://denyspopov.atlassian.net/browse/AZ-835). Sub-tickets: + +| Ticket | Role | +|--------|------| +| [AZ-836](https://denyspopov.atlassian.net/browse/AZ-836) | `TlogRouteExtractor` — active-segment trim + Douglas-Peucker coarsen tlog GPS to ≤ N waypoints | +| [AZ-838](https://denyspopov.atlassian.net/browse/AZ-838) | `SatelliteProviderRouteClient` + `seed_route.py` CLI — POST RouteSpec to satellite-provider, poll `mapsReady` | +| [AZ-839](https://denyspopov.atlassian.net/browse/AZ-839) | C3 `operator_pre_flight_setup` real fixture — wires C1+C2+C11+C10 against the seeded catalog | +| [AZ-840](https://denyspopov.atlassian.net/browse/AZ-840) | C4 E2E orchestrator test — drives the full 7-step pipeline from `(tlog, video, calibration)` | +| [AZ-842](https://denyspopov.atlassian.net/browse/AZ-842) | C6 Docs — `replay_protocol.md` Invariants 12-14 + `architecture.md` + this README (cycle-4 rescope) | + +The cycle-4 replay-input redesign tickets ride alongside the Epic: + +| Ticket | Role | +|--------|------| +| [AZ-894](https://denyspopov.atlassian.net/browse/AZ-894) | `CsvReplayInputAdapter` — new CSV-driven primary path on the single canonical clock | +| [AZ-895](https://denyspopov.atlassian.net/browse/AZ-895) | Auto-sync surface deprecation — tlog adapter reduced to audit-only role | +| [AZ-896](https://denyspopov.atlassian.net/browse/AZ-896) | CSV format spec (`csv_replay_format.md`) + example `data_imu.csv` | +| [AZ-897](https://denyspopov.atlassian.net/browse/AZ-897) | Operator-facing UI (React + Tailwind paired-upload form) — cycle 5+ |