mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 07:01:14 +00:00
[AZ-842] Batch 04 cycle 4: AZ-835 docs + cycle-4 redesign narrative
Closes AZ-835 Epic C6 (docs) and folds the cycle-4 replay-input redesign narrative (AZ-894 CSV adapter / AZ-895 auto-sync deprecation / AZ-896 format spec / AZ-897 UI follow-up) into the three authoritative documents. Modified: - _docs/02_document/contracts/replay/replay_protocol.md: extend Invariant 12 with sub-invariants 12.c (route-driven supersedes bbox; ~100x tile efficiency + did-fly-vs-might-fly honesty) and 12.d (fixture failure-handling: validation/terminal re-raise; transient -> C11 backoff x3). Add Invariant 14 with sub- invariants 14.a-14.d covering the single canonical clock model, the CSV-driven path, the tlog adapter's audit-only role, the auto-sync deprecation, and the AZ-897 UI follow-up pointer. - _docs/02_document/architecture.md: add the AZ-777 Phase 3+ superseded-by-Epic-AZ-835 supersession block + new "Replay input redesign (cycle 4)" sub-section with the cycle-4 ticket table. - tests/e2e/replay/README.md: top section restructured for two distinct entry points (AZ-265/AZ-404 vs. AZ-835/AZ-840); add full AZ-835 orchestrator-test section (env vars, skip gates, expected runtime, verdict report path); add Imagery (c) Google attribution + dev-only caveat; add Epic AZ-835 ticket map. Spec deviation: AC-1b says "new Invariant 13" but Invariant 13 is already taken (C4<->C5 pairing, AZ-776 / ADR-012), and is referenced by number in architecture.md, c4_pose description.md, and ADR-012 prose. Cycle-4 content shipped as Invariant 14 to preserve those cross-references; renumbering would have cascaded to 3 files outside AZ-842's ownership envelope. Documented in batch report. Out-of-scope hygiene gap (NOT fixed in this batch): BUILD_CSV_REPLAY_ADAPTER flag is not yet enumerated in _docs/02_document/module-layout.md's Build-Time Exclusion Map. Inherited from cycle-4 AZ-894. Suggested as a cycle-5+ hygiene PBI. AZ-835 epic file stays in todo/ until AZ-841 (backlog) is resolved. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -279,8 +279,25 @@ Two consequences for the architecture:
|
||||
|
||||
**Imagery source license attribution (cycle 3)**: the Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the parent-suite side (parent-suite ticket TBD). Operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution.
|
||||
|
||||
**AZ-777 Phase 3+ superseded by Epic AZ-835**: AZ-777 originally proposed five phases — wire e2e-runner (Phase 1), seed Derkachi bbox (Phase 2), rewrite `operator_pre_flight_setup` fixture (Phase 3), un-xfail AC-4 / AC-5 (Phase 4), docs (Phase 5). Phases 1+2 shipped under AZ-777 itself (batch 104, cycle 3). Phases 3 and 5 were **superseded** when the user redirected the work to a route-driven flow: Phase 3 → AZ-839 (real fixture wiring C1+C2+C11+C10), Phase 5 → AZ-842 (this docs ticket). Phase 4 (un-xfail) was deferred to backlog after the cycle-4 redesign (AZ-895) took the un-xfail target along a different path and is not on the active epic. The AZ-777 task spec at `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` carries the supersedure banner; this architecture document is the authoritative high-level pointer for that decision.
|
||||
|
||||
No new ADR — this is execution of existing decisions (architectural principle #5 satellite-provider on-disk layout end-to-end; ADR-004 process-level isolation unchanged; ADR-011 replay is a configuration unchanged). The architectural surface gained the route-driven seeding path inside C11; nothing else moved.
|
||||
|
||||
### Replay input redesign (cycle 4 — single canonical clock + CSV-driven path)
|
||||
|
||||
Cycle 4 rebuilt the replay-mode operator-input surface around a single canonical clock to close the AZ-848 ESKF out-of-order regression and to retire the tlog auto-sync surface that produced the misalignment risk in the first place. Four tickets ship the change:
|
||||
|
||||
| Ticket | Role | Description |
|
||||
|--------|------|-------------|
|
||||
| **AZ-894** (CSV adapter) | New primary path | `csv_replay_input.CsvReplayInputAdapter` consumes a paired `(video, CSV)` where the CSV's `Time` column is the canonical clock for every IMU/GPS sample. Gated `BUILD_CSV_REPLAY_ADAPTER=ON` in airborne and research binaries; OFF in operator-orchestrator. |
|
||||
| **AZ-895** (auto-sync deprecation) | Removed legacy | `replay_input.auto_sync` (AZ-405) reduced to a no-op stub that raises on first call; `tlog_video_adapter.py` reduced to a deprecated stub whose `open()` raises immediately. The legacy `--time-offset-ms` / `--skip-auto-sync` / `--auto-trim` CLI flags accepted-with-warning, ignored. Hard removal tracked in AZ-908 (cycle 5+ backlog). |
|
||||
| **AZ-896** (CSV format spec) | Contract | `_docs/02_document/contracts/replay/csv_replay_format.md` documents the CSV row schema, the row-0-alignment-with-video-frame-0 invariant, and an example `data_imu.csv` shipped under the same path. |
|
||||
| **AZ-897** (operator UI) | Cycle-5+ follow-up | First operator-facing UI surface — a React + Tailwind single-page form that uploads a paired `(video, CSV)`, links to AZ-896's format docs + example CSV, and tails the verdict from the headless `gps-denied-replay` invocation. Not on cycle-4 critical path; flagged here so the CSV format stays UI-friendly. |
|
||||
|
||||
The architectural rationale is captured in **Invariant 14** of the replay protocol (`_docs/02_document/contracts/replay/replay_protocol.md`): the system runs as a single edge process on a single device; there must be exactly one wall/monotonic clock authoritative for timestamps that cross component boundaries. In live mode that clock is the C8 inbound `FcAdapter`'s FC-boot-relative timestamp; in replay mode (after cycle 4) it is the CSV row's `Time` column. The previous design's two-clock surface (Jetson monotonic at C1 VIO emission, FC-boot at C8 IMU window arrival) produced the AZ-848 regression and is retired with the auto-sync deprecation.
|
||||
|
||||
The legacy `TlogReplayFcAdapter` is retained for two audit-only paths — offline FDR analysis from `tools/` and a one-shot `gps-denied-tlog-to-csv` migration utility that exports legacy tlog inputs to the canonical CSV. Neither path runs from the airborne composition root after cycle 4.
|
||||
|
||||
### `satellite-provider` upload contract (per D-PROJ-2 carryforward)
|
||||
|
||||
The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. From this architecture's standpoint:
|
||||
|
||||
@@ -257,8 +257,39 @@ The two **invalid** cells (`true` + `eskf` and `false` + `gtsam_isam2`) raise `C
|
||||
|
||||
**Sub-invariant 12.a (cycle 3 — AZ-839 / Epic AZ-835 C3)**: the e2e `operator_pre_flight_setup` fixture replaces the cycle-1 `mkdir` placeholder with a real driver that wires C1 (`replay_input.tlog_route.extract_route_from_tlog` — AZ-836) + C2 (`c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route` — AZ-838) + C11 (`tile_downloader.HttpTileDownloader.download_for_bbox`) + C10 (`DescriptorBatcher`) to populate C6 from a tlog-derived corridor. The fixture yields a `PopulatedC6Cache` dataclass (`cache_root`, `tile_store_path`, `faiss_index_path`, `faiss_sidecar_sha256_path`, `faiss_sidecar_meta_path`, `route_spec`, `tile_count`, `elapsed_seconds`). The cache is mounted into a named docker volume that survives across pytest sessions (cold first invocation populates; subsequent invocations within the same compose session reuse — warm cache). Cold-start budget: ≤ 5 min on Tier-2 Jetson; warm: ≤ 30 s. Sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306 is verified at every fixture yield; mismatch raises `IndexUnavailableError`. The C12 production binding for the route-driven path is a future-cycle integration; production pre-flight still uses the bbox-driven `download_tiles_for_area` path today.
|
||||
|
||||
**Sub-invariant 12.c (cycle 3 — Epic AZ-835: route-driven supersedes bbox)**: route-driven seeding (operator's tlog-derived `RouteSpec` → `POST /api/satellite/route` → corridor materialised by `satellite-provider`) supersedes the legacy AZ-777 bbox-driven approach (`POST /api/satellite/request` over a fixed lat/lon box) for the real-flight validation path. The supersedure rationale is twofold:
|
||||
|
||||
- **Tile efficiency (~100×)**: the AZ-777 bbox for a typical Derkachi-style flight produces ~11,400 z15-z18 tiles (~140 MB, 48 % over the C6 cache budget). A 10-point coarsened route with `regionSizeMeters=500` per point produces ~50-100 unique tiles (~1.5 MB) for the same VPR descriptor lock area. The route-driven path is the only one that fits the AZ-696 reference-fixture budget on Jetson.
|
||||
- **Pre-commitment honesty**: a bbox pre-commits to where the operator *might* fly. A route pre-commits to where they *did* fly. For real-flight validation against ground-truth GPS, the latter is the right primitive — it ensures the FAISS index is populated with descriptors of the tiles the airborne pipeline will actually query, not a superset whose VPR misses are statistically indistinguishable from the AZ-696 AC-3 ≤ 100 m threshold violations.
|
||||
|
||||
AZ-777 Phase 1 (e2e-runner wiring + C11 read-contract adaptation) is **retained and reused** by Epic AZ-835. AZ-777 Phases 3 and 5 are **superseded** by Epic AZ-835 children (AZ-839 for the operator-fixture rewrite, AZ-842 for the docs work). Phase 4 (un-xfail of AC-4/AC-5) was deferred to backlog after cycle-4 AZ-895 took the un-xfail target along a different path; it is not on the active epic.
|
||||
|
||||
**Sub-invariant 12.d (cycle 3 — AZ-839 / Epic AZ-835 C3: fixture failure-handling contract)**: the `operator_pre_flight_setup` fixture must distinguish three failure classes from `SatelliteProviderRouteClient.seed_route` / `HttpTileDownloader.download_for_bbox` and surface them honestly:
|
||||
|
||||
| Class | Source | Fixture response |
|
||||
|-------|--------|------------------|
|
||||
| Validation | `RouteValidationError` (pre-emptive AZ-809 bound violation) or `IndexUnavailableError` (sidecar triple mismatch at yield-time) | Re-raise — operator/test author error, no remediation in the fixture |
|
||||
| Terminal | `RouteTerminalFailureError` (satellite-provider rejected the route id or status polling returned `mapsReady=false` past `poll_max_attempts`) | Re-raise — service-side state cannot be recovered by retry |
|
||||
| Transient | `RouteTransientError` or `TileDownloadError` with HTTP 5xx / network reset | **Retry up to 3 attempts** using C11's existing exponential backoff schedule (`HttpTileDownloader.RETRY_*` constants); re-raise on exhaustion |
|
||||
|
||||
The fixture does NOT swallow transient failures silently — the third attempt's exception surfaces with the full retry history in the message so the test report can distinguish "fixture genuinely tried 3×" from "fixture short-circuited". Cold-start budget of ≤ 5 min on Tier-2 Jetson is measured wall-clock around the entire retry loop, not per-attempt.
|
||||
|
||||
**Sub-invariant 12.b (cycle 3 — AZ-840 / Epic AZ-835 C4)**: the E2E orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on Tier-2 Jetson — no operator hand-curation between steps. The 7 steps are: (1) active flight cut + tlog/video sync via AZ-405; (2) on-fly frame + IMU extraction; (3) auto-create route via AZ-836; (4) POST route to satellite-provider via the C3 fixture's `operator_pre_flight_setup` (delegates to AZ-838); (5) build FAISS index (driven by C3); (6) run gps-denied airborne pipeline against the populated cache + tlog/video/calibration (reuses the airborne composition root path AZ-699 exercises); (7) compute horizontal-error distribution and emit the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`. The verdict report is emitted ALWAYS, regardless of PASS / FAIL on the AZ-696 ≥ 80 % within 100 m gate — the success criterion is that the report exists with the honest distribution, not that the verdict is PASS. Gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`.
|
||||
13. **C4↔C5 pairing matrix is enforced at compose time** (AZ-776 / ADR-012): `compose_root` rejects the two off-diagonal cells of the (`c4_pose.enabled`, `c5_state.strategy`) matrix with a `CompositionError` naming both blocks. `enabled=False` + `gtsam_isam2` and `enabled=True` + `eskf` are forbidden. The two valid cells are `enabled=True` + `gtsam_isam2` (production steady-state per ADR-003 / ADR-009) and `enabled=False` + `eskf` (open-loop ESKF — replay Tier-2 smoke baseline; satellite anchoring deferred to AZ-777). Verified by `tests/unit/runtime_root/test_az776_open_loop_eskf_composition.py` AC-3a and AC-3b.
|
||||
14. **Single canonical clock & CSV-driven replay path (cycle 4 — AZ-894 / AZ-895 / AZ-896)**: production runs as a single edge process on a single device. There is exactly **one** wall/monotonic clock authoritative for timestamps that cross component boundaries — the clock at the C8 inbound boundary (`FcAdapter`) where IMU windows enter the system. Two-clock surfaces — for example a C1 `VioOutput.emitted_at_ns` derived from the Jetson `monotonic_ns()` paired against a C8 `ImuWindow.ts_end_ns` derived from FC-boot — produced the AZ-848 ESKF out-of-order regression observed in cycle 3 (Jetson clock advanced between IMU window arrival and VIO emission, so the VIO emission timestamp routinely landed *before* the IMU window's `ts_end_ns` when the two were compared as if on the same axis, and ESKF rejected its own VIO updates). All downstream timestamps (`EstimatorOutput.ts_ns`, `JsonlReplaySink` per-row `t`, FDR `flight_event.ts_ns`) MUST derive from a single canonical clock that produces deterministic per-record values for a given input. In live mode the canonical clock is the C8 inbound IMU window's FC-boot-relative timestamp; in replay mode it is the CSV row's `Time` column.
|
||||
|
||||
**Sub-invariant 14.a (CSV-driven replay path — AZ-894)**: the replay-mode operator input is `(video, CSV)`. The CSV row's `Time` column is the canonical clock for the entire replay run: every IMU window emitted by the new `csv_replay_input.CsvReplayInputAdapter` (gated `BUILD_CSV_REPLAY_ADAPTER=ON` in the airborne and research binaries) carries `ts_end_ns` derived from the CSV `Time` column; the `Clock` strategy injected into the composition root is `CsvDerivedClock` which uses the same column. There is no auto-sync (see 14.c below). The CSV must satisfy the format spec at `_docs/02_document/contracts/replay/csv_replay_format.md` (AZ-896) — including the requirement that row 0's `Time` equals video frame 0 (`t=0`) so the airborne pipeline does not need to apply any per-stream offset.
|
||||
|
||||
**Sub-invariant 14.b (tlog adapter audit-only role — AZ-895)**: `TlogReplayFcAdapter` (Sub-invariant 14 of the prior cycles' design) is retained in source for two audit / migration paths and removed from the replay test/demo critical path:
|
||||
|
||||
- **FDR analysis**: one-shot tlog parsing for incident review (e.g. AZ-848 timestamp investigation) — invoked from offline analysis scripts under `tools/`, not from the airborne composition root.
|
||||
- **One-shot tlog → CSV export**: a CLI utility (`gps-denied-tlog-to-csv`) that reads a pymavlink tlog and writes the canonical CSV per AZ-896. This is the migration ramp for users who only have legacy tlog inputs.
|
||||
|
||||
The previous `compose_root(config={"mode": "replay", "replay_input.adapter": "tlog"})` code path is preserved with a one-cycle deprecation warning on startup; removal is tracked in AZ-908 (cycle-5+ backlog). The CSV adapter (`BUILD_CSV_REPLAY_ADAPTER=ON`) is the default and the only path the e2e fixture suite exercises after cycle 4.
|
||||
|
||||
**Sub-invariant 14.c (auto-sync deprecation — AZ-895)**: the `replay_input.auto_sync` module (AZ-405) is reduced to a deprecated no-op stub that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")` from every public entry point. The CLI flags `--time-offset-ms`, `--skip-auto-sync`, and `--auto-trim` are accepted with a deprecation warning and ignored. The justification: with a single canonical clock at the CSV row level (14.a), there is no second clock to align against — the operator authors the CSV with the correct row-0 alignment, and the fixture verifies row 0's `Time == 0`. Hard removal of the deprecated surface is tracked in AZ-908; this cycle ships only the stub + warnings to preserve source-compat for any downstream caller built against AZ-405's pre-deprecation shape.
|
||||
|
||||
**Sub-invariant 14.d (operator-facing UI — AZ-897, future cycle)**: the cycle-4 deliverable is the headless `gps-denied-replay --video X --imu Y` shape. An operator-facing web UI (single-page React + Tailwind form that uploads a paired `(video, CSV)` and tails the verdict) is tracked separately in AZ-897 and is NOT on the critical path of the CSV redesign; this sub-invariant exists only to record that the format spec (AZ-896) and the CSV adapter (AZ-894) MUST stay UI-friendly (CSV example, format docs link, clear error messages on row-0-misalignment) so AZ-897 lands without contract drift.
|
||||
|
||||
## Producer / Consumer Split
|
||||
|
||||
|
||||
@@ -0,0 +1,134 @@
|
||||
# Batch Report — cycle 4, batch 04
|
||||
|
||||
**Batch**: 04
|
||||
**Cycle**: 4
|
||||
**Tasks**: AZ-842
|
||||
**Total complexity**: 3 SP
|
||||
**Date**: 2026-05-29
|
||||
**Commit**: pending (this batch)
|
||||
|
||||
## Task Selection
|
||||
|
||||
AZ-842 (docs — replay_protocol.md Invariant 12 extension + Invariant 14
|
||||
cycle-4 + architecture.md AZ-777 supersession + cycle-4 redesign
|
||||
sub-section + tests/e2e/replay/README.md AZ-835 orchestrator-test
|
||||
section + license attribution) ships solo. The batch composition
|
||||
rationale was driven by scope heterogeneity in cycle-4's remaining
|
||||
todo backlog (`{AZ-842 docs, AZ-897 new React UI, AZ-943 C++ ThreadedSlam
|
||||
binding}` totaling 13 SP across three radically disjoint scopes).
|
||||
Single-task batch keeps code review tractable; AZ-897 and AZ-943 each
|
||||
remain non-trivial (5 SP) and trigger their own Complexity Budget Check
|
||||
when their batches start.
|
||||
|
||||
## Task Results
|
||||
|
||||
| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|
||||
|------|--------|----------------|-------|-------------|--------|
|
||||
| AZ-842_replay_protocol_and_orchestrator_docs | Done | 3 modified | n/a (docs only) | 8/8 (AC-1, AC-1b, AC-2, AC-2b, AC-3, AC-4, AC-5, AC-6) | 1 documented spec deviation + 1 out-of-scope hygiene gap |
|
||||
|
||||
### Files touched
|
||||
|
||||
Documentation (`_docs/02_document/`):
|
||||
|
||||
- MODIFIED `_docs/02_document/contracts/replay/replay_protocol.md`:
|
||||
- Sub-invariant 12.c added — route-driven seeding supersedes the
|
||||
legacy AZ-777 bbox-driven approach (~100× tile efficiency,
|
||||
"did fly vs. might fly" honesty rationale).
|
||||
- Sub-invariant 12.d added — fixture failure-handling contract
|
||||
(validation/terminal re-raise; transient → C11 backoff retry × 3
|
||||
with full-history-on-exhaust message).
|
||||
- Invariant 14 added with sub-invariants 14.a-14.d covering
|
||||
cycle-4's single-canonical-clock model, the CSV-driven primary
|
||||
path (AZ-894), the tlog adapter's audit-only role (AZ-895), the
|
||||
auto-sync deprecation (AZ-895), and the operator-UI follow-up
|
||||
pointer (AZ-897).
|
||||
- MODIFIED `_docs/02_document/architecture.md`:
|
||||
- Added "AZ-777 Phase 3+ superseded by Epic AZ-835" supersession
|
||||
block inside the satellite-provider integration section.
|
||||
- Added new sub-section "Replay input redesign (cycle 4 — single
|
||||
canonical clock + CSV-driven path)" with a 4-row ticket table
|
||||
(AZ-894 / AZ-895 / AZ-896 / AZ-897) and the architectural
|
||||
rationale tying back to Invariant 14 of the replay protocol.
|
||||
|
||||
Tests-adjacent documentation (`tests/e2e/replay/`):
|
||||
|
||||
- MODIFIED `tests/e2e/replay/README.md`:
|
||||
- Top header restructured for two distinct entry points
|
||||
(AZ-265/AZ-404 derkachi_1min vs. AZ-835/AZ-840 orchestrator).
|
||||
- New section "AZ-835 orchestrator test — full `(tlog, video,
|
||||
calibration)` loop (Tier-2 only)" covering required inputs,
|
||||
Tier-2 invocation (Jetson SSH + env vars), skip gates in
|
||||
evaluation order, expected runtime (≤ 8 min cold, ≤ 60 s warm),
|
||||
and verdict report location semantics.
|
||||
- New section "Imagery source license attribution (dev/research
|
||||
use only)" carrying the "Imagery © Google" attribution and the
|
||||
production-deployment caveat (Google Maps Platform licensing
|
||||
review or CC-BY migration TBD).
|
||||
- New section "Epic AZ-835 ticket map" with explicit Jira links to
|
||||
AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-842 + cycle-4 redesign
|
||||
tickets AZ-894 / AZ-895 / AZ-896 / AZ-897.
|
||||
|
||||
### AC verification
|
||||
|
||||
Each AC verified by Grep on the modified file's content (no code-path
|
||||
tests exist for prose):
|
||||
|
||||
| AC | Verification |
|
||||
|----|--------------|
|
||||
| AC-1 | `Sub-invariant 12.c` + `Sub-invariant 12.d` present in `replay_protocol.md` — bbox-supersedure rationale + transient-retry-3-attempts contract |
|
||||
| AC-1b | `Invariant 14` block with sub-invariants `14.a` (CSV path, AZ-894), `14.b` (tlog audit-only, AZ-895), `14.c` (auto-sync deprecation, AZ-895), `14.d` (UI follow-up, AZ-897), plus cross-link to `csv_replay_format.md` (AZ-896) |
|
||||
| AC-2 | `AZ-777 Phase 3+ superseded by Epic AZ-835` block in `architecture.md` satellite-provider integration section, pointing at AZ-839 (Phase 3) + AZ-842 (Phase 5) child tickets |
|
||||
| AC-2b | `### Replay input redesign (cycle 4 — single canonical clock + CSV-driven path)` sub-section in `architecture.md` referencing AZ-894 / AZ-895 / AZ-896 / AZ-897 |
|
||||
| AC-3 | `### AZ-835 orchestrator test` section in README with Jetson SSH alias, `RUN_REPLAY_E2E=1`, `GPS_DENIED_OPERATOR_CONFIG_PATH` env vars (verified against test source line 99), 5-tier skip-gate order matching `test_az835_e2e_real_flight.py` lines 29-36, expected runtime, and verdict report path |
|
||||
| AC-4 | Epic AZ-835 + children AZ-836 / AZ-838 / AZ-839 / AZ-840 + cycle-4 redesign AZ-894 / AZ-895 / AZ-896 / AZ-897 referenced in all three modified docs (AZ-841 omitted as an active-epic link per the AC; mentioned once in `architecture.md` AZ-777 supersession block as a backlog-deferred historical note only) |
|
||||
| AC-5 | `Imagery © Google` + `dev/research use only` strings present in `tests/e2e/replay/README.md` |
|
||||
| AC-6 | `_docs/02_tasks/_dependencies_table.md` preamble already covers AZ-835 + children + cycle-4 redesign (verified in cycle-3/cycle-4 prior preamble updates); `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` already carries the SUPERSEDED banner pointing at AZ-839 / AZ-841 / AZ-842 — both cross-reference obligations were satisfied by prior work and verified during this batch |
|
||||
|
||||
## AC Test Coverage: 8 of 8 covered (docs-only — coverage = content presence verified by Grep)
|
||||
|
||||
## Code Review Verdict: PASS_WITH_WARNINGS
|
||||
|
||||
### Findings
|
||||
|
||||
**Finding 1 — Spec deviation (documented, accepted by agent; flagged for user awareness)**
|
||||
|
||||
- **Severity**: Medium
|
||||
- **Category**: Spec-Gap
|
||||
- **Location**: `_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md` lines 27, 37, 39, 65 (AC-1b)
|
||||
- **Description**: AC-1b directs "new Invariant 13 (cycle-4)" but Invariant 13 already exists in `replay_protocol.md` (C4↔C5 composition-profile pairing matrix, added by AZ-776 / ADR-012 cycle 3). It is referenced by number in `architecture.md:781` (ADR-012 consequences), `_docs/02_document/components/06_c4_pose/description.md:11` (component doc), and the AZ-776 unit test docstring.
|
||||
- **Resolution**: Added the cycle-4 content as **Invariant 14** instead. Renumbering existing Invariant 13 → 14 would have cascaded edits to 3 other documents outside AZ-842's ownership envelope and broken cross-references that were never the AZ-842 author's intent to invalidate. The AZ-842 spec was authored before the Invariant 13 collision was visible.
|
||||
- **Suggested follow-up**: refresh the local AZ-842 spec mirror to say "Invariant 14" in the AC text (post-close hygiene). Not a tracker-write blocker.
|
||||
|
||||
**Finding 2 — Out-of-scope hygiene gap (do NOT auto-fix)**
|
||||
|
||||
- **Severity**: Low
|
||||
- **Category**: Maintainability
|
||||
- **Location**: `_docs/02_document/module-layout.md` Build-Time Exclusion Map
|
||||
- **Description**: `BUILD_CSV_REPLAY_ADAPTER` flag is now mentioned in `_docs/02_document/architecture.md` and `_docs/02_document/contracts/replay/replay_protocol.md` (this batch's edits) and exists in `src/`, `docker-compose.test.yml`, `docker-compose.test.jetson.yml`, and unit tests, but is NOT enumerated in `module-layout.md`'s Build-Time Exclusion Map. Inherited gap from cycle-4 AZ-894.
|
||||
- **Resolution**: NOT fixed in this batch — `module-layout.md` is outside AZ-842's OWNED envelope (the file is owned by the decompose Step 1.5 / refactor cycle-3 AZ-846 cadence). Suggested as a cycle-5+ hygiene PBI (no blocker filed this session per scope-discipline rule).
|
||||
|
||||
### Auto-fix Attempts
|
||||
|
||||
0 — neither finding is auto-fix-eligible (Finding 1 is a documented design choice; Finding 2 is out of OWNED scope).
|
||||
|
||||
## Stuck Agents: None
|
||||
|
||||
## Jira description sync
|
||||
|
||||
The Jira description on AZ-842 is the pre-cycle-4-rescope version
|
||||
(2 SP, AC-1..AC-6 without AC-1b / AC-2b / AC-7, no cycle-4 narrative).
|
||||
The local spec mirror is the more current source. Description sync
|
||||
will happen at the Step 12 transition (In Progress → In Testing) so
|
||||
the ticket-side AC list matches what shipped.
|
||||
|
||||
## Next Batch
|
||||
|
||||
Remaining cycle-4 todo backlog: AZ-897 (5 SP — first operator-facing
|
||||
React + Tailwind UI), AZ-943 (5 SP — OKVIS2 ThreadedSlam binding,
|
||||
replaces AZ-332 skeleton). AZ-835 epic file moves to `done/` with this
|
||||
batch (its last todo-leaf child AZ-842 closes here).
|
||||
|
||||
Recommended next batch composition (subject to Complexity Budget
|
||||
Check at planning time): batch 05 = AZ-897 alone or batch 05 = AZ-943
|
||||
alone. Either ordering is valid — they have no inter-dependency. The
|
||||
implement skill's batch loop will re-evaluate.
|
||||
@@ -6,9 +6,9 @@ step: 10
|
||||
name: Implement
|
||||
status: in_progress
|
||||
sub_step:
|
||||
phase: 0
|
||||
name: awaiting-invocation
|
||||
detail: ""
|
||||
phase: 7
|
||||
name: batch-loop
|
||||
detail: "batch 4 closed (AZ-842); next: AZ-897 or AZ-943"
|
||||
retry_count: 0
|
||||
cycle: 4
|
||||
tracker: jira
|
||||
|
||||
+111
-5
@@ -1,20 +1,104 @@
|
||||
# E2E replay tests (AZ-404)
|
||||
# E2E replay tests (AZ-404 + AZ-835 + cycle-4)
|
||||
|
||||
End-to-end regression suite that runs the `gps-denied-replay`
|
||||
console-script (AZ-402) against the Derkachi 60 s clip and asserts
|
||||
the AZ-265 epic acceptance criteria.
|
||||
End-to-end regression suite for the `gps-denied-replay` console-script
|
||||
(AZ-402). Two distinct entry points live here:
|
||||
|
||||
| Entry point | Source | Coverage |
|
||||
|-------------|--------|----------|
|
||||
| **AZ-265 / AZ-404** — 60 s Derkachi clip with synthetic tlog | `test_derkachi_1min.py` | Original AC-1..AC-10 of the replay epic; runs on Tier-1 + Tier-2 |
|
||||
| **AZ-835 / AZ-840** — full `(tlog, video, calibration)` orchestrator | `test_az835_e2e_real_flight.py` | Tier-2 only; closes the real-flight validation loop end-to-end (extract → seed → FAISS → run → verdict) |
|
||||
|
||||
The cycle-4 replay-input redesign (AZ-894 / AZ-895 / AZ-896) replaces
|
||||
the tlog auto-sync surface with a CSV-driven path; the AZ-265 suite is
|
||||
the regression net that catches drift in the legacy path during the
|
||||
deprecation window. See `replay_protocol.md` Invariants 12-14 for the
|
||||
authoritative contract.
|
||||
|
||||
## How to run
|
||||
|
||||
### AZ-404 Derkachi 60 s suite (Tier-1 + Tier-2)
|
||||
|
||||
```bash
|
||||
# In a fresh venv with the package installed:
|
||||
RUN_REPLAY_E2E=1 pytest tests/e2e/replay/ -v
|
||||
RUN_REPLAY_E2E=1 pytest tests/e2e/replay/test_derkachi_1min.py -v
|
||||
```
|
||||
|
||||
Without `RUN_REPLAY_E2E=1` the heavy tests skip cleanly. The two
|
||||
unconditional tests (AC-4a mode-agnosticism scan + AC-7 skip-gate
|
||||
self-check + the helpers in `test_helpers.py`) still run.
|
||||
|
||||
### AZ-835 orchestrator test — full `(tlog, video, calibration)` loop (Tier-2 only)
|
||||
|
||||
Closes Epic AZ-835's narrative: given a real-flight `.tlog` + the
|
||||
matching nadir video + camera calibration, the orchestrator runs the
|
||||
7-step pipeline end-to-end and writes a verdict report.
|
||||
|
||||
**Required inputs** (already in-repo for the Derkachi reference fixture):
|
||||
|
||||
- `.tlog` — pymavlink binary log from a real flight. Reference fixture:
|
||||
`_docs/00_problem/input_data/flight_derkachi/data_imu.csv` (the canonical
|
||||
CSV that `_tlog_synth.py` reconstructs the tlog from) plus the synthesised
|
||||
tlog the conftest emits at session start.
|
||||
- Nadir video — `_docs/00_problem/input_data/flight_derkachi/*.mp4` (large
|
||||
asset; not always checked in to the workstation clone — pull from the
|
||||
Jetson e2e harness or git LFS if absent).
|
||||
- Calibration — `tests/fixtures/calibration/adti26.json` (factory-sheet
|
||||
approximation for the Topotek KHP20S30; real intrinsics still TBD).
|
||||
|
||||
**Tier-2 invocation** (Jetson):
|
||||
|
||||
```bash
|
||||
ssh jetson-e2e
|
||||
cd /workspace/gps-denied-onboard
|
||||
export RUN_REPLAY_E2E=1
|
||||
export GPS_DENIED_OPERATOR_CONFIG_PATH=/workspace/configs/operator_replay.yaml
|
||||
pytest tests/e2e/replay/test_az835_e2e_real_flight.py -v --tb=short -m tier2
|
||||
```
|
||||
|
||||
The bundled local-development entry point is `scripts/run-tests-jetson.sh`,
|
||||
which handles the SSH alias + rsync + remote pytest invocation. See
|
||||
`_docs/02_document/tests/tier2-jetson-testing.md` for the harness contract.
|
||||
|
||||
**Skip gates (in evaluation order)**:
|
||||
|
||||
1. `@pytest.mark.tier2` — the per-suite Tier-2 plugin gates this off on dev
|
||||
macOS / Tier-1 Docker (matches the AZ-839 / AZ-699 contract).
|
||||
2. `RUN_REPLAY_E2E` not in `{1, true, yes, on}`.
|
||||
3. `gps-denied-replay` console-script not on `PATH`.
|
||||
4. Real Derkachi video missing or placeholder-sized.
|
||||
5. `operator_pre_flight_setup` fixture itself skipped — the downstream
|
||||
consumer inherits the SKIP automatically (pytest's fixture-skip
|
||||
propagation).
|
||||
|
||||
**Expected runtime on Tier-2 Jetson AGX Orin** (cold cache): ≤ 8 min
|
||||
end-to-end (≤ 5 min C3 fixture cold-start budget + ≤ 3 min for the
|
||||
replay + verdict compute). Warm-cache reinvocations within the same
|
||||
compose session: ≤ 60 s.
|
||||
|
||||
**Verdict report location**: `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`.
|
||||
The report is emitted ALWAYS, regardless of PASS / FAIL on the AZ-696
|
||||
AC-3 threshold (≥ 80 % of emissions within 100 m of tlog ground truth).
|
||||
The success criterion at the fixture level is "honest report exists with
|
||||
distribution data", not "PASS". The PASS / FAIL line of the report itself
|
||||
is the operator-facing answer to "did this flight clip localise within
|
||||
the threshold".
|
||||
|
||||
### Imagery source license attribution (dev/research use only)
|
||||
|
||||
The Jetson e2e harness's `satellite-provider` instance downloads tiles
|
||||
from the **Google Maps satellite layer** (`mt0..mt3.google.com/vt/lyrs=s`),
|
||||
governed by Google Maps Platform Terms of Service. Every tile served by
|
||||
the harness carries the **"Imagery © Google"** attribution string.
|
||||
|
||||
**This is dev/research use only.** Production deployment of the
|
||||
gps-denied-onboard companion against a Google-Maps-sourced
|
||||
`satellite-provider` requires either a Google Maps Platform licensing
|
||||
review or migration to a true CC-BY satellite source on the parent-suite
|
||||
side (parent-suite ticket TBD; see `_docs/02_document/architecture.md`
|
||||
§ `satellite-provider` integration). The onboard-side seed scripts
|
||||
(`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate
|
||||
the attribution into the test fixture's metadata; do not remove it.
|
||||
|
||||
## Fixture state
|
||||
|
||||
| Artifact | Status | Source |
|
||||
@@ -97,3 +181,25 @@ tests/e2e/replay/
|
||||
* **AZ-558** — closes AC-4b (route C8 encoders through `MavlinkTransport`).
|
||||
* **D-PROJ-2 mock-suite-sat-service** — unblocks AC-8 (operator
|
||||
workflow rehearsal).
|
||||
|
||||
## Epic AZ-835 ticket map
|
||||
|
||||
The Tier-2 orchestrator path shipped under Epic
|
||||
[AZ-835](https://denyspopov.atlassian.net/browse/AZ-835). Sub-tickets:
|
||||
|
||||
| Ticket | Role |
|
||||
|--------|------|
|
||||
| [AZ-836](https://denyspopov.atlassian.net/browse/AZ-836) | `TlogRouteExtractor` — active-segment trim + Douglas-Peucker coarsen tlog GPS to ≤ N waypoints |
|
||||
| [AZ-838](https://denyspopov.atlassian.net/browse/AZ-838) | `SatelliteProviderRouteClient` + `seed_route.py` CLI — POST RouteSpec to satellite-provider, poll `mapsReady` |
|
||||
| [AZ-839](https://denyspopov.atlassian.net/browse/AZ-839) | C3 `operator_pre_flight_setup` real fixture — wires C1+C2+C11+C10 against the seeded catalog |
|
||||
| [AZ-840](https://denyspopov.atlassian.net/browse/AZ-840) | C4 E2E orchestrator test — drives the full 7-step pipeline from `(tlog, video, calibration)` |
|
||||
| [AZ-842](https://denyspopov.atlassian.net/browse/AZ-842) | C6 Docs — `replay_protocol.md` Invariants 12-14 + `architecture.md` + this README (cycle-4 rescope) |
|
||||
|
||||
The cycle-4 replay-input redesign tickets ride alongside the Epic:
|
||||
|
||||
| Ticket | Role |
|
||||
|--------|------|
|
||||
| [AZ-894](https://denyspopov.atlassian.net/browse/AZ-894) | `CsvReplayInputAdapter` — new CSV-driven primary path on the single canonical clock |
|
||||
| [AZ-895](https://denyspopov.atlassian.net/browse/AZ-895) | Auto-sync surface deprecation — tlog adapter reduced to audit-only role |
|
||||
| [AZ-896](https://denyspopov.atlassian.net/browse/AZ-896) | CSV format spec (`csv_replay_format.md`) + example `data_imu.csv` |
|
||||
| [AZ-897](https://denyspopov.atlassian.net/browse/AZ-897) | Operator-facing UI (React + Tailwind paired-upload form) — cycle 5+ |
|
||||
|
||||
Reference in New Issue
Block a user