mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 17:11:14 +00:00
chore: WIP pre-implement
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
File diff suppressed because one or more lines are too long
+14
@@ -1,5 +1,19 @@
|
||||
# Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests (AZ-835 C5)
|
||||
|
||||
> **Cycle-4 deferral (2026-05-26)**: moved to `backlog/` during cycle-4 Step 9
|
||||
> scope review. Blocking issues:
|
||||
> - **Conflict with AZ-895 AC-4**: AZ-895 (cycle-4 cleanup) explicitly states
|
||||
> `test_derkachi_real_tlog.py` stays `@xfail` with the AZ-848-scoped reason
|
||||
> in cycle 4. Un-xfailing this test here contradicts AZ-895 and will fail
|
||||
> the Jetson run because AZ-848 (the underlying clock bug) is in backlog/.
|
||||
> - **Partial overlap with AZ-894 AC-3**: the other un-xfail target
|
||||
> (`test_derkachi_1min.py::AC3`) is the same test AZ-894 (cycle-4 CSV
|
||||
> adapter) covers under its own AC-3 — re-doing the un-xfail in a
|
||||
> separate ticket duplicates effort.
|
||||
> - **Replay condition**: revisit when EITHER (a) AZ-848 is fixed and the
|
||||
> tlog adapter path is restored, OR (b) cycle 4 lands and we rescope this
|
||||
> ticket to only the CSV-path tests AZ-894 doesn't already cover.
|
||||
|
||||
**Task**: AZ-841_unxfail_az777_tier2_tests
|
||||
**Name**: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests once C3 fixture + C4 orchestrator land (AZ-835 C5)
|
||||
**Description**: Fifth building block of Epic AZ-835. Once C3 (AZ-839, `operator_pre_flight_setup` real fixture) and C4 (AZ-840, e2e orchestrator test) land, remove the `@pytest.mark.xfail` markers from the AZ-777 Tier-2 tests. The verdict — PASS or FAIL — becomes the honest signal. Both tests remain gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`.
|
||||
@@ -0,0 +1,135 @@
|
||||
# [AZ-776 follow-up] derkachi_1min AC-1/2/5/6 fail on Jetson — VioOutput.emitted_at_ns clock-mismatch with FC IMU timebase
|
||||
|
||||
> **SCOPE UPDATE (2026-05-26, cycle-4 planning)**
|
||||
>
|
||||
> After user decision to switch the primary replay path to user-supplied (video, CSV) pairs (see AZ-894 / AZ-895 / AZ-896 / AZ-897), the tlog-adapter path becomes **audit-only** and this ticket is **no longer bench-blocking**. It remains a real bug and stays open for any future tlog-only flight (flights that ship with a `.tlog` but no companion `data_imu.csv`).
|
||||
>
|
||||
> **Priority**: backlog (deprioritised from cycle-4 candidate)
|
||||
> **Bench-blocking?**: no — AZ-894 supersedes
|
||||
> **Production-blocking?**: no — production single-clock model never goes through the tlog adapter
|
||||
> **Complexity**: unchanged (5 SP)
|
||||
|
||||
**Task**: AZ-848_jetson_eskf_out_of_order_regression
|
||||
**Name**: Repair the VioOutput contract — emitted_at_ns must use the frame's timeline timestamp, not process monotonic_ns, so it aligns with the FC IMU timebase that C5 ESKF tracks alongside it
|
||||
**Description**: On the Jetson e2e harness (`scripts/run-tests-jetson.sh`), four tests in `tests/e2e/replay/test_derkachi_1min.py` (AC-1, AC-5, AC-6 realtime, AC-6 asap) fail with identical deterministic root cause `EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0')` at frame 3, preceded by `c5.state.eskf_out_of_order` from `imu_window` (ts_ns=187_370_418_000 < last_added_ts_ns=1_187_232_637_925_619 — ~5–6 orders of magnitude apart). Plus 1 XPASS on `test_ac3_within_100m_80pct_of_ticks` (probable vacuous-pass — when the binary exits 1 on frame 3, the ≥80 % within 100 m assertion evaluates over zero emissions).
|
||||
|
||||
**Revised root cause (2026-05-26 evidence-based investigation)**: NOT an IMU-vs-IMU clock-source mismatch (the original hypothesis was incorrect — RAW_IMU.time_usec and SCALED_IMU2.time_boot_ms share the same FC-boot-relative timebase in the Derkachi tlog: 187–634 s). The actual mismatch is **VioOutput.emitted_at_ns** vs **ImuWindow.ts_end_ns**:
|
||||
|
||||
| Source | Code site | Value on Jetson | Timebase |
|
||||
|---|---|---|---|
|
||||
| `VioOutput.emitted_at_ns` | `klt_ransac.py:274` — `self._clock.monotonic_ns()` | ~1.187·10¹⁵ ns (≈ 13.7 days — Jetson uptime when the run started) | Process monotonic |
|
||||
| `imu_window.ts_end_ns` | `tlog_replay_adapter.py:710` — `time_usec * 1000` | ~1.87·10¹¹ ns (≈ 187 s — Pixhawk boot-relative) | FC-boot-relative |
|
||||
|
||||
C5 ESKF tracks `_last_added_ts_ns` across BOTH `add_vio` and `add_fc_imu`. Frame 0: `add_vio` sets `_last_added_ts_ns = 1.187·10¹⁵`. Frame 1: `add_fc_imu` checks `1.87·10¹¹ + ~10⁸ < 1.187·10¹⁵` → out_of_order degraded → next add_vio with corrupted nominal state → mahalanobis² = 109.76 > 100 → fatal divergence at frame 3.
|
||||
|
||||
**Why this hides on Tier-1**: the test is `@pytest.mark.tier2_only` (skipped on workstation runs). Unit tests use mocked VIO with synthetic clocks, so the contract clash never surfaces.
|
||||
|
||||
**Why this hides on a short-uptime Jetson**: a Jetson booted < ~10 s ago would have monotonic_ns smaller than the FC's boot-relative timestamps; the inequality flips and the bug masquerades as "intermittent passes". The 13.7-day-uptime test box made it deterministic.
|
||||
|
||||
**Complexity**: 5 SP (revised up from 3 — the fix touches the C1 contract: `VioOutput.emitted_at_ns` semantics + every C1 strategy that populates it + `_docs/02_document/contracts/c1_vio/` doc + every consumer of `vio.emitted_at_ns` in C5 / C13 / FDR. Plus a determinism test that records monotonic_ns vs frame_ts_ns at frame 0 to lock the invariant in.)
|
||||
**Dependencies**: AZ-776 (closed; produced the verification gap that hid this regression)
|
||||
**Related**: AZ-883 (SCALED_IMU2 latent ts_ns=0 bug; uncovered during this investigation; separate ticket)
|
||||
**Component**: c1_vio (`klt_ransac.py`, `bench/okvis2.py`, `bench/vins_mono.py`, `_facade_spine.py`) + `_types/nav.py` (VioOutput dataclass) + c5_state (`eskf_baseline.py:add_vio` consumes the field) + c13_fdr (consumes `emitted_at_ns` per the docstring's "adaptive-gating decisions")
|
||||
**Tracker**: AZ-848 (https://denyspopov.atlassian.net/browse/AZ-848)
|
||||
**Parent Epic**: (none — bug surfaced in cycle 3 Step 11)
|
||||
|
||||
Jira AZ-848 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Symptom
|
||||
|
||||
On Jetson (`scripts/run-tests-jetson.sh`), four tests in `tests/e2e/replay/test_derkachi_1min.py` fail with identical root cause:
|
||||
|
||||
- `test_ac1_exits_0_jsonl_count_match`
|
||||
- `test_ac5_determinism_two_runs_diff`
|
||||
- `test_ac6_pace_realtime_60s_within_5pct`
|
||||
- `test_ac6_pace_asap_under_30s`
|
||||
|
||||
All four assert `gps-denied-replay` exits 0; the binary actually exits 1 on frame 3 with:
|
||||
|
||||
```
|
||||
ERROR c5_state.eskf_baseline c5.state.eskf_out_of_order
|
||||
source=imu_window ts_ns=187,370,418,000 last_added_ts_ns=1,187,232,637,925,619
|
||||
ERROR c5_state.eskf_baseline c5.state.eskf_filter_divergence
|
||||
source=vio mahalanobis_sq=109.76467866548009 threshold_sq=100.0
|
||||
ERROR runtime_root.replay_loop replay_loop.state_add_vio_fatal
|
||||
frame=3 EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0')
|
||||
```
|
||||
|
||||
Mahalanobis distance is identical (109.765) across all four runs — fully deterministic on the Derkachi 1-min clip.
|
||||
|
||||
Additionally, `test_ac3_within_100m_80pct_of_ticks` reports XPASS (was `@xfail` referencing AZ-777). Appears to be a symptom of the same bug — with the binary exiting code 1 before any GPS-denied emissions land, the `≥ 80 % within 100 m` assertion evaluates against an empty population and passes vacuously. The XPASS is NOT honest evidence that AZ-777 has been completed.
|
||||
|
||||
## Origin — AZ-776 verification gap
|
||||
|
||||
Commit `8de2716 [AZ-776] Open-loop ESKF composition profile via c4_pose.enabled` removed `@pytest.mark.xfail` decorators from AC-1 (line 61), AC-2 (line 138), AC-5 (line 413), AC-6 realtime (line 453), AC-6 asap (line 479) of `test_derkachi_1min.py`. The AZ-776 spec (`_docs/02_tasks/done/AZ-776_eskf_open_loop_composition_profile.md`) claims under AC-7:
|
||||
|
||||
> `_run_replay_loop` in `runtime_root/__init__.py` is exercised end-to-end on Jetson by a non-`xfail` integration test (AC-1, AC-2, AC-5, AC-6 realtime, AC-6 asap in `tests/e2e/replay/test_derkachi_1min.py` un-xfail **and pass**).
|
||||
|
||||
This was not honored — AZ-776 closed without an honest Jetson run. Predates the `meta-rule.mdc` "Real Results, Not Simulated Ones" rule (added 2026-05) that would have caught it.
|
||||
|
||||
## Cycle-3 scope (not the cause)
|
||||
|
||||
Cycle-3 Step 11 (2026-05-24) surfaced this on the first full Jetson run since cycle 1. Cycle-3's only src change was commit `fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint` — four files, all in `_types/route.py` (new), `c11_tile_manager/route_client.py`, `replay_input/__init__.py`, `replay_input/tlog_route.py`. None of `c5_state`, `c8_fc_adapter`, `runtime_root` were touched. Most recent change to `c5_state/eskf_baseline.py` is AZ-389; to `c8_fc_adapter/tlog_replay_adapter.py` is AZ-398. Both pre-date cycle 1. The latent contract clash was always there — Jetson uptime + an un-`xfail`ed test combined to make it deterministic.
|
||||
|
||||
## Diagnosis evidence (2026-05-26)
|
||||
|
||||
`/tmp/inspect_tlog.py` (ad-hoc pymavlink probe against `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`) — outputs preserved in this session's chat history:
|
||||
|
||||
- 4326 RAW_IMU msgs, time_usec ∈ [187,274,914 ; 633,952,656] µs (boot-relative ~187s–~634s)
|
||||
- 4330 SCALED_IMU2 msgs, time_boot_ms ∈ [187,274 ; 633,954] ms (same timebase, same range)
|
||||
- Both IMU types share the FC's boot timebase → original "two-IMU-clock-source mismatch" hypothesis is REFUTED
|
||||
- `klt_ransac.py:274` populates `VioOutput.emitted_at_ns = self._clock.monotonic_ns()` → 1.187·10¹⁵ ns on the test Jetson (uptime 13.7 days)
|
||||
- `_types/nav.py:158` documents this contract explicitly: "`emitted_at_ns` is `time.monotonic_ns` at output time."
|
||||
- `eskf_baseline.py:492` reads `ts_ns = vio.emitted_at_ns` and stores it in `_last_added_ts_ns` — the same field that `add_fc_imu` checks against `imu_window.ts_end_ns` (FC-boot-relative)
|
||||
- Confirmed: the inequality direction MATCHES the AZ-848 error log (`ts_ns=187,370,418,000 < last_added_ts_ns=1,187,232,637,925,619`)
|
||||
|
||||
## Affected files
|
||||
|
||||
- `src/gps_denied_onboard/_types/nav.py` — `VioOutput.emitted_at_ns` field + docstring at line 158 (contract change site)
|
||||
- `src/gps_denied_onboard/components/c1_vio/klt_ransac.py:274,425,463,592–619` — every site that fills `emitted_at_ns`
|
||||
- `src/gps_denied_onboard/components/c1_vio/bench/okvis2.py`, `vins_mono.py` — other C1 strategies that fill `emitted_at_ns`
|
||||
- `src/gps_denied_onboard/components/c1_vio/_facade_spine.py` — `frame_ts_ns(frame)` is the existing helper that should be the new source of truth
|
||||
- `src/gps_denied_onboard/components/c5_state/eskf_baseline.py:492,502,565` — already reads `vio.emitted_at_ns`; no API change needed once the field's semantics are fixed
|
||||
- `src/gps_denied_onboard/components/c13_fdr/**` — read `emitted_at_ns` per the docstring's "adaptive-gating decisions"; behavior change must be evaluated
|
||||
- `_docs/02_document/contracts/c1_vio/` — contract docs need re-version (semantic change to a public field)
|
||||
- `tests/e2e/replay/test_derkachi_1min.py` — the failing tests; AC-3 XPASS handling per AC-4 below
|
||||
|
||||
## Repro
|
||||
|
||||
```
|
||||
bash scripts/run-tests-jetson.sh
|
||||
# pytest report (after ~5 min):
|
||||
# tests/e2e/replay/test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match FAILED
|
||||
# tests/e2e/replay/test_derkachi_1min.py::test_ac5_determinism_two_runs_diff FAILED
|
||||
# tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_realtime_60s_within_5pct FAILED
|
||||
# tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_asap_under_30s FAILED
|
||||
# tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks XPASS
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | The `VioOutput.emitted_at_ns` contract docstring (`_types/nav.py:158`) no longer says "monotonic_ns at output time"; the field's semantics are documented as "the frame's timeline timestamp aligned with C8 FC IMU timebase, so C5 ESKF can compare against `imu_window.ts_end_ns` without a clock-source mismatch". A version bump is recorded in `_docs/02_document/contracts/c1_vio/`. |
|
||||
| AC-2 | Every C1 strategy (`klt_ransac.py`, `bench/okvis2.py`, `bench/vins_mono.py`) populates `emitted_at_ns` from the frame's timestamp (via `frame_ts_ns(frame)` or the strategy's own equivalent), NOT from `monotonic_ns()`. A unit test per strategy asserts the field value equals `frame_ts_ns(frame)`. |
|
||||
| AC-3 | A determinism test reads two consecutive frames' `VioOutput.emitted_at_ns` values and asserts they are equal to `frame_ts_ns(frame_n)` and `frame_ts_ns(frame_n+1)` respectively — locking the new invariant. |
|
||||
| AC-4 | Fix lands and `test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` PASSES on Jetson with `RUN_REPLAY_E2E=1` — no `@xfail` re-add. |
|
||||
| AC-5 | `test_ac5_determinism_two_runs_diff`, `test_ac6_pace_realtime_60s_within_5pct`, `test_ac6_pace_asap_under_30s` also PASS on Jetson. |
|
||||
| AC-6 | XPASS on `test_ac3_within_100m_80pct_of_ticks` is investigated. If symptom of the same bug, returns to honest XFAIL referencing AZ-777 once binary exits 0 cleanly. If genuine pass, AZ-777 is closed instead. |
|
||||
| AC-7 | C13 FDR consumers of `emitted_at_ns` are audited — any code path that relied on the field being monotonic-clock-wall-time has its behavior preserved via an explicit `time.monotonic_ns()` recorded under a different name (e.g., `recorded_at_ns`) or its expectation is documented as "frame timeline; not wall clock". |
|
||||
| AC-8 | `meta-rule.mdc` "Real Results" gate is honored — no ticket may close `Done` until the operator has eyes on a green Jetson run log line. |
|
||||
|
||||
## Notes
|
||||
|
||||
- Tracker context: surfaced `cycle: 3, step: 11` on 2026-05-24; root cause re-diagnosed 2026-05-26 (operator-supervised investigation against the actual Derkachi tlog).
|
||||
- Local unit suite (`pytest tests/unit/`) passes 2303 / 0 fail / 86 legitimate skips after C12 cold-start threshold relax (`05f1143 [AZ-844]`).
|
||||
- Cycle 3 Step 11 verdict was PASS for cycle-3-scope; this ticket captures the wider Jetson regression for next cycle.
|
||||
- Local mirror created retroactively 2026-05-24 (cycle 3 Step 12 entry) — Jira AZ-848 filed 2026-05-24 was the original signal; mirror was missing.
|
||||
- 2026-05-26: spec materially revised after evidence-based investigation refuted the original "two-IMU-clock-source mismatch" hypothesis. The corrected diagnosis points at the C1 contract (`VioOutput.emitted_at_ns` semantics), not at the C8 adapter. The SCALED_IMU2 latent bug surfaced during this investigation is split out as AZ-883 to keep this ticket's scope tight.
|
||||
|
||||
## References
|
||||
|
||||
- Jira: https://denyspopov.atlassian.net/browse/AZ-848
|
||||
- Run-tests report: `_docs/03_implementation/run_tests_step11_report.md` (Cycle 3 closeout, lines 617–635)
|
||||
- Origin spec: `_docs/02_tasks/done/AZ-776_eskf_open_loop_composition_profile.md`
|
||||
- Related: AZ-777 (the XFAIL the AC-6 XPASS originally referenced); AZ-883 (SCALED_IMU2 latent bug)
|
||||
@@ -0,0 +1,74 @@
|
||||
# `_handle_imu` mis-reads SCALED_IMU2 timestamps — produces ts_ns=0 for every other IMU sample
|
||||
|
||||
> **SCOPE UPDATE (2026-05-26, cycle-4 planning)**
|
||||
>
|
||||
> Deprioritised behind AZ-894 (CSV-driven replay adapter). This bug only matters once the tlog-adapter path is reactivated for tlog-only flights (flights that ship with a `.tlog` but no companion `data_imu.csv`). Stays open in backlog.
|
||||
>
|
||||
> **Priority**: backlog (deprioritised from cycle-4 candidate)
|
||||
> **Bench-blocking?**: no — AZ-894 supersedes the tlog path for Derkachi
|
||||
> **Complexity**: unchanged (2 SP)
|
||||
|
||||
**Task**: AZ-883_scaled_imu2_ts_ns_zero_default
|
||||
**Name**: Branch `_handle_imu` on message type so SCALED_IMU2 uses `time_boot_ms × 1_000_000` instead of the missing `time_usec` field
|
||||
**Description**: `src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py:683` routes BOTH `RAW_IMU` and `SCALED_IMU2` messages through `_handle_imu`, which at line 710 reads `getattr(msg, "time_usec", 0) * 1000` to compute `sensor_ts_ns`. SCALED_IMU2 has no `time_usec` field (its time field is `time_boot_ms`, uint32 milliseconds since FC boot), so the `getattr` default-of-zero path fires for every SCALED_IMU2 message. The resulting IMU sample stream alternates RAW_IMU timestamps with `ts_ns=0` values.
|
||||
|
||||
**Evidence (2026-05-26 investigation against `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`)**:
|
||||
|
||||
- 4326 RAW_IMU messages with `time_usec` ∈ [187,274,914 ; 633,952,656] µs (boot-relative microseconds, ~187s–~634s)
|
||||
- 4330 SCALED_IMU2 messages with `time_boot_ms` ∈ [187,274 ; 633,954] ms (same FC-boot timebase, same range)
|
||||
- Both interleaved in arrival order — every other IMU sample is the affected type
|
||||
- `_handle_imu`'s simulated output: 4266 non-monotonic transitions out of 8656 (~49 %) — almost every other transition is non-monotonic because SCALED_IMU2 collapses to ts_ns=0
|
||||
|
||||
**Why this is currently latent**: C5 ESKF's `add_fc_imu` reads `imu_window.ts_end_ns` (the LAST sample's ts_ns) for monotonicity guarding. If the last sample in the window happens to be RAW_IMU, the guard passes. The per-sample preintegration loop at `eskf_baseline.py:627–647` reads each `sample.ts_ns` individually for delta-t computation, but with ts_ns=0 samples interleaved, the delta-t arithmetic produces negative or near-zero intervals that get silently absorbed by the bias-correction math without raising. It WILL bite once any downstream consumer (FDR replay, latency analyser, deterministic-time gate) does a per-sample monotonicity assertion.
|
||||
|
||||
**Why this surfaced now**: the operator-supervised AZ-848 investigation read the Derkachi tlog through pymavlink and observed the interleaving directly. The bug has been present since `_handle_imu` was written (predates cycle 1) and was never caught because no test asserts per-sample IMU monotonicity.
|
||||
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: AZ-848 (split off from its investigation; can land before, after, or in parallel — no shared code path beyond `_handle_imu`)
|
||||
**Component**: c8_fc_adapter (`tlog_replay_adapter.py`)
|
||||
**Tracker**: AZ-883 (https://denyspopov.atlassian.net/browse/AZ-883) — Jira ticket created 2026-05-26 during cycle 3 release flow; allocated key AZ-883 (next-available, NOT the originally-planned AZ-849)
|
||||
**Parent Epic**: (none — bug surfaced during AZ-848 investigation)
|
||||
|
||||
## Symptom
|
||||
|
||||
If you add a per-sample monotonicity assertion to the C5 ESKF or to the C8 tlog adapter pre-emit gate, every Jetson run against the Derkachi tlog reports 4266 zero-valued IMU sample timestamps interleaved with proper RAW_IMU values. The assertion fires immediately at message index 1 (the first SCALED_IMU2 after the first RAW_IMU).
|
||||
|
||||
## Proposed fix
|
||||
|
||||
Modify `_handle_imu` (`src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py:709`) to branch on the message type via the caller's already-computed `msg_type`:
|
||||
|
||||
```python
|
||||
def _handle_imu(self, msg: Any, *, msg_type: str) -> bool:
|
||||
if msg_type == "RAW_IMU":
|
||||
sensor_ts_ns = int(getattr(msg, "time_usec", 0)) * 1000
|
||||
elif msg_type == "SCALED_IMU2":
|
||||
sensor_ts_ns = int(getattr(msg, "time_boot_ms", 0)) * 1_000_000
|
||||
else:
|
||||
raise FcOpenError(
|
||||
f"_handle_imu called with unsupported msg_type={msg_type!r}; "
|
||||
f"expected RAW_IMU or SCALED_IMU2"
|
||||
)
|
||||
...
|
||||
```
|
||||
|
||||
Update the caller at line 684 to pass `msg_type=msg_type`. Add a unit test that synthesises a SimpleNamespace with `time_boot_ms=187274` (no `time_usec` field) and verifies the emitted `ImuTelemetrySample.ts_ns == 187_274_000_000`.
|
||||
|
||||
Alternative (heavier): pick a single canonical message type at construction time (parameterise the adapter with `imu_source: Literal["RAW_IMU","SCALED_IMU2"]`, auto-detected from the tlog pre-scan) and drop the non-chosen type at the dispatch site. This buys cleaner streams but doubles the test matrix.
|
||||
|
||||
The branching fix is simpler and preserves the existing OR-group semantic (`("RAW_IMU", "SCALED_IMU2")` in `_REQUIRED_MESSAGE_GROUPS`).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | `_handle_imu` reads `time_boot_ms × 1_000_000` for SCALED_IMU2 messages and `time_usec × 1000` for RAW_IMU. A unit test exercises both branches with a synthetic SimpleNamespace lacking the OTHER field. |
|
||||
| AC-2 | An integration test against the Derkachi tlog (Tier-1; no Jetson hardware needed — only pymavlink + the tlog file) asserts that the IMU stream as seen by the runtime loop is strictly monotonic ts_ns. The test reads at least the first 100 IMU samples and verifies `sample[i+1].ts_ns > sample[i].ts_ns` for all i. |
|
||||
| AC-3 | No regression in existing RAW_IMU-only adapter tests. |
|
||||
| AC-4 | The fix is independent of AZ-848 — does not require the VioOutput contract change to land first. |
|
||||
|
||||
## References
|
||||
|
||||
- Jira: https://denyspopov.atlassian.net/browse/AZ-883
|
||||
- Origin: AZ-848 investigation, 2026-05-26 cycle 3 Step 16.5 release flow
|
||||
- Related: AZ-848 (the VIO contract repair; both surfaced from the same investigation but their fixes are independent)
|
||||
- Tlog evidence: `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`, 8656 IMU samples (4326 RAW_IMU + 4330 SCALED_IMU2 interleaved)
|
||||
@@ -3,17 +3,30 @@
|
||||
**Task**: AZ-842_replay_protocol_and_orchestrator_docs
|
||||
**Name**: Docs: replay_protocol.md Invariant 12 + AZ-777 Phase 3+ superseded note + orchestrator-test README (AZ-835 C6)
|
||||
**Description**: Sixth and final building block of Epic AZ-835. Capture the route-driven flow in the authoritative documents so future implementers, operators, and reviewers understand what changed and why.
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: AZ-841 (C5, un-xfail — SOFT; README describes test outcomes assuming C5 has landed); AZ-777 (being closed/superseded by this Epic — AZ-777 spec is updated during the AZ-777 closure step, verified by AC-6); AZ-835 (parent Epic)
|
||||
**Complexity**: 3 SP (cycle-4 rescope: was 2 SP)
|
||||
**Dependencies**: AZ-894 (CSV adapter — HARD; replay_protocol.md sub-section describes the new single-canonical-clock flow); AZ-895 (auto-sync deprecation — HARD; replay_protocol.md sub-section describes the tlog adapter's new audit-only role); AZ-896 (CSV format docs — SOFT; replay_protocol.md cross-links to the format spec); AZ-777 (closed/superseded by this Epic); AZ-835 (parent Epic)
|
||||
**Component**: `_docs/02_document/contracts/replay/replay_protocol.md` + `_docs/02_document/architecture.md` + `tests/e2e/replay/README*.md`
|
||||
**Tracker**: AZ-842 (https://denyspopov.atlassian.net/browse/AZ-842)
|
||||
**Parent Epic**: AZ-835
|
||||
|
||||
Jira AZ-842 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
> **Cycle-4 rescope (2026-05-26)**: dropped the AZ-841 (un-xfail) soft
|
||||
> dependency — AZ-841 was deferred to backlog in cycle-4 Step 9 scope
|
||||
> review (see `_docs/02_tasks/backlog/AZ-841_unxfail_az777_tier2_tests.md`).
|
||||
> Expanded scope from "AZ-835 epic docs only" to also cover the cycle-4
|
||||
> replay-input redesign narrative: AZ-894 (CSV-driven single-canonical-clock
|
||||
> adapter), AZ-895 (tlog adapter → audit-only after auto-sync deprecation),
|
||||
> AZ-896 (CSV format spec). The replay_protocol.md edits now describe BOTH
|
||||
> the route-driven AZ-835 flow AND the cycle-4 CSV-driven replay path,
|
||||
> which together supersede the legacy tlog+auto-sync surface.
|
||||
> Complexity bumped 2 → 3 SP to cover the added cycle-4 narrative.
|
||||
|
||||
## Modified files
|
||||
|
||||
### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension
|
||||
### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension + Invariant 13 (NEW, cycle-4)
|
||||
|
||||
**1a. Invariant 12 — route-driven flow (AZ-835)**
|
||||
|
||||
Extend **Invariant 12** with an AZ-835 sub-section describing:
|
||||
|
||||
@@ -21,6 +34,16 @@ Extend **Invariant 12** with an AZ-835 sub-section describing:
|
||||
- Why route-driven supersedes the AZ-777 bbox approach (efficiency: ~100× fewer tiles; honesty: pre-commits to where the operator did fly).
|
||||
- The C3 fixture's failure-handling contract (validation/terminal → re-raise; transient → retry up to 3 attempts using C11's existing backoff schedule).
|
||||
|
||||
**1b. Invariant 13 — single canonical clock (cycle-4, AZ-894 / AZ-895 / AZ-896)**
|
||||
|
||||
Add a new **Invariant 13** sub-section describing:
|
||||
|
||||
- The single-clock model production uses (single edge device, single clock at receipt) and why two-clock surfaces (e.g. `VioOutput.emitted_at_ns` from Jetson monotonic vs. `ImuWindow.ts_end_ns` from FC-boot) produce ESKF out-of-order regressions like AZ-848.
|
||||
- The CSV-driven replay path (AZ-894) — `(video, CSV)` operator input, IMU + GPS-ground-truth on a single canonical clock derived from the CSV's `Time` column, no auto-sync.
|
||||
- The CSV schema (delegate to `_docs/02_document/contracts/replay/csv_replay_format.md` produced by AZ-896 for the field-level spec).
|
||||
- The tlog-replay adapter's new audit-only role (AZ-895): retained for FDR analysis and one-shot tlog→CSV export, removed from the test/demo critical path.
|
||||
- Auto-sync deprecation (AZ-895): `--time-offset-ms` / `--skip-auto-sync-validation` CLI flags removed or marked deprecated with one-cycle warning.
|
||||
|
||||
### 2. `_docs/02_document/architecture.md` — satellite-provider entry extension
|
||||
|
||||
Append a sub-section to the existing satellite-provider entry noting that Epic AZ-835 + its C1-C5 children landed the full e2e real-flight validation path on top of AZ-777 Phase 1's wire + C11 contract adaptation. Mark AZ-777 Phase 3+ as superseded by Epic AZ-835 (pointer-only — the AZ-777 spec itself is updated in C5's wake during the AZ-777 closure step).
|
||||
@@ -39,11 +62,13 @@ Either extend `tests/e2e/replay/README.md` or create a dedicated `tests/e2e/repl
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | `replay_protocol.md` Invariant 12 has a new AZ-835 sub-section covering the route-driven flow, the bbox-supersedure rationale, and the failure-handling contract. |
|
||||
| AC-1b | `replay_protocol.md` has a new Invariant 13 (cycle-4) sub-section covering the single-canonical-clock model, the CSV-driven replay path (AZ-894), the tlog adapter's audit-only role (AZ-895), and auto-sync deprecation. Links to `csv_replay_format.md` (AZ-896). |
|
||||
| AC-2 | `architecture.md` satellite-provider entry has a sub-section noting Epic AZ-835's contribution and pointing at AZ-777 Phase 3+ as superseded. |
|
||||
| AC-2b | `architecture.md` replay-input section explains the cycle-4 redesign: CSV adapter primary path, tlog adapter audit-only role, removal of auto-sync. References AZ-894 / AZ-895 / AZ-896 / AZ-897. |
|
||||
| AC-3 | `tests/e2e/replay/README*.md` exists and a new contributor can run the orchestrator test on Jetson using only the README's instructions (no out-of-band knowledge required). |
|
||||
| AC-4 | All three docs link to the Epic (AZ-835) and to the relevant child tickets (AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-841). |
|
||||
| AC-4 | All three docs link to the Epic (AZ-835), its children (AZ-836 / AZ-838 / AZ-839 / AZ-840), and the cycle-4 redesign tickets (AZ-894 / AZ-895 / AZ-896 / AZ-897). AZ-841 reference omitted (deferred to backlog). |
|
||||
| AC-5 | License attribution string ("Imagery © Google") and the dev-only caveat are present in the test README. |
|
||||
| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children. |
|
||||
| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children and at the cycle-4 redesign tickets. |
|
||||
|
||||
## Out of scope
|
||||
|
||||
|
||||
@@ -0,0 +1,53 @@
|
||||
# Replay: CSV-driven IMU+GPS adapter using single canonical clock
|
||||
|
||||
**Task**: AZ-894_csv_driven_replay_adapter
|
||||
**Name**: Add a CSV-replay adapter that consumes the Derkachi-schema `data_imu.csv` (or any flight that ships with a paired CSV) and exposes IMU + GPS-ground-truth on a single canonical clock derived from the CSV's `Time` column
|
||||
**Description**: Cycle 3 surfaced AZ-848 (eskf_out_of_order on frame 3) because the current replay pipeline imports two incompatible clocks: `VioOutput.emitted_at_ns` uses Jetson process-monotonic time, while `ImuWindow.ts_end_ns` uses FC-boot-relative time (parsed from MAVLink tlog messages). The single-clock model that production uses (single edge device, single clock at receipt) is not what replay does today. The Derkachi fixture's `data_imu.csv` already contains both IMU (`SCALED_IMU2.*`) and GPS ground truth (`GLOBAL_POSITION_INT.*`) on a single canonical clock (the `Time` column, 0..489.9 s at 10 Hz, aligned 3:1 with the 30 fps video). Using the CSV directly eliminates the clock-mismatch surface entirely for the test/demo path and matches the production single-clock model.
|
||||
|
||||
**Complexity**: 3 SP
|
||||
**Dependencies**: AZ-896 (format docs land in the same cycle but can land in either order)
|
||||
**Blocks**: AZ-895 (auto-sync deprecation), AZ-897 (replay UI)
|
||||
**Component**: replay_input (new adapter), c8_fc_adapter (alternate ground-truth source), cli/replay
|
||||
**Tracker**: AZ-894 (https://denyspopov.atlassian.net/browse/AZ-894)
|
||||
**Parent Epic**: (none — cycle-4 replay-input redesign)
|
||||
|
||||
## Schema
|
||||
|
||||
The Derkachi CSV header (19 columns):
|
||||
|
||||
```
|
||||
timestamp(ms), Time,
|
||||
SCALED_IMU2.xacc, SCALED_IMU2.yacc, SCALED_IMU2.zacc,
|
||||
SCALED_IMU2.xgyro, SCALED_IMU2.ygyro, SCALED_IMU2.zgyro,
|
||||
SCALED_IMU2.xmag, SCALED_IMU2.ymag, SCALED_IMU2.zmag,
|
||||
GLOBAL_POSITION_INT.lat, GLOBAL_POSITION_INT.lon, GLOBAL_POSITION_INT.alt,
|
||||
GLOBAL_POSITION_INT.relative_alt,
|
||||
GLOBAL_POSITION_INT.vx, GLOBAL_POSITION_INT.vy, GLOBAL_POSITION_INT.vz,
|
||||
GLOBAL_POSITION_INT.hdg
|
||||
```
|
||||
|
||||
- `timestamp(ms)`: FC-boot-relative milliseconds (kept for traceability; not used by C5)
|
||||
- `Time`: flight-relative seconds (canonical clock — what C5 actually uses)
|
||||
- `SCALED_IMU2.*`: 10 Hz IMU stream (accel mg, gyro mrad/s, mag mGauss per ArduPilot convention)
|
||||
- `GLOBAL_POSITION_INT.*`: 10 Hz GPS ground truth (lat/lon in 1e-7 deg, alt in mm, vx/vy/vz in cm/s, hdg in cdeg)
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- **AC-1**: Adapter parses the Derkachi `data_imu.csv` end-to-end and emits 4,899 IMU samples + 4,899 GPS-ground-truth samples on a single monotonic clock anchored at row 0.
|
||||
- **AC-2**: Wired into `cli/replay.py`; `gps-denied-replay --video flight_derkachi.mp4 --imu data_imu.csv` runs without invoking `tlog_replay_adapter.py`.
|
||||
- **AC-3**: `test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` passes on the Jetson e2e harness using the new path. AZ-848 cascade no longer triggers (no two-clock surface in the new path).
|
||||
- **AC-4**: `VioOutput.emitted_at_ns` is populated from the CSV's `Time` column (or the frame-derived `t = N/fps`), not `time.monotonic_ns()`, when the new adapter is in use.
|
||||
- **AC-5**: Schema mismatch (missing required column, NaN in `Time`, non-monotonic `Time`) raises a clear `ReplayInputAdapterError` at startup, not deep in the loop.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- The structural AZ-848 / AZ-883 fix in the tlog adapter — those stay open as backlog.
|
||||
- UI for picking the CSV — AZ-897.
|
||||
- Other CSV schemas (PX4, generic MAVLink dumps) — future enhancement if needed.
|
||||
|
||||
## References
|
||||
|
||||
- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md`
|
||||
- Bench-run evidence: `_docs/04_release/release_cycle3_jetson-bench_2026-05-26-1442.md`
|
||||
- Companion tickets: AZ-895 (deprecate auto-sync), AZ-896 (format docs + example CSV), AZ-897 (replay UI)
|
||||
- Supersedes (re bench-blocking): AZ-848 (VioOutput contract), AZ-883 (SCALED_IMU2 ts_ns=0)
|
||||
@@ -0,0 +1,39 @@
|
||||
# Replay: deprecate auto_sync surface; tlog adapter → audit-only
|
||||
|
||||
**Task**: AZ-895_deprecate_auto_sync_surface
|
||||
**Name**: Remove the tlog+video auto-sync infrastructure and reframe `tlog_replay_adapter.py` as audit-only, now that AZ-894 ships the CSV-driven primary path
|
||||
**Description**: User decision (2026-05-26): the test/demo replay path will accept a paired (video, CSV) input from the operator instead of auto-syncing a tlog and video. Auto-sync is unnecessary in production (single edge device, single clock by design) and over-engineered for test (the CSV already encodes the alignment).
|
||||
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: AZ-894 (must ship first — the CSV adapter is the replacement)
|
||||
**Component**: replay_input (auto_sync.py, tlog_video_adapter.py), cli/replay, runtime_root/_replay_branch
|
||||
**Tracker**: AZ-895 (https://denyspopov.atlassian.net/browse/AZ-895)
|
||||
**Parent Epic**: (none — cycle-4 replay-input redesign)
|
||||
|
||||
## Touch list
|
||||
|
||||
- `src/gps_denied_onboard/replay_input/auto_sync.py` — delete or convert to a clear no-op that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")`
|
||||
- `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` — strip auto-sync invocations
|
||||
- `src/gps_denied_onboard/cli/replay.py` — remove `--time-offset-ms` / `--skip-auto-sync-validation` flags (or mark deprecated with one-cycle warning)
|
||||
- `src/gps_denied_onboard/runtime_root/_replay_branch.py` — strip auto-sync wiring
|
||||
- `tests/unit/replay_input/test_az405_auto_sync.py` — pass against the new behaviour or delete with rationale recorded in the batch report
|
||||
- `tests/e2e/replay/test_derkachi_real_tlog.py` — continues to `@xfail` with the AZ-848-scoped reason; nothing in this ticket fixes the underlying tlog-clock bug
|
||||
- `tlog_replay_adapter.py` / `tlog_ground_truth.py` — module docstrings updated to call out the new audit-only / one-shot-export roles
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- **AC-1**: `auto_sync.py` is either deleted or made into a clear no-op that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")`.
|
||||
- **AC-2**: All references to `--time-offset-ms` / `--skip-auto-sync-validation` flags in the CLI are removed or marked deprecated with a one-cycle deprecation warning.
|
||||
- **AC-3**: `test_az405_auto_sync` tests either pass against the new behaviour or are deleted with rationale recorded in the batch report.
|
||||
- **AC-4**: `test_derkachi_real_tlog.py` continues to `@xfail` with the AZ-848-scoped reason; nothing in this ticket fixes the underlying tlog-clock bug.
|
||||
- **AC-5**: Module docstrings of `tlog_replay_adapter.py` and `tlog_ground_truth.py` are updated to call out their new audit-only / one-shot-export roles.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- AZ-848 / AZ-883 structural fix — they stay open as backlog (tlog path is still broken, just no longer the primary path).
|
||||
- New CSV export tooling for arbitrary tlogs — future ticket.
|
||||
|
||||
## References
|
||||
|
||||
- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md`
|
||||
- Companion: AZ-894 (CSV adapter — must land first), AZ-896 (docs), AZ-897 (UI)
|
||||
@@ -0,0 +1,38 @@
|
||||
# Docs: replay-input format spec + downloadable example CSV
|
||||
|
||||
**Task**: AZ-896_replay_format_docs_and_example_csv
|
||||
**Name**: Author the operator-facing format spec for the (video, CSV) replay input pair, plus a minimal downloadable example CSV
|
||||
**Description**: Operators using the replay/demo path need to know the exact CSV schema the system accepts, the hard contract (video t=0 ≡ CSV row 0; video must be nadir; UAV must already be airborne at t=0), and have a downloadable example to copy from. Operators today have no entry point that documents this.
|
||||
|
||||
**Complexity**: 1 SP
|
||||
**Dependencies**: AZ-894 (the adapter that consumes the format — the doc describes what AZ-894 accepts)
|
||||
**Blocks**: AZ-897 (UI links to the docs page and serves the example CSV)
|
||||
**Component**: docs (_docs/04_release/)
|
||||
**Tracker**: AZ-896 (https://denyspopov.atlassian.net/browse/AZ-896)
|
||||
**Parent Epic**: (none — cycle-4 replay-input redesign)
|
||||
|
||||
## What
|
||||
|
||||
- Author a docs page at `_docs/04_release/replay_input_format.md` (or wherever the operator-facing docs land in cycle 4)
|
||||
- Schema table: column names, units, types, expected rates, required vs optional
|
||||
- Constraint statements up top, before the column table:
|
||||
- Video: nadir camera; UAV already airborne at frame 0
|
||||
- CSV: row 0 timestamp == video frame 0 timestamp; `Time` column starts at 0.0; rows monotonic and uniformly-spaced
|
||||
- Ship `_docs/04_release/example_data_imu.csv` — a minimal valid example (e.g., 20 rows = 2 seconds at 10 Hz)
|
||||
- Cross-link from the AZ-897 replay UI "Download example" button
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- **AC-1**: Schema page documents all 19 columns of the Derkachi CSV with units and types.
|
||||
- **AC-2**: The three hard constraints (nadir / airborne / aligned-start) are stated up top, before the column table.
|
||||
- **AC-3**: The example CSV (≥10 rows) passes through the AZ-894 CSV adapter without errors.
|
||||
- **AC-4**: The page is reachable from the AZ-897 UI's "Download example" link.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Multi-schema support (PX4, generic MAVLink dumps).
|
||||
|
||||
## References
|
||||
|
||||
- Companion: AZ-894 (CSV adapter), AZ-897 (UI), AZ-895 (auto-sync deprecation)
|
||||
- Source fixture: `_docs/00_problem/input_data/flight_derkachi/data_imu.csv`, README at `_docs/00_problem/input_data/flight_derkachi/README.md`
|
||||
@@ -0,0 +1,45 @@
|
||||
# Replay UI: web form for paired video + CSV input + example download
|
||||
|
||||
**Task**: AZ-897_replay_ui_web_form
|
||||
**Name**: Build the first operator-facing UI for the GPS-denied onboard system — a single-page form that uploads a paired (video, CSV) for replay
|
||||
**Description**: User decision (2026-05-26): the system offers an operator-facing UI for the test/demo replay path. The UI surfaces the hard constraints visually (nadir, airborne, aligned-start) so operators don't fail silently from a misaligned video. This is also the foundation for the deferred operator-tooling work (see `_docs/00_research/00_question_decomposition.md` lines 119, 224).
|
||||
|
||||
Tech stack per `.cursor/rules/techstackrule.mdc`: React + Tailwind CSS.
|
||||
|
||||
**Complexity**: 5 SP
|
||||
**Dependencies**: AZ-894 (backend CSV adapter), AZ-896 (format docs + example CSV that the UI serves)
|
||||
**Component**: frontend (new — first piece of operator-facing UI), backend (new HTTP endpoint that fronts `gps-denied-replay`)
|
||||
**Tracker**: AZ-897 (https://denyspopov.atlassian.net/browse/AZ-897)
|
||||
**Parent Epic**: (none — cycle-4 replay-input redesign; will likely become the first piece of a future operator-tooling epic)
|
||||
|
||||
## Shape
|
||||
|
||||
A single-page web form, served from a target to be decided during implementation (Jetson? operator workstation? containerised dev mode?). Hosts:
|
||||
|
||||
- **Video file picker**. Accept `.mp4`, `.mov`. Display constraint hint: "Nadir camera; UAV already airborne at frame 0."
|
||||
- **CSV file picker**. Accept `.csv`. Display constraint hint: "Row 0 timestamp must equal video frame 0; see format docs."
|
||||
- **"Download example CSV"** link → AZ-896's `example_data_imu.csv`.
|
||||
- **"View format docs"** link → AZ-896's `replay_input_format.md`.
|
||||
- **"Start replay"** button → POSTs (video_path, csv_path) to a backend endpoint that invokes `gps-denied-replay --video X --imu Y`.
|
||||
- **Result panel**: tail the replay subprocess output, display final verdict (PASS/FAIL + accuracy metrics).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- **AC-1**: Form renders with both pickers, both constraint hints, download/docs links, and the start button.
|
||||
- **AC-2**: The start button correctly invokes the replay pipeline against the selected files; success path returns a verdict; failure path returns the error reason from the backend.
|
||||
- **AC-3**: Form rejects mismatched filename pairs only with explicit operator-actionable error messages — no silent failures.
|
||||
- **AC-4**: Example-CSV download serves the file from AZ-896 with the correct content-type.
|
||||
- **AC-5**: Tests cover empty submissions, mismatched file types, backend failures, and the happy path. React Testing Library + jest for component tests; an e2e smoke test covers the full flow.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Multi-flight management / history / library.
|
||||
- Authentication / user accounts.
|
||||
- Sector classification, pre-flight cache provisioning, mission planning (those are separate deferred items from C10 / `00_question_decomposition.md`).
|
||||
- The deploy-target decision (Jetson vs operator workstation) — to be resolved during implementation; default proposal: containerised dev mode for now.
|
||||
|
||||
## References
|
||||
|
||||
- Companion: AZ-894 (CSV adapter), AZ-896 (docs + example CSV)
|
||||
- Deferred precedent: `_docs/00_research/00_question_decomposition.md` lines 119 ("Mission-planning UX is out of scope"), 224 ("Operator-side CLI/desktop tool design deferred to Plan-phase")
|
||||
- Tech stack: React + Tailwind CSS per `.cursor/rules/techstackrule.mdc`
|
||||
@@ -0,0 +1,78 @@
|
||||
# Land `architecture_compliance_baseline.md` (cycle-3 retro #3, third try)
|
||||
|
||||
**Task**: AZ-899_architecture_compliance_baseline
|
||||
**Name**: Create `_docs/02_document/architecture_compliance_baseline.md` so cumulative reviews can emit `## Baseline Delta` rows
|
||||
**Description**: Cycle-1 retro Top-3 Improvement Action #3, repeated in cycle-3 retro Top-3 #3. The file has been unmade across cycles 2 and 3, leaving cumulative reviews unable to quantify carried-over / resolved / newly-introduced architecture violations per cycle. Seed the baseline from `_docs/06_metrics/structure_2026-05-20.md` with `0` violations, freeze the snapshot semantics, and wire the existing-code flow's Step 2 to reference it.
|
||||
**Complexity**: 1 SP
|
||||
**Dependencies**: None (operates on existing artifact `_docs/06_metrics/structure_2026-05-20.md`)
|
||||
**Component**: documentation only — no source code change
|
||||
**Tracker**: AZ-899 (https://denyspopov.atlassian.net/browse/AZ-899)
|
||||
**Epic**: (none — cycle-4 process housekeeping)
|
||||
|
||||
## Problem
|
||||
|
||||
Cycle-3 retro § Structural Metrics:
|
||||
|
||||
> `_docs/02_document/architecture_compliance_baseline.md` **still does not exist** — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3.
|
||||
|
||||
Without a baseline, cumulative reviews log "`_docs/02_document/architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted". Structural regressions (new cycles in the import graph, newly-introduced violations) therefore cannot be quantified across cycles — only verified pairwise per batch.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Cumulative-review reports starting from cycle-4 batch 1 emit a `## Baseline Delta` section that quantifies new vs. resolved vs. carried-over architecture violations.
|
||||
- Cycle-end retros can compare structural deltas across cycles using a single canonical baseline document instead of re-deriving from the previous cycle's snapshot.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Create `_docs/02_document/architecture_compliance_baseline.md` seeded with **0** violations.
|
||||
- Reference `_docs/06_metrics/structure_2026-05-20.md` as the source-of-truth snapshot from which the baseline was derived.
|
||||
- Document the file's update protocol: a new violation found in a cumulative review is appended (with batch ID, severity, finding ID); a resolution is recorded by marking the row `RESOLVED in batch <ID>`.
|
||||
- Document the snapshot-refresh trigger: any cycle that materially changes structure (component count, cross-component edges, new contracts) re-snapshots via `python -m gps_denied_onboard.tools.structure_snapshot` (or equivalent existing script — verify before reference).
|
||||
|
||||
### Excluded
|
||||
|
||||
- Refactoring source code to fix violations — none currently exist.
|
||||
- Adding new component scaffolding — out of scope.
|
||||
- Modifying `code-review` or `retrospective` skills — they already reference the file; the only change needed is making the referenced file exist.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Baseline file exists with 0 violations**
|
||||
Given a fresh repo checkout
|
||||
When `ls _docs/02_document/architecture_compliance_baseline.md` runs
|
||||
Then the file exists and its `## Violations` section is explicitly empty (or marked "None at baseline")
|
||||
|
||||
**AC-2: Baseline references the structural snapshot**
|
||||
Given the baseline file
|
||||
When read
|
||||
Then it includes a `## Source` section pointing at `_docs/06_metrics/structure_2026-05-20.md` and lists the structural facts (15 components, 0 import cycles, 5 contract files) that establish the "0 violations" claim
|
||||
|
||||
**AC-3: Update protocol documented**
|
||||
Given the baseline file
|
||||
When read
|
||||
Then it includes an `## Update Protocol` section describing append-on-violation, mark-resolved-on-fix, and the snapshot-refresh trigger
|
||||
|
||||
**AC-4: Cumulative-review hook verified**
|
||||
Given the baseline file in place
|
||||
When the cycle-4 first cumulative-review report is generated
|
||||
Then the report emits a `## Baseline Delta` section (even if empty: "0 new, 0 resolved, 0 carried-over")
|
||||
|
||||
## Constraints
|
||||
|
||||
- File format: markdown, matches the structure of `_docs/06_metrics/structure_2026-05-20.md` style.
|
||||
- No source code change permitted under this ticket — strictly documentation.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Future violations slip past the baseline**
|
||||
- *Risk*: A cumulative review finds a violation but the reviewer forgets to append it to the baseline.
|
||||
- *Mitigation*: The `code-review` skill (referenced in cycle-3 retro Suggested Updates) should be updated separately to auto-append; this ticket only delivers the baseline file. The follow-up belongs in cycle 5 if needed.
|
||||
|
||||
## References
|
||||
|
||||
- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md` § Top 3 Improvement Actions #3
|
||||
- Cycle-1 retro: `_docs/06_metrics/retro_2026-05-20.md` § Top 3 Improvement Actions #3 (original)
|
||||
- Source snapshot: `_docs/06_metrics/structure_2026-05-20.md`
|
||||
- Existing-code flow Step 2: `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan"
|
||||
@@ -0,0 +1,82 @@
|
||||
# Autodev: gate Step-9 entry on previous-cycle retro existence
|
||||
|
||||
**Task**: AZ-900_autodev_retro_existence_gate
|
||||
**Name**: Codify the LESSONS rule — autodev must block cycle-N+1 Step 9 entry if `retro_<YYYY-MM-DD>.md` for cycle N is absent
|
||||
**Description**: Cycle-3 retro Top-3 Improvement Action #2 and 2026-05-26 LESSONS entry both call for codifying a Re-Entry After Completion gate that verifies the previous cycle's retro file exists before incrementing the cycle counter. Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3 and all cycle-1 retro Top-3 actions sat invisible. This ticket codifies the gate in `.cursor/skills/autodev/flows/existing-code.md` § Re-Entry After Completion.
|
||||
**Complexity**: 1 SP
|
||||
**Dependencies**: None
|
||||
**Component**: `.cursor/skills/autodev/flows/existing-code.md` (workflow doc only)
|
||||
**Tracker**: AZ-900 (https://denyspopov.atlassian.net/browse/AZ-900)
|
||||
**Epic**: (none — cycle-4 process housekeeping)
|
||||
|
||||
## Problem
|
||||
|
||||
LESSONS 2026-05-26 [process] entry:
|
||||
|
||||
> Cycle-2 retro was never filed. The autodev orchestrator silently auto-chained from cycle-2 Step 17 (if it ran at all) straight into cycle-3 Step 9 without producing `retro_<cycle2-date>.md`. As a result, cycle-1 retro's Top-3 Improvement Actions sat invisible across cycle 2 and were re-discovered, all three still undelivered, only at cycle-3 close.
|
||||
|
||||
Cycle-3 retro Top-3 #2 echoes the same recommendation.
|
||||
|
||||
The fix is a one-line check in the flow file that BLOCKS Step 9 entry for cycle N+1 unless `_docs/06_metrics/retro_<YYYY-MM-DD>.md` for cycle N exists.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Future cycle-N → cycle-(N+1) transitions are gated: the autodev orchestrator refuses to enter Step 9 of cycle N+1 if no retro file exists for cycle N.
|
||||
- Missing retros are surfaced at the session boundary, not 6 weeks later at the next cycle's close.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Edit `.cursor/skills/autodev/flows/existing-code.md` § "Re-Entry After Completion" to add a gate: before incrementing `cycle`, glob `_docs/06_metrics/retro_*.md` and verify a file dated after the cycle-N start exists.
|
||||
- Define the BLOCK behavior: if absent, present a Choose A/B/C block:
|
||||
- **A)** Author the missing retro now (invoke `.cursor/skills/retrospective/SKILL.md` in cycle-end mode)
|
||||
- **B)** Stub a backfilled retro and proceed (with a leftover entry filed for proper backfill)
|
||||
- **C)** Abort and ask the user
|
||||
- Add a corresponding bullet to `.cursor/skills/autodev/state.md` § "Session Boundaries" pointing at the new gate.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Retroactively writing cycle-2 retro (separate ticket if user wants it; cycle-3 retro already covers cycle-2 trend deltas where data is on disk).
|
||||
- Adding similar gates to greenfield or meta-repo flows (only `existing-code` has the cycle counter).
|
||||
- Per-step retro check inside cycles (this gate fires only at the cycle boundary).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Flow file gate exists**
|
||||
Given `.cursor/skills/autodev/flows/existing-code.md`
|
||||
When the "Re-Entry After Completion" section is read
|
||||
Then it contains a step `Verify previous cycle's retro exists` BEFORE the cycle increment
|
||||
|
||||
**AC-2: Choose A/B/C block specified**
|
||||
Given the gate triggers (no retro file found)
|
||||
When the documented behavior is consulted
|
||||
Then it specifies the three options (A: author now, B: stub + leftover, C: abort) with the standard Choose format
|
||||
|
||||
**AC-3: state.md cross-reference**
|
||||
Given `.cursor/skills/autodev/state.md`
|
||||
When the "Session Boundaries" section is read
|
||||
Then it mentions the new retro-existence gate or links to the flow file's gate
|
||||
|
||||
**AC-4: Discovery rule**
|
||||
Given the gate
|
||||
When the file pattern is documented
|
||||
Then the glob is unambiguous: `_docs/06_metrics/retro_*.md` with a date matching cycle-N's date range; the date-range derivation is explicit (cycle N start = last `implementation_report_*_cycle{N-1}.md` date; cycle N end = today)
|
||||
|
||||
## Constraints
|
||||
|
||||
- Pure workflow doc change — no source code, no tests.
|
||||
- Must not break the existing greenfield-Done → existing-code Phase-B transition (greenfield → existing-code is a one-shot flow change with no retro requirement on first entry, since there is no previous cycle).
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: False positive on greenfield→existing-code transition**
|
||||
- *Risk*: First cycle of an existing-code flow shouldn't require a previous-cycle retro.
|
||||
- *Mitigation*: Gate condition includes `state.cycle > 1` — cycle 1 has no previous cycle.
|
||||
|
||||
## References
|
||||
|
||||
- LESSONS 2026-05-26 [process] entry: `_docs/LESSONS.md` § 2026-05-26 [process]
|
||||
- Cycle-3 retro Top-3 #2: `_docs/06_metrics/retro_2026-05-26.md`
|
||||
- Flow file: `.cursor/skills/autodev/flows/existing-code.md` § "Re-Entry After Completion"
|
||||
- State management: `.cursor/skills/autodev/state.md` § "Session Boundaries"
|
||||
@@ -0,0 +1,85 @@
|
||||
# Fix `EVIDENCE_OUT` default path — workspace-relative, not container-only
|
||||
|
||||
**Task**: AZ-901_evidence_out_default_path_fix
|
||||
**Name**: Change `e2e/runner/conftest.py:56` `EVIDENCE_OUT` default from `/e2e-results/evidence` to a workspace-relative path so Tier-1 host runs don't crash
|
||||
**Description**: Closes leftover `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`. Cycle-3 Step 15 (Performance Test) surfaced this: the default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness; a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly hits `OSError: [Errno 30] Read-only file system: '/e2e-results'` (macOS) or `PermissionError` (Linux). Workaround today: `EVIDENCE_OUT="$(pwd)/e2e-results/..." pytest ...`. Fix: resolve a workspace-relative default when neither `--evidence-out` nor `EVIDENCE_OUT` is set.
|
||||
**Complexity**: 1 SP
|
||||
**Dependencies**: None
|
||||
**Component**: `e2e/runner/conftest.py`
|
||||
**Tracker**: AZ-901 (https://denyspopov.atlassian.net/browse/AZ-901)
|
||||
**Epic**: (none — cycle-4 process housekeeping)
|
||||
|
||||
## Problem
|
||||
|
||||
`e2e/runner/conftest.py:56`:
|
||||
|
||||
```python
|
||||
default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")
|
||||
```
|
||||
|
||||
The default `/e2e-results/evidence` is a container-mount path. Tier-1 Docker harness and the Tier-2 Jetson runner pass `--evidence-out` explicitly, so they're fine. Host-direct `python -m pytest e2e/tests/performance/` invocations (developer machine, no Docker) hit `nfr_recorder.pytest_sessionfinish` which tries `mkdir(evidence_dir)` and crashes.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Developer can run `python -m pytest e2e/tests/performance/` on a Mac/Linux workstation without setting `EVIDENCE_OUT` and without crashing.
|
||||
- Docker / Jetson runners continue to work unchanged (they pass `--evidence-out` explicitly).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Modify `e2e/runner/conftest.py:56` to resolve a workspace-relative default when `EVIDENCE_OUT` is unset.
|
||||
- Proposed: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`
|
||||
- Verify Docker compose files and Jetson scripts that pass `--evidence-out` still work (they should — they override the default).
|
||||
- Verify `.gitignore` ignores `e2e-results/` at repo root (probably already does — confirm before commit).
|
||||
- Delete the leftover file `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` once the fix lands and the verification AC passes.
|
||||
|
||||
### Excluded
|
||||
|
||||
- The "lazy fallback inside the recorder" alternative shape — staying with the workspace-relative-default shape for simplicity (Option 1 from the leftover file).
|
||||
- Refactoring `nfr_recorder.pytest_sessionfinish` — the writer code is fine; only the default path is wrong.
|
||||
- Adding new evidence-out related env vars or CLI flags.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Host-direct pytest works without EVIDENCE_OUT**
|
||||
Given a clean workspace on macOS or Linux
|
||||
When `python -m pytest e2e/tests/performance/ -v --tb=short` runs (no `EVIDENCE_OUT` env var, no `--evidence-out` flag)
|
||||
Then pytest exits 0, evidence is written under `<workspace_root>/e2e-results/evidence/`, and no `OSError` / `PermissionError` is raised
|
||||
|
||||
**AC-2: Docker harness unchanged**
|
||||
Given the Tier-1 Docker compose (`docker-compose.test.jetson.yml`)
|
||||
When the e2e suite runs inside the container
|
||||
Then `--evidence-out` is still passed and evidence lands at the container mount path `/e2e-results/evidence/` (no behavioral change)
|
||||
|
||||
**AC-3: Jetson harness unchanged**
|
||||
Given `scripts/run-tests-jetson.sh`
|
||||
When invoked
|
||||
Then it still passes `--evidence-out` to pytest and evidence is collected per the existing protocol
|
||||
|
||||
**AC-4: gitignore covers workspace-relative path**
|
||||
Given the fix in place
|
||||
When a host-direct run produces `<workspace_root>/e2e-results/`
|
||||
Then `git status` does NOT show `e2e-results/` as untracked (already covered by `.gitignore`, or `.gitignore` is updated as part of this ticket)
|
||||
|
||||
**AC-5: Leftover deleted**
|
||||
Given the fix lands and ACs 1–4 pass
|
||||
When `ls _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`
|
||||
Then the file does not exist
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | Run `pytest e2e/tests/performance/` without env vars on host | Exit 0, evidence at `<workspace_root>/e2e-results/evidence/` |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Backward-compatible — existing callers passing `--evidence-out` or setting `EVIDENCE_OUT` see no change.
|
||||
- No new dependencies; uses `pathlib.Path` which `conftest.py` already imports (verify before commit).
|
||||
|
||||
## References
|
||||
|
||||
- Leftover file: `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`
|
||||
- Cycle-3 Step 15 perf report: `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` § "Findings worth tracking" item 3
|
||||
- Conftest: `e2e/runner/conftest.py:56`
|
||||
Reference in New Issue
Block a user