[AZ-598] Batch 78: sitl_observer.wait_for_outbound + FT-P-01 fixture builder

Phase 1: extend sitl_observer with cursor-based `wait_for_outbound`
returning `OutboundMessage` from `outbound_messages_<fc_kind>_<host>.json`
fixtures. Three outcomes: message, TimeoutError (null entries), or
RuntimeError (missing/malformed). Fix FT-P-01 + FT-P-05 scenarios to
use `fc_kind=` kwarg.

Phase 2: FT-P-01 vertical-slice fixture builder under
`e2e/fixtures/sitl_replay_builder/`. Reuses the production
`gps-denied-replay` CLI + `ReplayInputAdapter`: encode 60 stills as
1 fps MP4 + synthetic stationary tlog (pymavlink); run replay;
project FDR outbound estimates into the schema. Avoids the
13+ cp of SUT-side frame-ingestion that a live-SITL-capture path
would have required. Live execution remains a manual operator step.

+35 unit tests (664 total, up from 637). K=3 cumulative review for
b76-b78 documents the offline-replay arc convergence.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 12:08:02 +03:00
parent f49d803252
commit 47ad43f913
14 changed files with 1940 additions and 8 deletions
+132
View File
@@ -0,0 +1,132 @@
# Batch 78 Report — FT-P-01 vertical slice (cycle 1, batch 12 of test phase)
**Batch**: 78
**Date**: 2026-05-17
**Context**: Test implementation (greenfield Step 10 — Implement Tests)
**Tasks**: AZ-598 (5 cp) — 1 task (FT-P-01 vertical slice)
**Cycle**: 1
**Verdict**: COMPLETE — PASS (self-reviewed + cumulative-reviewed; see `reviews/batch_78_review.md` + `reviews/cumulative_76_78_review.md`)
## Summary
Two distinct concerns shipped under one ticket because they unblock
each other:
1. **Observer extension**`sitl_observer._FdrReplayObserver.wait_for_outbound`
was missing despite being called by FT-P-01 + FT-P-05. The b78
audit caught this; the implementation adds cursor-based replay
from `outbound_messages_<fc_kind>_<host>.json` plus an
`OutboundMessage` dataclass + the two scenario kwarg fixes
(`fc_adapter=``fc_kind=`).
2. **FT-P-01 fixture builder** — a vertical-slice tool that produces
the two fixture files (`outbound_messages_*` + `observer_*`) for
the FT-P-01 scenario. Pivoted from the original "live SITL
docker capture" design (would have needed ~13+ cp of new SUT-side
frame-ingestion code) to a "drive `gps-denied-replay` against a
1 fps MP4 + synthetic stationary tlog" approach that reuses the
existing production `ReplayInputAdapter`. No new SUT code; one
subprocess call instead of a multi-container compose.
### Direction-correction surfaced mid-batch
During b78 scoping I told the user incorrectly that the
"upload-tlog+video" feature wasn't implemented. Discovery during
scope analysis showed `src/gps_denied_onboard/replay_input/` exists
exactly for that use case (CLI = `gps-denied-replay`, coordinator
= `ReplayInputAdapter`, auto-sync = AZ-405). I corrected the user
immediately, surfaced the direction options, and the user chose to
stay the course on b78 (FT-P-01 vertical slice). The discovery also
enabled the pivot from live-SITL-capture to "reuse the
`gps-denied-replay` CLI" — turning the impossible-in-one-batch
phase 2 into a tractable one.
### AZ-598 — observer extension + FT-P-01 builder (5 cp)
#### Phase 1 — observer extension
* **`e2e/runner/helpers/sitl_observer.py`** (extended):
* New `OutboundMessage(lat_deg, lon_deg, image_id=None)` frozen
dataclass.
* `_FdrReplayObserver` unfrozen (cursor state is now meaningful);
`_outbound_cursor: int = 0` + lazy `_outbound_messages` cache.
* New `wait_for_outbound(timeout_s: float | None = None)` method
with three outcomes: `OutboundMessage` / `TimeoutError` /
`RuntimeError`.
* New module-level `_load_outbound_messages(fc_kind, host)` helper
that validates the entire `messages` list at first read.
* **`e2e/tests/positive/test_ft_p_01_still_image_accuracy.py`**:
`get_observer(fc_adapter=...)``get_observer(fc_kind=...)`.
* **`e2e/tests/positive/test_ft_p_05_sat_anchor.py`**: same kwarg
fix.
* **`e2e/_unit_tests/helpers/test_sitl_observer.py`**: +11 tests
covering cursor advance, timeout, exhaustion, missing file,
missing env, malformed schema (list/object/keys), optional
`image_id`, and cursor independence between observers.
#### Phase 2 — FT-P-01 fixture builder
* **`e2e/fixtures/sitl_replay_builder/__init__.py`** (new): minimal
package docstring; deliberately no symbol re-exports (avoid the
`build_p01_fixtures` function/submodule name-shadow pitfall —
documented in the docstring).
* **`e2e/fixtures/sitl_replay_builder/build_p01_fixtures.py`** (new):
* `BuilderConfig` frozen dataclass.
* `encode_stills_to_mp4(image_paths, output, fps=1.0)` — OpenCV;
accepts `_video_writer_factory` / `_imread` for testability.
* `generate_stationary_tlog(output, duration_s=120, hz=200)`
pymavlink; writes zero-motion `RAW_IMU` + `ATTITUDE` pairs.
* `run_gps_denied_replay(video, tlog, fdr_out, time_offset_ms=0,
extra_args=...)` — subprocess to the production CLI; bypasses
auto-sync because the synthetic tlog has no take-off.
* `parse_fdr_for_outbound_estimates(fdr_path, fdr_kind=...,
lat_key=..., lon_key=...)` — JSONL walk; configurable
record-kind + field-key projection.
* `write_outbound_messages_fixture(output, image_ids, estimates)`
— schema writer; preserves `None` → JSON `null` for timeouts.
* `write_observer_fixture(output)` — minimal observer config so
`get_observer` succeeds.
* `build_p01_fixtures(cfg, *, _runner=None, ...)` — orchestrator.
* `_main(argv=None) -> int` — argparse CLI entry point.
* **`e2e/fixtures/sitl_replay_builder/README.md`** (new): strategy,
usage, output structure, limitations.
* **`e2e/_unit_tests/fixtures/test_sitl_replay_builder.py`** (new):
+24 tests — 3 for `encode_stills_to_mp4`, 4 for
`generate_stationary_tlog` (incl. one real-pymavlink round-trip),
3 for `run_gps_denied_replay`, 6 for
`parse_fdr_for_outbound_estimates`, 3 for
`write_outbound_messages_fixture`, 1 for
`write_observer_fixture`, 4 for end-to-end orchestration.
* **`e2e/_unit_tests/test_directory_layout.py`**: registers the
three new files in the layout invariant.
## Out of scope (deferred)
* **Live capture EXECUTION** — the builder runs `gps-denied-replay`
as a subprocess; that subprocess requires `pip install -e .` at
repo root plus access to the input images. Not executed in this
batch; documented as a manual operator step. A future ticket can
add a CI job that runs the live capture + commits the resulting
fixtures.
* **Other scenarios** (FT-P-02 through FT-N-04) — each needs its
own fixture-builder flow (continuous video + IMU CSV replay,
blackout/spoof setup, etc.).
* **iNav adapter** — only ArduPilot supported in this batch.
* **`fc_kind` ↔ `fc_adapter` naming convergence** — kwarg-fix only;
a future cleanup ticket should converge the vocabulary.
## Test Results
* New unit tests: **35** (11 `wait_for_outbound` + 24 builder).
* Full `e2e/_unit_tests` suite: **664 passed in 137 s** (previous
cumulative: 637 → +27 net).
* No new linter errors.
* `grep raise NotImplementedError` under `e2e/tests/` returns
**zero** matches (b77 invariant preserved).
## State
* Spec moved: `_docs/02_tasks/todo/AZ-598_ft_p_01_vertical_slice.md`
→ `_docs/02_tasks/done/`.
* `_docs/_autodev_state.md` advanced to `last_completed_batch: 78`,
`last_cumulative_review: batches_76-78` (K=3 cumulative shipped
alongside the batch review).