[AZ-405] Replay — replay_input/ coordinator + IMU take-off auto-sync

Adds the Layer-4 cross-cutting `replay_input/` module per ADR-011:
ReplayInputAdapter converges (video, tlog) into the standard
FrameSource + FcAdapter + Clock surfaces the airborne composition
root consumes. Owns time-alignment between video frames and tlog
IMU/attitude ticks (manual via --time-offset-ms or auto via the
AZ-405 IMU-take-off detector + Farneback motion-onset detector).

Auto-sync algorithm (auto_sync.py):
- Tlog take-off detector: sustained vertical-accel excess > 0.5 g for
  >= 0.5 s + sustained attitude-rate magnitude > 1 rad/s.
- Video motion-onset detector: dense Farneback flow magnitude > 1.5 px
  sustained >= 0.5 s (deterministic per AC-10).
- compute_offset combines the two; confidence = min(tlog, video).
- validate_offset_or_fail implements the AC-9 95 % frame-window match
  validator with configurable threshold + window.

ReplayInputAdapter.open() ordering (AC-13):
1. Load tlog samples + fail-fast on missing RAW_IMU/SCALED_IMU2 or
   ATTITUDE BEFORE any video read.
2. Resolve offset (auto-sync OR manual override; manual bypasses the
   detectors entirely per AC-8).
3. Run AC-9 validator on resolved offset; raise auto-sync hard-fail
   for AC-7 (CLI exit 2 mapping).
4. Build single Clock instance per pace (TlogDerived/ASAP, Wall/REAL).
5. Construct VideoFileFrameSource and TlogReplayFcAdapter with the
   resolved offset baked in (replay protocol Invariant 8).

Structured log + FDR records on auto-sync detected / low-confidence /
AC-8 hard-fail kinds. Idempotent close (AC-12).

Tests: 25 unit tests across tests/unit/replay_input/ covering all 13
ACs (kernel-level synthetic fixtures for AC-1..AC-10; coordinator-
level OpenCV synthetic videos + faked pymavlink for AC-6..AC-13).

Contract update: replay_protocol.md v2.0.0 added fdr_client to the
ReplayInputAdapter __init__ signature (was missing in the prose; the
task spec already listed it in the allowed-imports section).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 09:50:51 +03:00
parent f9b4241d3a
commit 8149083cac
14 changed files with 2979 additions and 4 deletions
@@ -1,159 +0,0 @@
# Replay — `replay_input/` coordinator + auto-sync video↔tlog via IMU take-off detection
**Task**: AZ-405_replay_auto_sync
**Name**: `replay_input/` Layer-4 cross-cutting coordinator (`ReplayInputAdapter`) + auto-sync of video↔tlog timestamp offset via IMU take-off detection (AC-7 / AC-8; `--time-offset-ms` is the manual override)
**Description**: Per ADR-011, replay is a configuration of the airborne binary; the architectural integration point is the new `replay_input/` Layer-4 cross-cutting module that converges `(video, tlog)` inputs into the standard `FrameSource` + `FcAdapter` + `Clock` surfaces the composition root already consumes. This task creates the `replay_input/` module and owns the time-alignment concern inside it (auto-sync + manual offset application).
The module:
1. Hosts the `ReplayInputAdapter` class in `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` (public re-export in `__init__.py`). Constructor takes `(video_path, tlog_path, camera_calibration, target_fc_dialect, wgs_converter, pace, manual_time_offset_ms, auto_sync_config)`. `.open()` resolves the time-offset (auto-sync OR manual override), instantiates `VideoFileFrameSource` + `TlogReplayFcAdapter` + chosen `Clock` (`TlogDerivedClock` for pace=ASAP; `WallClock` for pace=REALTIME), and returns a `ReplayInputBundle(frame_source, fc_adapter, clock, resolved_time_offset_ms, auto_sync_result)` for the composition root to wire.
2. Hosts the auto-sync logic in `src/gps_denied_onboard/replay_input/auto_sync.py`:
- `detect_tlog_takeoff(tlog_path, target_fc_dialect) -> AutoSyncResult` — parses the tlog for the IMU take-off pattern (sustained vertical accel > 0.5 g for ≥ 0.5 s + change in attitude rate > 1 rad/s in the same window — typical quadcopter take-off signature); returns `(tlog_takeoff_ns, confidence)`.
- `detect_video_motion_onset(video_path, frame_rate_hz) -> AutoSyncResult` — analyses the video for motion-onset via pyramidal optical flow magnitude crossing a configurable threshold sustained for ≥ 0.5 s; returns `(video_motion_onset_ns, confidence)`.
- `compute_offset(tlog_result, video_result) -> AutoSyncOffset` — combines the two; offset = `tlog_takeoff_ns - video_motion_onset_ns` (positive offset = video starts before take-off recorded in tlog); confidence = combined.
- `validate_offset_or_fail(offset, tlog_path, video_path, frame_rate_hz, threshold_pct) -> int` — runs the AC-8 frame-window match-percentage check: for each video frame, find the nearest IMU window within ± 100 ms after applying the offset; return 0 if ≥ 95 % of frames have a match, 2 otherwise.
3. Confidence-scoring: confidence is high (≥ 80 %) when both signals are well-defined; low when ambiguous (e.g., fixed-wing hand-launch — no clear vertical-accel-above-0.5g pulse). If combined confidence < 80 %, `ReplayInputAdapter.open()` logs WARN + uses the best-guess offset and proceeds. `manual_time_offset_ms is not None` always overrides auto-detect.
4. AC-8 hard-fail: if `validate_offset_or_fail` returns 2 (either after auto-sync OR after manual override), `ReplayInputAdapter.open()` raises `ReplayInputAdapterError("auto-sync hard-fail: …")` which the shared main maps to CLI exit code 2.
The composition root's replay-mode branch (AZ-401) instantiates `ReplayInputAdapter`, calls `.open()`, and consumes the returned bundle. No replay-aware code lives outside this module + AZ-400's transport seam + AZ-401's composition-root branch.
**Complexity**: 5 points (unchanged from v1.0.0 — same algorithmic work; the coordinator class is a small addition since it just instantiates strategies the algorithm already needs).
**Dependencies**: AZ-402 (CLI provides the args that feed `ReplayInputAdapter`); AZ-399 (`TlogReplayFcAdapter` is instantiated by `ReplayInputAdapter.open()`); AZ-398 (`VideoFileFrameSource` + `Clock` strategies are instantiated by `ReplayInputAdapter.open()`); AZ-279 (`WgsConverter` constructor-injected); AZ-263 (`runtime_root` bootstrap); AZ-269 / AZ-270 (`Config.replay.auto_sync` sub-config); AZ-266 (logging); AZ-272 (FDR record schema for confidence + decision logging).
**Component**: replay-input (epic AZ-265 / E-DEMO-REPLAY) — module at `src/gps_denied_onboard/replay_input/`.
**Tracker**: AZ-405
**Epic**: AZ-265 (E-DEMO-REPLAY)
### Document Dependencies
- `_docs/02_document/contracts/replay/replay_protocol.md` (v2.0.0) — `ReplayInputAdapter` API; `time_offset_ms` semantics (Invariant 8).
- `_docs/02_document/architecture.md`**ADR-011** (replay-as-configuration; ReplayInputAdapter is the architectural seam between (video, tlog) and the rest of the system) + R-DEMO-1 mitigation.
- `_docs/02_document/module-layout.md``shared/replay_input` cross-cutting entry.
- Epic AZ-265 description in `_docs/02_document/epics.md` — AC-7 / AC-8 / AC-9 / AC-10.
## Problem
Two problems:
1. **Without `replay_input/`** there is no module-level home for the `(video, tlog)``(FrameSource, FcAdapter, Clock)` convergence; the composition root would need to instantiate each strategy individually + know about auto-sync + apply the manual override — all replay-specific code leaking into `compose_root`. Per ADR-011 the composition root should see only standard `FrameSource` + `FcAdapter` + `Clock` instances after the coordinator is opened; this task creates the coordinator.
2. **Without auto-sync** the replay CLI relies on the operator passing `--time-offset-ms N` manually, which is error-prone (operators often don't have a stopwatch on the moment of take-off; the camera and FC are routinely started at different times). R-DEMO-1 is a recurring real-world concern. AC-7 / AC-8 codify the auto-sync expectation.
## Outcome
- `src/gps_denied_onboard/replay_input/__init__.py`:
- Re-exports `ReplayInputAdapter`, `ReplayInputBundle`, `AutoSyncDecision`, `AutoSyncConfig`, `ReplayInputAdapterError`.
- `src/gps_denied_onboard/replay_input/interface.py`:
- `ReplayInputBundle` frozen+slots dataclass.
- `AutoSyncDecision` frozen+slots dataclass.
- `AutoSyncConfig` frozen+slots dataclass (defaults + thresholds).
- `src/gps_denied_onboard/replay_input/tlog_video_adapter.py`:
- `ReplayInputAdapter` class with `open()` + `close()` (idempotent close).
- Inside `open()`: resolve time-offset (auto-sync OR manual) → instantiate strategies → return bundle.
- Fails fast if required tlog message types absent (R-DEMO-3); raises `ReplayInputAdapterError("tlog missing required message types: ...")`.
- `src/gps_denied_onboard/replay_input/auto_sync.py`:
- `detect_tlog_takeoff(tlog_path, target_fc_dialect) -> AutoSyncResult` — pymavlink stream-parse; sustained vertical-accel + attitude-rate detector.
- `detect_video_motion_onset(video_path, frame_rate_hz) -> AutoSyncResult` — OpenCV pyramidal optical flow.
- `compute_offset(tlog_result, video_result) -> AutoSyncOffset` — combination + confidence.
- `validate_offset_or_fail(offset, tlog_path, video_path, frame_rate_hz, threshold_pct) -> int` — AC-8 validator.
- `src/gps_denied_onboard/replay_input/tests/` — unit tests:
- `test_tlog_takeoff_detector_positive` (AC-1).
- `test_tlog_takeoff_detector_ambiguous` (AC-2).
- `test_tlog_takeoff_detector_hand_launch` (AC-3).
- `test_video_motion_onset_positive` (AC-4).
- `test_combined_offset_within_200ms` (AC-5).
- `test_combined_offset_low_confidence_warn_and_proceed` (AC-6).
- `test_ac8_validator_hard_fail` (AC-7).
- `test_manual_override_bypasses_auto_detect` (AC-8).
- `test_frame_window_match_validator_threshold` (AC-9).
- `test_confidence_score_deterministic` (AC-10).
- `test_replay_input_adapter_open_returns_bundle` (covers the coordinator wiring; AC-11 below).
- `test_replay_input_adapter_clock_strategy_pace_asap` (TlogDerivedClock).
- `test_replay_input_adapter_clock_strategy_pace_realtime` (WallClock).
- `test_replay_input_adapter_close_idempotent`.
- `test_replay_input_adapter_missing_tlog_messages_fails_fast` (R-DEMO-3).
- INFO log on auto-detect success: `kind="replay.auto_sync.detected"` with `{tlog_takeoff_ns, video_motion_onset_ns, offset_ms, tlog_confidence, video_confidence, combined_confidence}`.
- WARN log on low confidence: `kind="replay.auto_sync.low_confidence"` with the same fields + `proceeding_with_best_guess: true`.
- ERROR log on AC-8 fail: `kind="replay.auto_sync.ac8_validation_failed"` with `{frame_window_match_pct, threshold_pct: 95.0}`.
- FDR records mirror all three log kinds.
## Scope
### Included
- `replay_input/` module structure (`__init__.py`, `interface.py`, `tlog_video_adapter.py`, `auto_sync.py`, `tests/`).
- `ReplayInputAdapter` class with `open()` + `close()`.
- Tlog-takeoff detector (sustained vertical accel + attitude rate).
- Video-motion-onset detector (pyramidal optical flow).
- Combined offset computation + confidence.
- AC-8 frame-window match-percentage validator.
- Manual override (`manual_time_offset_ms is not None`) bypass path.
- Structured logging + FDR.
- All unit tests listed above.
### Excluded
- E2E test against the Derkachi fixture — owned by AZ-404 (this task ships unit tests; AZ-404 adds the integration assertion AC-7 / AC-8 / AC-9).
- The CLI argparse + entrypoint — owned by AZ-402.
- The composition root branch on `config.mode` — owned by AZ-401.
- `VideoFileFrameSource` + `Clock` strategies themselves — owned by AZ-398.
- `TlogReplayFcAdapter` itself — owned by AZ-399.
## Acceptance Criteria
**AC-1: Tlog take-off detector positive** — synthetic AP IMU trace with a clear take-off (sustained 1.2 g vertical for 1 s + 1.5 rad/s attitude rate) → `tlog_takeoff_ns` matches the synthetic onset within ± 50 ms; `confidence ≥ 0.85`.
**AC-2: Tlog take-off detector ambiguous** — synthetic IMU with low-amplitude vibration (0.3 g) but no take-off → `confidence < 0.50`.
**AC-3: Tlog take-off detector hand-launch** — synthetic IMU with abrupt 0.8 g impulse but no sustained climb → `confidence < 0.80` (in the WARN-and-proceed regime per AC-7).
**AC-4: Video motion-onset positive** — synthetic 60-frame video with first 10 frames stationary and frames 11+ moving → `video_motion_onset_ns` matches the onset of frame 11 within ± 1 frame.
**AC-5: Combined offset within ± 200 ms (epic AC-7)** — for a fixture with KNOWN ground-truth offset (e.g., constructed test case offset = 5000 ms), `compute_offset` returns within ± 200 ms of ground truth.
**AC-6: Low combined confidence WARN-and-proceed** — when `combined_confidence < 0.80`, `ReplayInputAdapter.open()` returns the bundle with the best-guess offset + WARN log; does NOT raise — verified via the unit test of the coordinator.
**AC-7: AC-8 hard-fail raises** — wire a `validate_offset_or_fail` against a deliberately-bad offset (e.g., 60 s offset on a 60 s clip — every frame would be off the tlog window); `ReplayInputAdapter.open()` raises `ReplayInputAdapterError("auto-sync hard-fail: …")` so the shared main maps to CLI exit code 2; ERROR log + FDR fired.
**AC-8: Manual override bypasses auto-detect**`ReplayInputAdapter(manual_time_offset_ms=5000, …).open()``detect_*` and `compute_offset` are NOT invoked (verified via call-count assertion); the manual offset flows directly into `TlogReplayFcAdapter`. AC-8 validator still runs (so a wildly wrong manual offset still fails fast).
**AC-9: Frame-window match-percentage validator** — for a known-good offset, validator computes ≥ 95 % match (returns 0); for a known-bad offset, computes ≤ 95 % (returns 2). Threshold is configurable via `config.replay.auto_sync_match_threshold_pct` (default 95.0).
**AC-10: Confidence-score determinism** — re-run the auto-sync against the same input twice; assert confidence values match within 1e-9 (algorithmic determinism).
**AC-11: ReplayInputAdapter.open() returns a complete bundle**`bundle = adapter.open()` returns a `ReplayInputBundle` with `isinstance(bundle.frame_source, VideoFileFrameSource)`, `isinstance(bundle.fc_adapter, TlogReplayFcAdapter)`, and `bundle.clock` matching the pace (`TlogDerivedClock` for ASAP, `WallClock` for REALTIME). The `resolved_time_offset_ms` field equals either the manual override or the auto-sync result.
**AC-12: Close is idempotent**`adapter.open(); adapter.close(); adapter.close()` does not raise; the second close is a no-op.
**AC-13: Missing tlog messages fail fast** — open against a tlog missing `RAW_IMU` (AP) or `MSP2_RAW_IMU` (iNav); assert `ReplayInputAdapterError("tlog missing required message types: ['RAW_IMU']")` is raised inside `open()` BEFORE any video read (R-DEMO-3).
## Non-Functional Requirements
- Auto-sync startup overhead p99 ≤ 3 s (within the epic's cold-start ≤ 5 s budget combined with composition).
- Tlog-takeoff detection: full tlog scan ≤ 1 s for tlogs up to 100 MB (typical 12 min clip is ~10 MB).
- Video-motion-onset detection: scan the first 10 s of the video; ≤ 1 s on Tier-1 hardware.
## Constraints
- OpenCV (already in deps for video) is the optical flow library.
- pymavlink (already bundled per D-C8-3) is the tlog reader.
- The take-off pattern thresholds (0.5 g, 1 rad/s, 0.5 s sustained) are in `config.replay.auto_sync.takeoff_*` with documented defaults.
- The video-motion threshold is similarly configurable.
- AC-8's 95 % match threshold is configurable per `config.replay.auto_sync_match_threshold_pct`.
- `ReplayInputAdapter` is a Layer-4 module (per `module-layout.md`); it imports from Layer 1 (`frame_source` interface, `clock` interface, `_types`, `config`, `logging`, `fdr_client`, `helpers.wgs_converter`) and instantiates Layer-4 strategies (`c8_fc_adapter.tlog_replay_adapter`, `frame_source.video_file_frame_source`); it does NOT import from Layer 3 (no component-level dependencies).
## Risks & Mitigation
- **R-DEMO-1 (drift / unsynchronised recordings)** — *Mitigation*: this task IS the mitigation; AC-1..AC-5 cover the positive cases; AC-6 covers the WARN-and-proceed regime; AC-7 covers the hard-fail regime.
- **R-DEMO-3 (demo footage missing required FC messages)** — *Mitigation*: AC-13 fails fast at startup with a clear message naming the missing types.
- **Risk: optical-flow false-positives on jitter-only video** — *Mitigation*: configurable threshold; sustained-for-0.5 s requirement matches the take-off semantics; AC-2 covers the ambiguous case.
- **Risk: fixed-wing hand-launch hits the WARN regime even on legitimate footage** — *Mitigation*: documented; operator can pass `--time-offset-ms` manually; AC-3 documents the expected confidence drop.
- **Risk: AC-8 95 % threshold too strict for short clips with sparse IMU** — *Mitigation*: threshold is configurable; default 95 % is calibrated for typical tlog rates (50200 Hz IMU).
- **Risk (new): the coordinator class adds a new architectural seam that might leak `if mode == replay` plumbing into `compose_root`** — *Mitigation*: AZ-401's AC-7 (AST scan) catches this; the coordinator's API surface (open() → bundle) is designed so the composition root sees only standard interfaces past `.open()`.
## Runtime Completeness
- **Named capability**: `replay_input/` Layer-4 coordinator that converges `(video, tlog)` into the standard `FrameSource` + `FcAdapter` + `Clock` surfaces, owning time-alignment between them.
- **Production code**: real OpenCV optical flow, real pymavlink tlog scan, real confidence-scored combined offset, real AC-8 validator, real strategy instantiation, real Clock-pace selection.
- **Allowed external stubs**: test fakes only.
- **Unacceptable substitutes**: a hardcoded `time_offset_ms = 0` default (defeats R-DEMO-1 mitigation); placing the coordinator inside `cli/replay.py` (defeats the Layer-4 separation and forces the CLI to know about strategy instantiation — that belongs in the composition root branch, which itself delegates to `replay_input/`).
## Contract
Implements epic AZ-265 ACs 7 + 8; mitigates R-DEMO-1 + R-DEMO-3. Implements the `ReplayInputAdapter` surface specified in `_docs/02_document/contracts/replay/replay_protocol.md` (v2.0.0). Operationalises the `replay_input/` cross-cutting module from ADR-011.