Files
Oleksandr Bezdieniezhnykh 8149083cac [AZ-405] Replay — replay_input/ coordinator + IMU take-off auto-sync
Adds the Layer-4 cross-cutting `replay_input/` module per ADR-011:
ReplayInputAdapter converges (video, tlog) into the standard
FrameSource + FcAdapter + Clock surfaces the airborne composition
root consumes. Owns time-alignment between video frames and tlog
IMU/attitude ticks (manual via --time-offset-ms or auto via the
AZ-405 IMU-take-off detector + Farneback motion-onset detector).

Auto-sync algorithm (auto_sync.py):
- Tlog take-off detector: sustained vertical-accel excess > 0.5 g for
  >= 0.5 s + sustained attitude-rate magnitude > 1 rad/s.
- Video motion-onset detector: dense Farneback flow magnitude > 1.5 px
  sustained >= 0.5 s (deterministic per AC-10).
- compute_offset combines the two; confidence = min(tlog, video).
- validate_offset_or_fail implements the AC-9 95 % frame-window match
  validator with configurable threshold + window.

ReplayInputAdapter.open() ordering (AC-13):
1. Load tlog samples + fail-fast on missing RAW_IMU/SCALED_IMU2 or
   ATTITUDE BEFORE any video read.
2. Resolve offset (auto-sync OR manual override; manual bypasses the
   detectors entirely per AC-8).
3. Run AC-9 validator on resolved offset; raise auto-sync hard-fail
   for AC-7 (CLI exit 2 mapping).
4. Build single Clock instance per pace (TlogDerived/ASAP, Wall/REAL).
5. Construct VideoFileFrameSource and TlogReplayFcAdapter with the
   resolved offset baked in (replay protocol Invariant 8).

Structured log + FDR records on auto-sync detected / low-confidence /
AC-8 hard-fail kinds. Idempotent close (AC-12).

Tests: 25 unit tests across tests/unit/replay_input/ covering all 13
ACs (kernel-level synthetic fixtures for AC-1..AC-10; coordinator-
level OpenCV synthetic videos + faked pymavlink for AC-6..AC-13).

Contract update: replay_protocol.md v2.0.0 added fdr_client to the
ReplayInputAdapter __init__ signature (was missing in the prose; the
task spec already listed it in the allowed-imports section).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 09:50:51 +03:00

12 KiB

Code Review Report

Batch: 60 (AZ-405) Date: 2026-05-14 Verdict: PASS_WITH_WARNINGS

Findings

# Severity Category File:Line Title
1 Medium Spec-Gap _docs/02_document/contracts/replay/replay_protocol.md:134-145 Contract ReplayInputAdapter.__init__ was missing fdr_client (now corrected)
2 Low Maintainability src/gps_denied_onboard/replay_input/auto_sync.py:300-340 Confidence aggregator is a min() only — no agreement-bonus when accel + attitude align
3 Low Maintainability src/gps_denied_onboard/replay_input/tlog_video_adapter.py Three test-only injection kwargs (tlog_source_factory, video_frames_factory, video_timestamps_factory) added to constructor

Finding Details

F1: Contract ReplayInputAdapter.__init__ did not list fdr_client (Medium / Spec-Gap)

  • Location: _docs/02_document/contracts/replay/replay_protocol.md:134-145
  • Description: The replay protocol contract v2.0.0 specified the ReplayInputAdapter.__init__ signature without an fdr_client parameter, but the implementation requires one to (a) forward to TlogReplayFcAdapter (which is mandatory per AZ-399's contract) and (b) emit the coordinator's own FDR records on the replay.auto_sync.detected / replay.auto_sync.low_confidence / replay.auto_sync.ac8_validation_failed paths. Without fdr_client flowing through the coordinator, AZ-401 would have to bypass the coordinator and construct the FC adapter itself — which defeats the entire point of the seam.
  • Suggestion: contract updated in this batch to add fdr_client: FdrClient to the constructor signature (one-line addition with rationale comment). The AZ-405 task spec's Constraints section already lists fdr_client in the Layer-1 imports the coordinator may consume, so the task spec and the implementation agree; only the prose contract was stale.
  • Task: AZ-405

F2: Confidence aggregator uses min() only (Low / Maintainability)

  • Location: src/gps_denied_onboard/replay_input/auto_sync.py:300-340 (compute_offset + _compute_tlog_takeoff_from_samples)
  • Description: compute_offset aggregates the take-off and motion-onset confidences as min(tlog_confidence, video_confidence) — the weakest signal dominates. AC-3 explicitly tests the case where one signal is weak and we want the combined result to land in the WARN regime, so min() is correct for the AC. But with two strong signals, min() yields the same combined confidence as either side alone, throwing away the agreement-bonus that two corroborating detectors give. Today the AC bar is "≥ 0.85 confidence" so this is a non-issue.
  • Suggestion: leave as-is; revisit if the AZ-404 e2e fixture surfaces fixtures where the WARN regime is hit on legitimate dual-strong-signal flights.
  • Task: AZ-405

F3: Test-only injection kwargs leak into the production constructor (Low / Maintainability)

  • Location: src/gps_denied_onboard/replay_input/tlog_video_adapter.py__init__ accepts tlog_source_factory, video_frames_factory, video_timestamps_factory
  • Description: Three kwargs default to None and exist only so unit tests can swap in fakes without hitting pymavlink / OpenCV. Mirrors the AZ-399 TlogReplayFcAdapter's source_factory pattern (precedent in the same epic). Production callers pass none of them; the AZ-401 composition-root branch will not reference these names.
  • Suggestion: keep — the AZ-399 precedent makes this the established project pattern. Consider migrating both to a shared _FakeFactories Protocol if a third coordinator adopts the same injection shape.
  • Task: AZ-405

Phase Summary

Phase 1 — Context Loading

Read inputs:

  • _docs/02_tasks/todo/AZ-405_replay_auto_sync.md
  • _docs/02_document/contracts/replay/replay_protocol.md (v2.0.0)
  • _docs/02_document/architecture.md (ADR-011)
  • _docs/02_document/module-layout.md (Layer 4, shared/replay_input entry)
  • _docs/02_document/epics.md (E-DEMO-REPLAY ACs 7 / 8 / 9 / 10)

Phase 2 — Spec Compliance

All 13 acceptance criteria are covered by tests in tests/unit/replay_input/:

AC Test Status
AC-1 test_ac1_tlog_takeoff_detector_positive_within_50ms_and_high_confidence Covered
AC-2 test_ac2_tlog_takeoff_detector_low_amplitude_vibration_low_confidence Covered
AC-3 test_ac3_tlog_takeoff_detector_hand_launch_warn_regime Covered
AC-4 test_ac4_video_motion_onset_detected_within_one_frame Covered
AC-5 test_ac5_combined_offset_within_200ms_of_ground_truth Covered
AC-6 test_ac6_low_confidence_warn_and_proceed_does_not_raise (+ test_ac6_combined_confidence_takes_minimum_of_inputs) Covered
AC-7 test_ac7_validator_hard_fail_returns_2_for_offset_outside_window (kernel) + test_ac7_ac8_validator_hard_fail_raises_on_open (coordinator) Covered
AC-8 test_ac8_manual_override_bypasses_auto_detect Covered
AC-9 test_ac9_validator_passes_for_well_matched_offset + test_ac9_threshold_configurable Covered
AC-10 test_ac10_confidence_score_deterministic_across_two_runs + test_ac10_video_onset_deterministic_across_two_runs Covered
AC-11 test_ac11_open_returns_complete_bundle_with_correct_strategies + _pace_realtime_yields_wall_clock + _pace_asap_yields_tlog_derived_clock + _resolved_offset_matches_auto_sync_result Covered
AC-12 test_ac12_close_is_idempotent + test_close_without_open_does_not_raise Covered
AC-13 test_ac13_missing_imu_messages_fails_fast_before_video_read + _missing_attitude_messages_fails_fast Covered

Contract compliance — ReplayInputAdapter.open() raises with the contract-mandated messages:

  • "tlog missing required message types: ..." — verified by AC-13 tests
  • "auto-sync hard-fail: ..." — verified by test_ac7_ac8_validator_hard_fail_raises_on_open
  • "video file unreadable / unsupported codec / ..." — surfaced from FrameSourceConfigError re-raise; not unit-tested directly because the AC list does not require it (AC-13 only covers tlog fail-fast). Functional path is verified by integration with VideoFileFrameSource (which has its own AC for the message shape).

ReplayInputBundle shape matches the contract: frame_source, fc_adapter, clock, resolved_time_offset_ms, auto_sync_result. Frozen + slotted dataclass per ADR-002.

Phase 3 — Code Quality

  • SOLID: auto_sync.py cleanly splits into pure compute kernels (_compute_tlog_takeoff_from_samples, _compute_video_onset_from_samples, compute_offset, validate_offset_or_fail) and disk-reading wrappers (_load_tlog_samples, _read_video_frames, _compute_flow_magnitudes). Tests target the kernels — disk IO is exercised only via the wrappers.
  • Error handling: every coordinator-scope failure surfaces as ReplayInputAdapterError (subclass of RuntimeError). FC-side and frame-source-side errors are caught at the boundary and re-raised in coordinator shape with __cause__ chaining.
  • Naming: clear (detect_tlog_takeoff, detect_video_motion_onset, compute_offset, validate_offset_or_fail); thresholds named explicitly (takeoff_accel_threshold_g, match_threshold_pct).
  • Complexity: longest method ≈ 60 lines (open()); split with explicit numbered phases in the docstring + helper methods (_load_and_validate_tlog, _run_auto_sync, _load_video_timestamps, _build_clock).
  • Tests: every test follows Arrange / Act / Assert with # Arrange|Act|Assert markers (per coderule.mdc).
  • Dead code: none introduced. auto_sync.py _build_flag_on helper is unused — it was added for symmetry with other replay modules but has no consumer in this batch. Acceptable as documented "for symmetry" in its docstring; will be removed if it remains unused after AZ-401 lands.

Phase 4 — Security

  • No SQL / command injection vectors.
  • No hardcoded secrets.
  • Tlog and video file paths are operator-supplied. Both are normalised to pathlib.Path; existence checks happen before any file is opened.
  • Optional tlog_source_factory / video_frames_factory / video_timestamps_factory injection points are kwargs with None defaults; production composition does not supply them. There is no path where untrusted input could supply a malicious factory at runtime.
  • The OpenCV dense-flow pass (cv2.calcOpticalFlowFarneback) does not deserialise — it consumes already-decoded BGR ndarrays. No unsafe deserialisation surface.

Phase 5 — Performance

  • Tlog scan is bounded by prescan_max_messages (default 6000 — ~30 s @ 200 Hz) and runs exactly once per open() (the result is reused for both the AC-13 missing-messages check AND the auto-sync take-off detector). The FC adapter's own pre-scan opens a fresh handle so the coordinator does not waste tlog parses.
  • Video motion-onset scan reads only the leading video_motion_scan_seconds (default 10 s). Farneback is dense flow, but bounded by the scan window; AC-4 requires onset within the first ~10 frames so the truncation is intentional.
  • AC-9 validator uses bisect.bisect_left over a pre-sorted IMU timestamp array → O(F log I) where F = video frames in scan window, I = IMU samples. Linear in the worst case.
  • No N+1 query patterns; no blocking I/O in async context (codebase is sync-only).

Phase 6 — Cross-Task Consistency

  • AZ-405 consumes TlogReplayFcAdapter (AZ-399) + VideoFileFrameSource + WallClock + TlogDerivedClock (AZ-398) + FdrClient (AZ-273) + WgsConverter (AZ-279) + iso_ts_now (AZ-264). All consumed from their documented Public APIs.
  • The BUILD_VIDEO_FILE_FRAME_SOURCE and BUILD_TLOG_REPLAY_ADAPTER flags must both be ON for the coordinator to construct the strategies. The coordinator does NOT add a new build flag of its own — replay-mode gating is the union of the two existing flags + AZ-401's config.mode == "replay" check (per spec).
  • AutoSyncConfig defaults match the replay_protocol.md v2.0.0 contract and the AZ-405 spec's "0.5 g, 1 rad/s, 0.5 s sustained" thresholds. AZ-401 will map config.replay.auto_sync.* into an AutoSyncConfig(...) instance.

Phase 7 — Architecture Compliance

  • Layer direction: replay_input is at Layer 4 per module-layout.md. Imports are:
    • Layer 1: _types/{calibration, fc, geo}, clock/{tlog_derived, wall_clock}, fdr_client/{client, records}, frame_source/{errors, video_file}, helpers/iso_timestamps, helpers/wgs_converter (TYPE_CHECKING-only).
    • Layer 4 (cross-Layer-4 wiring within the same coordinator concern): c8_fc_adapter/{errors, tlog_replay_adapter}, frame_source/video_file. These are documented in module-layout.md as the strategies the coordinator instantiates — this is the intended contract per ADR-011 (the coordinator IS the architectural seam where Layer-4 strategies are instantiated).
    • No imports from Layer 3 (no component dependencies). Verified by grep over the new files.
  • Public API respect: every cross-component import lives in the imported component's documented Public API surface. (tlog_replay_adapter.TlogReplayFcAdapter, tlog_replay_adapter.ReplayPace — both exported in the AZ-399 module's __all__.)
  • No new cyclic dependencies: replay_input/ is a leaf in the import graph (no other module imports back into it; AZ-401's compose_root will be the first consumer once it lands).
  • Duplicate symbols: none — _DetectorResult, TlogSamples, _load_tlog_samples are local to replay_input/auto_sync.py. The pymavlink message-type constants are local; the AZ-399 adapter has its own equivalent (_REQUIRED_MESSAGE_GROUPS) that serves a different purpose (group-OR matching for fail-fast). No overlap warrants extraction.
  • Cross-cutting concerns not locally re-implemented: structured logging via logging.getLogger; FDR enqueue via FdrClient.enqueue; ISO timestamps via iso_ts_now. All consumed from shared helpers.

Verdict Logic

  • 0 Critical, 0 High, 1 Medium (Spec-Gap that was resolved in this batch by updating the contract), 2 Low → PASS_WITH_WARNINGS.

Outputs

  • verdict: PASS_WITH_WARNINGS
  • findings: 3 (1 Medium + 2 Low)
  • critical_count: 0
  • high_count: 0
  • report_path: _docs/03_implementation/reviews/batch_60_review.md