Files
gps-denied-onboard/_docs/02_tasks/todo/AZ-401_replay_compose.md
T
Oleksandr Bezdieniezhnykh 5fe67023b2 [AZ-329] [AZ-330] [AZ-523] [AZ-524] Batch 44 atomic refactor
Implements two new C12 services and rebalances the C11/C12 boundary
in one atomic commit:

* AZ-329 PostLandingUploadOrchestrator — gates C11 upload on the
  `flight_footer` FDR record's `clean_shutdown` field; 4 refusal
  modes; new FdrFooterReader Protocol + LocalFdrFooterReader.
* AZ-330 OperatorReLocService — AC-3.4 visual-loss re-localization
  hint; reuses shared LatLonAlt; OperatorCommandTransport Protocol
  cut (E-C8 owns the future pymavlink concrete); new FDR record
  kind `c12.reloc.requested`; log redaction (lat/lon 5 decimals,
  reason 200 chars).
* AZ-523 C11 internal flight-state gate removed (SRP refactor):
  `confirm_flight_state` / `FlightStateSignal` use /
  `FlightStateNotOnGroundError` deleted from C11; TileUploader
  contract bumped to v2.0.0 (frozen) with migration note; AZ-317
  superseded.
* AZ-524 Package rename `c12_operator_tooling` →
  `c12_operator_orchestrator` across source, tests, pyproject,
  CMake, Dockerfile, compose, CI, runtime-root services class
  (`OperatorOrchestratorServices`) + factory function
  (`build_operator_orchestrator`), logger namespaces, config slug,
  docs, and the E-C12 epic title.

Tests: 1543 passed, 80 skipped (all environment gates). Targeted
AC suite (AZ-329 + AZ-330 + FdrFooterReader): 37 passed. Cold-start
NFR-perf still ≤ 500 ms p99.

Tracker: AZ-317 → Done (superseded); AZ-319 v2.0.0 contract bump
comment; AZ-329/AZ-330 → In Testing; AZ-253 epic renamed; AZ-523
+ AZ-524 created and closed as audit-trail tickets.

See `_docs/03_implementation/batch_44_cycle1_report.md`.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 19:42:46 +03:00

8.3 KiB
Raw Blame History

Replay — compose_replay(config) + Clock injection (R-DEMO-4)

Task: AZ-401_replay_compose Name: compose_replay(config) -> ReplayRoot + Clock injection across C1C5 Description: Implement compose_replay(config: Config) -> ReplayRoot at src/gps_denied_onboard/runtime_root/replay.py (alongside the existing compose_root and compose_operator). Resolves ALL strategies for the replay binary: FrameSource = VideoFileFrameSource; FcAdapter = TlogReplayFcAdapter; Sink = JsonlReplaySink; Clock = TlogDerivedClock (when pace=ASAP) OR WallClock (when pace=REALTIME); ALL of C1C5 wired with the SAME Public API as the live compose_root (per Invariant 1 — no replay-aware branches in components). NO C6/C10/C11/C12 (replay reads pre-built tile cache; no operator-side workflows). Configuration loading (config.yaml) + camera-calibration loading (calib.json) handled here. The ReplayRoot dataclass holds: frame_source, fc_adapter, replay_sink, clock, vio (C1), vpr (C2), rerank (C2.5), matcher (C3), refiner (C3.5), pose_estimator (C4), state_estimator (C5), and runtime_loop() method that drives the per-frame loop documented in the contract. Build-flag check at startup: refuses to run if any of BUILD_VIDEO_FILE_FRAME_SOURCE, BUILD_TLOG_REPLAY_ADAPTER, BUILD_REPLAY_SINK_JSONL is OFF — these are mandatory for the replay binary. Complexity: 3 points Dependencies: AZ-398 (FrameSource + Clock); AZ-399 (TlogReplayFcAdapter); AZ-400 (JsonlReplaySink); AZ-269 / AZ-270 (config); AZ-263; AZ-266; AZ-272; AZ-390 (E-C8 FcAdapter Protocol the tlog adapter implements); all C1C5 epics composed at runtime via their Public APIs: AZ-254 (C1), AZ-255 (C2), AZ-256 (C2.5), AZ-257 (C3), AZ-258 (C3.5), AZ-259 (C4), AZ-260 (C5) — concrete strategy task IDs flow in through each component's composition factory, not through this composition root directly Component: replay-composition (epic AZ-265 / E-DEMO-REPLAY) — lives in runtime_root/replay.py Tracker: AZ-401 Epic: AZ-265 (E-DEMO-REPLAY)

Document Dependencies

  • _docs/02_document/contracts/replay/replay_protocol.md — replay composition + runtime loop body.
  • _docs/02_document/module-layout.mdruntime_root.py composition root location.
  • _docs/02_document/architecture.md — ADR-001 / ADR-002 / ADR-009.
  • _docs/02_document/contracts/c5_state/state_estimator_protocol.mdEstimatorOutput consumed by the sink.

Problem

Without this task, the replay-only strategies (FrameSource + Clock + TlogReplayFcAdapter + JsonlReplaySink) have no composition root that wires them with C1C5; the per-frame runtime loop is undefined; the CLI has nothing to invoke. This is the integration point where replay strategies meet production components.

Outcome

  • src/gps_denied_onboard/runtime_root/replay.py:
    • ReplayPace enum (REALTIME / ASAP).
    • ReplayRoot dataclass (frozen + slots; holds all wired components).
    • compose_replay(config: Config) -> ReplayRoot.
    • ReplayRoot.runtime_loop() -> int (returns exit code; 0 on success, 2 on AC-8 sync-impossible, 1 on any other error).
  • The composition root invokes build_* factories from each component's existing factory module (no new factory APIs in scope here — they all exist from the C1C8 epics).
  • Build-flag check at startup: refuses to run if any mandatory replay-only flag is OFF; raises ReplayCompositionError with the OFF-flag list.
  • INFO log on startup: kind="replay.compose_root.ready" with {config_path, calib_path, pace, time_offset_ms, video_path, tlog_path, output_path}.
  • DEBUG log per loop iteration: kind="replay.loop.tick" (every 100 frames).
  • Unit tests: composition resolves + returns ReplayRoot, build-flag check rejects on missing flag, runtime_loop terminates on next_frame() -> None, runtime_loop emits one EstimatorOutput per processed frame, AC-8 sync-impossible exit code 2.

Scope

Included

  • compose_replay body.
  • ReplayRoot dataclass.
  • runtime_loop() driving the per-frame loop documented in the contract.
  • Build-flag check at startup.
  • Configuration + calibration loading (re-uses existing config loader from AZ-269/AZ-270).
  • Unit tests including build-flag rejection + frame-by-frame loop.

Excluded

  • CLI argparse + entrypoint — owned by CLI task.
  • Auto-sync IMU take-off detection — owned by AZ-405 (this task accepts time_offset_ms from config or CLI override).
  • Dockerfile + CI — owned by Docker task.
  • E2E replay fixture test — owned by E2E task.
  • C6/C10/C11/C12 wiring — explicitly NOT included (per epic scope).

Acceptance Criteria

AC-1: ReplayRoot returned with all components wiredcompose_replay(valid_config) returns a ReplayRoot with non-None values for all fields (frame_source, fc_adapter, replay_sink, clock, vio, vpr, rerank, matcher, refiner, pose_estimator, state_estimator).

AC-2: Build-flag check — with BUILD_VIDEO_FILE_FRAME_SOURCE=OFF, compose_replay(...)ReplayCompositionError("BUILD_VIDEO_FILE_FRAME_SOURCE is OFF; replay binary requires it").

AC-3: ASAP → TlogDerivedClock; REALTIME → WallClockpace=ASAP resolves Clock = TlogDerivedClock; pace=REALTIME resolves Clock = WallClock. Verify via isinstance(replay_root.clock, ...).

AC-4: Runtime loop terminates on EOS — wire a FakeFrameSource returning 10 frames + None; call runtime_loop(); assert it returns 0 after exactly 10 frame cycles.

AC-5: One EstimatorOutput per frame — drive 10 frames; assert JsonlReplaySink.emit was called exactly 10 times with EstimatorOutput instances.

AC-6: AC-8 sync-impossible exit code 2 — wire a tlog adapter that reports < 95 % frame-window match (auto-sync hard-fail per AC-8 of the epic); runtime_loop() returns 2.

AC-7: Composition uses Public APIs only — assert that compose_replay imports ONLY __init__.py re-exports of each component (per module-layout.md Layer-3 / Layer-4 rules). CI-style check via AST scan in the unit test.

AC-8: No C6/C10/C11/C12 imports — assert that compose_replay does NOT import any symbol from components.c6_tile_cache, components.c10_provisioning, components.c11_tilemanager, components.c12_operator_orchestrator (per epic scope).

AC-9: Configuration + calibration loadingcompose_replay(config_with_invalid_calib_path)ReplayCompositionError("camera-calibration not found at ...").

AC-10: Single-Clock invariant — assert that the same Clock instance is injected into all components that need one (no two distinct Clock instances per process); check via id() comparison across consumers.

Non-Functional Requirements

  • compose_replay p99 ≤ 1 s (one-time startup cost; epic NFT cold-start ≤ 5 s).
  • runtime_loop() per-frame overhead (NOT counting C1C5 work) p99 ≤ 1 ms.

Constraints

  • ADR-001 / ADR-002 / ADR-009 unchanged.
  • Public API discipline (Layer-3 / Layer-4 from module-layout.md).
  • C1C5 components MUST remain mode-agnostic (Invariant 1 enforced by AST scan in AZ-404).
  • All time-driven logic uses injected Clock (Invariant 2).
  • NO HTTP server in the replay binary (per epic scope).

Risks & Mitigation

  • R-DEMO-4 (production C1C5 paths bake real-time-cadence assumptions)Mitigation: Clock injection (Invariant 2). Documented as ADR amendment in next architecture-doc cycle.
  • Risk: composition root is the single biggest churn surface for new componentsMitigation: re-use existing per-component build_* factories; this task does NOT introduce new factory APIs.
  • Risk: builders fail in subtle ways under build-flag combinationsMitigation: AC-2 + AC-7 + AC-8 cover the failure modes; unit-test-grade build-flag matrix on every PR.

Runtime Completeness

  • Named capability: replay-binary composition root + per-frame runtime loop.
  • Production code: real strategy resolution, real ReplayRoot dataclass, real runtime loop, real build-flag check.
  • Allowed external stubs: test fakes only (FakeFrameSource, FakeFcAdapter, FakeReplaySink) for unit tests.
  • Unacceptable substitutes: hardcoding strategies in the loop body (defeats ADR-009); embedding component-construction logic in the loop (defeats single-responsibility).

Contract

Implements _docs/02_document/contracts/replay/replay_protocol.md — replay composition + runtime loop.