Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
13 KiB
Contract: Replay Mode (FrameSource + ReplaySink + Clock + replay composition)
Owner: replay (epic AZ-265 / E-DEMO-REPLAY) — strategies live inside existing components (frame_source/, c8_fc_adapter/); only the composition root and CLI are net-new top-level files.
Producer task: AZ-398 (FrameSource Protocol + VideoFileFrameSource + LiveCameraFrameSource retrofit + Clock Protocol)
Consumer tasks: AZ-399 (TlogReplayFcAdapter), AZ-400 (ReplaySink + JsonlReplaySink), AZ-401 (compose_replay + Clock injection), AZ-402 (gps-denied-replay CLI), AZ-403 (Dockerfile + CI matrix + SBOM diff), AZ-404 (E2E replay fixture test), AZ-405 (Auto-sync IMU take-off detection).
Version: 1.0.0
Status: draft
Last Updated: 2026-05-10
Module-layout home:
src/gps_denied_onboard/frame_source/interface.py,__init__.py—FrameSourceProtocol (Layer 1 cross-cutting permodule-layout.md).src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py—TlogReplayFcAdapter(gatedBUILD_TLOG_REPLAY_ADAPTER).src/gps_denied_onboard/components/c8_fc_adapter/replay_sink.py—ReplaySinkinterface +JsonlReplaySink(gatedBUILD_REPLAY_SINK_JSONL).src/gps_denied_onboard/clock/interface.py,__init__.py—ClockProtocol.src/gps_denied_onboard/runtime_root/replay.py—compose_replay(config) -> ReplayRoot.
Purpose
Defines the public interfaces enabling offline replay mode per epic AZ-265: run the production C1–C5 pipeline against historical inputs (1–2 min Derkachi-style clip + matching pymavlink .tlog) so the parent-suite UI demo has end-to-end fidelity equal to a live flight. Production C1–C5 components MUST remain mode-agnostic — replay-aware logic lives ONLY in the composition root, the new strategies, and the CLI. The replay binary is a fourth Docker image (gps-denied-replay-cli) containing C1–C5 + replay strategies but NOT C6/C10/C11/C12 (no operator-side workflows; tile cache is read pre-built).
This contract defines four Protocols and the replay composition surface:
FrameSource— the formalised cross-cutting interface for camera-frame ingestion (previously implicit). Two strategies:LiveCameraFrameSource(retrofit; existing camera plumbing renamed and put behind the Protocol) andVideoFileFrameSource(replay-only, gatedBUILD_VIDEO_FILE_FRAME_SOURCE).Clock— the wall-clock vs. tlog-derived time abstraction (R-DEMO-4 mitigation). Two strategies:WallClock(live/research/operator) andTlogDerivedClock(replay only).ReplaySink— the offlineEstimatorOutputconsumer interface. One strategy:JsonlReplaySink(oneEstimatorOutputper JSONL line; gatedBUILD_REPLAY_SINK_JSONL).TlogReplayFcAdapter— replay-onlyFcAdapterstrategy (per AZ-261FcAdapterProtocol from_docs/02_document/contracts/c8_fc_adapter/fc_adapter_protocol.md); parses pymavlink.tlogand emitsImuWindow/AttitudeWindow/GpsHealth/FlightStateSignalat tlog-timestamp cadence (or wall-clock-paced per--pace). GatedBUILD_TLOG_REPLAY_ADAPTER.
The shared WgsConverter (AZ-279) is constructor-injected into the tlog adapter for tlog-GPS → local-tangent-plane conversion.
Public API
Protocol: FrameSource
@runtime_checkable
class FrameSource(Protocol):
def next_frame(self) -> NavCameraFrame | None: ... # None on end-of-stream
def close(self) -> None: ...
Protocol: Clock
@runtime_checkable
class Clock(Protocol):
def monotonic_ns(self) -> int: ...
def time_ns(self) -> int: ... # wall-clock (UTC) for log timestamps
def sleep_until_ns(self, target_ns: int) -> None: ... # honoured in --pace realtime; no-op in --pace asap
Protocol: ReplaySink
@runtime_checkable
class ReplaySink(Protocol):
def emit(self, output: EstimatorOutput) -> None: ...
def close(self) -> None: ...
Concrete: TlogReplayFcAdapter
class TlogReplayFcAdapter(FcAdapter):
def __init__(
self,
tlog_path: Path,
target_fc_dialect: FcKind, # ARDUPILOT_PLANE | INAV
clock: Clock,
wgs_converter: WgsConverter,
time_offset_ms: int = 0, # auto-detected by AZ-405 auto-sync task or set via --time-offset-ms
pace: ReplayPace = ReplayPace.ASAP, # REALTIME | ASAP
): ...
The TlogReplayFcAdapter implements the full FcAdapter Protocol from AZ-261. emit_external_position raises FcEmitError("replay adapter does not emit to FC") (replay is read-only on the FC side; downstream consumers use ReplaySink instead). request_source_set_switch raises SourceSetSwitchNotSupportedError. subscribe_telemetry is the primary surface — fans out IMU/attitude/GPS-health/flight-state from the tlog at the configured pace.
CLI surface
gps-denied-replay
--video PATH
--tlog PATH
--output results.jsonl
--camera-calibration calib.json
--config config.yaml
[--pace {realtime,asap}] # default asap
[--time-offset-ms N] # overrides auto-sync
Composition root extension
def compose_replay(config: Config) -> ReplayRoot: ...
ReplayRoot is a dataclass holding all wired components plus the FrameSource, TlogReplayFcAdapter, ReplaySink, and Clock chosen for the replay run. The runtime loop is:
loop:
frame = frame_source.next_frame()
if frame is None: break
c1 = vio.process(frame) # C1
candidates = vpr.lookup(c1) # C2
reranked = rerank.rerank(candidates) # C2.5
matched = matcher.match(reranked) # C3
refined = refiner.refine_if_needed(matched) # C3.5
pose = pose_estimator.estimate(refined) # C4
state.add_pose_anchor(pose) # C5
state.add_vio(c1.vio_output) # C5
output = state.current_estimate()
replay_sink.emit(output)
replay_sink.close()
The tlog adapter's subscribe_telemetry callbacks are wired to C5's add_fc_imu and to C1's IMU prior on the same threads as in the live binary.
Invariants
- Mode-agnostic C1–C5: production components MUST NOT contain
if replay_mode:branches. Mode-specific behaviour lives in the strategy (Frame source / FC adapter / Sink / Clock). Verified by an explicit grep guard in CI. - Single
Clockper process: the composition root resolvesClockexactly once at startup. All time-driven logic (AC-5.2 fallback timer, STATUSTEXT rate-limits, key rotation logging) consumes the injectedClockvia constructor — nevertime.monotonic_ns()directly. Verified by an AST scan in CI for directtime.monotonic_ns/time.time_nsreferences in components. - Frame source ordering:
next_frame()returns frames in monotonically non-decreasingmonotonic_nsorder. Out-of-order frames raiseFrameSourceError(NOT silently dropped — replay must be deterministic). - End-of-stream is None:
next_frame()returnsNoneONLY when the stream is permanently exhausted. Transient I/O failures raiseFrameSourceError. - TlogReplayFcAdapter emit-only-via-sink:
emit_external_positionandemit_status_textraiseFcEmitError("replay adapter does not emit to FC"). Downstream consumers MUST emit toReplaySinkinstead. - Pace mode honoured by Clock:
pace=REALTIME→Clock.sleep_until_ns(target_ns)blocks until wall-clock catches up;pace=ASAP→ no-op. The pace flag is consumed ONLY by theClockand the tlog adapter — components see only theClockProtocol. - JsonlReplaySink one-line-per-emit: each
emit(output)writes exactly one JSON object + newline; the file is fsync'd onclose(). Schema matchesEstimatorOutput(frozen dataclass serialised viadataclasses.asdict+orjson.dumps). - Time-offset honoured: when constructed with
time_offset_ms != 0, the tlog adapter shifts every emitted timestamp by that offset before passing to subscribers.time_offset_msis set ONCE at construction (no live re-tuning). - Build-flag gating:
VideoFileFrameSource,TlogReplayFcAdapter,JsonlReplaySinkMUST refuse construction when their respectiveBUILD_*flag is OFF (per ADR-002 — replay binary has them ON; airborne / research / operator have them OFF). - Determinism: same
(video, tlog, config, time_offset_ms, pace=ASAP)input → same JSONL output within ≤ 1e-6 float drift in position fields (AC-5).
Producer / Consumer Split
| Task ID | Scope |
|---|---|
| AZ-398 (Producer) | FrameSource Protocol; Clock Protocol; VideoFileFrameSource (gated BUILD_VIDEO_FILE_FRAME_SOURCE); LiveCameraFrameSource retrofit (rename existing camera-ingest plumbing into the Protocol shape — no behaviour change); WallClock + TlogDerivedClock strategies; composition wiring in the existing compose_root/compose_operator (Clock = WallClock there). NO tlog parsing, NO sink, NO replay composition. |
| AZ-399 (Consumer 1) | TlogReplayFcAdapter: pymavlink stream-parser (DO NOT materialise; R-DEMO-2 throughput floor); maps tlog message types → FcTelemetryFrame; supports both AP and iNav dialects; subscribe_telemetry fan-out at the configured pace; respects time_offset_ms; honours Clock for pacing; fail-fast at startup if required message types absent (R-DEMO-3). |
| AZ-400 (Consumer 2) | ReplaySink Protocol + JsonlReplaySink (one JSON object per line; orjson serialiser; close() fsyncs). |
| AZ-401 (Consumer 3) | compose_replay(config) -> ReplayRoot: full strategy resolution for the replay binary; Clock strategy selection (TlogDerivedClock for ASAP, WallClock for REALTIME; documented per R-DEMO-4); FrameSource = VideoFileFrameSource; FcAdapter = TlogReplayFcAdapter; Sink = JsonlReplaySink; ALL of C1–C5 wired with the same Public API as the live binary. NO C6/C10/C11/C12. Configuration loading + camera-calibration loading. |
| AZ-402 (Consumer 4) | gps-denied-replay CLI entrypoint: argparse, config + calibration loader, runtime loop (the loop body documented in this contract above), structured-error exit codes (0=success, 2=AC-8 sync-impossible, 1=any other error). |
| AZ-403 (Consumer 5) | gps-denied-replay-cli Dockerfile (multi-stage; Python + C1–C5 + cpp/* + replay strategies; NO C6/C10/C11/C12; NO HTTP server) + GitHub Actions matrix entry + SBOM diff CI step verifying absence of excluded components per AC-4. |
| AZ-404 (Consumer 6) | E2E replay fixture test: tests/e2e/replay/test_derkachi_1min.py — runs the CLI against a 1–2 min Derkachi clip + matching tlog; asserts AC-3 (≤ 100 m for ≥ 80 % of ticks); gated by RUN_REPLAY_E2E=1 in CI. |
| AZ-405 (Consumer 7) | Auto-sync of video ↔ tlog via IMU take-off detection (AC-7 / AC-8). Take-off pattern: sustained vertical accel > 0.5 g + change in attitude rate > 1 rad/s lasting ≥ 0.5 s (typical quadcopter signature). Confidence-scored; falls back to WARN + best-guess if < 80 %; --time-offset-ms always overrides; AC-8 hard-fail (exit 2) if neither auto-detect nor manual offset produces > 95 % frame-window match. |
Constraints
@runtime_checkableon all Protocols; DTOsfrozen=True, slots=True.- Lazy-import per ADR-002 with the new
BUILD_VIDEO_FILE_FRAME_SOURCE,BUILD_TLOG_REPLAY_ADAPTER,BUILD_REPLAY_SINK_JSONLflags. - C1–C5 components MUST remain mode-agnostic (Invariant 1).
- All time-driven logic in components MUST consume the injected
Clock(Invariant 2). - No HTTP server in the replay binary (parent-suite UI shells out to the CLI; defer until subprocess shape is proven insufficient).
- pymavlink bundled unmodified per D-C8-3.
- The tlog parser MUST stream-parse — never materialise the entire tlog into memory (R-DEMO-2; multi-GB tlogs).
Risks / Mitigations
- R-DEMO-1 (tlog ↔ video timestamp drift / unsynchronised recordings): auto-sync via IMU take-off detection (AC-7) +
--time-offset-msmanual override. Fixed-wing hand-launch fallback documented. - R-DEMO-2 (pymavlink slow on multi-GB tlogs): stream-parse, never materialise. Throughput floor benchmarked + documented in CI.
- R-DEMO-3 (demo footage missing required FC messages):
TlogReplayFcAdapter.open(...)fails fast at startup, listing missing message types and the components that need them. - R-DEMO-4 (production C1–C5 paths bake real-time-cadence assumptions):
Clockinjection (Invariants 1, 2). Documented as ADR amendment in next architecture-doc cycle.
Notes for the Implementer
- The
LiveCameraFrameSourceretrofit is a no-op restructure: the existing camera-ingest thread becomes a class implementingFrameSource. Its behaviour is unchanged. This is what allows C1 to consumeFrameSourcevia constructor without becoming replay-aware. - The
TlogReplayFcAdapter'ssubscribe_telemetryfan-out runs on a dedicated thread (mirroring the livePymavlinkArdupilotAdapterdecode-thread semantics). This way C1 and C5 see identical thread boundaries in live and replay. - The
ClockProtocol is the SAME interface in live and replay — only the strategy differs. This is the single Liskov-clean line that lets components consumeClockwithout knowing the mode.