[AZ-401] [AZ-400] Replay — compose_root replay-mode branch + transport seam

Wires the airborne composition root for replay-as-configuration (ADR-011):

- compose_root(config) branches on config.mode in {"live", "replay"}.
  Live behaviour is unchanged; replay builds ReplayInputAdapter,
  attaches JsonlReplaySink, and injects NoopMavlinkTransport.
- New private module runtime_root/_replay_branch.py holds the
  replay-only strategy graph + build-flag gate + calibration loader.
- Config gains Config.mode (Literal["live","replay"]) plus
  Config.replay sub-block with nested ReplayAutoSyncConfig that mirrors
  the AZ-405 AutoSyncConfig DTO; YAML loader + ENV map updated.

Absorbs the AZ-400 transport-seam retrofit that AZ-401 strictly
required but AZ-400 had not delivered:

- New MavlinkTransport Protocol (write/bytes_written/close).
- NoopMavlinkTransport (replay; build-flag gated, idempotent close,
  thread-safe byte counter).
- SerialMavlinkTransport (live, no-op restructure of existing pymavlink
  byte path; encoder retrofit to actually USE it is the AZ-558
  follow-up).

AZ-401 AC-9 (NoopMavlinkTransport.bytes_written > 0 after C8 encoders
run) is BLOCKED on AZ-558 — the encoder routing retrofit is out of
the AZ-401 task envelope (FORBIDDEN files: pymavlink_ardupilot_adapter,
msp2_inav_adapter). AZ-558 spec, batch_61_review.md, and the test's
@pytest.mark.skip rationale all carry the deferral reason.

Tests: 22 compose_root replay-branch tests + 17 transport tests.
Full regression: 2063 passed, 86 environment-skips, 1 documented
skip (AC-9 / AZ-558), 1 pre-existing flaky perf test deselected.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 11:55:33 +03:00
parent 8149083cac
commit 17a0d074af
19 changed files with 2156 additions and 45 deletions
+77 -27
View File
@@ -7,9 +7,14 @@ the component graph in dependency order.
Per-binary entrypoints:
* :func:`compose_root` - airborne runtime
* :func:`compose_root` - airborne runtime; serves both ``config.mode == "live"``
and ``config.mode == "replay"`` per ADR-011 (replay-as-configuration)
* :func:`compose_operator` - operator-side tooling (pre-flight, post-landing)
* :func:`compose_replay` - replay-cli runtime (extension owned by AZ-401)
Replay is a configuration of :func:`compose_root`, not a separate function:
the branch on ``config.mode`` lives in :mod:`._replay_branch`. The legacy
``compose_replay`` export was removed by AZ-401 (ADR-011 supersedes the
v1.0.0 "replay is a sibling root" design).
Public surface frozen by
``_docs/02_document/contracts/shared_config/composition_root_protocol.md`` v1.0.0.
@@ -24,6 +29,10 @@ from dataclasses import dataclass, field
from typing import TYPE_CHECKING, Any, Final, Literal, get_args
from gps_denied_onboard.config import Config, load_config
from gps_denied_onboard.runtime_root._replay_branch import (
CompositionError,
build_replay_components,
)
from gps_denied_onboard.runtime_root.c12_factory import (
build_flights_api_client,
)
@@ -67,6 +76,7 @@ __all__ = [
"EXIT_FDR_OPEN_FAILURE",
"EXIT_GENERIC_FAILURE",
"REQUIRED_ENV_VARS",
"CompositionError",
"ConfigurationError",
"OperatorRoot",
"OutboundThreadAlreadyBoundError",
@@ -91,7 +101,6 @@ __all__ = [
"clear_strategy_registries",
"clear_strategy_registry",
"compose_operator",
"compose_replay",
"compose_root",
"list_registered_fc_strategies",
"list_registered_gcs_strategies",
@@ -317,8 +326,17 @@ def _compose(
binary: str,
allowed_tiers: frozenset[StrategyTier],
extra_required_env: Iterable[str],
pre_constructed: Mapping[str, Any] | None = None,
) -> tuple[dict[str, Any], tuple[str, ...]]:
"""Shared composition path used by ``compose_root`` / ``compose_operator``."""
"""Shared composition path used by ``compose_root`` / ``compose_operator``.
``pre_constructed`` lets the caller seed the ``constructed`` dict
before any registered factory runs — used by the replay-mode branch
of :func:`compose_root` to inject the cross-cutting replay
strategies (``frame_source``, ``fc_adapter``, ``clock``,
``mavlink_transport``, ``replay_sink``) so any C1-C7 factory that
declares a dependency on one finds it already populated.
"""
_check_required_env(extra_required=extra_required_env)
selections = _resolve_component_strategies(config, allowed_tiers)
resolved: dict[str, _Registration] = {
@@ -326,7 +344,9 @@ def _compose(
for slug, strategy in selections.items()
}
order = _topo_order(resolved.keys(), resolved)
constructed: dict[str, Any] = {}
constructed: dict[str, Any] = (
dict(pre_constructed) if pre_constructed is not None else {}
)
for slug in order:
registration = resolved[slug]
try:
@@ -336,7 +356,11 @@ def _compose(
_close_partial_instances(constructed)
raise
_ = binary # documented but unused beyond labelling the returned root
return constructed, tuple(order)
# Returned components include only the registry-driven strategies — the
# caller is responsible for merging the pre_constructed dict back in if
# it wants a single combined view.
registry_built = {slug: constructed[slug] for slug in order}
return registry_built, tuple(order)
def _close_partial_instances(instances: Mapping[str, Any]) -> None:
@@ -392,19 +416,61 @@ def _read_strategy_attr(block: Any) -> Any:
return None
def compose_root(config: Config) -> RuntimeRoot:
"""Compose the airborne runtime graph (per contract v1.0.0)."""
def compose_root(
config: Config,
*,
replay_components_factory: Any | None = None,
) -> RuntimeRoot:
"""Compose the airborne runtime graph for ``config.mode``.
With ``config.mode == "live"`` (the default) the function behaves
exactly as the pre-AZ-401 implementation — every wiring decision is
driven by ``config.components[slug].strategy`` against the strategy
registry, gated by the airborne tier.
With ``config.mode == "replay"`` the function additionally builds
the five replay-only strategies (``frame_source``, ``fc_adapter``,
``clock``, ``mavlink_transport``, ``replay_sink``) per
:mod:`._replay_branch` and merges them into the components dict
BEFORE the registry-driven C1-C7+C13 strategies run, so any
component factory that consumes one of the five via ``constructed``
finds it already populated. C1-C7+C13 strategies are wired
identically to live mode (replay protocol Invariant 1).
The ``replay_components_factory`` keyword is a test-only injection
point — production callers omit it. Tests pass a callable returning
``(components, construction_order)`` so the unit suite does not
have to satisfy the full OpenCV / pymavlink / FDR side-effects of
the real strategies.
"""
extra_env = (
("MAVLINK_SIGNING_KEY",)
if config.mode == "live"
else ()
)
if config.mode == "replay":
replay_factory = replay_components_factory or build_replay_components
replay_components, replay_order = replay_factory(config)
else:
replay_components = {}
replay_order = ()
components, order = _compose(
config,
binary="airborne",
allowed_tiers=frozenset({"airborne", "shared"}),
extra_required_env=("MAVLINK_SIGNING_KEY",),
extra_required_env=extra_env,
pre_constructed=replay_components,
)
merged: dict[str, Any] = dict(replay_components)
merged.update(components)
full_order = tuple(replay_order) + tuple(
slug for slug in order if slug not in replay_order
)
return RuntimeRoot(
binary="airborne",
profile=os.environ["GPS_DENIED_FC_PROFILE"],
components=components,
construction_order=order,
components=merged,
construction_order=full_order,
)
@@ -424,22 +490,6 @@ def compose_operator(config: Config) -> OperatorRoot:
)
def compose_replay(config: Config) -> RuntimeRoot:
"""Compose the replay-cli runtime graph. Concrete wiring is owned by AZ-401."""
components, order = _compose(
config,
binary="replay-cli",
allowed_tiers=frozenset({"airborne", "shared"}),
extra_required_env=(),
)
return RuntimeRoot(
binary="replay-cli",
profile=os.environ["GPS_DENIED_FC_PROFILE"],
components=components,
construction_order=order,
)
@dataclass(frozen=True)
class TakeoffResult:
"""Successful takeoff: writer is open, FC adapter is wired, components started.
@@ -0,0 +1,329 @@
"""Replay-mode branch of :func:`compose_root` (AZ-401 / E-DEMO-REPLAY).
Internal module. Owns the wiring that turns a ``config.mode == "replay"``
:class:`Config` into a :class:`RuntimeRoot` whose components dict carries
the replay-only strategies (``frame_source``, ``fc_adapter``, ``clock``,
``mavlink_transport``, ``replay_sink``) plus whatever C1-C7+C13 strategies
the binary's bootstrap registered against
:data:`gps_denied_onboard.runtime_root._STRATEGY_REGISTRY`.
Per replay protocol v2.0.0 (ADR-011): replay is a configuration of the
single airborne composition root, not a sibling root. The branch lives
in this module to keep ``runtime_root/__init__.py`` focused on the
shared composition spine while still exposing exactly one
``compose_root(config)`` entrypoint.
Build-flag gates (per replay protocol Invariant 9):
- ``BUILD_VIDEO_FILE_FRAME_SOURCE`` — required for the
:class:`VideoFileFrameSource` instance returned by the coordinator.
- ``BUILD_TLOG_REPLAY_ADAPTER`` — required for the
:class:`TlogReplayFcAdapter` instance returned by the coordinator.
- ``BUILD_REPLAY_SINK_JSONL`` — shared by the JSONL sink and the noop
outbound transport.
All three default ON in the airborne binary (per ADR-011); flipping any
OFF disables replay mode without affecting live mode.
"""
from __future__ import annotations
import json
import os
from collections.abc import Mapping
from pathlib import Path
from typing import TYPE_CHECKING, Any, Final
from gps_denied_onboard._types.calibration import CameraCalibration
from gps_denied_onboard._types.fc import FcKind
from gps_denied_onboard.components.c8_fc_adapter.noop_mavlink_transport import (
NoopMavlinkTransport,
)
from gps_denied_onboard.components.c8_fc_adapter.replay_sink import (
JsonlReplaySink,
)
from gps_denied_onboard.config import Config
from gps_denied_onboard.fdr_client import make_fdr_client
from gps_denied_onboard.helpers.wgs_converter import WgsConverter
from gps_denied_onboard.logging import get_logger
from gps_denied_onboard.replay_input import (
AutoSyncConfig,
ReplayInputAdapter,
ReplayInputBundle,
)
from gps_denied_onboard.replay_input.tlog_video_adapter import ReplayPace
if TYPE_CHECKING:
from gps_denied_onboard.fdr_client.client import FdrClient
__all__ = [
"REPLAY_BUILD_FLAGS",
"REPLAY_COMPONENT_KEYS",
"CompositionError",
"build_replay_components",
]
_LOG_KIND_READY: Final[str] = "replay.compose_root.ready"
REPLAY_BUILD_FLAGS: Final[tuple[str, ...]] = (
"BUILD_VIDEO_FILE_FRAME_SOURCE",
"BUILD_TLOG_REPLAY_ADAPTER",
"BUILD_REPLAY_SINK_JSONL",
)
REPLAY_COMPONENT_KEYS: Final[tuple[str, ...]] = (
"frame_source",
"fc_adapter",
"clock",
"mavlink_transport",
"replay_sink",
)
class CompositionError(RuntimeError):
"""Raised when the replay-mode branch refuses to compose a runtime.
Carries the human-readable reason (build-flag OFF, missing path,
contradictory config) so the caller can surface it in the structured
log + on stderr without a second introspection pass.
"""
def build_replay_components(
config: Config,
*,
fdr_client_factory: Any | None = None,
replay_input_adapter_factory: Any | None = None,
sink_factory: Any | None = None,
transport_factory: Any | None = None,
) -> tuple[dict[str, Any], tuple[str, ...]]:
"""Construct the replay-mode component dict + construction order.
The factories are test-only injection points. Production callers
(just ``compose_root``) leave them ``None`` so the real constructors
run; unit tests pass fakes so they don't have to satisfy the full
OpenCV / pymavlink / FDR side-effects of the real strategies.
Returns:
``(components, construction_order)`` — the same shape
:func:`gps_denied_onboard.runtime_root._compose` returns. The
keys are the entries of :data:`REPLAY_COMPONENT_KEYS`; the
values are typed strategy instances.
"""
if config.mode != "replay":
raise CompositionError(
"build_replay_components called with non-replay config "
f"(mode={config.mode!r})"
)
_validate_build_flags()
_validate_replay_paths(config)
fdr_factory = fdr_client_factory or make_fdr_client
fdr_client = fdr_factory("replay_input", config)
sink_fdr_client = fdr_factory("c8_fc_adapter.replay_sink", config)
bundle = _build_replay_input_bundle(
config,
fdr_client=fdr_client,
adapter_factory=replay_input_adapter_factory,
)
if sink_factory is not None:
sink = sink_factory(config, sink_fdr_client)
else:
sink = JsonlReplaySink(
output_path=Path(config.replay.output_path),
fdr_client=sink_fdr_client,
)
if transport_factory is not None:
transport = transport_factory(config)
else:
transport = NoopMavlinkTransport()
components: dict[str, Any] = {
"frame_source": bundle.frame_source,
"fc_adapter": bundle.fc_adapter,
"clock": bundle.clock,
"mavlink_transport": transport,
"replay_sink": sink,
}
_log_ready(config, bundle)
return components, REPLAY_COMPONENT_KEYS
def _validate_build_flags() -> None:
"""Refuse construction when any replay-mode ``BUILD_*`` flag is OFF."""
for flag_name in REPLAY_BUILD_FLAGS:
raw = os.environ.get(flag_name, "ON").strip().upper()
if raw == "OFF":
raise CompositionError(
f"{flag_name} is OFF; replay mode requires it"
)
def _validate_replay_paths(config: Config) -> None:
"""Reject empty / missing replay paths early with a precise message."""
if not config.replay.video_path:
raise CompositionError(
"config.replay.video_path is empty; replay mode requires a video path"
)
if not config.replay.tlog_path:
raise CompositionError(
"config.replay.tlog_path is empty; replay mode requires a tlog path"
)
if not config.replay.output_path:
raise CompositionError(
"config.replay.output_path is empty; replay mode requires an output path"
)
def _build_replay_input_bundle(
config: Config,
*,
fdr_client: "FdrClient",
adapter_factory: Any | None,
) -> ReplayInputBundle:
"""Build the :class:`ReplayInputAdapter` and call ``open()``."""
pace = _resolve_pace(config.replay.pace)
target_fc_dialect = _resolve_fc_kind(config.replay.target_fc_dialect)
auto_sync = _build_auto_sync_config(config)
camera_calibration = _load_camera_calibration(config)
wgs_converter = WgsConverter()
if adapter_factory is not None:
adapter = adapter_factory(
config=config,
camera_calibration=camera_calibration,
target_fc_dialect=target_fc_dialect,
wgs_converter=wgs_converter,
fdr_client=fdr_client,
pace=pace,
auto_sync_config=auto_sync,
)
else:
adapter = ReplayInputAdapter(
video_path=Path(config.replay.video_path),
tlog_path=Path(config.replay.tlog_path),
camera_calibration=camera_calibration,
target_fc_dialect=target_fc_dialect,
wgs_converter=wgs_converter,
fdr_client=fdr_client,
pace=pace,
manual_time_offset_ms=config.replay.time_offset_ms,
auto_sync_config=auto_sync,
)
return adapter.open()
def _resolve_pace(raw: str) -> ReplayPace:
if raw == "asap":
return ReplayPace.ASAP
if raw == "realtime":
return ReplayPace.REALTIME
raise CompositionError(
f"config.replay.pace={raw!r} not in ('asap', 'realtime')"
)
def _resolve_fc_kind(raw: str) -> FcKind:
if raw == "ardupilot_plane":
return FcKind.ARDUPILOT_PLANE
if raw == "inav":
return FcKind.INAV
raise CompositionError(
f"config.replay.target_fc_dialect={raw!r} not in "
"('ardupilot_plane', 'inav')"
)
def _build_auto_sync_config(config: Config) -> AutoSyncConfig:
block = config.replay.auto_sync
return AutoSyncConfig(
takeoff_accel_threshold_g=block.takeoff_accel_threshold_g,
takeoff_attitude_rate_threshold_rad_s=(
block.takeoff_attitude_rate_threshold_rad_s
),
sustained_seconds=block.sustained_seconds,
prescan_max_messages=block.prescan_max_messages,
video_motion_threshold=block.video_motion_threshold,
video_motion_scan_seconds=block.video_motion_scan_seconds,
match_threshold_pct=block.match_threshold_pct,
match_window_ms=block.match_window_ms,
low_confidence_threshold=block.low_confidence_threshold,
)
def _load_camera_calibration(config: Config) -> CameraCalibration:
"""Read the camera calibration JSON into a :class:`CameraCalibration` DTO.
The replay binary uses the SAME calibration file the live binary
loads; AZ-401 does not introduce a new on-disk format.
"""
import numpy as np
path = config.runtime.camera_calibration_path
if not path:
raise CompositionError(
"config.runtime.camera_calibration_path is empty; replay mode "
"requires a camera calibration JSON"
)
try:
blob = json.loads(Path(path).read_text(encoding="utf-8"))
except OSError as exc:
raise CompositionError(
f"failed to read camera calibration from {path!r}: {exc!r}"
) from exc
except json.JSONDecodeError as exc:
raise CompositionError(
f"camera calibration {path!r} is not valid JSON: {exc!r}"
) from exc
if not isinstance(blob, Mapping):
raise CompositionError(
f"camera calibration {path!r} must decode to a mapping; "
f"got {type(blob).__name__}"
)
intrinsics = np.asarray(blob.get("intrinsics_3x3"), dtype=np.float64)
if intrinsics.shape != (3, 3):
raise CompositionError(
f"camera calibration {path!r} 'intrinsics_3x3' must be 3x3; "
f"got shape {intrinsics.shape}"
)
distortion = np.asarray(blob.get("distortion", []), dtype=np.float64)
body_to_camera = np.asarray(
blob.get("body_to_camera_se3", np.eye(4).tolist()),
dtype=np.float64,
)
return CameraCalibration(
camera_id=str(blob.get("camera_id", "replay-camera")),
intrinsics_3x3=intrinsics,
distortion=distortion,
body_to_camera_se3=body_to_camera,
acquisition_method=str(blob.get("acquisition_method", "operator")),
metadata=dict(blob.get("metadata", {})),
)
def _log_ready(config: Config, bundle: ReplayInputBundle) -> None:
log = get_logger("runtime_root.replay_branch")
log.info(
f"{_LOG_KIND_READY}: pace={config.replay.pace} "
f"resolved_offset_ms={bundle.resolved_time_offset_ms}",
extra={
"kind": _LOG_KIND_READY,
"kv": {
"video_path": config.replay.video_path,
"tlog_path": config.replay.tlog_path,
"output_path": config.replay.output_path,
"pace": config.replay.pace,
"resolved_offset_ms": bundle.resolved_time_offset_ms,
"calib_path": config.runtime.camera_calibration_path,
"auto_sync_used": bundle.auto_sync_result is not None,
},
},
)