mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 19:01:14 +00:00
[AZ-401] [AZ-400] Replay — compose_root replay-mode branch + transport seam
Wires the airborne composition root for replay-as-configuration (ADR-011):
- compose_root(config) branches on config.mode in {"live", "replay"}.
Live behaviour is unchanged; replay builds ReplayInputAdapter,
attaches JsonlReplaySink, and injects NoopMavlinkTransport.
- New private module runtime_root/_replay_branch.py holds the
replay-only strategy graph + build-flag gate + calibration loader.
- Config gains Config.mode (Literal["live","replay"]) plus
Config.replay sub-block with nested ReplayAutoSyncConfig that mirrors
the AZ-405 AutoSyncConfig DTO; YAML loader + ENV map updated.
Absorbs the AZ-400 transport-seam retrofit that AZ-401 strictly
required but AZ-400 had not delivered:
- New MavlinkTransport Protocol (write/bytes_written/close).
- NoopMavlinkTransport (replay; build-flag gated, idempotent close,
thread-safe byte counter).
- SerialMavlinkTransport (live, no-op restructure of existing pymavlink
byte path; encoder retrofit to actually USE it is the AZ-558
follow-up).
AZ-401 AC-9 (NoopMavlinkTransport.bytes_written > 0 after C8 encoders
run) is BLOCKED on AZ-558 — the encoder routing retrofit is out of
the AZ-401 task envelope (FORBIDDEN files: pymavlink_ardupilot_adapter,
msp2_inav_adapter). AZ-558 spec, batch_61_review.md, and the test's
@pytest.mark.skip rationale all carry the deferral reason.
Tests: 22 compose_root replay-branch tests + 17 transport tests.
Full regression: 2063 passed, 86 environment-skips, 1 documented
skip (AC-9 / AZ-558), 1 pre-existing flaky perf test deselected.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,124 @@
|
||||
# Code Review Report
|
||||
|
||||
**Batch**: 61 (AZ-401, with absorbed AZ-400 transport-seam retrofit)
|
||||
**Date**: 2026-05-14
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | High | Spec-Gap | tests/unit/test_az401_compose_root_replay.py:526 | AC-9 (`NoopMavlinkTransport.bytes_written() > 0` after C8 encoders) is BLOCKED by AZ-399 design choice — test is `pytest.skip` with documented rationale |
|
||||
| 2 | Medium | Scope | src/gps_denied_onboard/components/c8_fc_adapter/serial_mavlink_transport.py | Live transport seam introduced as no-op restructure; existing `PymavlinkArdupilotAdapter` / `Msp2InavAdapter` encoders do NOT yet route bytes through `MavlinkTransport.write()` |
|
||||
| 3 | Low | Maintainability | src/gps_denied_onboard/runtime_root/__init__.py | New optional `replay_components_factory` kwarg (test-injection seam) added to `compose_root` |
|
||||
| 4 | Low | Style | src/gps_denied_onboard/runtime_root/_replay_branch.py | Inline `_load_camera_calibration` helper duplicates the live-mode loader pattern (one-time tax acceptable; share if a third call site appears) |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: AC-9 is BLOCKED — `NoopMavlinkTransport.bytes_written() > 0`** (High / Spec-Gap)
|
||||
|
||||
- Location: `tests/unit/test_az401_compose_root_replay.py:526` (`test_ac9_noop_transport_bytes_written_after_runtime_drive`)
|
||||
- Description: AC-9 demands that after the C8 outbound encoders run in replay mode, `NoopMavlinkTransport.bytes_written() > 0`. The intended path is: C5 emits an `EstimatorOutput` → C8 outbound encoder produces a MAVLink `GPS_INPUT` byte stream → `MavlinkTransport.write(bytes)` records the count. The blocker is at the C8 encoder layer: `TlogReplayFcAdapter` (AZ-399) raises `FcEmitError` on every `emit_external_position()` call rather than routing bytes through a transport seam, and the live-mode `PymavlinkArdupilotAdapter` / `Msp2InavAdapter` adapters call `pymavlink`'s `mavutil.mavlink_connection.mav.gps_input_send(...)` directly — they bypass `MavlinkTransport` entirely. Closing AC-9 requires retrofitting the AP / iNav / QGC encoder code paths to consume `MavlinkTransport`, which AZ-400's original spec scoped but did not deliver. The Protocol seam + both implementations (`NoopMavlinkTransport`, `SerialMavlinkTransport`) are present and unit-tested in `test_az400_mavlink_transport.py` (17 tests passing), so the architectural seam is in place.
|
||||
- Suggestion: file a follow-up task `AZ-401-followup-mavlink-transport-routing` that retrofits `PymavlinkArdupilotAdapter` and `Msp2InavAdapter` (and the Replay FC adapter) to write through `MavlinkTransport.write()` instead of calling pymavlink's `mav.*_send` helpers directly. Keep the AC-9 skip in place with the same blocker reference until that task lands. The skip's `reason` text is the spec for the follow-up.
|
||||
- Task: AZ-401 (deferred)
|
||||
|
||||
**F2: Live transport seam is a no-op restructure** (Medium / Scope)
|
||||
|
||||
- Location: `src/gps_denied_onboard/components/c8_fc_adapter/serial_mavlink_transport.py`
|
||||
- Description: `SerialMavlinkTransport` wraps a pymavlink `mavlink_connection` and forwards bytes via `connection.write(bytes)`. The class is fully implemented and unit-tested (cumulative byte counting, error wrapping for `OSError`, idempotent close, write-after-close rejection). However, the existing live encoders (`PymavlinkArdupilotAdapter`, `Msp2InavAdapter`) still call `connection.mav.gps_input_send(...)` directly — they don't construct or use a `SerialMavlinkTransport`. So the class exists, conforms to the Protocol, and is import-clean — but it is **dormant** in the live path. This is an explicit, deliberate scope reduction: AZ-401's primary goal was the replay-mode branch in `compose_root`, and the AZ-400 retrofit was absorbed only to the minimum extent the replay branch needed. The full live-side retrofit is the same follow-up task as F1.
|
||||
- Suggestion: same as F1 — track via `AZ-401-followup-mavlink-transport-routing`. The follow-up should also flip `SerialMavlinkTransport` from "constructed but never wired" to "the only path live bytes flow through".
|
||||
- Task: AZ-401 (deferred)
|
||||
|
||||
**F3: New `replay_components_factory` kwarg on `compose_root`** (Low / Maintainability)
|
||||
|
||||
- Location: `src/gps_denied_onboard/runtime_root/__init__.py` — `compose_root(config, *, replay_components_factory=None)`
|
||||
- Description: Adds an optional kwarg defaulting to `None`. When `None` (production), `compose_root` calls `_replay_branch.build_replay_components(config)`. When provided (tests), the factory is used instead. This mirrors the established pattern from `replay_input.tlog_video_adapter.ReplayInputAdapter.__init__` (`tlog_source_factory`, `video_frames_factory`, `video_timestamps_factory`) noted in batch 60 review as F3, and from `TlogReplayFcAdapter.__init__` (`source_factory`, AZ-399). Production callers (CLI entrypoint AZ-402, runtime root operator AZ-326) pass none of them.
|
||||
- Suggestion: keep — the precedent is now present in three coordinators. If a fourth adopts the same shape, migrate to a shared `_TestFactories` Protocol.
|
||||
- Task: AZ-401
|
||||
|
||||
**F4: `_load_camera_calibration` duplicates live-mode loader pattern** (Low / Style)
|
||||
|
||||
- Location: `src/gps_denied_onboard/runtime_root/_replay_branch.py` — `_load_camera_calibration(path: Path) -> CameraCalibration`
|
||||
- Description: Reads a JSON calibration file, validates the required keys, and returns a `CameraCalibration`. The live-mode binary will need the same logic when AZ-263 / AZ-326 wire it. There are currently zero other callers of this exact loader (live mode reads its calib via the operator entry point, AZ-326), so the duplication is hypothetical until a second loader is written.
|
||||
- Suggestion: keep inline. When AZ-326 / AZ-326-operator-orchestrator implements its calibration loader, factor into `gps_denied_onboard/_helpers/camera_calibration_loader.py` (Layer-2) and have both call sites reuse it.
|
||||
- Task: AZ-401
|
||||
|
||||
## Phase Summary
|
||||
|
||||
### Phase 1 — Context Loading
|
||||
|
||||
Read inputs:
|
||||
|
||||
- `_docs/02_tasks/todo/AZ-401_replay_compose.md`
|
||||
- `_docs/02_document/contracts/replay/replay_protocol.md` (v2.0.0)
|
||||
- `_docs/02_document/architecture.md` (ADR-011 — replay-as-configuration)
|
||||
- `_docs/02_document/module-layout.md` (Layer 4 + Build-Time Exclusion Map)
|
||||
- `_docs/02_document/epics.md` (E-DEMO-REPLAY ACs 1 / 5 / 9 / 11 / 12)
|
||||
- `_docs/02_document/contracts/c8_fc_adapter/fc_adapter_protocol.md` (the new `MavlinkTransport` Protocol seam)
|
||||
|
||||
### Phase 2 — Spec Compliance
|
||||
|
||||
| AC | Verdict | Test |
|
||||
|----|---------|------|
|
||||
| AC-1 (single composition root, `compose_replay` deleted) | PASS | `test_ac1_compose_replay_no_longer_exported` |
|
||||
| AC-2 (live mode unchanged) | PASS | `test_ac2_live_default_mode_returns_runtime_root_with_no_replay_keys` + `test_ac2_live_explicit_mode_unchanged` |
|
||||
| AC-3 (replay wires VideoFile / TlogReplay / Noop / JsonlReplaySink) | PASS | `test_ac3_replay_mode_wires_five_replay_strategies` |
|
||||
| AC-4 (replay rejects each `BUILD_*` flag OFF; live unaffected) | PASS | `test_ac4_replay_rejects_each_build_flag_off` (parameterized × 3) + `test_ac4_live_with_replay_flag_off_succeeds` |
|
||||
| AC-5 (pace → clock kind; same Clock instance across consumers) | PASS | `test_ac5_replay_pace_asap_uses_tlog_derived_clock` + `test_ac5_replay_pace_realtime_uses_wall_clock` + `test_ac5_clock_single_instance_id_equality` |
|
||||
| AC-6 (JSONL sink emits per tick, 10 frames → 10 lines) | PASS | `test_ac6_jsonl_sink_emits_per_tick_when_runtime_drives_outputs` |
|
||||
| AC-7 (no mode-aware imports outside runtime_root) | PASS | `test_ac7_no_component_imports_video_file_frame_source` + `test_ac7_only_runtime_root_imports_replay_strategies` |
|
||||
| AC-8 (replay branch imports only public APIs / documented deep submodules) | PASS | `test_ac8_replay_branch_imports_only_public_apis` |
|
||||
| AC-9 (NoopMavlinkTransport.bytes_written > 0) | **BLOCKED** | `test_ac9_noop_transport_bytes_written_after_runtime_drive` (skipped with documented reason — see F1) |
|
||||
| AC-10 (replay does not alter C6 cache shape) | PASS (smoke) | `test_ac10_replay_does_not_alter_c6_cache_shape` (full E2E owned by AZ-404) |
|
||||
|
||||
AZ-400 retrofit ACs (Transport Protocol + impls) covered by 17 tests in `tests/unit/c8_fc_adapter/test_az400_mavlink_transport.py`.
|
||||
|
||||
Contract verification: `_docs/02_document/contracts/replay/replay_protocol.md` v2.0.0 §Composition Root + Invariants 1, 5, 9, 11, 12 all match the implementation. `MavlinkTransport` Protocol shape (`write(bytes) -> int`, `bytes_written() -> int`, `close()`) matches the contract documentation in `c8_fc_adapter/fc_adapter_protocol.md`.
|
||||
|
||||
### Phase 3 — Code Quality
|
||||
|
||||
- **SOLID**: `MavlinkTransport` Protocol cleanly separates the "byte sink" responsibility from the encoder; `NoopMavlinkTransport` and `SerialMavlinkTransport` are LSP-substitutable. `compose_root` delegates the replay-specific composition to `_replay_branch.build_replay_components` (single responsibility — `compose_root` just routes by mode).
|
||||
- **Error handling**: every transport write goes through a lock; closed-state is checked; `OSError` from the underlying serial connection is wrapped in `MavlinkTransportError` with the original cause chained via `from exc`. `CompositionError` carries enough context (which flag is OFF, which path is empty) for an operator to diagnose without grepping source.
|
||||
- **Naming**: `_replay_branch.py` is the established convention (private module under `runtime_root/`); `build_replay_components` is a verb-form factory; `REPLAY_BUILD_FLAGS` and `REPLAY_COMPONENT_KEYS` are uppercase constants.
|
||||
- **Complexity**: longest function is `build_replay_components` at ~85 lines, all linear flow with explicit guard clauses. No cyclomatic-complexity-> 10 functions.
|
||||
- **Test quality**: every AC test asserts a meaningful behavior (not just "no error thrown"). The two AST-scan tests (AC-7, AC-8) survive across files via `ast.parse` rather than substring-grep.
|
||||
- **Dead code**: none introduced. The legacy `compose_replay` export was already deleted in a prior batch (greenfield iteration); this batch confirmed that via AC-1.
|
||||
|
||||
### Phase 4 — Security Quick-Scan
|
||||
|
||||
- No SQL strings, no shell-escapes, no `eval` / `exec` / `pickle.loads`.
|
||||
- `_load_camera_calibration` reads operator-controlled JSON and validates keys; treats missing keys as a hard error rather than silently substituting.
|
||||
- Replay paths come from operator-controlled config; no taint surface from external input.
|
||||
- No hardcoded secrets, API keys, or credentials.
|
||||
- No sensitive data logged: the `replay.compose_root.ready` log emits paths and pace, not auth keys.
|
||||
|
||||
### Phase 5 — Performance Scan
|
||||
|
||||
- `compose_root` live-mode path adds **one** `if config.mode == "replay"` check per startup — well under the 50 ms budget the task spec requires.
|
||||
- `NoopMavlinkTransport.write` acquires a `threading.Lock` per call. This matches `SerialMavlinkTransport`'s contract (the live encoders write from a single thread today, but the seam is thread-safe by construction). At an expected emit rate of ≤ 10 Hz, lock contention is irrelevant.
|
||||
- The `_validate_build_flags` helper iterates `REPLAY_BUILD_FLAGS` (length 3) and reads `os.environ` — constant-time at startup; not in any hot path.
|
||||
- `_replay_branch` does no I/O on the live-mode path (it's never imported when `config.mode == "live"` triggers the early return in `compose_root`).
|
||||
|
||||
### Phase 6 — Cross-Task Consistency
|
||||
|
||||
- The AZ-400 retrofit (transport seam) is consistent with AZ-401's replay branch: `_replay_branch.build_replay_components` returns a `mavlink_transport: NoopMavlinkTransport` slot that the (future) C8 encoder retrofit will consume.
|
||||
- `Config.replay.auto_sync` (added here) is a structural mirror of `replay_input.interface.AutoSyncConfig` (added in AZ-405 / batch 60). The two dataclasses have the same field names + defaults; `_replay_branch._build_auto_sync_config` translates between them. If they ever drift, the failure surface is `replay_input.tlog_video_adapter.ReplayInputAdapter.__init__` rejecting an unrecognised kwarg — caught at startup.
|
||||
- No conflicting patterns introduced. The build-flag-gating convention (`os.environ[FLAG] == "ON"`) matches what `JsonlReplaySink.__init__` and `NoopMavlinkTransport.__init__` already do.
|
||||
|
||||
### Phase 7 — Architecture Compliance
|
||||
|
||||
- **Layer direction**: `runtime_root` (Layer-5 cross-cutting composition) imports from `components/*` (Layer-3) and `replay_input/` (Layer-4 cross-cutting coordinator). All Layer-5 → Layer-3/4 — correct direction.
|
||||
- **Public API respect**: `_replay_branch.py` imports the noop transport + JSONL sink via deep paths (`gps_denied_onboard.components.c8_fc_adapter.noop_mavlink_transport`, `...replay_sink`). These are documented exceptions in `module-layout.md` (the strategy modules are owned by C8 but instantiated only in the composition root). The AC-8 test enforces this allowlist mechanically.
|
||||
- **No new cyclic deps**: `_replay_branch.py` is leaf-imported only by `runtime_root/__init__.py`. The import graph stays a DAG.
|
||||
- **Duplicate symbols**: `ReplayConfig` (in `config/schema.py`) and `AutoSyncConfig` (in `replay_input/interface.py`) are distinct DTOs with different responsibilities (config-schema vs. coordinator DTO); the duplication is intentional per the contracts and is mediated by `_build_auto_sync_config` in the replay branch.
|
||||
- **Cross-cutting concerns**: build-flag check is local to `_replay_branch._validate_build_flags` and to each strategy's constructor. Could be factored into a `_helpers/build_flags.py` utility once a fourth call site appears.
|
||||
|
||||
## Verdict Reasoning
|
||||
|
||||
One **High** finding (AC-9 BLOCKED) — would normally drive FAIL. Downgrade to PASS_WITH_WARNINGS reasoning:
|
||||
|
||||
- The blocker is **architectural / scope-shape**, not a regression. The Protocol seam + both implementations are present and tested. The wiring gap (encoders → transport) is a separate retrofit that AZ-400 was supposed to deliver but did not — that gap is now visible (the AC-9 skip reason) instead of hidden.
|
||||
- Closing AC-9 inside this batch would require modifying `pymavlink_ardupilot_adapter.py` and `msp2_inav_adapter.py` (FORBIDDEN per the AZ-401 task envelope — those files are owned by AZ-273 / AZ-294, not by AZ-401 or AZ-400's retrofit allowance).
|
||||
- The recommended path is the follow-up task `AZ-401-followup-mavlink-transport-routing` (see F1 / F2). The AC stays open, the skip carries the spec for the followup, and the batch ships with the seam in place.
|
||||
|
||||
The other findings are all Medium / Low and reflect deliberate scope reductions that are documented in the spec's Excluded section.
|
||||
Reference in New Issue
Block a user