Files
gps-denied-onboard/_docs/03_implementation/reviews/batch_09_review.md
T
Oleksandr Bezdieniezhnykh a61d2d3f4b [AZ-391] C8 inbound: MAVLink + MSP2 decoders + rings + bus + warm-start
Adds the C8 inbound producer side:
- TelemetryRing[T]: bounded drop-oldest ring; first-overflow INFO log
  + monotonic dropped_count.
- SubscriptionBus + SubscriptionHandle: synchronous fan-out, lock-
  released-before-callback to avoid deadlock; subscriber crash caught
  + DEBUG-logged so one bad subscriber cannot kill the decode loop.
- PymavlinkInboundDecoder: pymavlink-based AP decoder for RAW_IMU,
  SCALED_IMU2, ATTITUDE, GPS_RAW_INT, GPS2_RAW, HEARTBEAT, STATUSTEXT.
  Out-of-order drop (Invariant 7) per-kind WARN. STATUSTEXT spoofing
  sentinel promotes subsequent GPS to GpsStatus.SPOOFED within 5 s.
  AC-5.1 warm-start hint cached on first 3D+ fix; embedded into
  every FlightStateSignal.
- Msp2InavInboundDecoder: YAMSPy-based iNav polling decoder for IMU /
  attitude / GPS / flight-state. signed=False always (RESTRICT-COMM-2);
  GpsStatus.SPOOFED is unreachable on iNav.

Adds yamspy>=0.3.3 + pyserial>=3.5 to pyproject.toml.

Tests: 443 pass / 2 skip / 0 fail (+33 in batch 9).

Contract: no drift on fc_adapter_protocol.md v1.0.0; this batch
implements the inbound producer side without changing signatures.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:28:14 +03:00

10 KiB

Batch 09 — Code Review

Batch: 9 of N Task: AZ-391 (C8 inbound subscription path — MAVLink + MSP2 decoders + rings + bus + warm-start hint) Reviewer: autodev (7-phase) Verdict: PASS_WITH_INFO Date: 2026-05-11

Scope

Task Component / Concern Files touched (prod) Files touched (tests)
AZ-391 C8 inbound decoders + rings + subscription bus + warm-start hint components/c8_fc_adapter/{_telemetry_rings.py, _subscription.py, _inbound_mavlink.py, _inbound_msp2.py}, pyproject.toml (yamspy + pyserial) tests/unit/c8_fc_adapter/test_az391_inbound_subscription.py

Phase 1 — AC compliance

AC Coverage
AC-1 AP RAW_IMU decode test_ac1_ap_raw_imu_decode — frame fields match; received_at = monotonic_ns() at decode boundary asserted (non-zero).
AC-2 AP ATTITUDE decode test_ac2_ap_attitude_decode — roll/pitch/yaw forwarded.
AC-3 AP GPS_RAW_INT → GpsHealth test_ac3_ap_gps_fix_type_mapping (7 fix_type values parametrised) + test_ac3_spoofing_sentinel_promotes_to_spoofed (STATUSTEXT sentinel → SPOOFED).
AC-4 AP HEARTBEAT → FlightState test_ac4_heartbeat_to_flight_state (10 (system_status, base_mode) tuples).
AC-5 iNav MSP2 decode + SPOOFED unreachable test_ac5_inav_msp2_imu_decode, test_ac5_inav_msp2_attitude_decode, test_ac5_inav_spoofed_status_unreachable.
AC-6 Ring drop-oldest + first-overflow INFO test_ac6_ring_drop_oldest_and_logs_first_overflow (1000 → 100; dropped=900; exactly 1 INFO record).
AC-7 Multi-subscriber fan-out + cancel test_ac7_multi_subscriber_fan_out_and_cancel + test_ac7_subscriber_exception_does_not_kill_dispatch.
AC-8 Warm-start hint within 1 s test_ac8_warm_start_hint_present_after_first_gps + test_ac8_warm_start_hint_propagates_to_flight_state_signal.
AC-9 Out-of-order drop + WARN test_ac9_out_of_order_dropped_with_warn.
AC-10 Decode-error isolation test_ac10_corrupt_frame_does_not_kill_decoder.

33 new tests added in batch; 443 total in suite (was 410), 2 pre-existing skips, 0 failures.

Phase 2 — Contract drift

  • _docs/02_document/contracts/c8_fc_adapter/fc_adapter_protocol.mdunchanged at v1.0.0. This batch implements the inbound producer side of the contract surface without altering signatures. The subscribe_telemetry and current_flight_state Protocol methods will be wired into the concrete adapter classes by AZ-393 / AZ-394.

Phase 3 — Architectural compliance

  • No new public-API surface: all four new modules are prefixed _ per module-layout.md ("internal helpers MUST start with _"). The c8_fc_adapter/__init__.__all__ gate from batch 8 is still tight — verified by the smoke test from batch 8 (test_internal_modules_not_in_public_all).
  • Decoder dependency direction: decoders import only from _types, _subscription, _telemetry_rings, logging. No upward imports from runtime_root or other components.
  • Source Protocol injection: both decoders take their wire source as an injected Protocol (MAVLinkSource, MspSource), so tests use deterministic in-memory fakes and never touch a real serial port. This matches ADR-009 (interface-first DI) and the AZ-391 constraint "no production stubs".
  • Single decoder thread invariant: both run_decode_loop and run_poll_loop are designed to be the sole reader of their respective source; the rings are pushed only from that thread. The lock is held only for the bookkeeping update; the dispatch fires after the lock is released.
  • Subscriber-crash isolation: implemented at the bus level, not the decoder level — a single subscriber raising during dispatch() cannot kill the decode loop. The bus emits a DEBUG record per crash and continues with the next subscriber.
  • Synchronous fan-out (not per-subscriber thread) — matches the contract's Invariant 8 ("inbound callbacks fire on the decoder thread"). The bus does NOT enforce the 100 µs subscriber budget — slow subscribers must use a non-blocking enqueue on their own thread (documented in _subscription.py).

Phase 4 — Performance & reliability

  • Ring push is O(1) amortised: collections.deque(maxlen=N) provides O(1) bounded push; the overflow check is one length comparison + one counter increment under the ring lock.
  • First-overflow INFO log is single-shot: the _overflow_logged flag is set under the lock, so concurrent producers cannot emit duplicate INFO records.
  • Dispatch path is allocation-light: one list copy of (sub_id, callback) pairs under the lock; no extra allocations per subscriber.
  • Decode-loop resilience: recv_match exceptions caught + DEBUG log + loop continues; per-message decode exceptions caught + DEBUG log + next message processed. Subscriber-callback exceptions caught + DEBUG log + next subscriber called. Three independent containment lines.
  • NFR (1 ms avg IMU callback) verified in test_nfr_perf_imu_callback_under_1ms — typical measured cost ~50 µs on a quiet CPU. The 200 Hz sustain NFR is for the IT/PT tier and not exercised here.
  • Thread-safety smoke (test_nfr_ring_thread_safety_smoke): 5 k producer pushes vs 5 k consumer snapshots — no crash, bounded length preserved.

Phase 5 — Test quality

  • AP decode tests use stub messages with get_type() lambdas — not real pymavlink wire bytes. This is intentional: we are testing OUR decode mapping (FC field → contract enum), not pymavlink's parser. A separate IT-tier test will exercise real-wire decode against SITL.
  • iNav tests use a _FakeMspSource that returns dicts. The production iNav driver wraps yamspy.MSPy whose decode-side returns the same dict shape — read_imu() / read_attitude() / read_gps() / read_flight_state() are the only methods the decoder touches.
  • AC-3 spoofing test exercises the STATUSTEXT → 5 s window → GPS_RAW_INT promotion chain end-to-end.
  • AC-6 ring test asserts exact survivor set (list(range(900, 1000))) — a regression that dropped the wrong items would fail.
  • AC-9 out-of-order test forces a fake _last_ts_ns[kind] of now + 10 s so the next frame is deterministically "older" — no race with real time.
  • AC-10 corruption test asserts BOTH the bad-frame DEBUG log AND the subsequent good-frame dispatch — a regression that crashed the loop on bad frames would fail.

Arrange / Act / Assert pattern consistently applied.

Phase 6 — Logging & FDR coverage

  • TelemetryRing: INFO log on first overflow per ring instance (kind="c8.inbound.ring_overflow", kv={ring_kind, capacity}).
  • SubscriptionBus: DEBUG log on callback crash (kind="c8.inbound.callback_error", kv={sub_id, error}).
  • AP decoder: WARN log on out-of-order frame (kind="c8.inbound.out_of_order_frame_dropped", kv={telemetry_kind, prev_ns, this_ns}); DEBUG log on per-message decode error (kind="c8.inbound.decode_error", kv={msg_type, error}); WARN log on spoofing-sentinel STATUSTEXT (kind="c8.inbound.spoofing_sentinel_seen", kv={text, captured_at}).
  • iNav decoder: WARN log on out-of-order frame (same kind as AP path); DEBUG log on per-tick decode error.
  • No new FDR record kinds — the AC list mentions only WARN/DEBUG logs, not FDR records. Aggregate-counter FDR record per 60 s is a forward-action enhancement noted in the implementation report.

Phase 7 — Security & risk surface

  • iNav signing-key path is impossibleMsp2InavInboundDecoder constructs FcTelemetryFrame(signed=False) unconditionally. The contract's FcAdapter.open(port, signing_key) parameter for iNav is rejected at config-load by FcConfig.__post_init__ (batch 8), so the only way to construct an iNav decoder is with no signing key. Tested by test_ac5_inav_msp2_imu_decode::frame.signed is False.
  • AP signing-flag assumption: PymavlinkInboundDecoder constructs FcTelemetryFrame(signed=True) unconditionally. The actual MAVLink 2.0 signing handshake is AZ-395's responsibility; if signing is NOT enabled, the signed=True here is overly optimistic. Promotion of this to a per-message check (using the pymavlink MAVLink_message.signed attribute) is a forward action when AZ-395 lands. Documented in the informational findings.
  • Spoofing-window text matching is case-insensitive substring: "GPS spoofing" or "GPS jamming" triggers the sentinel. False positives are possible if the FC's STATUSTEXT mentions these phrases in a non-detection context — we accept this in batch 9 scope because the spec leaves the spoofing-detection trigger vague ("vendor-specific telemetry, not always present"). Promotion to an exact-message-id check is a forward action.
  • No silent error suppression: every catch path emits a DEBUG / WARN / INFO record. The bare except Exception in the decode loop is the only way to honour the "decoder MUST NOT crash on a single malformed frame" AC; the matching log record + next-frame continuation provides full observability.
  • YAMSPy is a new external dependency — 38 KiB pure-Python wheel. No native code, no I/O at import time. License: MIT (per the YAMSPy GitHub repo). Reviewed before pinning.

Informational findings (non-blocking)

  1. PymavlinkInboundDecoder sets signed=True unconditionally — the actual signing state will be ground-truthed once AZ-395 lands. The unsigned-AP edge case is theoretical (AP with MAVLink 2.0 routing but signing disabled) and is documented in the contract for a future revision.
  2. Spoofing-sentinel window is hard-coded at 5 s — promotion to FcConfig.spoofing_sentinel_window_s (or per-FC override) is a forward-action contract bump. The unit test for AC-3 uses the default 5 s window.
  3. Aggregate 60 s out-of-order counter is not implemented — the spec's risk mitigation mentions it; we kept the per-drop WARN as the AC-9 contract. The 60 s aggregate could be added in a follow-up without contract drift (the WARN→INFO escalation lives in the logger, not in the public surface).
  4. current_flight_state() is not yet wired to consume the state ring — that wiring lives in AZ-393 / AZ-394's adapter class shells. The ring is exposed read-only as decoder.state_ring so the adapter can peek the latest entry.

Verdict

PASS_WITH_INFO — all 10 ACs satisfied, all tests green, no architectural drift, zero contract changes in this batch. The four informational findings are forward actions / theoretical edge cases tied to upcoming tasks, not blockers.