Implement all 11 `sitl_observer` public surfaces as an offline
FDR-replay strategy (reads JSON fixtures under `${E2E_SITL_REPLAY_DIR}`
instead of live pymavlink/yamspy). Replace 12 per-scenario
`_harness_helpers_implemented` probes with one shared session-scoped
`sitl_replay_ready` fixture in `e2e/tests/conftest.py`.
Net: -636 LoC of duplicated scenario gating, +17 LoC shared fixture,
+38 new unit tests (596 total, up from 558). Includes K=3 cumulative
review for batches 73-75 (PASS).
Co-authored-by: Cursor <cursoragent@cursor.com>
11 KiB
Code Review Report
Batch: 75 — AZ-595 (sitl_observer FDR-replay + scenario probe cleanup) Date: 2026-05-17 Verdict: PASS
Findings
(none)
Findings Sweep
Phase 1 — Context Loading
Loaded the AZ-595 task spec, the pre-implementation sitl_observer.py
(which previously raised NotImplementedError from every surface), the
two consumers that depend on its dataclass shapes (ap_contract_evaluator,
msp_frame_observer), and the 8 scenarios that previously gated on
_harness_helpers_implemented (FT-P-01, FT-P-02, FT-P-03/14, FT-P-04,
FT-P-05, FT-P-07, FT-P-08, FT-P-09-AP, FT-P-09-iNav, FT-P-10, FT-P-11,
FT-N-01, FT-N-02, FT-N-03, FT-N-04). Cross-checked the FDR record wire
schema used by fdr_reader.iter_records (batch 74) to confirm the
single-JSON-payload format the new observer reads matches what a
fixture builder would produce.
Phase 2 — Spec Compliance
| AC | Coverage | Status |
|---|---|---|
AC-1 (replay_dir_available + replay_dir resolve E2E_SITL_REPLAY_DIR env var; absent / unset / missing-dir all surface as falsy) |
test_replay_dir_available_returns_false_when_env_missing, test_replay_dir_available_returns_false_when_env_empty, test_replay_dir_available_returns_false_when_dir_missing, test_replay_dir_available_returns_true_when_dir_exists, test_replay_dir_returns_none_when_env_missing, test_replay_dir_returns_path_when_env_set |
Covered |
AC-2 (every previously-stubbed read surface reads its dedicated JSON fixture, parses into the public dataclass, returns [] / None when fixture absent) |
read_ekf_divergence_events (4 tests), read_gps_health_samples (3 tests), read_consistency_check_events (3 tests), capture_ap_tlog (2 tests), read_ap_parameter (3 tests), observe_inav_tcp_handshake (3 tests), collect_inav_msp_frames (3 tests), query_inav_gps_state (2 tests), get_observer.read_gps_state (3 tests), get_observer.read_parameter (3 tests) |
Covered |
AC-3 (every prepare_sitl_* surface is a no-op when fixture absent and a no-op pass-through when fixture present — the runner is offline-only in batch 75) |
test_prepare_sitl_cold_boot_is_no_op_when_fixture_absent, test_prepare_sitl_cold_boot_is_no_op_with_fixture, test_prepare_sitl_no_gps_is_no_op, test_prepare_sitl_cold_boot_empty_fixture_path_raises |
Covered |
AC-4 (malformed fixture JSON surfaces as ValueError with a file pointer — never silent []) |
test_read_ekf_divergence_events_malformed_raises, test_read_gps_health_samples_malformed_raises, test_read_consistency_check_events_malformed_raises, test_capture_ap_tlog_invalid_json_raises, test_read_ap_parameter_missing_key_raises, test_observe_inav_tcp_handshake_invalid_raises, test_collect_inav_msp_frames_invalid_raises, test_query_inav_gps_state_invalid_raises, test_get_observer_invalid_payload_raises |
Covered |
AC-5 (8 scenarios that gated on _harness_helpers_implemented now consume the shared sitl_replay_ready fixture and skip cleanly when E2E_SITL_REPLAY_DIR is unset) |
conftest sitl_replay_ready fixture (1 session-scoped fixture); 12 refactored scenarios (FT-P-01/02/03/14/04/05/07/08/09-AP/09-iNav/10/11, FT-N-01/02/03/04) — local _harness_helpers_implemented + _NullSink + _NullImuEmitter definitions removed; scenarios depend on sitl_replay_ready: bool and skip with an AZ-595-referencing message |
Covered |
| AC-6 (full suite passes) | 596 passed (+38 from 558 baseline) | Covered |
Phase 3 — Code Quality
- Single responsibility:
sitl_observer.replay_dir/replay_dir_availableown env-var resolution. They are the ONLY readers ofE2E_SITL_REPLAY_DIRin the runner — every downstream surface goes through them.- Each
read_*/capture_*/observe_*/collect_*/query_*surface owns exactly one JSON fixture. The mapping<surface> ↔ <filename>is encoded in the call site, not distributed across helpers, so a fixture-builder author can grepsitl_observer.pyonce to see the full file list. _load_optional_json_listis the only path for "list of events, fixture optional"._load_required_jsonis the only path for "single dict, fixture must exist". Two helpers, two contracts._FdrReplayObserveris a frozen dataclass: the only state is the loaded payload + the fc-adapter kind + host. No mutable state, no I/O after construction.prepare_sitl_cold_boot/prepare_sitl_no_gpsare no-ops in replay mode by design. The docstring explains: live SITL parameter loading is owned by a follow-up live-mode observer, not by the FDR-replay branch.
- No suppressed errors:
- Every JSON parse path raises
ValueErrorwith the offending file path on malformed input. Noexcept Exception: pass, no2>/dev/null, no bareexcept. _load_optional_json_listchecks fixture existence + falls back to[]only when the file is genuinely absent — a present file with malformed JSON still raises. Tested by the_malformed_raisesfamily._load_required_jsonraisesFileNotFoundErroron missing fixture andValueErroron parse failure. The_invalid_raisesfamily of tests covers both branches.
- Every JSON parse path raises
- AAA comment discipline: all 38 new tests use
# Arrange / # Act / # Assert; sections omitted when the test is a single line. - No code comments narrating what code does — the module-level docstring explains the replay strategy and the runtime contract. Per-function docstrings document the fixture filename + dataclass mapping; no inline narration.
- Public boundary: the module imports only stdlib (
os,json,pathlib,dataclasses,typing). Zerofrom gps_denied_onboard ...imports. Confirmed.
Phase 4 — Security
- No new credentials, secrets, or network surface. The whole point of the FDR-replay strategy is that the runner does not touch a live SITL container in unit-test mode — every observer surface resolves to deterministic file I/O over JSON fixtures.
E2E_SITL_REPLAY_DIRenv var is read-only; the runner never writes to it. The path is resolved into aPathand joined with hard-coded filenames — no user-controlled string interpolation into a shell, noeval, nosubprocess.- No
pickle, nomarshal, noyaml.load(unsafe=True): fixtures are pure JSON parsed viajson.loads.
Phase 5 — Performance
- Every surface is O(N) over the fixture content — a JSON file
with N records. For the maximum scenario in batch 75
(
collect_inav_msp_framesfor a 60 s window at ~5 Hz) the fixture would be ≤300 frames, dominated by the JSON parse. - No I/O at module-import time.
replay_dir()resolves the env var on each call — cheap, no caching needed because scenarios only invoke it via the session-scopedsitl_replay_readyfixture. _FdrReplayObserveris a frozen dataclass cached behind the module-levelget_observerfactory. Multiple calls for the same (fc_kind, host) tuple return the same instance without re-parsing the underlying JSON.
Phase 6 — Cross-Task Consistency
- Probe pattern unified: the 12 scenarios that used to define
a local
_harness_helpers_implementedfixture (+_NullSink/_NullImuEmitterhelper classes) now consume the single session-scopedsitl_replay_readyfixture frome2e/tests/conftest.py. Removing the local probes deleted ~636 lines of duplicate gating code in exchange for ~17 lines of shared fixture — net -619 LoC across the scenario suite. - Skip-message pattern unified: every refactored scenario
now emits a skip message of the form
"FT-X-Y full scenario requires \E2E_SITL_REPLAY_DIR` to point at a prepared SITL replay fixture (AZ-595). Pure-logic ACs covered by ."Greppingsitl_replay_ready` returns exactly the 12 scenarios plus the conftest fixture — no orphaned uses, no missed scenarios. - Stale docstrings updated: the module docstrings in FT-P-01,
FT-P-02, FT-P-04 used to say "skip is keyed off
NotImplementedErrorfrom the helper imports". These were updated to reference the newsitl_replay_readyfixture and theE2E_SITL_REPLAY_DIRenv var. The FT-P-02 docstring also no longer claimsimu_replayraisesNotImplementedError(since AZ-594 landed it in batch 74). - Dataclass field names match consumers: the new
EkfDivergenceEvent,GpsHealthSample,ConsistencyCheckEvent,TcpHandshakeReport,MspFrameCapture,InavGpsState,FcGpsState,MspFrameSampledataclasses use the exact field names already referenced by the batch-72/73 evaluators (ap_contract_evaluator,msp_frame_observer, the blackout/ outlier/outage evaluators, the cold-start evaluator). No consumer required edits. - No-op
prepare_sitl_*is a deliberate semantic choice: scenarios that previously calledsitl_observer.prepare_sitl_cold_boot(FT-P-11) orsitl_observer.prepare_sitl_no_gpsstill call them, and now succeed instead of raising. The actual parameter load is recorded into the fixture by the (future) fixture builder, so the runtime call is a no-op in replay mode. The scenario logic is unchanged.
Phase 7 — Architecture Compliance
- Module placement unchanged:
sitl_observer.pywas edited in place at its existinge2e/runner/helpers/location. The new unit-test file lives ate2e/_unit_tests/helpers/test_sitl_observer.py, replacing the prior stub-only smoke test. Directory layout invariant test still passes — both paths were already registered. - No
src/gps_denied_onboardimports anywhere in the observer. Confirmed. - No new top-level dependencies: stdlib only. The runner
requirements.txtwas not touched. - Backwards-compatible scenario contract: every public
surface that scenarios previously called (
get_observer,prepare_sitl_cold_boot,prepare_sitl_no_gps,capture_ap_tlog,read_ap_parameter,observe_inav_tcp_handshake,collect_inav_msp_frames,query_inav_gps_state,read_ekf_divergence_events,read_gps_health_samples,read_consistency_check_events) retains the same name + return type. The scenarios needed no call-site changes beyond the skip-gate fixture swap.
Test Results
- New unit tests: 38 (covering
sitl_observerend-to-end — everyread_*,capture_*,observe_*,collect_*,query_*,prepare_*, plusreplay_dirandget_observer). - Full
e2e/_unit_testssuite: 596 passed in 123 s (previous cumulative: 558 → +38 net). - No new linter errors (
ReadLintsclean onsitl_observer.py,test_sitl_observer.py,conftest.py, and all 12 refactored scenario files).