# Code Review Report

**Batch**: 75 — AZ-595 (sitl_observer FDR-replay + scenario probe cleanup)
**Date**: 2026-05-17
**Verdict**: PASS

## Findings

(none)

## Findings Sweep

### Phase 1 — Context Loading

Loaded the AZ-595 task spec, the pre-implementation `sitl_observer.py`
(which previously raised `NotImplementedError` from every surface), the
two consumers that depend on its dataclass shapes (`ap_contract_evaluator`,
`msp_frame_observer`), and the 8 scenarios that previously gated on
`_harness_helpers_implemented` (FT-P-01, FT-P-02, FT-P-03/14, FT-P-04,
FT-P-05, FT-P-07, FT-P-08, FT-P-09-AP, FT-P-09-iNav, FT-P-10, FT-P-11,
FT-N-01, FT-N-02, FT-N-03, FT-N-04). Cross-checked the FDR record wire
schema used by `fdr_reader.iter_records` (batch 74) to confirm the
single-JSON-payload format the new observer reads matches what a
fixture builder would produce.

### Phase 2 — Spec Compliance

| AC | Coverage | Status |
|----|----------|--------|
| AC-1 (`replay_dir_available` + `replay_dir` resolve `E2E_SITL_REPLAY_DIR` env var; absent / unset / missing-dir all surface as falsy) | `test_replay_dir_available_returns_false_when_env_missing`, `test_replay_dir_available_returns_false_when_env_empty`, `test_replay_dir_available_returns_false_when_dir_missing`, `test_replay_dir_available_returns_true_when_dir_exists`, `test_replay_dir_returns_none_when_env_missing`, `test_replay_dir_returns_path_when_env_set` | Covered |
| AC-2 (every previously-stubbed read surface reads its dedicated JSON fixture, parses into the public dataclass, returns `[]` / `None` when fixture absent) | `read_ekf_divergence_events` (4 tests), `read_gps_health_samples` (3 tests), `read_consistency_check_events` (3 tests), `capture_ap_tlog` (2 tests), `read_ap_parameter` (3 tests), `observe_inav_tcp_handshake` (3 tests), `collect_inav_msp_frames` (3 tests), `query_inav_gps_state` (2 tests), `get_observer.read_gps_state` (3 tests), `get_observer.read_parameter` (3 tests) | Covered |
| AC-3 (every `prepare_sitl_*` surface is a no-op when fixture absent and a no-op pass-through when fixture present — the runner is offline-only in batch 75) | `test_prepare_sitl_cold_boot_is_no_op_when_fixture_absent`, `test_prepare_sitl_cold_boot_is_no_op_with_fixture`, `test_prepare_sitl_no_gps_is_no_op`, `test_prepare_sitl_cold_boot_empty_fixture_path_raises` | Covered |
| AC-4 (malformed fixture JSON surfaces as `ValueError` with a file pointer — never silent `[]`) | `test_read_ekf_divergence_events_malformed_raises`, `test_read_gps_health_samples_malformed_raises`, `test_read_consistency_check_events_malformed_raises`, `test_capture_ap_tlog_invalid_json_raises`, `test_read_ap_parameter_missing_key_raises`, `test_observe_inav_tcp_handshake_invalid_raises`, `test_collect_inav_msp_frames_invalid_raises`, `test_query_inav_gps_state_invalid_raises`, `test_get_observer_invalid_payload_raises` | Covered |
| AC-5 (8 scenarios that gated on `_harness_helpers_implemented` now consume the shared `sitl_replay_ready` fixture and skip cleanly when `E2E_SITL_REPLAY_DIR` is unset) | conftest `sitl_replay_ready` fixture (1 session-scoped fixture); 12 refactored scenarios (FT-P-01/02/03/14/04/05/07/08/09-AP/09-iNav/10/11, FT-N-01/02/03/04) — local `_harness_helpers_implemented` + `_NullSink` + `_NullImuEmitter` definitions removed; scenarios depend on `sitl_replay_ready: bool` and skip with an AZ-595-referencing message | Covered |
| AC-6 (full suite passes) | 596 passed (+38 from 558 baseline) | Covered |

### Phase 3 — Code Quality

* **Single responsibility**:
  * `sitl_observer.replay_dir` / `replay_dir_available` own env-var
    resolution. They are the ONLY readers of `E2E_SITL_REPLAY_DIR`
    in the runner — every downstream surface goes through them.
  * Each `read_*` / `capture_*` / `observe_*` / `collect_*` /
    `query_*` surface owns exactly one JSON fixture. The mapping
    `<surface> ↔ <filename>` is encoded in the call site, not
    distributed across helpers, so a fixture-builder author can
    grep `sitl_observer.py` once to see the full file list.
  * `_load_optional_json_list` is the only path for "list of events,
    fixture optional". `_load_required_json` is the only path for
    "single dict, fixture must exist". Two helpers, two contracts.
  * `_FdrReplayObserver` is a frozen dataclass: the only state is
    the loaded payload + the fc-adapter kind + host. No mutable
    state, no I/O after construction.
  * `prepare_sitl_cold_boot` / `prepare_sitl_no_gps` are no-ops in
    replay mode by design. The docstring explains: live SITL
    parameter loading is owned by a follow-up live-mode observer,
    not by the FDR-replay branch.
* **No suppressed errors**:
  * Every JSON parse path raises `ValueError` with the offending
    file path on malformed input. No `except Exception: pass`,
    no `2>/dev/null`, no bare `except`.
  * `_load_optional_json_list` checks fixture existence + falls
    back to `[]` only when the file is genuinely absent — a present
    file with malformed JSON still raises. Tested by the
    `_malformed_raises` family.
  * `_load_required_json` raises `FileNotFoundError` on missing
    fixture and `ValueError` on parse failure. The `_invalid_raises`
    family of tests covers both branches.
* **AAA comment discipline**: all 38 new tests use
  `# Arrange / # Act / # Assert`; sections omitted when the test
  is a single line.
* **No code comments narrating what code does** — the module-level
  docstring explains the replay strategy and the runtime
  contract. Per-function docstrings document the fixture
  filename + dataclass mapping; no inline narration.
* **Public boundary**: the module imports only stdlib (`os`,
  `json`, `pathlib`, `dataclasses`, `typing`). Zero
  `from gps_denied_onboard ...` imports. Confirmed.

### Phase 4 — Security

* **No new credentials, secrets, or network surface**. The whole
  point of the FDR-replay strategy is that the runner does not
  touch a live SITL container in unit-test mode — every observer
  surface resolves to deterministic file I/O over JSON fixtures.
* **`E2E_SITL_REPLAY_DIR` env var** is read-only; the runner
  never writes to it. The path is resolved into a `Path` and
  joined with hard-coded filenames — no user-controlled string
  interpolation into a shell, no `eval`, no `subprocess`.
* **No `pickle`, no `marshal`, no `yaml.load(unsafe=True)`**:
  fixtures are pure JSON parsed via `json.loads`.

### Phase 5 — Performance

* Every surface is O(N) over the fixture content — a JSON file
  with N records. For the maximum scenario in batch 75
  (`collect_inav_msp_frames` for a 60 s window at ~5 Hz) the
  fixture would be ≤300 frames, dominated by the JSON parse.
* No I/O at module-import time. `replay_dir()` resolves the
  env var on each call — cheap, no caching needed because
  scenarios only invoke it via the session-scoped
  `sitl_replay_ready` fixture.
* `_FdrReplayObserver` is a frozen dataclass cached behind the
  module-level `get_observer` factory. Multiple calls for the
  same (fc_kind, host) tuple return the same instance without
  re-parsing the underlying JSON.

### Phase 6 — Cross-Task Consistency

* **Probe pattern unified**: the 12 scenarios that used to define
  a local `_harness_helpers_implemented` fixture (+ `_NullSink`
  / `_NullImuEmitter` helper classes) now consume the single
  session-scoped `sitl_replay_ready` fixture from
  `e2e/tests/conftest.py`. Removing the local probes deleted
  ~636 lines of duplicate gating code in exchange for ~17 lines
  of shared fixture — net -619 LoC across the scenario suite.
* **Skip-message pattern unified**: every refactored scenario
  now emits a skip message of the form
  `"FT-X-Y full scenario requires \`E2E_SITL_REPLAY_DIR\` to point at a prepared SITL replay fixture (AZ-595). Pure-logic ACs covered by <unit-test>."`
  Grepping `sitl_replay_ready` returns exactly the 12 scenarios
  plus the conftest fixture — no orphaned uses, no missed
  scenarios.
* **Stale docstrings updated**: the module docstrings in FT-P-01,
  FT-P-02, FT-P-04 used to say "skip is keyed off
  `NotImplementedError` from the helper imports". These were
  updated to reference the new `sitl_replay_ready` fixture and
  the `E2E_SITL_REPLAY_DIR` env var. The FT-P-02 docstring also
  no longer claims `imu_replay` raises `NotImplementedError`
  (since AZ-594 landed it in batch 74).
* **Dataclass field names match consumers**: the new
  `EkfDivergenceEvent`, `GpsHealthSample`, `ConsistencyCheckEvent`,
  `TcpHandshakeReport`, `MspFrameCapture`, `InavGpsState`,
  `FcGpsState`, `MspFrameSample` dataclasses use the exact
  field names already referenced by the batch-72/73 evaluators
  (`ap_contract_evaluator`, `msp_frame_observer`, the blackout/
  outlier/outage evaluators, the cold-start evaluator). No
  consumer required edits.
* **No-op `prepare_sitl_*` is a deliberate semantic choice**:
  scenarios that previously called `sitl_observer.prepare_sitl_cold_boot`
  (FT-P-11) or `sitl_observer.prepare_sitl_no_gps` still call
  them, and now succeed instead of raising. The actual parameter
  load is recorded into the fixture by the (future) fixture
  builder, so the runtime call is a no-op in replay mode. The
  scenario logic is unchanged.

### Phase 7 — Architecture Compliance

* **Module placement unchanged**: `sitl_observer.py` was edited
  in place at its existing `e2e/runner/helpers/` location. The
  new unit-test file lives at
  `e2e/_unit_tests/helpers/test_sitl_observer.py`, replacing the
  prior stub-only smoke test. Directory layout invariant test
  still passes — both paths were already registered.
* **No `src/gps_denied_onboard` imports** anywhere in the
  observer. Confirmed.
* **No new top-level dependencies**: stdlib only. The runner
  `requirements.txt` was not touched.
* **Backwards-compatible scenario contract**: every public
  surface that scenarios previously called (`get_observer`,
  `prepare_sitl_cold_boot`, `prepare_sitl_no_gps`,
  `capture_ap_tlog`, `read_ap_parameter`,
  `observe_inav_tcp_handshake`, `collect_inav_msp_frames`,
  `query_inav_gps_state`, `read_ekf_divergence_events`,
  `read_gps_health_samples`, `read_consistency_check_events`)
  retains the same name + return type. The scenarios needed no
  call-site changes beyond the skip-gate fixture swap.

## Test Results

* New unit tests: **38** (covering `sitl_observer` end-to-end —
  every `read_*`, `capture_*`, `observe_*`, `collect_*`,
  `query_*`, `prepare_*`, plus `replay_dir` and `get_observer`).
* Full `e2e/_unit_tests` suite: **596 passed in 123 s**
  (previous cumulative: 558 → +38 net).
* No new linter errors (`ReadLints` clean on
  `sitl_observer.py`, `test_sitl_observer.py`, `conftest.py`,
  and all 12 refactored scenario files).