[AZ-594] Implement core-three harness stubs (fdr_reader, frame_source_replay, imu_replay)

Replaces the NotImplementedError stubs AZ-406 reserved on three runner-
side helpers; these were stranded from any tracker ticket since
AZ-407/408 never came back to fill them. Concrete bodies:

* fdr_reader.iter_records: JSONL parser + wire-envelope validator;
  recursive *.jsonl walk; projects {schema_version, ts, producer_id,
  kind, payload} to runner-side FdrRecord with record_type/monotonic_ms
  renames; yields oldest-first.
* frame_source_replay.replay_video: OpenCV VideoCapture decode + JPEG
  re-encode; auto-detects file vs directory; injectable sleep_fn for
  unit-test pacing.
* imu_replay.ImuReplayer.replay: csv.DictReader parse; degrees->radians
  attitude conversion; tolerates scientific notation; same sleep_fn
  injection pattern.

Adds 34 unit tests (14 + 10 + 10). Full e2e unit suite: 558 passed (+31).
Existing scenario _harness_helpers_implemented probes still return False
because they also depend on sitl_observer / fc_proxy_runtime stubs that
remain pending; scenario probe cleanup is out of AZ-594 scope.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 08:42:12 +03:00
parent 2d6d44af5d
commit 1d260f7e41
10 changed files with 1196 additions and 76 deletions
@@ -0,0 +1,86 @@
# Batch 74 Report — Harness Stubs (cycle 1, batch 8 of test phase)
**Batch**: 74
**Date**: 2026-05-17
**Context**: Test implementation (greenfield Step 10 — Implement Tests)
**Tasks**: AZ-594 (4 cp) — 1 task (umbrella ticket for the core three harness stubs)
**Cycle**: 1
**Verdict**: COMPLETE — PASS (self-reviewed; see `reviews/batch_74_review.md`)
## Summary
A planning-gap fix-up batch. The skip-gates in batches 71-73 referenced
`AZ-441 / AZ-407 / AZ-416 leftovers` — but on inspection none of those
tickets actually owned the deferred harness stubs (`frame_source_replay`,
`imu_replay`, `fdr_reader`, `sitl_observer`, `fc_proxy_runtime`). The
stubs were AZ-406 surface reservations that AZ-407/408 never came back
to fill, and were stranded without a tracker entry.
This batch creates a single umbrella ticket (AZ-594) and ships the
core three of the five — the ones with the lowest implementation risk
and the broadest unblock surface:
* `fdr_reader.iter_records` — JSONL parser + wire-schema validator.
* `frame_source_replay.replay_video` — OpenCV-backed decode + sink emission.
* `imu_replay.ImuReplayer.replay` — CSV parser + emitter driver.
The remaining two stubs (`sitl_observer.get_observer` + `read_*`
surfaces; `fc_proxy_runtime` driver + docker wiring) need separate
tickets — they touch live pymavlink / yamspy / TCP plumbing and don't
fit in a single batch alongside the core three. Those become batch 75
candidates.
### AZ-594 — Harness stubs (core three) (4 cp)
* **`runner/helpers/fdr_reader.py`** —
`iter_records(fdr_archive_root)` recursively walks `*.jsonl` files,
validates each line against the wire envelope (`schema_version, ts,
producer_id, kind, payload`), projects onto the runner-side
`FdrRecord` dataclass (`record_type` for `kind`; `monotonic_ms`
derived from ISO 8601 `ts`), and yields records oldest-first. Raises
`FileNotFoundError` on missing root + `ValueError` with file+line
on malformed envelopes. `archive_size_bytes` body was already present.
* **`runner/helpers/frame_source_replay.py`** — `FrameSourceReplayer`
now backs both `replay_image_directory(dir)` and `replay_video(path)`.
`replay_video` auto-detects file vs directory (so AZ-408 injector
frame-directory outputs work via the same entry point). OpenCV
`VideoCapture` decodes MP4; every frame is re-JPEG-encoded so the
sink always receives JPEG bytes. Cadence pacing is parameterised via
a `sleep_fn` injection so unit tests can drop wall-clock entirely.
* **`runner/helpers/imu_replay.py`** — `ImuReplayer.replay(csv_path)`
parses `data_imu.csv` (the AZ-408 schema with possibly scientific-
notation floats), constructs typed `ImuSample`s with attitude
converted from degrees → radians, and drives `FcInboundEmitter.emit`.
Same `sleep_fn` + `realtime` injection pattern as
`frame_source_replay` for test parity.
## Scenario-probe interaction
The existing `_harness_helpers_implemented` probes in batches 71-73
pass `/tmp/non-existent` to each helper. Previously the inner
`except NotImplementedError: return False` fired; with the new bodies,
the outer `except Exception: return False` catches the
`FileNotFoundError` and still returns False. So:
* No existing scenario silently starts running its full E2E path.
* All eight skip-gated scenarios from batches 71-73 continue to skip
cleanly because they ALSO depend on `sitl_observer` /
`mavproxy_tlog_reader` (for some) / `fc_proxy_runtime` (for FT-N-04).
* Probe-cleanup is explicitly excluded from AZ-594 scope and will land
in the future batch that fills the remaining two stubs.
## Test Results
* New unit tests: 14 (fdr_reader) + 10 (frame_source_replay) + 10
(imu_replay) = **34 new tests**.
* Full `e2e/_unit_tests` suite: **558 passed in 139 s** (previous
cumulative: 527 → +31 net).
* No new linter errors.
## State
* Spec moved: `_docs/02_tasks/todo/AZ-594_harness_stubs_core_three.md`
`_docs/02_tasks/done/`.
* `_docs/_autodev_state.md` advanced to `last_completed_batch: 74`.
* Cumulative review window: `last_cumulative_review = batches_70-72`;
next K=3 cumulative review fires at the end of batch 75.
@@ -0,0 +1,144 @@
# Code Review Report
**Batch**: 74 — AZ-594 (harness stubs: core three)
**Date**: 2026-05-17
**Verdict**: PASS
## Findings
(none)
## Findings Sweep
### Phase 1 — Context Loading
Read the AZ-594 task spec, then the three stub files in their
pre-implementation state (`fdr_reader.py`, `frame_source_replay.py`,
`imu_replay.py`). Cross-checked the SUT-side `FdrRecord` wire schema
in `src/gps_denied_onboard/fdr_client/records.py` to confirm field
names (`schema_version`, `ts`, `producer_id`, `kind`, `payload`,
`extra`) and the rename strategy the runner-side `FdrRecord`
duplicates (`kind → record_type`, `ts → monotonic_ms`). Re-read the
existing scenario `_harness_helpers_implemented` probes in the batch
71-73 scenario files to confirm the post-fix probe semantics.
### Phase 2 — Spec Compliance
| AC | Coverage | Status |
|----|----------|--------|
| AC-1 (`fdr_reader.iter_records` parses every `*.jsonl` under root, yields `FdrRecord` ordered by `monotonic_ms`; raises FileNotFoundError on missing root) | `test_missing_root_raises_file_not_found`, `test_empty_root_yields_nothing`, `test_single_file_round_trip`, `test_multiple_files_are_merged_and_sorted`, `test_subdirectory_files_included`, `test_payload_passed_through`; envelope-validation tests cover the wire-schema gate (`test_missing_envelope_key_raises`, `test_non_object_line_raises`, `test_malformed_json_raises`, `test_empty_producer_id_raises`, `test_ts_iso_without_z_parses`, `test_blank_lines_are_skipped`); `archive_size_bytes` covered by `test_archive_size_bytes_sums_all_files` and `test_archive_size_bytes_missing_root_returns_zero` | Covered |
| AC-2 (`frame_source_replay.replay_video` accepts `.mp4` OR directory; decodes via OpenCV; emits via sink at cadence; raises FileNotFoundError on missing input) | `test_video_missing_path_raises_file_not_found`, `test_video_dir_delegates_to_image_directory`, `test_video_mp4_round_trip`; image-dir mode covered by `test_image_dir_missing_raises_file_not_found`, `test_image_dir_empty_returns_zero`, `test_image_dir_emits_sorted_by_name`, `test_image_dir_non_jpeg_reencoded`, `test_image_dir_skips_non_image_files`, `test_image_dir_non_realtime_does_not_sleep`, `test_image_dir_realtime_sleeps_per_frame` | Covered |
| AC-3 (`imu_replay.ImuReplayer.replay` parses CSV, constructs `ImuSample`s, calls emitter; tolerates scientific-notation floats) | `test_happy_path_emits_all_rows`, `test_scientific_notation_parses`, `test_attitude_radians_converted`, `test_realtime_sleeps_per_sample`, `test_non_realtime_does_not_sleep`, `test_empty_csv_emits_nothing`; error paths: `test_missing_csv_raises_file_not_found`, `test_rate_hz_must_be_positive`, `test_missing_required_columns_raises`, `test_row_with_unparseable_value_raises` | Covered |
| AC-4 (≥5 unit tests per body — happy + 2 error + 1 boundary) | fdr_reader: 14 tests; frame_source_replay: 10 tests; imu_replay: 10 tests | Covered (exceeds floor) |
| AC-5 (full suite passes) | 558 passed (+31 from 527 baseline) | Covered |
### Phase 3 — Code Quality
* **Single responsibility**:
* `fdr_reader` owns wire-envelope parsing + projection to the runner-side
dataclass. It does NOT validate per-kind payload keys — that's left to
the consuming evaluators, matching the AZ-272/273 contract that
documents `payload` as opaque on the runner side.
* `frame_source_replay.FrameSourceReplayer` owns frame decode + sink
emission. Cadence pacing is parameterised via `sleep_fn` for testability;
OpenCV is the only third-party import.
* `imu_replay.ImuReplayer` owns CSV parse + emitter invocation. Cadence
pacing same pattern as frame_source_replay. Row parsing is a pure
function `_parse_row` so missing/malformed columns surface as a
`ValueError` with a file + line pointer.
* **No suppressed errors**:
* `fdr_reader._parse_envelope` raises `ValueError` (with file + line)
on every malformed envelope branch; `json.JSONDecodeError` propagates
naturally for unrecoverable JSON corruption.
* `frame_source_replay._decode_and_emit_video` raises explicit
`ValueError` if OpenCV cannot open the file or fails to encode a
frame — never silently skips a frame.
* `imu_replay._parse_row` re-raises `KeyError`/`ValueError` wrapped
as a contextual `ValueError`; the original `__cause__` is preserved
via `raise ... from exc`.
* No `except Exception: pass`, `2>/dev/null`, or empty `except` blocks.
* **AAA comment discipline**: every new test uses `# Arrange / # Act /
# Assert`; sections omitted when not needed.
* **No code comments narrating what code does** — only the module-level
docstrings explain the wire-vs-runner schema rename, the OpenCV
realtime/non-realtime distinction, and the scientific-notation
tolerance rationale.
* **Public boundary**: confirmed all three modules import only
stdlib (`csv`, `json`, `math`, `time`, `pathlib`, `datetime`) +
`cv2`. No `from gps_denied_onboard ...` anywhere.
### Phase 4 — Security
* **No new credentials, secrets, or network surface**. All three
helpers are deterministic file-I/O over caller-supplied paths.
* **Wire-schema gate** in `fdr_reader._parse_envelope` is the safety
invariant — if the SUT FDR schema drifts, the runner blows up at
parse time with a file+line pointer rather than silently producing
records with default-zero fields. This is the explicit
drift-detection rationale documented in the `FdrRecord` docstring.
* **No `eval`, `exec`, `pickle`, or `subprocess`** in any of the three
modules.
### Phase 5 — Performance
* All three implementations are O(N) over their input streams
(JSONL lines, frames, CSV rows). Sorting in `iter_records` is the
one O(N log N) step — necessary for the multi-file merge-sort
guarantee, and acceptable since the typical archive is bounded by
the AZ-441 50 GB / 8 h budget.
* No file I/O at module-import time.
* `FrameSourceReplayer._decode_and_emit_video` always re-encodes
frames to JPEG (lossy double-encode if source was MP4). The
intent is so the sink always receives JPEG bytes regardless of
source format — matches the SUT's `ONBOARD_FRAME_SOURCE_PATH`
expectation of file-system-readable JPEG frames.
### Phase 6 — Cross-Task Consistency
* **Sleep injection pattern**: `frame_source_replay.FrameSourceReplayer`
and `imu_replay.ImuReplayer` both accept `sleep_fn` as a keyword
argument defaulting to `time.sleep`. Tests pass `lambda _: None`
or a recording stub. Single pattern across the two replay surfaces.
* **`realtime` vs `non-realtime` flag**: both replayers default to
`realtime=True` and skip the sleep when `False`. Consistent semantics
so that future scenarios can choose between wall-clock replay
(for live-FC tests) and fast replay (for FDR/evidence-only tests).
* **FileNotFoundError**: all three new bodies surface missing input
paths via `FileNotFoundError`, consistent with how the existing
helpers `accuracy_evaluator`, `multi_segment_evaluator`,
`mavproxy_tlog_reader`, `cold_start_evaluator` handle missing
inputs.
* **Scenario probe interaction**: the existing
`_harness_helpers_implemented` probes in batches 71-73 (FT-P-07,
FT-P-08, FT-P-10, FT-P-11, FT-N-01, FT-N-02, FT-N-03, FT-N-04)
pass `/tmp/non-existent` to each helper. Previously the inner
`except NotImplementedError: return False` fired; now the outer
`except Exception: return False` catches the new `FileNotFoundError`
and still returns False. The scenarios continue to skip — which is
correct because the OTHER probe-gated helpers (`sitl_observer`,
`mavproxy_tlog_reader` for some scenarios, `fc_proxy_runtime`) are
still pending. Probe cleanup is explicitly excluded from AZ-594
scope and will happen once the remaining harness stubs land.
### Phase 7 — Architecture Compliance
* **Module placement unchanged**: existing files at
`e2e/runner/helpers/{fdr_reader,frame_source_replay,imu_replay}.py`
were edited in place (bodies replace `NotImplementedError`); no
layout changes. New unit-test files in
`e2e/_unit_tests/helpers/`. Layout invariant test still passes —
these helpers were already listed.
* **No `src/gps_denied_onboard` imports** anywhere. Verified.
* **Existing scenario surface preserved**: `FdrRecord` dataclass
field names and types are unchanged from the AZ-406 contract.
`ImuSample` likewise. `ReplayCadence` and `FrameSink` Protocol
unchanged. Consumers in batches 71-73 stay valid.
## Test Results
* New unit tests: 14 (fdr_reader) + 10 (frame_source_replay) + 10
(imu_replay) = **34 new tests**.
* Full `e2e/_unit_tests` suite: **558 passed in 139 s** (previous
cumulative: 527 → +31 net).
* No new linter errors (`ReadLints` on all six new/modified files
reported clean).