# Cumulative Code Review — Batches 79, 80, 81 **Date**: 2026-05-17 **Verdict**: PASS Covers the arc: * **Batch 79 / AZ-599** — FT-P-02 Derkachi vertical-slice fixture builder + `_common.py` extraction (factored out shared helpers from the b78 FT-P-01 builder). * **Batch 80 / AZ-600** — refactor `_common.py` + `build_p01_fixtures.py` + `build_p02_fixtures.py` into a parameterised strategy framework (`builder.py`). * **Batch 81 / AZ-420** — FT-P-12 (GCS downsample) + FT-P-13 (GCS command path) scenarios with a new `gcs_telemetry_evaluator.py` helper + `sitl_observer.capture_gcs_tlog`. The three batches together close two unrelated arcs: 1. **Fixture-builder consolidation**: the b78 FT-P-01 vertical slice landed as a one-off builder; b79 added a second one-off (FT-P-02) plus an `_common.py` extraction; b80 then collapsed both into a strategy-pattern framework where future scenarios compose existing strategies. Net result: a single durable `builder.py` instead of N per-scenario builders. 2. **GCS-leg blackbox coverage**: b81 adds the FT-P-12 and FT-P-13 scenarios that exercise the operator-facing telemetry stream and command path — orthogonal to the fixture-builder work and the first scenario pair that probes the C8 `QgcTelemetryAdapter`. ## Cross-Cutting Themes ### Convergence on a single helper / evaluator shape Both b80's `builder.py` strategies and b81's `gcs_telemetry_evaluator.py` follow the same convention seen across the wider runner.helpers package: 1. **Pure-logic dataclasses** (`FixtureBuilderConfig`, `GcsSummaryRateReport`, `HintAckReport`, `SearchRegionShiftReport`, `HintRejectionReport`, `InboundHint`, `FdrCommandAck`, `SearchRegionRecord`) — `@dataclass(frozen=True)` with a `passes` property where the dataclass is an evaluation result. 2. **`Iterable[TlogMessage]` ingress** — b81's four evaluators all accept `Iterable[TlogMessage]` instead of a Sequence so they compose with `mavproxy_tlog_reader.iter_messages` without materialisation. The scenario test materialises once via `collect_messages_to_list` and reuses for multiple analyzers (mirrors `ap_contract_evaluator.collect_messages_to_list` from b78). 3. **No `src/gps_denied_onboard` imports** — `e2e/_unit_tests/ test_no_sut_imports.py` walks the entire `e2e/` tree and passes across all three batches. ### Fixture-naming convention The artifact naming pattern `_.tlog` is consistent across batches: * `ap_tlog_.tlog` (pre-existing, AP/SITL-side) * `gcs_tlog_.tlog` (b81, GCS-side) Same env var (`E2E_SITL_REPLAY_DIR`), same observer surface (`sitl_observer.capture_ap_tlog` / `capture_gcs_tlog`), same RuntimeError messaging convention ("RuntimeError: capture_X: env var not set", "fixture not found at PATH"). ### Strategy-pattern reuse readiness B80's `builder.py` factored four ABCs (`VideoSource`, `TlogSource`, `FdrProjection`) + six concrete impls. B81 was the first scenario batch after b80; it consumes no new strategy from `builder.py` (FT-P-12/13 are runtime tests, not fixture-builder scenarios), so the strategy framework's reuse is still latent. The next builder work (any of FT-P-04/05/07/08/10/11) will be the actual test of b80's refactor — they should each land as a ~30-line config factory, not a new builder module. ## Phase-by-Phase Findings ### Phase 1 — Context Loading Read all three batch reports, all three per-batch reviews, and the task specs AZ-599, AZ-600, AZ-420. Mapped changed files to tasks via the batch reports' "Tasks" sections. ### Phase 2 — Spec Compliance (cumulative) All three per-batch reviews already verified ACs against tests. Re-walking AC coverage across the union: * **AZ-599 (FT-P-02 builder)**: 7 scenario integration tests (test_sitl_replay_builder_p02.py). Behavior preserved through b80's refactor (same tests now in slimmed file, plus new strategy-level coverage in test_sitl_replay_builder_builder.py). * **AZ-600 (strategy refactor)**: 33 strategy-level tests + 6 + 7 scenario tests = 46 total across the three files. * **AZ-420 (FT-P-12+13)**: 39 evaluator + adapter unit tests + 2 scenario tests (skip locally; collect 12 param IDs total). No AC went from covered → uncovered between batches. ### Phase 3 — Code Quality (cumulative) * **SRP across batches**: b80 splits orchestration (`build_fixtures`) from materialisation (each strategy) — clean. B81's `gcs_telemetry_evaluator.py` keeps four independent evaluator functions in one module — defensible because they all consume the same tlog stream and the same FDR record family; splitting would just shuffle imports. * **Naming consistency**: timestamps end in `_us` or `_ms` everywhere; distances end in `_m`; rates end in `_hz`; coordinates end in `_deg`. Units in the name, not in comments. ✓ * **Test AAA pattern**: spot-checked 10 random tests across all three batches. All use `# Arrange / # Act / # Assert` comments per the coding rules. ✓ * **No comment narration of mechanics**: only docstrings + one-liners on non-obvious invariants (`STATIONARY_Z_ACCEL_MG` gravity encoding in b80; the greedy-pairing "keep moving forward to find the last pre-hint" in b81). ✓ ### Phase 4 — Security Quick-Scan (cumulative) * No SQL, no `shell=True`, no `eval`/`exec`. ✓ * No hardcoded secrets/API keys. ✓ * `subprocess.run` in `builder.py`'s `run_gps_denied_replay` uses a list-form `cmd`, not shell-interpolation. ✓ * Tlog/FDR record parsing wraps `int()` / `float()` calls with ValueError re-raise messaging on the offending payload (b81's `parse_reloc_payload`, b80's `ImuCsvTlog`). ✓ ### Phase 5 — Performance (cumulative) * `build_fixtures` is bound by subprocess (`run_gps_denied_replay`) wall-clock, not in-process compute. Per-strategy materialisation is O(n) over input rows/frames. ✓ * `gcs_telemetry_evaluator` is all O(n) over messages with a single O(n log n) ack sort. ✓ * No new blocking I/O introduced. No new `time.sleep` polling loops. ✓ ### Phase 6 — Cross-Task Consistency * **Interface compatibility across batches**: b80 removed `_common.py`; b81 doesn't reference it. The only b81-side import from the fixture-builder area is the implicit fixture file path pattern, which is unchanged. No drift. ✓ * **Duplicate symbols**: `compute_gcs_summary_rate` (b81) and `compute_gps_input_rate` (pre-existing `ap_contract_evaluator`) compute the same `(N-1)/window` rate over different message types — same algorithm, different AC. The fact they're separate functions is correct (they have different domain semantics). If a third "rate over message type X" appears, then a generic helper would be warranted. ✓ * **Shared logging / config**: no batch introduced a new logging or config loader; the existing `runner.helpers.replay_mode` env-var helpers are reused. ✓ ### Phase 7 — Architecture Compliance `_docs/02_document/module-layout.md` declares Blackbox Tests as the sole owner of `e2e/**`. 1. **Layer direction**: all imports across the 23 changed files resolve within `e2e/` (`runner.helpers.*`, `e2e.fixtures.*`, stdlib, pytest, mavlink/pymavlink, cv2 in the b80 builder). No imports of `src/gps_denied_onboard.*`. ✓ 2. **Public API respect**: all cross-helper imports go through module-level public names; no `_private` symbol crossed a module boundary. ✓ 3. **No new cyclic deps**: b80's `builder.py` is a leaf consumed by `build_p01_fixtures.py` and `build_p02_fixtures.py`; b81's `gcs_telemetry_evaluator.py` is a leaf consumed by the two scenario tests. No back-edges. ✓ 4. **No duplicate symbols across components**: there is only one component touched (Blackbox Tests), so this check is vacuous except internally — covered by Phase 6. 5. **Cross-cutting concerns**: haversine math (b81) is local. The project does not yet have a shared geo-math module; if a second consumer appears (e.g., spoof-distance check in FT-N-04), promote to `runner/helpers/geo_math.py`. Tracked here, not flagged. ✓ ## Findings (None new at cumulative scope.) The two Low/Note findings from `batch_81_review.md` (`HintAckReport.passes` empty-hints semantic; FDR record loop branch ordering) remain as documented; both are scoped to b81 alone and neither is exacerbated by viewing b79/b80/b81 jointly. Carry over to the next cumulative cycle if a future batch touches the same files. ## Regression Gate (cumulative) Full `e2e/_unit_tests/` suite: **746 passing in 147.57 s**. Baseline history across the arc: | Cumulative window | Suite count | Δ | |-------------------|-------------|---| | Pre-batch-79 | ~625 | | | Post-batch-79 | 686 | +61 (FT-P-02 builder scenario tests) | | Post-batch-80 | 700 | +14 (strategy reorganisation; finer-grained per-strategy tests) | | Post-batch-81 | 746 | +46 (FT-P-12/13 unit + scenario fixtures) | No tests removed without an equivalent replacement (`test_common_module_exports_used_by_b01` retired in b80 because the "shared helper" pattern it pinned is now structural rather than file-location-based). No flakes observed across the three full-suite runs that closed each batch.