From b0296da911a4fb30a1f305a0a244265f822f4cc3 Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Sun, 17 May 2026 14:48:45 +0300 Subject: [PATCH] [AZ-420] Batch 81 housekeeping + cumulative 79-81 review Archive AZ-420 to done/; add cumulative review for batches 79-81 (PASS, no new findings); advance autodev state to await batch 82. Co-authored-by: Cursor --- .../{todo => done}/AZ-420_ft_p_12_13_gcs.md | 0 .../reviews/cumulative_79_81_review.md | 209 ++++++++++++++++++ _docs/_autodev_state.md | 12 +- 3 files changed, 215 insertions(+), 6 deletions(-) rename _docs/02_tasks/{todo => done}/AZ-420_ft_p_12_13_gcs.md (100%) create mode 100644 _docs/03_implementation/reviews/cumulative_79_81_review.md diff --git a/_docs/02_tasks/todo/AZ-420_ft_p_12_13_gcs.md b/_docs/02_tasks/done/AZ-420_ft_p_12_13_gcs.md similarity index 100% rename from _docs/02_tasks/todo/AZ-420_ft_p_12_13_gcs.md rename to _docs/02_tasks/done/AZ-420_ft_p_12_13_gcs.md diff --git a/_docs/03_implementation/reviews/cumulative_79_81_review.md b/_docs/03_implementation/reviews/cumulative_79_81_review.md new file mode 100644 index 0000000..973b4a0 --- /dev/null +++ b/_docs/03_implementation/reviews/cumulative_79_81_review.md @@ -0,0 +1,209 @@ +# Cumulative Code Review — Batches 79, 80, 81 + +**Date**: 2026-05-17 +**Verdict**: PASS + +Covers the arc: + +* **Batch 79 / AZ-599** — FT-P-02 Derkachi vertical-slice fixture + builder + `_common.py` extraction (factored out shared helpers + from the b78 FT-P-01 builder). +* **Batch 80 / AZ-600** — refactor `_common.py` + `build_p01_fixtures.py` + + `build_p02_fixtures.py` into a parameterised strategy framework + (`builder.py`). +* **Batch 81 / AZ-420** — FT-P-12 (GCS downsample) + FT-P-13 (GCS + command path) scenarios with a new `gcs_telemetry_evaluator.py` + helper + `sitl_observer.capture_gcs_tlog`. + +The three batches together close two unrelated arcs: + +1. **Fixture-builder consolidation**: the b78 FT-P-01 vertical slice + landed as a one-off builder; b79 added a second one-off (FT-P-02) + plus an `_common.py` extraction; b80 then collapsed both into a + strategy-pattern framework where future scenarios compose existing + strategies. Net result: a single durable `builder.py` instead of + N per-scenario builders. +2. **GCS-leg blackbox coverage**: b81 adds the FT-P-12 and FT-P-13 + scenarios that exercise the operator-facing telemetry stream + and command path — orthogonal to the fixture-builder work and + the first scenario pair that probes the C8 `QgcTelemetryAdapter`. + +## Cross-Cutting Themes + +### Convergence on a single helper / evaluator shape + +Both b80's `builder.py` strategies and b81's +`gcs_telemetry_evaluator.py` follow the same convention seen across +the wider runner.helpers package: + +1. **Pure-logic dataclasses** (`FixtureBuilderConfig`, + `GcsSummaryRateReport`, `HintAckReport`, `SearchRegionShiftReport`, + `HintRejectionReport`, `InboundHint`, `FdrCommandAck`, + `SearchRegionRecord`) — `@dataclass(frozen=True)` with a `passes` + property where the dataclass is an evaluation result. +2. **`Iterable[TlogMessage]` ingress** — b81's four evaluators all + accept `Iterable[TlogMessage]` instead of a Sequence so they + compose with `mavproxy_tlog_reader.iter_messages` without + materialisation. The scenario test materialises once via + `collect_messages_to_list` and reuses for multiple analyzers + (mirrors `ap_contract_evaluator.collect_messages_to_list` from b78). +3. **No `src/gps_denied_onboard` imports** — `e2e/_unit_tests/ + test_no_sut_imports.py` walks the entire `e2e/` tree and + passes across all three batches. + +### Fixture-naming convention + +The artifact naming pattern `_.tlog` is consistent +across batches: + +* `ap_tlog_.tlog` (pre-existing, AP/SITL-side) +* `gcs_tlog_.tlog` (b81, GCS-side) + +Same env var (`E2E_SITL_REPLAY_DIR`), same observer surface +(`sitl_observer.capture_ap_tlog` / `capture_gcs_tlog`), same +RuntimeError messaging convention ("RuntimeError: capture_X: env var +not set", "fixture not found at PATH"). + +### Strategy-pattern reuse readiness + +B80's `builder.py` factored four ABCs (`VideoSource`, `TlogSource`, +`FdrProjection`) + six concrete impls. B81 was the first scenario +batch after b80; it consumes no new strategy from `builder.py` +(FT-P-12/13 are runtime tests, not fixture-builder scenarios), so +the strategy framework's reuse is still latent. The next builder +work (any of FT-P-04/05/07/08/10/11) will be the actual test of +b80's refactor — they should each land as a ~30-line config +factory, not a new builder module. + +## Phase-by-Phase Findings + +### Phase 1 — Context Loading + +Read all three batch reports, all three per-batch reviews, and the +task specs AZ-599, AZ-600, AZ-420. Mapped changed files to tasks via +the batch reports' "Tasks" sections. + +### Phase 2 — Spec Compliance (cumulative) + +All three per-batch reviews already verified ACs against tests. +Re-walking AC coverage across the union: + +* **AZ-599 (FT-P-02 builder)**: 7 scenario integration tests + (test_sitl_replay_builder_p02.py). Behavior preserved through + b80's refactor (same tests now in slimmed file, plus new + strategy-level coverage in test_sitl_replay_builder_builder.py). +* **AZ-600 (strategy refactor)**: 33 strategy-level tests + 6 + 7 + scenario tests = 46 total across the three files. +* **AZ-420 (FT-P-12+13)**: 39 evaluator + adapter unit tests + 2 + scenario tests (skip locally; collect 12 param IDs total). + +No AC went from covered → uncovered between batches. + +### Phase 3 — Code Quality (cumulative) + +* **SRP across batches**: b80 splits orchestration (`build_fixtures`) + from materialisation (each strategy) — clean. B81's + `gcs_telemetry_evaluator.py` keeps four independent evaluator + functions in one module — defensible because they all consume + the same tlog stream and the same FDR record family; splitting + would just shuffle imports. +* **Naming consistency**: timestamps end in `_us` or `_ms` everywhere; + distances end in `_m`; rates end in `_hz`; coordinates end in + `_deg`. Units in the name, not in comments. ✓ +* **Test AAA pattern**: spot-checked 10 random tests across all three + batches. All use `# Arrange / # Act / # Assert` comments per the + coding rules. ✓ +* **No comment narration of mechanics**: only docstrings + one-liners + on non-obvious invariants (`STATIONARY_Z_ACCEL_MG` gravity encoding + in b80; the greedy-pairing "keep moving forward to find the last + pre-hint" in b81). ✓ + +### Phase 4 — Security Quick-Scan (cumulative) + +* No SQL, no `shell=True`, no `eval`/`exec`. ✓ +* No hardcoded secrets/API keys. ✓ +* `subprocess.run` in `builder.py`'s `run_gps_denied_replay` uses a + list-form `cmd`, not shell-interpolation. ✓ +* Tlog/FDR record parsing wraps `int()` / `float()` calls with + ValueError re-raise messaging on the offending payload (b81's + `parse_reloc_payload`, b80's `ImuCsvTlog`). ✓ + +### Phase 5 — Performance (cumulative) + +* `build_fixtures` is bound by subprocess (`run_gps_denied_replay`) + wall-clock, not in-process compute. Per-strategy materialisation + is O(n) over input rows/frames. ✓ +* `gcs_telemetry_evaluator` is all O(n) over messages with a single + O(n log n) ack sort. ✓ +* No new blocking I/O introduced. No new `time.sleep` polling loops. ✓ + +### Phase 6 — Cross-Task Consistency + +* **Interface compatibility across batches**: b80 removed + `_common.py`; b81 doesn't reference it. The only b81-side import + from the fixture-builder area is the implicit fixture file path + pattern, which is unchanged. No drift. ✓ +* **Duplicate symbols**: `compute_gcs_summary_rate` (b81) and + `compute_gps_input_rate` (pre-existing + `ap_contract_evaluator`) compute the same `(N-1)/window` + rate over different message types — same algorithm, different + AC. The fact they're separate functions is correct (they have + different domain semantics). If a third "rate over message + type X" appears, then a generic helper would be warranted. ✓ +* **Shared logging / config**: no batch introduced a new logging or + config loader; the existing `runner.helpers.replay_mode` env-var + helpers are reused. ✓ + +### Phase 7 — Architecture Compliance + +`_docs/02_document/module-layout.md` declares Blackbox Tests as the +sole owner of `e2e/**`. + +1. **Layer direction**: all imports across the 23 changed files + resolve within `e2e/` (`runner.helpers.*`, `e2e.fixtures.*`, + stdlib, pytest, mavlink/pymavlink, cv2 in the b80 builder). No + imports of `src/gps_denied_onboard.*`. ✓ +2. **Public API respect**: all cross-helper imports go through + module-level public names; no `_private` symbol crossed a module + boundary. ✓ +3. **No new cyclic deps**: b80's `builder.py` is a leaf consumed by + `build_p01_fixtures.py` and `build_p02_fixtures.py`; b81's + `gcs_telemetry_evaluator.py` is a leaf consumed by the two + scenario tests. No back-edges. ✓ +4. **No duplicate symbols across components**: there is only one + component touched (Blackbox Tests), so this check is vacuous + except internally — covered by Phase 6. +5. **Cross-cutting concerns**: haversine math (b81) is local. The + project does not yet have a shared geo-math module; if a second + consumer appears (e.g., spoof-distance check in FT-N-04), promote + to `runner/helpers/geo_math.py`. Tracked here, not flagged. ✓ + +## Findings + +(None new at cumulative scope.) + +The two Low/Note findings from `batch_81_review.md` +(`HintAckReport.passes` empty-hints semantic; FDR record loop branch +ordering) remain as documented; both are scoped to b81 alone and +neither is exacerbated by viewing b79/b80/b81 jointly. Carry over +to the next cumulative cycle if a future batch touches the same +files. + +## Regression Gate (cumulative) + +Full `e2e/_unit_tests/` suite: **746 passing in 147.57 s**. + +Baseline history across the arc: + +| Cumulative window | Suite count | Δ | +|-------------------|-------------|---| +| Pre-batch-79 | ~625 | | +| Post-batch-79 | 686 | +61 (FT-P-02 builder scenario tests) | +| Post-batch-80 | 700 | +14 (strategy reorganisation; finer-grained per-strategy tests) | +| Post-batch-81 | 746 | +46 (FT-P-12/13 unit + scenario fixtures) | + +No tests removed without an equivalent replacement +(`test_common_module_exports_used_by_b01` retired in b80 because +the "shared helper" pattern it pinned is now structural rather than +file-location-based). No flakes observed across the three full-suite +runs that closed each batch. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 615e717..1008685 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -6,15 +6,15 @@ step: 10 name: Implement Tests status: in_progress sub_step: - phase: 11 - name: commit-batch - detail: "batch 81" + phase: 0 + name: awaiting-invocation + detail: "" retry_count: 0 cycle: 1 tracker: jira -last_completed_batch: 80 -last_cumulative_review: batches_76-78 -current_batch: 81 +last_completed_batch: 81 +last_cumulative_review: batches_79-81 +current_batch: 82 last_step_outcomes: step_8: "Code is testable — no changes needed (testability_assessment.md committed; no list-of-changes, no source edits)"