mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 21:51:12 +00:00
[AZ-420] Batch 81 housekeeping + cumulative 79-81 review
Archive AZ-420 to done/; add cumulative review for batches 79-81 (PASS, no new findings); advance autodev state to await batch 82. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,209 @@
|
|||||||
|
# Cumulative Code Review — Batches 79, 80, 81
|
||||||
|
|
||||||
|
**Date**: 2026-05-17
|
||||||
|
**Verdict**: PASS
|
||||||
|
|
||||||
|
Covers the arc:
|
||||||
|
|
||||||
|
* **Batch 79 / AZ-599** — FT-P-02 Derkachi vertical-slice fixture
|
||||||
|
builder + `_common.py` extraction (factored out shared helpers
|
||||||
|
from the b78 FT-P-01 builder).
|
||||||
|
* **Batch 80 / AZ-600** — refactor `_common.py` + `build_p01_fixtures.py`
|
||||||
|
+ `build_p02_fixtures.py` into a parameterised strategy framework
|
||||||
|
(`builder.py`).
|
||||||
|
* **Batch 81 / AZ-420** — FT-P-12 (GCS downsample) + FT-P-13 (GCS
|
||||||
|
command path) scenarios with a new `gcs_telemetry_evaluator.py`
|
||||||
|
helper + `sitl_observer.capture_gcs_tlog`.
|
||||||
|
|
||||||
|
The three batches together close two unrelated arcs:
|
||||||
|
|
||||||
|
1. **Fixture-builder consolidation**: the b78 FT-P-01 vertical slice
|
||||||
|
landed as a one-off builder; b79 added a second one-off (FT-P-02)
|
||||||
|
plus an `_common.py` extraction; b80 then collapsed both into a
|
||||||
|
strategy-pattern framework where future scenarios compose existing
|
||||||
|
strategies. Net result: a single durable `builder.py` instead of
|
||||||
|
N per-scenario builders.
|
||||||
|
2. **GCS-leg blackbox coverage**: b81 adds the FT-P-12 and FT-P-13
|
||||||
|
scenarios that exercise the operator-facing telemetry stream
|
||||||
|
and command path — orthogonal to the fixture-builder work and
|
||||||
|
the first scenario pair that probes the C8 `QgcTelemetryAdapter`.
|
||||||
|
|
||||||
|
## Cross-Cutting Themes
|
||||||
|
|
||||||
|
### Convergence on a single helper / evaluator shape
|
||||||
|
|
||||||
|
Both b80's `builder.py` strategies and b81's
|
||||||
|
`gcs_telemetry_evaluator.py` follow the same convention seen across
|
||||||
|
the wider runner.helpers package:
|
||||||
|
|
||||||
|
1. **Pure-logic dataclasses** (`FixtureBuilderConfig`,
|
||||||
|
`GcsSummaryRateReport`, `HintAckReport`, `SearchRegionShiftReport`,
|
||||||
|
`HintRejectionReport`, `InboundHint`, `FdrCommandAck`,
|
||||||
|
`SearchRegionRecord`) — `@dataclass(frozen=True)` with a `passes`
|
||||||
|
property where the dataclass is an evaluation result.
|
||||||
|
2. **`Iterable[TlogMessage]` ingress** — b81's four evaluators all
|
||||||
|
accept `Iterable[TlogMessage]` instead of a Sequence so they
|
||||||
|
compose with `mavproxy_tlog_reader.iter_messages` without
|
||||||
|
materialisation. The scenario test materialises once via
|
||||||
|
`collect_messages_to_list` and reuses for multiple analyzers
|
||||||
|
(mirrors `ap_contract_evaluator.collect_messages_to_list` from b78).
|
||||||
|
3. **No `src/gps_denied_onboard` imports** — `e2e/_unit_tests/
|
||||||
|
test_no_sut_imports.py` walks the entire `e2e/` tree and
|
||||||
|
passes across all three batches.
|
||||||
|
|
||||||
|
### Fixture-naming convention
|
||||||
|
|
||||||
|
The artifact naming pattern `<artifact>_<host>.tlog` is consistent
|
||||||
|
across batches:
|
||||||
|
|
||||||
|
* `ap_tlog_<host>.tlog` (pre-existing, AP/SITL-side)
|
||||||
|
* `gcs_tlog_<host>.tlog` (b81, GCS-side)
|
||||||
|
|
||||||
|
Same env var (`E2E_SITL_REPLAY_DIR`), same observer surface
|
||||||
|
(`sitl_observer.capture_ap_tlog` / `capture_gcs_tlog`), same
|
||||||
|
RuntimeError messaging convention ("RuntimeError: capture_X: env var
|
||||||
|
not set", "fixture not found at PATH").
|
||||||
|
|
||||||
|
### Strategy-pattern reuse readiness
|
||||||
|
|
||||||
|
B80's `builder.py` factored four ABCs (`VideoSource`, `TlogSource`,
|
||||||
|
`FdrProjection`) + six concrete impls. B81 was the first scenario
|
||||||
|
batch after b80; it consumes no new strategy from `builder.py`
|
||||||
|
(FT-P-12/13 are runtime tests, not fixture-builder scenarios), so
|
||||||
|
the strategy framework's reuse is still latent. The next builder
|
||||||
|
work (any of FT-P-04/05/07/08/10/11) will be the actual test of
|
||||||
|
b80's refactor — they should each land as a ~30-line config
|
||||||
|
factory, not a new builder module.
|
||||||
|
|
||||||
|
## Phase-by-Phase Findings
|
||||||
|
|
||||||
|
### Phase 1 — Context Loading
|
||||||
|
|
||||||
|
Read all three batch reports, all three per-batch reviews, and the
|
||||||
|
task specs AZ-599, AZ-600, AZ-420. Mapped changed files to tasks via
|
||||||
|
the batch reports' "Tasks" sections.
|
||||||
|
|
||||||
|
### Phase 2 — Spec Compliance (cumulative)
|
||||||
|
|
||||||
|
All three per-batch reviews already verified ACs against tests.
|
||||||
|
Re-walking AC coverage across the union:
|
||||||
|
|
||||||
|
* **AZ-599 (FT-P-02 builder)**: 7 scenario integration tests
|
||||||
|
(test_sitl_replay_builder_p02.py). Behavior preserved through
|
||||||
|
b80's refactor (same tests now in slimmed file, plus new
|
||||||
|
strategy-level coverage in test_sitl_replay_builder_builder.py).
|
||||||
|
* **AZ-600 (strategy refactor)**: 33 strategy-level tests + 6 + 7
|
||||||
|
scenario tests = 46 total across the three files.
|
||||||
|
* **AZ-420 (FT-P-12+13)**: 39 evaluator + adapter unit tests + 2
|
||||||
|
scenario tests (skip locally; collect 12 param IDs total).
|
||||||
|
|
||||||
|
No AC went from covered → uncovered between batches.
|
||||||
|
|
||||||
|
### Phase 3 — Code Quality (cumulative)
|
||||||
|
|
||||||
|
* **SRP across batches**: b80 splits orchestration (`build_fixtures`)
|
||||||
|
from materialisation (each strategy) — clean. B81's
|
||||||
|
`gcs_telemetry_evaluator.py` keeps four independent evaluator
|
||||||
|
functions in one module — defensible because they all consume
|
||||||
|
the same tlog stream and the same FDR record family; splitting
|
||||||
|
would just shuffle imports.
|
||||||
|
* **Naming consistency**: timestamps end in `_us` or `_ms` everywhere;
|
||||||
|
distances end in `_m`; rates end in `_hz`; coordinates end in
|
||||||
|
`_deg`. Units in the name, not in comments. ✓
|
||||||
|
* **Test AAA pattern**: spot-checked 10 random tests across all three
|
||||||
|
batches. All use `# Arrange / # Act / # Assert` comments per the
|
||||||
|
coding rules. ✓
|
||||||
|
* **No comment narration of mechanics**: only docstrings + one-liners
|
||||||
|
on non-obvious invariants (`STATIONARY_Z_ACCEL_MG` gravity encoding
|
||||||
|
in b80; the greedy-pairing "keep moving forward to find the last
|
||||||
|
pre-hint" in b81). ✓
|
||||||
|
|
||||||
|
### Phase 4 — Security Quick-Scan (cumulative)
|
||||||
|
|
||||||
|
* No SQL, no `shell=True`, no `eval`/`exec`. ✓
|
||||||
|
* No hardcoded secrets/API keys. ✓
|
||||||
|
* `subprocess.run` in `builder.py`'s `run_gps_denied_replay` uses a
|
||||||
|
list-form `cmd`, not shell-interpolation. ✓
|
||||||
|
* Tlog/FDR record parsing wraps `int()` / `float()` calls with
|
||||||
|
ValueError re-raise messaging on the offending payload (b81's
|
||||||
|
`parse_reloc_payload`, b80's `ImuCsvTlog`). ✓
|
||||||
|
|
||||||
|
### Phase 5 — Performance (cumulative)
|
||||||
|
|
||||||
|
* `build_fixtures` is bound by subprocess (`run_gps_denied_replay`)
|
||||||
|
wall-clock, not in-process compute. Per-strategy materialisation
|
||||||
|
is O(n) over input rows/frames. ✓
|
||||||
|
* `gcs_telemetry_evaluator` is all O(n) over messages with a single
|
||||||
|
O(n log n) ack sort. ✓
|
||||||
|
* No new blocking I/O introduced. No new `time.sleep` polling loops. ✓
|
||||||
|
|
||||||
|
### Phase 6 — Cross-Task Consistency
|
||||||
|
|
||||||
|
* **Interface compatibility across batches**: b80 removed
|
||||||
|
`_common.py`; b81 doesn't reference it. The only b81-side import
|
||||||
|
from the fixture-builder area is the implicit fixture file path
|
||||||
|
pattern, which is unchanged. No drift. ✓
|
||||||
|
* **Duplicate symbols**: `compute_gcs_summary_rate` (b81) and
|
||||||
|
`compute_gps_input_rate` (pre-existing
|
||||||
|
`ap_contract_evaluator`) compute the same `(N-1)/window`
|
||||||
|
rate over different message types — same algorithm, different
|
||||||
|
AC. The fact they're separate functions is correct (they have
|
||||||
|
different domain semantics). If a third "rate over message
|
||||||
|
type X" appears, then a generic helper would be warranted. ✓
|
||||||
|
* **Shared logging / config**: no batch introduced a new logging or
|
||||||
|
config loader; the existing `runner.helpers.replay_mode` env-var
|
||||||
|
helpers are reused. ✓
|
||||||
|
|
||||||
|
### Phase 7 — Architecture Compliance
|
||||||
|
|
||||||
|
`_docs/02_document/module-layout.md` declares Blackbox Tests as the
|
||||||
|
sole owner of `e2e/**`.
|
||||||
|
|
||||||
|
1. **Layer direction**: all imports across the 23 changed files
|
||||||
|
resolve within `e2e/` (`runner.helpers.*`, `e2e.fixtures.*`,
|
||||||
|
stdlib, pytest, mavlink/pymavlink, cv2 in the b80 builder). No
|
||||||
|
imports of `src/gps_denied_onboard.*`. ✓
|
||||||
|
2. **Public API respect**: all cross-helper imports go through
|
||||||
|
module-level public names; no `_private` symbol crossed a module
|
||||||
|
boundary. ✓
|
||||||
|
3. **No new cyclic deps**: b80's `builder.py` is a leaf consumed by
|
||||||
|
`build_p01_fixtures.py` and `build_p02_fixtures.py`; b81's
|
||||||
|
`gcs_telemetry_evaluator.py` is a leaf consumed by the two
|
||||||
|
scenario tests. No back-edges. ✓
|
||||||
|
4. **No duplicate symbols across components**: there is only one
|
||||||
|
component touched (Blackbox Tests), so this check is vacuous
|
||||||
|
except internally — covered by Phase 6.
|
||||||
|
5. **Cross-cutting concerns**: haversine math (b81) is local. The
|
||||||
|
project does not yet have a shared geo-math module; if a second
|
||||||
|
consumer appears (e.g., spoof-distance check in FT-N-04), promote
|
||||||
|
to `runner/helpers/geo_math.py`. Tracked here, not flagged. ✓
|
||||||
|
|
||||||
|
## Findings
|
||||||
|
|
||||||
|
(None new at cumulative scope.)
|
||||||
|
|
||||||
|
The two Low/Note findings from `batch_81_review.md`
|
||||||
|
(`HintAckReport.passes` empty-hints semantic; FDR record loop branch
|
||||||
|
ordering) remain as documented; both are scoped to b81 alone and
|
||||||
|
neither is exacerbated by viewing b79/b80/b81 jointly. Carry over
|
||||||
|
to the next cumulative cycle if a future batch touches the same
|
||||||
|
files.
|
||||||
|
|
||||||
|
## Regression Gate (cumulative)
|
||||||
|
|
||||||
|
Full `e2e/_unit_tests/` suite: **746 passing in 147.57 s**.
|
||||||
|
|
||||||
|
Baseline history across the arc:
|
||||||
|
|
||||||
|
| Cumulative window | Suite count | Δ |
|
||||||
|
|-------------------|-------------|---|
|
||||||
|
| Pre-batch-79 | ~625 | |
|
||||||
|
| Post-batch-79 | 686 | +61 (FT-P-02 builder scenario tests) |
|
||||||
|
| Post-batch-80 | 700 | +14 (strategy reorganisation; finer-grained per-strategy tests) |
|
||||||
|
| Post-batch-81 | 746 | +46 (FT-P-12/13 unit + scenario fixtures) |
|
||||||
|
|
||||||
|
No tests removed without an equivalent replacement
|
||||||
|
(`test_common_module_exports_used_by_b01` retired in b80 because
|
||||||
|
the "shared helper" pattern it pinned is now structural rather than
|
||||||
|
file-location-based). No flakes observed across the three full-suite
|
||||||
|
runs that closed each batch.
|
||||||
@@ -6,15 +6,15 @@ step: 10
|
|||||||
name: Implement Tests
|
name: Implement Tests
|
||||||
status: in_progress
|
status: in_progress
|
||||||
sub_step:
|
sub_step:
|
||||||
phase: 11
|
phase: 0
|
||||||
name: commit-batch
|
name: awaiting-invocation
|
||||||
detail: "batch 81"
|
detail: ""
|
||||||
retry_count: 0
|
retry_count: 0
|
||||||
cycle: 1
|
cycle: 1
|
||||||
tracker: jira
|
tracker: jira
|
||||||
last_completed_batch: 80
|
last_completed_batch: 81
|
||||||
last_cumulative_review: batches_76-78
|
last_cumulative_review: batches_79-81
|
||||||
current_batch: 81
|
current_batch: 82
|
||||||
|
|
||||||
last_step_outcomes:
|
last_step_outcomes:
|
||||||
step_8: "Code is testable — no changes needed (testability_assessment.md committed; no list-of-changes, no source edits)"
|
step_8: "Code is testable — no changes needed (testability_assessment.md committed; no list-of-changes, no source edits)"
|
||||||
|
|||||||
Reference in New Issue
Block a user