mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 05:41:13 +00:00
[autodev] Cumulative review 88-92 + canonical 85-87 path
Catches up implement skill Step 14.5 cadence (K=3 missed since batches 82-84): one review covering the 88-92 window after the previous session backfilled the missing 85-87 review at the wrong path. Renames reviews/cumulative_review_batches_85_87.md to the canonical cumulative_review_batches_85-87_cycle1_report.md so the implement skill's resumability detects it. Cumulative review 88-92 verdict: PASS_WITH_WARNINGS. - CR-F1/F2 carry-overs from 85-87 escalated (write_csv_evidence + _resolve_fixture_path duplication now in 17 files each). - CR-F3 process: batch_90/91_review.md missing on disk; batches' inline self-reviews substitute. - Phase 7 architecture clean: airborne_bootstrap.py imports all Layer-5 sibling or lower, no new cycles, public APIs respected. State: still Step 7 (Implement) sub_step 16 batch-loop. Next: batch 93 = AZ-622 (Phase D, 3cp) — fresh session recommended. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,136 @@
|
||||
# Cumulative Code Review — Batches 88-92 (Cycle 1)
|
||||
|
||||
**Window**: batches 88, 89, 90, 91, 92
|
||||
**Tasks covered**: AZ-440, AZ-441, AZ-442, AZ-443, AZ-446, AZ-619, AZ-620, AZ-621 (8 tasks, 19 complexity points)
|
||||
**Date**: 2026-05-19
|
||||
**Verdict**: PASS_WITH_WARNINGS — proceed; carry-over hygiene PBIs (CR-F1/F2 escalated) and one process gap
|
||||
**Reviewer**: autodev / implement skill Step 14.5
|
||||
**Last cumulative review**: batches 85-87 (`_docs/03_implementation/cumulative_review_batches_85-87_cycle1_report.md`)
|
||||
|
||||
## Scope
|
||||
|
||||
Union of files changed since the previous cumulative review window (commits `f25cae4..680ba29`, 129 files / +16,564 / -87 LOC). This review focuses on the production-code window (90-92) and the test-implementation tail (88-89).
|
||||
|
||||
| Batch | Theme | Tasks | New helpers | New scenarios | New unit tests | Production code |
|
||||
|-------|-------|-------|-------------|---------------|----------------|-----------------|
|
||||
| 88 | Resource limits (NFT-LIM) | AZ-440..AZ-443 (9 cp) | 4 (memory_budget, fdr_size, storage_budget, thermal_envelope) | 4 (NFT-LIM-01..04) + 1 fixture | 4 helper-unit + 1 layout | — |
|
||||
| 89 | NFR reporting refinement | AZ-446 (2 cp) | — (modifies `nfr_recorder.py`) | — (extends 2 existing) | 6 new in `test_nfr_recorder.py` | — |
|
||||
| 90 | Bootstrap Phase A | AZ-619 (2 cp) | — | — | 5 in `test_az619_*.py` | `airborne_bootstrap.py` — `build_pre_constructed`, `AIRBORNE_MAIN_PRODUCER_ID`, `_build_c13_fdr`, `_build_clock` |
|
||||
| 91 | Bootstrap Phase B | AZ-620 (3 cp) | — | — | 3 in `test_az620_*.py` (+ AZ-619 file extended with autouse stub) | `airborne_bootstrap.py` — `FAISS_BUILD_FLAG`, `_consumers_of_pre_constructed_key`, `_configured_consumers_of_pre_constructed_key`, `_build_c6_descriptor_index`, `_build_c6_tile_store` |
|
||||
| 92 | Bootstrap Phase C | AZ-621 (3 cp) | — | — | 3 in `test_az621_*.py` (+ AZ-619 and AZ-620 files extended with autouse stub) | `airborne_bootstrap.py` — `C7_AIRBORNE_BUILD_FLAGS`, `_build_c7_inference` |
|
||||
|
||||
The window crosses one structural boundary: batches 88–89 close out the **test-implementation** epoch (Step 10), batches 90–92 are the **post-rewind product-implementation cycle 1** (Step 7 after the Jetson e2e gate rewound the flow per `_docs/LESSONS.md` 2026-05-18). The bootstrap surface (`src/gps_denied_onboard/runtime_root/airborne_bootstrap.py`) is the only production file touched in the window.
|
||||
|
||||
## Cross-batch consistency (Phase 6)
|
||||
|
||||
### Test-implementation tail (88-89)
|
||||
|
||||
PASS. Batch 88's four NFT-LIM evaluator helpers follow the same 7-element shape established by batches 85-87 (frozen dataclasses, single `evaluate()` entry, `write_csv_evidence()` writer, `Sequence` parameter typing, module-docstring discipline). Batch 88's four scenario files follow the same 7-step shape (tier-skip → sitl_replay_ready-skip → `_resolve_fixture_path()` → typed parse → evaluator → CSV evidence → AC assertions). Batch 89 (AZ-446) is a purely additive API change to `nfr_recorder.py`: `record_metric()` gained kw-only `band`, `ci95_low`, `ci95_high` with default `None`, and a new `report.csv` emission via `pytest_sessionfinish` — existing call sites unaffected.
|
||||
|
||||
### Product-implementation cycle 1 start (90-92)
|
||||
|
||||
PASS — the three phases of the AZ-618 umbrella exhibit a deliberate, repeated pattern that is itself the cumulative-review story:
|
||||
|
||||
1. **Phase A (AZ-619)** introduces `build_pre_constructed(config) -> dict[str, Any]` with two foundational keys (`c13_fdr`, `clock`).
|
||||
2. **Phase B (AZ-620)** adds two new keys (`c6_descriptor_index`, `c6_tile_store`) and the first wrapped-factory error-translation: `RuntimeNotAvailableError → AirborneBootstrapError` naming the missing key + the gating `BUILD_FAISS_INDEX` flag + the configured consumer slug(s). Phase B introduces the reusable helpers `_consumers_of_pre_constructed_key()` and `_configured_consumers_of_pre_constructed_key()`.
|
||||
3. **Phase C (AZ-621)** adds one new key (`c7_inference`) by reusing Phase B's helpers and extending the build-flag surfacing to two flags (`C7_AIRBORNE_BUILD_FLAGS` as a `tuple[tuple[str, str], ...]`).
|
||||
|
||||
The pattern is converging — Phase C added zero new infrastructure helpers and only one new constant. The reusable helpers placed in Phase B (`_configured_consumers_of_pre_constructed_key`) are exactly the right shape for Phase D/E/F to extend without duplication. **This is the strongest signal of design quality in the window.**
|
||||
|
||||
#### Minor naming asymmetry (informational, no finding)
|
||||
|
||||
`FAISS_BUILD_FLAG: Final[str]` (Phase B) vs. `C7_AIRBORNE_BUILD_FLAGS: Final[tuple[tuple[str, str], ...]]` (Phase C). Phase B has one flag; Phase C has two. The container types differ because the cardinalities differ. Not a finding — but if a future phase D/E/F introduces a third multi-flag constant, the project should consider a single uniform shape, e.g. `dict[str, tuple[(str, str), ...]]` keyed by infrastructure key, to eliminate the `_BUILD_FLAG` / `_BUILD_FLAGS` reader load.
|
||||
|
||||
#### Test-isolation pattern (informational, no finding)
|
||||
|
||||
Each phase's test file adds an **autouse stub fixture** for the next phase's builders so existing AC assertions remain valid as the bootstrap grows. AZ-619's `test_phase_a_only_seeds_two_keys` was relaxed to `test_phase_a_keys_remain_present_under_az620_additivity` (subset equality), and AZ-619/AZ-620 test files were extended with `_stub_c7_inference_builder` in batch 92. This is the right pattern for an additive umbrella — keeps each AC test focused on the keys it owns without coupling to later-phase env requirements (TensorRT, FAISS, Postgres).
|
||||
|
||||
## Cross-batch findings
|
||||
|
||||
### CR-F1 — `write_csv_evidence` duplication continues to scale (Medium / Maintainability — ESCALATED)
|
||||
|
||||
**Carry-over** from `cumulative_review_batches_85-87_cycle1_report.md` CR-F1.
|
||||
|
||||
Status update: the duplication is now spread across **17 evaluator helpers** (13 from batches 85-87 + 4 NFT-LIM helpers added by batch 88). Batch 89 (AZ-446) was speculatively expected to absorb this in `nfr_recorder.py` — but as captured in `_docs/03_implementation/reviews/batch_89_review.md` F2, AZ-446's spec is reporting-only (band/CI columns on per-metric `report.csv`), and the carry-over duplication lives in evaluator helpers + scenario boilerplate outside the AZ-446 envelope.
|
||||
|
||||
**Severity escalation rationale**: still Medium, but the carry-count went from 13 → 17 in one window. Every new NFT helper will reproduce the same ~30-line CSV-emit block. If batches 93-95 (AZ-622/623/624) introduce any new evaluator helpers (they should not — the umbrella is bootstrap wiring, not new NFT scenarios), the duplication will grow again.
|
||||
|
||||
**Proposed PBI**: hygiene — **3 points**. Add `e2e/runner/helpers/csv_evidence_writer.py` exposing `write_csv_evidence(out_path: Path, header: Sequence[str], rows: Sequence[Sequence[Any]], footer: Sequence[Sequence[Any]] | None = None) -> Path` and migrate the 17 helpers incrementally. Old per-helper API may co-exist during migration. Estimate: 17 helper files + 1 new helper + 1 unit test file.
|
||||
|
||||
### CR-F2 — `_resolve_fixture_path` duplication continues to scale (Medium / Maintainability — ESCALATED)
|
||||
|
||||
**Carry-over** from `cumulative_review_batches_85-87_cycle1_report.md` CR-F2.
|
||||
|
||||
Status update: same trajectory as CR-F1 — now in **17 scenario files** (13 from 85-87 + 4 NFT-LIM scenarios from batch 88). Every scenario defines an identical `_resolve_fixture_path() -> Path` differing only in env-var name + default filename + a `sitl_observer.replay_dir()` fallback.
|
||||
|
||||
**Severity escalation rationale**: as CR-F1.
|
||||
|
||||
**Proposed PBI**: hygiene — **2 points**. Add `e2e/runner/helpers/fixture_path.py` exposing `resolve(env_var_name: str, default_filename: str, sitl_observer_fallback: Callable[[], Path] | None = None) -> Path`. Pure refactor.
|
||||
|
||||
### CR-F3 — Per-batch review reports for batches 90 and 91 are missing on disk (Low / Process)
|
||||
|
||||
`_docs/03_implementation/reviews/` contains `batch_88_review.md`, `batch_89_review.md`, `batch_92_review.md` — but **NO `batch_90_review.md` or `batch_91_review.md`**. The batch reports for 90 and 91 say "self-reviewed inline" but the implement skill's Step 9 + Step 10 expect a canonical per-batch review artifact under `reviews/batch_NN_review.md`.
|
||||
|
||||
Likely cause: the previous session ran the implementation work and the inline self-review, but did not save the review report to disk. Combined with the batch 90 report being **a retroactive backfill** (its own header states this), this points to a process gap where the implement skill's writeback was partially skipped during the AZ-619/AZ-620 batches.
|
||||
|
||||
**Severity**: Low — the work itself is sound (verdicts in the batch reports are PASS with passing tests). But absent review artifacts complicate future cumulative reviews and the eventual product-implementation completeness gate. **No remediation expected for these specific batches** (the review can be reconstructed from batch reports + commit diffs at any point); the process correction is for future batches to land their review reports as part of the same commit as the code.
|
||||
|
||||
### CR-F4 — Outstanding hygiene PBIs from prior cumulative review still open (Info)
|
||||
|
||||
For tracking:
|
||||
|
||||
| PBI | Source | Estimate | Status |
|
||||
|-----|--------|----------|--------|
|
||||
| Shared `csv_evidence_writer` helper | CR-F1 (85-87) → escalated here | 3 cp | OPEN — escalated |
|
||||
| Shared `fixture_path.resolve` helper | CR-F2 (85-87) → escalated here | 2 cp | OPEN — escalated |
|
||||
| AZ-595 fixture builder materialization | CR-F3 (85-87) | 5 cp | OPEN — blocking 12 scenarios + 4 NFT-LIM scenarios |
|
||||
| `dns-blackhole` sidecar | CR-F4 (85-87) | 3 cp | OPEN — blocks NFT-SEC-05 live capture |
|
||||
|
||||
The AZ-595 fixture-builder backlog now has **17 dependent scenarios** (13 from prior window + 4 NFT-LIM added in batch 88). The next product cycle should sequence AZ-595 ahead of any retry of the Step 11 Run-Tests gate — without those fixtures, the Tier-1 docker harness will continue to `sitl_replay_ready`-skip 17+ scenarios.
|
||||
|
||||
### CR-F5 — `airborne_bootstrap.py` size + responsibility check (Info)
|
||||
|
||||
After Phase C, `airborne_bootstrap.py` is 637 lines: registration wiring (lines 199-420), `pre_constructed` builders (lines 423-587), and `register_airborne_strategies()` (lines 590-636). Phase D will add `c3_lightglue_runtime` + `c3_feature_extractor` builders; Phase E adds `c282_ransac_filter` + c5 helpers; Phase F wires `main()`. Projected size after Phase F: ~900-1000 lines.
|
||||
|
||||
The Single-Responsibility check passes today — the file's single responsibility is "airborne-binary composition: register strategies + build infrastructure dict". Phase D/E will add two more builders following the established pattern; this stays within scope. **Phase F (`main()` integration) is the size risk**: if it embeds CLI parsing + logging setup + `register_airborne_strategies()` + `build_pre_constructed()` + `compose_root()` orchestration, the file may exceed a comfortable single-file ceiling. **No finding today**; flag for the AZ-624 task spec / batch 95 review to evaluate whether `main()` belongs in a separate `airborne_main.py` adjacent to `airborne_bootstrap.py`.
|
||||
|
||||
### CR-F6 — All test runs in the window are green (Info)
|
||||
|
||||
| Batch | Test signal |
|
||||
|-------|-------------|
|
||||
| 88 | 207 helper unit tests pass in 0.36 s; full e2e unit suite **1223 passed in 154 s** |
|
||||
| 89 | +6 tests, full e2e unit suite **1229 passed in 134 s** |
|
||||
| 90 | 5 AZ-619 tests + AZ-591 regression (7/7) — **12/12 passed** |
|
||||
| 91 | 15/15 `tests/unit/runtime_root/` passed in 1.06 s; AZ-591 + AZ-619 + AZ-620 all green |
|
||||
| 92 | 18/18 `tests/unit/runtime_root/` passed in 0.68 s; AZ-591 (7/7) + AZ-619 (5/5) + AZ-620 (3/3) + AZ-621 (3/3) all green |
|
||||
|
||||
No new ruff errors on touched files across the window. Batch 91 also opportunistically cleaned up 12 pre-existing UP037 quoted-annotation lints in `airborne_bootstrap.py` (covered by `from __future__ import annotations`).
|
||||
|
||||
## Architecture Compliance (Phase 7)
|
||||
|
||||
`_docs/02_document/architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted.
|
||||
|
||||
Per-batch Phase-7 results (verified against this review's scope):
|
||||
|
||||
- **Batch 88 (e2e/** changes)**: all new files under `e2e/runner/helpers/` + `e2e/tests/resource_limit/` + `e2e/_unit_tests/helpers/` — owned by `blackbox_tests` (`Owns: e2e/**`). No `src/gps_denied_onboard` imports. No new cycles. F3 in batch 88 was an ownership drift in the **task spec** (suggested `tests/fixtures/...`); the implementation correctly moved the fixture to `e2e/fixtures/jetson/`.
|
||||
- **Batch 89 (`nfr_recorder.py` additive change)**: purely additive kw-only parameters + new public method. No new cross-component imports.
|
||||
- **Batch 90 (`airborne_bootstrap.py` Phase A)**: imports `runtime_root.fdr_client.client.make_fdr_client`, `clock.wall_clock.WallClock`, `runtime_root.register_strategy`. All Layer-5 sibling or lower. No cycles.
|
||||
- **Batch 91 (`airborne_bootstrap.py` Phase B)**: adds imports from `runtime_root.storage_factory` (Layer-5 sibling) and `runtime_root.errors`. All allowed.
|
||||
- **Batch 92 (`airborne_bootstrap.py` Phase C)**: adds import from `runtime_root.inference_factory` (Layer-5 sibling). All allowed. Per `_docs/03_implementation/reviews/batch_92_review.md`: imported symbol `build_inference_runtime` is in `inference_factory.__all__` — Public API respected.
|
||||
|
||||
**Cross-batch duplicate-symbol scan**: no class/function/constant introduced in this window collides with another component's symbol. The closest thing to a duplicate is the per-helper `write_csv_evidence` functions (CR-F1), but those are intentional pattern repetition within one component (`blackbox_tests`), not cross-component duplication.
|
||||
|
||||
**No new cyclic dependencies** introduced. The bootstrap module's import graph remains: `airborne_bootstrap → {clock, fdr_client, runtime_root.{storage_factory, inference_factory, errors, register_strategy}, runtime_root.{matcher,pose,refiner,rerank,state,vio,vpr}_factory}`. No back-edges.
|
||||
|
||||
## Final verdict
|
||||
|
||||
**PASS_WITH_WARNINGS** — proceed to batch 93.
|
||||
|
||||
Gate logic per code-review skill: 0 Critical, 0 High, 2 Medium (CR-F1 + CR-F2 carry-overs, both escalated), 1 Low (CR-F3 process), 3 Info — verdict is PASS_WITH_WARNINGS. Per implement skill Step 14.5 gate: PASS / PASS_WITH_WARNINGS → continue to next batch (Step 14 loop).
|
||||
|
||||
## Next-step recommendations
|
||||
|
||||
1. **Open hygiene PBI for CR-F1 + CR-F2** — combined 5 cp (or two separate 3+2 cp PBIs). Should land **before** Phase D/E/F if any new NFT helpers are introduced; otherwise can defer to after AZ-624.
|
||||
2. **Batch 93** = AZ-622 (Phase D, 3 cp). Single-task batch maintains the per-phase review focus that has worked through Phases A/B/C. Alternative: AZ-622 + AZ-623 (combined 6 cp) is acceptable — both extend the same bootstrap surface and the duplicate-symbol risk is low given the convergent pattern.
|
||||
3. **Do NOT combine all three remaining phases (D+E+F = 9 cp) into one batch** — AZ-624 (Phase F) wires `main()` and triggers the AC-1..AC-5 umbrella verification + Mandatory-Tier-2 Jetson run. Keeping AZ-624 as its own batch preserves a clean rollback boundary for the Jetson gate.
|
||||
4. **Process correction (CR-F3)**: future batches in this implementation arc should land `batch_NN_review.md` to `_docs/03_implementation/reviews/` as part of the same commit as the code, not as a self-review noted only in the batch report.
|
||||
@@ -8,7 +8,7 @@ status: in_progress
|
||||
sub_step:
|
||||
phase: 16
|
||||
name: batch-loop
|
||||
detail: "batch 92 done (AZ-621 Phase C). Next: cumulative review for batches 85-92 (8 batches overdue, K=3 default missed), THEN batch 93 = AZ-622 (Phase D, 3pt) or AZ-622+AZ-623 (6pt). Fresh session recommended."
|
||||
detail: "cumulative review 88-92 PASS_WITH_WARNINGS. Next: batch 93 = AZ-622 (Phase D, 3cp)."
|
||||
retry_count: 0
|
||||
cycle: 1
|
||||
tracker: jira
|
||||
|
||||
Reference in New Issue
Block a user