[autodev] Cumulative review 88-92 + canonical 85-87 path

Catches up implement skill Step 14.5 cadence (K=3 missed since
batches 82-84): one review covering the 88-92 window after the
previous session backfilled the missing 85-87 review at the wrong
path. Renames reviews/cumulative_review_batches_85_87.md to the
canonical cumulative_review_batches_85-87_cycle1_report.md so the
implement skill's resumability detects it.

Cumulative review 88-92 verdict: PASS_WITH_WARNINGS.
- CR-F1/F2 carry-overs from 85-87 escalated (write_csv_evidence +
  _resolve_fixture_path duplication now in 17 files each).
- CR-F3 process: batch_90/91_review.md missing on disk; batches'
  inline self-reviews substitute.
- Phase 7 architecture clean: airborne_bootstrap.py imports all
  Layer-5 sibling or lower, no new cycles, public APIs respected.

State: still Step 7 (Implement) sub_step 16 batch-loop. Next: batch
93 = AZ-622 (Phase D, 3cp) — fresh session recommended.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-19 08:30:08 +03:00
parent 680ba29ae6
commit eaf2f47f69
3 changed files with 137 additions and 1 deletions
@@ -1,85 +0,0 @@
# Cumulative Code Review — Batches 85-87
**Window**: batches 85 (AZ-428..AZ-431 NFT-PERF), 86 (AZ-432..AZ-435 NFT-RES), 87 (AZ-436..AZ-439 NFT-SEC)
**Total tasks**: 12 (4 + 4 + 4)
**Total complexity**: 16 + 14 + 16 = 46 points
**Per-batch verdicts**: PASS_WITH_WARNINGS / PASS_WITH_WARNINGS / PASS_WITH_WARNINGS
**Cumulative verdict**: PASS_WITH_WARNINGS — proceed; promote hygiene findings to PBIs
**Reviewer**: autodev / `implement` skill phase 8.5
**Date**: 2026-05-17
## What this window delivered
The complete NFT slice of the E-BBT (AZ-262) epic — 12 helpers + ~74 unit
test files + ~13 scenario files implementing every Performance, Resilience,
and Security NFT in the traceability matrix:
| Batch | Theme | Scenarios | Helpers | Net Unit Tests |
|-------|-------|-----------|---------|---------------|
| 85 | Performance | 4 (e2e_latency, streaming, ttff, spoof_promotion) | 4 | ~50 |
| 86 | Resilience | 4 (imu_fallback, companion_reboot, monte_carlo, escalation_ladder) | 4 | 74 |
| 87 | Security | 6 (cache_poisoning, no_egress, dns_blackhole, mavlink_signing, opencv_cve_probe, asan_fuzz) | 5 | 75 |
All 12 scenarios are fixture-consumers that skip cleanly without the SITL
replay fixture (AZ-595) being present.
## Cross-batch consistency
PASS. Every scenario in this window adopts the same 7-step shape:
1. tier/parameterization skip (where AC permits);
2. `sitl_replay_ready` skip with explicit pointer to the matching unit-test file;
3. fixture-path resolution via `_resolve_fixture_path()` helper;
4. fixture-not-found → `pytest.fail` with explicit AZ-595 production-dep pointer;
5. payload parse → typed records with shape-error `pytest.fail`;
6. evaluator call → CSV evidence + NFR records;
7. AC assertions with diagnostic messages naming the AC.
Every helper in this window adopts the same shape:
- frozen dataclasses for ALL records / reports;
- one `evaluate()` (or `evaluate_subcase` + `evaluate`) entry point;
- one `write_csv_evidence(out_path, report) -> Path` writer;
- `Sequence` parameter typing (Liskov-substitutable input collections);
- module docstring declaring public-boundary discipline.
Cross-helper consistency is the strongest signal of design quality this
window — a future helper added by anyone should be able to copy a
batch-85/86/87 evaluator and stay structurally on-pattern.
## Cross-batch findings
### CR-F1 — `write_csv_evidence` duplication continues to scale (Medium / Maintainability)
What started as a per-batch Low finding (batch-85 F4, batch-86 F1, batch-87 F1) is now spread across **13 helpers**. The duplication is no longer marginal; the per-evaluator schema variation makes a fully generic abstraction non-trivial, but a thin `csv_evidence_writer.py` helper offering `write_header_and_rows(out_path, header, rows, footer=None)` could remove ~30 lines per evaluator.
**Proposed PBI**: `AZ-???` (post-cycle hygiene) — 3 points. Replace per-evaluator CSV-writer boilerplate with shared helper. Scope: 13 evaluator files + 1 new helper + 1 unit test file. Migrates incrementally — old API can co-exist during migration.
### CR-F2 — `_resolve_fixture_path` duplicated across 13 scenarios (Medium / Maintainability)
Carry-over of batch-85 F3, batch-86 F4, batch-87 F4. Every scenario in this window defines an identical `_resolve_fixture_path() -> Path` differing only in env-var name + default filename.
**Proposed PBI**: `AZ-???` (post-cycle hygiene) — 2 points. Add `runner.helpers.fixture_path.resolve(env_var_name, default_filename) -> Path` shared helper. Scope: 13 scenarios + 1 new helper + 1 unit test file. Pure refactor.
### CR-F3 — Production dependency on AZ-595 fixture builder is concentrated (Low / Spec-Gap surfacing)
All 12 scenarios in this window declare a production dependency on the AZ-595 fixture builder emitting their respective replay JSON files. AZ-595 itself doesn't exist as a tracked task in the dependencies table (it's referenced in 12 scenario docstrings but has no work-item entry).
**Action**: a single new task `AZ-???` should be created to materialize the 13 fixture-JSON contracts (NFT-PERF-01..04 + NFT-RES-01..04 + NFT-SEC-01..05) into a fixture-builder module under `e2e/fixtures/sitl_replay_builder/`. Complexity estimate: 5 points (touches every fixture builder + adds 13 new JSON schemas).
### CR-F4 — DNS-blackhole sidecar is referenced but not deployed (Low / Infrastructure-Gap)
Batch-87 F2 found that NFT-SEC-05 depends on a DNS-blackhole sidecar configured per `environment.md`, but that sidecar does NOT exist in the e2e Docker compose stack. This is a Tier-1 infrastructure gap that blocks NFT-SEC-05's live-capture path.
**Proposed PBI**: `AZ-???` (e2e infrastructure) — 3 points. Add `dns-blackhole` sidecar service to `e2e/docker/docker-compose.test.yml` per `environment.md`. Scope: 1 new service entry + 1 Dockerfile + healthcheck wiring.
### CR-F5 — Cross-batch test-output gate is healthy
PASS — informational. All 215 batch-87 unit tests + 199 batch-86 unit tests + ~50 batch-85 unit tests collect and pass without errors. The complete `e2e/_unit_tests/` suite (1151 tests, ~138 s wall-clock) runs green from workspace root. The expected 12 pre-existing collection errors when running pytest from inside `e2e/` (vs workspace root) are an unrelated path-resolution quirk and not caused by this window.
## Final verdict
**PASS_WITH_WARNINGS**. Proceed to commit + tracker transition + archive.
The 5 cross-batch findings above should be promoted to hygiene PBIs after
the next batch (or earlier if user prioritizes — the F1 + F2 duplication
will keep growing with every new NFT-LIM helper in batch 88).