- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments. - Updated `.gitignore` to include a new deploy rollback bookmark. - Revised `_docs/_autodev_state.md` to reflect the current task status and steps. - Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements. - Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin. - Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths. This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.
11 KiB
Retrospective — 2026-05-20
Cycle-1 retrospective for GPS-Denied Onboard. Cycle 1 spans 2026-05-11 → 2026-05-20 (Problem → Deploy → this retro). This is the first retro for the project — no prior baseline. Generated by
/autodevgreenfield Step 17 (Retrospective, cycle-end mode).
Implementation Summary
| Metric | Value |
|---|---|
| Total tasks (done) | 165 (product + test + hygiene + refactor) |
| Total tasks (backlog) | 2 (AZ-592, AZ-593 — Tier-2 OKVIS2 / VINS-Mono validation) |
| Total tasks (todo) | 0 |
| Total batches | 97 (cycle 1) |
| Cycle duration | 9 days (2026-05-11 → 2026-05-20) |
| Avg tasks per batch | ≈ 1.7 |
| Estimated total complexity points | ≈ 565 cp (sampled: 80 × 3 cp, 50 × 5 cp, 30 × 2 cp, plus a handful of 4 cp and 0 cp umbrella) |
| Source LoC | 61,071 Python (src/) |
| Components | 15 (C1, C2, C2.5, C3, C3.5, C4, C5, C6, C7, C8, C10, C11, C12, C13 + helpers/runtime_root) |
| Binary tracks | 3 (airborne, research, operator-orchestrator) per ADR-002 + ADR-011 |
Quality Metrics
Product Implementation Completeness Gate (Step 7 → Step 8)
Source: _docs/03_implementation/implementation_completeness_cycle1_report.md (revised 2026-05-19 addendum).
| Verdict | Count | Percentage |
|---|---|---|
| PASS | 114 | 98.3 % |
| BLOCKED-with-named-Tier-2-handle | 4 | 3.4 % (overlaps; 2 share one hardware artifact) |
| FAIL | 0 | 0 % |
The 4 BLOCKED items are: AZ-332 (OKVIS2 production-default VIO native binding → AZ-592 backlog), AZ-333 (VINS-Mono research-only VIO → AZ-593 backlog), AZ-624 AC-5 + AZ-687 AC-687-3 (Jetson Tier-2 evidence file _docs/03_implementation/jetson_runs/2026-05-19_az687_tier2_run.txt pending).
Code Review Verdicts (sampled across batches 50..97)
The verdict field is consistently **Verdict**: in batches ≥ 53. Earlier batches (01–22) use a different convention and are captured in the consolidated cumulative_review_batches_01-22_cycle1_report.md.
| Verdict | Approximate count (batches 53..97) | Notes |
|---|---|---|
| PASS | ~24 | Includes single-task batches with inline self-review |
| PASS_WITH_WARNINGS | ~30 | Includes 38 files matching the verdict phrase |
| FAIL | 0 | No batch failed the in-loop review across the full cycle |
| BLOCKED | 0 (at batch level; 4 at task level — see Completeness Gate above) |
Auto-fix loop never had to escalate to user intervention across cycle 1.
Findings by Severity (latest 3 cumulative review windows: 70-72, 82-87, 88-92)
| Severity | Count (rolling 3-window) | Trend |
|---|---|---|
| Critical | 0 | — |
| High | 0 | — |
| Medium | 2 (CR-F1 csv-helper, CR-F2 fixture-path-helper — both escalated and still OPEN at end of cycle 1) | Carried over from window 85-87 → 88-92 |
| Low | ~5 (5 in batch 87, 3 in batch 85, 2 each in batches 83/84, 1 in batch 81) | Trending down across late batches |
| Info | many — process notes, no action | — |
Findings by Category (qualitative, since aggregate metrics across all 97 batches would need per-finding tagging that isn't uniform across early batches)
| Category | Notable patterns | Top files |
|---|---|---|
| Bug | Essentially none post Step 11 (Run Tests) — the iSAM2 ordering issue in AZ-625 was caught + fixed mid-batch | — |
| Spec-Gap | AZ-595 dependency surfaced late (blocks 17 NFT scenarios); AZ-687 replay-mode guard emerged from AZ-624's Jetson run | e2e/_unit_tests/helpers/, e2e/runner/helpers/ |
| Security | OpenCV CVE pin replay condition tracked since 2026-05-11 (gtsam numpy<2 ABI block); per _docs/05_security/dependency_scan.md re-validated against current pin — no advisory exposure |
pyproject.toml |
| Performance | NFT-PERF Tier-2 baselines pending AZ-595 + AZ-592/AZ-593; cycle-1 Tier-1 probe completed 2026-05-19 | _docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md |
| Maintainability | Two helper-duplication patterns surfaced (CR-F1, CR-F2) — 5 cp combined hygiene PBI proposed | e2e/runner/helpers/csv_evidence_writer, e2e/runner/helpers/fixture_path |
| Style | Minimal — ruff is run pre-commit + in the e2e-runner; batch 91 opportunistically cleaned 12 pre-existing UP037 lints in airborne_bootstrap.py |
airborne_bootstrap.py |
| Scope | One mid-cycle scope correction — AZ-589/AZ-590 closed Won't Fix after post-mortem found the actual gap was cross-cutting (empty _STRATEGY_REGISTRY), not per-strategy C++ wiring. Re-decomposed into AZ-591 + AZ-592/AZ-593 backlog |
runtime_root/__init__.py → airborne_bootstrap.py |
Structural Metrics
_docs/02_document/architecture_compliance_baseline.md does not exist — cumulative reviews could not emit ## Baseline Delta sections through cycle 1. This is a known gap (see Improvement Action #3 below).
| Metric | Value | Trend |
|---|---|---|
| Component count | 15 | First-cycle baseline |
| Source LoC (Python, src/) | 61,071 | First-cycle baseline |
| Cycles in component import graph | 0 | Healthy — no back-edges in runtime_root.airborne_bootstrap → {clock, fdr_client, runtime_root.*_factory, runtime_root.{storage,inference,errors,…}} |
| Cross-component edges | Concentrated in runtime_root/ factories (composition root by design) |
Healthy — single composition seam |
| Contract files | _docs/02_document/contracts/shared_*/ populated for FDR record schema (v1.3.0), log record schema (v1.0.0), C8 transport, post-landing upload, ingest (D-PROJ-2 placeholder) |
First-cycle baseline; coverage % vs public-API symbols not computed (no inventory) |
| Architecture violations net delta | n/a (no baseline) | Recommend establishing the baseline file in cycle 2 Step 6 |
A structural snapshot for future delta comparison is recorded in _docs/06_metrics/structure_2026-05-20.md (next item in this retro).
Efficiency
| Metric | Value |
|---|---|
| Blocked tasks (cycle-1 close) | 4 (all Tier-2 hardware / evidence rooted) |
| Tasks requiring fixes after review | ~5 (Medium/Low warnings landed as hygiene PBIs, not as in-cycle fixes) |
| Auto-fix loop escalations to user | 0 |
| Mid-cycle remediation post-mortems | 1 (AZ-589/AZ-590 → AZ-591) |
| Mid-cycle scope rewinds | 1 (Step 11 Run Tests → Step 7 Implement, for AZ-618 cross-cutting umbrella with 12 builder signatures; surfaced as 2026-05-18 lessons entry already in LESSONS.md) |
| Batch with most findings | Batch 87 (PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low) — late-cycle e2e helper polish |
| Out-of-loop process leftover | D-CROSS-CVE-1 OpenCV pin deferred on upstream gtsam numpy<2 ABI; re-checked at every /autodev invocation per leftover protocol |
Blocker Analysis
| Blocker Type | Count | Prevention (carries to cycle 2) |
|---|---|---|
| Tier-2 hardware/evidence (Jetson Orin) | 4 | Allocate Tier-2 Jetson access for AZ-595 + AZ-624 AC-5 re-run + AZ-687 AC-687-3; AZ-592/AZ-593 unblocked only when CI build env + DBoW2 vocab + upstream choice are decided. |
| Upstream library pin (D-CROSS-CVE-1) | 1 | Out-of-band — replay condition is gtsam numpy-2 wheels (or alternate SE(3) backend). Re-check is debounced ≤ 2 h per leftover entry; no action needed in dev cycle. |
| Mid-cycle scope correction | 1 | The fix already informed the 2026-05-18 LESSONS entry. Cross-cutting registry/factory state should be probed before classifying a per-task FAIL. |
Trend Comparison
Previous retrospective: N/A — first retro.
Cycle 1 establishes the baseline. From cycle 2 onward, this section will compare against:
| Metric | Current (cycle-1 baseline) | Target (cycle-2) |
|---|---|---|
| Code-review pass rate (PASS / total) | ≈ 44 % PASS, ≈ 55 % PASS_WITH_WARNINGS, 0 % FAIL | Maintain 0 % FAIL; lift PASS share by landing CR-F1 + CR-F2 hygiene PBIs early |
| Avg findings per batch (Medium + Low) | ~0.2 (≈ 16 total Medium/Low across 97 batches, dominated by helper-duplication carryovers) | ≤ 0.15 |
| Blocked tasks (cycle close) | 4 (all Tier-2 rooted) | ≤ 2; AZ-624 AC-5 + AZ-687 AC-687-3 close once the Tier-2 run lands |
| Mid-cycle remediation post-mortems | 1 | 0 |
| Structural baseline file present | No (gap) | Yes |
Top 3 Improvement Actions
-
Land the two open hygiene PBIs (CR-F1 + CR-F2) before any new NFT helper expansion in cycle 2.
- Impact: drops the only carried-over Medium findings; consolidates
csv_evidence_writer+fixture_path.resolveinto one canonical helper module undere2e/runner/helpers/; unblocks future NFT additions without duplication drift. - Effort: low (combined 5 cp; both already pre-decomposed in
cumulative_review_batches_88-92_cycle1_report.md).
- Impact: drops the only carried-over Medium findings; consolidates
-
Sequence AZ-595 (SITL observer + FDR replay fixture) as the first product task of cycle 2.
- Impact: closes 17 NFT scenarios in one PBI — every NFT-PERF / NFT-RES / NFT-SEC scenario that currently
sitl_replay_ready-skips on the Tier-1 docker harness. Also unblocks the Tier-2 evidence files for AZ-624 AC-5 + AZ-687 AC-687-3 once Jetson hardware is available. - Effort: medium (5 cp, in scope of a single batch; depends only on already-done tasks).
- Impact: closes 17 NFT scenarios in one PBI — every NFT-PERF / NFT-RES / NFT-SEC scenario that currently
-
Create
_docs/02_document/architecture_compliance_baseline.mdas a Step 6 (Decompose) prerequisite.- Impact: cumulative reviews can emit
## Baseline Deltasections from cycle 2 onward, quantifying architecture violations carried over / resolved / newly introduced per cycle. Without it, structural regressions are invisible to the retro process. - Effort: low (small file: scrape ADR list + initial violation count = 0 for cycle 2; the structural snapshot in this retro can seed the baseline). Suggest landing it in cycle 2 Step 6 as a precondition.
- Impact: cumulative reviews can emit
Suggested Rule/Skill Updates
| File | Change | Rationale |
|---|---|---|
.cursor/skills/decompose/SKILL.md (Step 6 prerequisites) |
Add a prerequisite check that _docs/02_document/architecture_compliance_baseline.md exists; create it with 0 baseline violations if missing |
Closes the gap that cumulative reviews flagged repeatedly across cycle 1 ("architecture_compliance_baseline.md does NOT exist → no Baseline Delta section emitted") |
.cursor/skills/implement/SKILL.md (Step 15 Completeness Gate) |
Before classifying any per-task FAIL, run a workspace grep for cross-cutting state the task depends on (e.g. central registries, factory dispatch tables); if the actual gap is cross-cutting, propose a single cross-cutting task instead of N per-task remediation tasks | AZ-589 + AZ-590 → AZ-591 post-mortem; saves a wasted remediation-task round-trip |
.cursor/skills/decompose/SKILL.md (or a new sub-step) |
Identify the fixture-builder dependency surface explicitly during test-task decomposition: if N test tasks share a single un-built fixture, schedule the fixture builder ahead of the dependent tasks as a P0 prerequisite, not as a peer | AZ-595 surfaced as a late-cycle 17-scenario blocker — would have been a 1-task P0 if decompose had cross-referenced fixture references in each test spec |
Process Leftovers (out of band)
_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md— still OPEN; replay condition unchanged (gtsam numpy<2). Re-checked at start of every/autodevinvocation; last check 2026-05-20T05:51 UTC+3.
End of cycle-1 retrospective.