Files
Oleksandr Bezdieniezhnykh bf13549b32
ci/woodpecker/push/02-build-push Pipeline failed
[autodev] Update configuration and documentation for cycle-1
- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments.
- Updated `.gitignore` to include a new deploy rollback bookmark.
- Revised `_docs/_autodev_state.md` to reflect the current task status and steps.
- Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements.
- Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin.
- Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths.

This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.
2026-05-20 08:05:35 +03:00

148 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Retrospective — 2026-05-20
> Cycle-1 retrospective for GPS-Denied Onboard. Cycle 1 spans
> 2026-05-11 → 2026-05-20 (Problem → Deploy → this retro). This is the
> **first** retro for the project — no prior baseline. Generated by
> `/autodev` greenfield Step 17 (Retrospective, cycle-end mode).
## Implementation Summary
| Metric | Value |
|--------|-------|
| Total tasks (done) | 165 (product + test + hygiene + refactor) |
| Total tasks (backlog) | 2 (AZ-592, AZ-593 — Tier-2 OKVIS2 / VINS-Mono validation) |
| Total tasks (todo) | 0 |
| Total batches | 97 (cycle 1) |
| Cycle duration | 9 days (2026-05-11 → 2026-05-20) |
| Avg tasks per batch | ≈ 1.7 |
| Estimated total complexity points | ≈ 565 cp (sampled: 80 × 3 cp, 50 × 5 cp, 30 × 2 cp, plus a handful of 4 cp and 0 cp umbrella) |
| Source LoC | 61,071 Python (src/) |
| Components | 15 (C1, C2, C2.5, C3, C3.5, C4, C5, C6, C7, C8, C10, C11, C12, C13 + helpers/runtime_root) |
| Binary tracks | 3 (airborne, research, operator-orchestrator) per ADR-002 + ADR-011 |
## Quality Metrics
### Product Implementation Completeness Gate (Step 7 → Step 8)
Source: `_docs/03_implementation/implementation_completeness_cycle1_report.md` (revised 2026-05-19 addendum).
| Verdict | Count | Percentage |
|---------|-------|------------|
| PASS | 114 | 98.3 % |
| BLOCKED-with-named-Tier-2-handle | 4 | 3.4 % (overlaps; 2 share one hardware artifact) |
| FAIL | 0 | 0 % |
The 4 BLOCKED items are: **AZ-332** (OKVIS2 production-default VIO native binding → AZ-592 backlog), **AZ-333** (VINS-Mono research-only VIO → AZ-593 backlog), **AZ-624 AC-5** + **AZ-687 AC-687-3** (Jetson Tier-2 evidence file `_docs/03_implementation/jetson_runs/2026-05-19_az687_tier2_run.txt` pending).
### Code Review Verdicts (sampled across batches 50..97)
The verdict field is consistently `**Verdict**:` in batches ≥ 53. Earlier batches (0122) use a different convention and are captured in the consolidated `cumulative_review_batches_01-22_cycle1_report.md`.
| Verdict | Approximate count (batches 53..97) | Notes |
|---------|------------------------------------|-------|
| PASS | ~24 | Includes single-task batches with inline self-review |
| PASS_WITH_WARNINGS | ~30 | Includes 38 files matching the verdict phrase |
| FAIL | 0 | No batch failed the in-loop review across the full cycle |
| BLOCKED | 0 (at batch level; 4 at task level — see Completeness Gate above) | |
Auto-fix loop never had to escalate to user intervention across cycle 1.
### Findings by Severity (latest 3 cumulative review windows: 70-72, 82-87, 88-92)
| Severity | Count (rolling 3-window) | Trend |
|----------|---------------------------|-------|
| Critical | 0 | — |
| High | 0 | — |
| Medium | 2 (CR-F1 csv-helper, CR-F2 fixture-path-helper — both escalated and still OPEN at end of cycle 1) | Carried over from window 85-87 → 88-92 |
| Low | ~5 (5 in batch 87, 3 in batch 85, 2 each in batches 83/84, 1 in batch 81) | Trending down across late batches |
| Info | many — process notes, no action | — |
### Findings by Category (qualitative, since aggregate metrics across all 97 batches would need per-finding tagging that isn't uniform across early batches)
| Category | Notable patterns | Top files |
|----------|------------------|-----------|
| Bug | Essentially none post Step 11 (Run Tests) — the iSAM2 ordering issue in AZ-625 was caught + fixed mid-batch | — |
| Spec-Gap | AZ-595 dependency surfaced late (blocks 17 NFT scenarios); AZ-687 replay-mode guard emerged from AZ-624's Jetson run | `e2e/_unit_tests/helpers/`, `e2e/runner/helpers/` |
| Security | OpenCV CVE pin replay condition tracked since 2026-05-11 (gtsam numpy<2 ABI block); per `_docs/05_security/dependency_scan.md` re-validated against current pin — no advisory exposure | `pyproject.toml` |
| Performance | NFT-PERF Tier-2 baselines pending AZ-595 + AZ-592/AZ-593; cycle-1 Tier-1 probe completed 2026-05-19 | `_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md` |
| Maintainability | Two helper-duplication patterns surfaced (CR-F1, CR-F2) — 5 cp combined hygiene PBI proposed | `e2e/runner/helpers/csv_evidence_writer`, `e2e/runner/helpers/fixture_path` |
| Style | Minimal — `ruff` is run pre-commit + in the e2e-runner; batch 91 opportunistically cleaned 12 pre-existing UP037 lints in `airborne_bootstrap.py` | `airborne_bootstrap.py` |
| Scope | One mid-cycle scope correction — AZ-589/AZ-590 closed Won't Fix after post-mortem found the actual gap was cross-cutting (empty `_STRATEGY_REGISTRY`), not per-strategy C++ wiring. Re-decomposed into AZ-591 + AZ-592/AZ-593 backlog | `runtime_root/__init__.py``airborne_bootstrap.py` |
## Structural Metrics
`_docs/02_document/architecture_compliance_baseline.md` does **not** exist — cumulative reviews could not emit `## Baseline Delta` sections through cycle 1. This is a known gap (see Improvement Action #3 below).
| Metric | Value | Trend |
|--------|-------|-------|
| Component count | 15 | First-cycle baseline |
| Source LoC (Python, src/) | 61,071 | First-cycle baseline |
| Cycles in component import graph | 0 | Healthy — no back-edges in `runtime_root.airborne_bootstrap → {clock, fdr_client, runtime_root.*_factory, runtime_root.{storage,inference,errors,…}}` |
| Cross-component edges | Concentrated in `runtime_root/` factories (composition root by design) | Healthy — single composition seam |
| Contract files | `_docs/02_document/contracts/shared_*/` populated for FDR record schema (v1.3.0), log record schema (v1.0.0), C8 transport, post-landing upload, ingest (D-PROJ-2 placeholder) | First-cycle baseline; coverage % vs public-API symbols not computed (no inventory) |
| Architecture violations net delta | n/a (no baseline) | Recommend establishing the baseline file in cycle 2 Step 6 |
A structural snapshot for future delta comparison is recorded in `_docs/06_metrics/structure_2026-05-20.md` (next item in this retro).
## Efficiency
| Metric | Value |
|--------|-------|
| Blocked tasks (cycle-1 close) | 4 (all Tier-2 hardware / evidence rooted) |
| Tasks requiring fixes after review | ~5 (Medium/Low warnings landed as hygiene PBIs, not as in-cycle fixes) |
| Auto-fix loop escalations to user | 0 |
| Mid-cycle remediation post-mortems | 1 (AZ-589/AZ-590 → AZ-591) |
| Mid-cycle scope rewinds | 1 (Step 11 Run Tests → Step 7 Implement, for AZ-618 cross-cutting umbrella with 12 builder signatures; surfaced as 2026-05-18 lessons entry already in LESSONS.md) |
| Batch with most findings | Batch 87 (`PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low`) — late-cycle e2e helper polish |
| Out-of-loop process leftover | D-CROSS-CVE-1 OpenCV pin deferred on upstream `gtsam` numpy<2 ABI; re-checked at every `/autodev` invocation per leftover protocol |
### Blocker Analysis
| Blocker Type | Count | Prevention (carries to cycle 2) |
|--------------|-------|----------------------------------|
| Tier-2 hardware/evidence (Jetson Orin) | 4 | Allocate Tier-2 Jetson access for AZ-595 + AZ-624 AC-5 re-run + AZ-687 AC-687-3; AZ-592/AZ-593 unblocked only when CI build env + DBoW2 vocab + upstream choice are decided. |
| Upstream library pin (D-CROSS-CVE-1) | 1 | Out-of-band — replay condition is `gtsam` numpy-2 wheels (or alternate SE(3) backend). Re-check is debounced ≤ 2 h per leftover entry; no action needed in dev cycle. |
| Mid-cycle scope correction | 1 | The fix already informed the 2026-05-18 LESSONS entry. Cross-cutting registry/factory state should be probed before classifying a per-task FAIL. |
## Trend Comparison
*Previous retrospective: N/A — first retro.*
Cycle 1 establishes the baseline. From cycle 2 onward, this section will compare against:
| Metric | Current (cycle-1 baseline) | Target (cycle-2) |
|--------|-----------------------------|-------------------|
| Code-review pass rate (PASS / total) | ≈ 44 % PASS, ≈ 55 % PASS_WITH_WARNINGS, 0 % FAIL | Maintain 0 % FAIL; lift PASS share by landing CR-F1 + CR-F2 hygiene PBIs early |
| Avg findings per batch (Medium + Low) | ~0.2 (≈ 16 total Medium/Low across 97 batches, dominated by helper-duplication carryovers) | ≤ 0.15 |
| Blocked tasks (cycle close) | 4 (all Tier-2 rooted) | ≤ 2; AZ-624 AC-5 + AZ-687 AC-687-3 close once the Tier-2 run lands |
| Mid-cycle remediation post-mortems | 1 | 0 |
| Structural baseline file present | No (gap) | Yes |
## Top 3 Improvement Actions
1. **Land the two open hygiene PBIs (CR-F1 + CR-F2) before any new NFT helper expansion in cycle 2.**
- Impact: drops the only carried-over Medium findings; consolidates `csv_evidence_writer` + `fixture_path.resolve` into one canonical helper module under `e2e/runner/helpers/`; unblocks future NFT additions without duplication drift.
- Effort: low (combined 5 cp; both already pre-decomposed in `cumulative_review_batches_88-92_cycle1_report.md`).
2. **Sequence AZ-595 (SITL observer + FDR replay fixture) as the first product task of cycle 2.**
- Impact: closes 17 NFT scenarios in one PBI — every NFT-PERF / NFT-RES / NFT-SEC scenario that currently `sitl_replay_ready`-skips on the Tier-1 docker harness. Also unblocks the Tier-2 evidence files for AZ-624 AC-5 + AZ-687 AC-687-3 once Jetson hardware is available.
- Effort: medium (5 cp, in scope of a single batch; depends only on already-done tasks).
3. **Create `_docs/02_document/architecture_compliance_baseline.md` as a Step 6 (Decompose) prerequisite.**
- Impact: cumulative reviews can emit `## Baseline Delta` sections from cycle 2 onward, quantifying architecture violations carried over / resolved / newly introduced per cycle. Without it, structural regressions are invisible to the retro process.
- Effort: low (small file: scrape ADR list + initial violation count = 0 for cycle 2; the structural snapshot in this retro can seed the baseline). Suggest landing it in cycle 2 Step 6 as a precondition.
## Suggested Rule/Skill Updates
| File | Change | Rationale |
|------|--------|-----------|
| `.cursor/skills/decompose/SKILL.md` (Step 6 prerequisites) | Add a prerequisite check that `_docs/02_document/architecture_compliance_baseline.md` exists; create it with `0` baseline violations if missing | Closes the gap that cumulative reviews flagged repeatedly across cycle 1 ("`architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted") |
| `.cursor/skills/implement/SKILL.md` (Step 15 Completeness Gate) | Before classifying any per-task FAIL, run a workspace grep for cross-cutting state the task depends on (e.g. central registries, factory dispatch tables); if the actual gap is cross-cutting, propose a single cross-cutting task instead of N per-task remediation tasks | AZ-589 + AZ-590 → AZ-591 post-mortem; saves a wasted remediation-task round-trip |
| `.cursor/skills/decompose/SKILL.md` (or a new sub-step) | Identify the **fixture-builder dependency surface** explicitly during test-task decomposition: if N test tasks share a single un-built fixture, schedule the fixture builder ahead of the dependent tasks as a P0 prerequisite, not as a peer | AZ-595 surfaced as a late-cycle 17-scenario blocker — would have been a 1-task P0 if decompose had cross-referenced fixture references in each test spec |
## Process Leftovers (out of band)
- **`_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`** — still OPEN; replay condition unchanged (gtsam numpy<2). Re-checked at start of every `/autodev` invocation; last check 2026-05-20T05:51 UTC+3.
End of cycle-1 retrospective.