Files
Oleksandr Bezdieniezhnykh bf13549b32
ci/woodpecker/push/02-build-push Pipeline failed
[autodev] Update configuration and documentation for cycle-1
- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments.
- Updated `.gitignore` to include a new deploy rollback bookmark.
- Revised `_docs/_autodev_state.md` to reflect the current task status and steps.
- Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements.
- Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin.
- Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths.

This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.
2026-05-20 08:05:35 +03:00

11 KiB
Raw Permalink Blame History

Retrospective — 2026-05-20

Cycle-1 retrospective for GPS-Denied Onboard. Cycle 1 spans 2026-05-11 → 2026-05-20 (Problem → Deploy → this retro). This is the first retro for the project — no prior baseline. Generated by /autodev greenfield Step 17 (Retrospective, cycle-end mode).

Implementation Summary

Metric Value
Total tasks (done) 165 (product + test + hygiene + refactor)
Total tasks (backlog) 2 (AZ-592, AZ-593 — Tier-2 OKVIS2 / VINS-Mono validation)
Total tasks (todo) 0
Total batches 97 (cycle 1)
Cycle duration 9 days (2026-05-11 → 2026-05-20)
Avg tasks per batch ≈ 1.7
Estimated total complexity points ≈ 565 cp (sampled: 80 × 3 cp, 50 × 5 cp, 30 × 2 cp, plus a handful of 4 cp and 0 cp umbrella)
Source LoC 61,071 Python (src/)
Components 15 (C1, C2, C2.5, C3, C3.5, C4, C5, C6, C7, C8, C10, C11, C12, C13 + helpers/runtime_root)
Binary tracks 3 (airborne, research, operator-orchestrator) per ADR-002 + ADR-011

Quality Metrics

Product Implementation Completeness Gate (Step 7 → Step 8)

Source: _docs/03_implementation/implementation_completeness_cycle1_report.md (revised 2026-05-19 addendum).

Verdict Count Percentage
PASS 114 98.3 %
BLOCKED-with-named-Tier-2-handle 4 3.4 % (overlaps; 2 share one hardware artifact)
FAIL 0 0 %

The 4 BLOCKED items are: AZ-332 (OKVIS2 production-default VIO native binding → AZ-592 backlog), AZ-333 (VINS-Mono research-only VIO → AZ-593 backlog), AZ-624 AC-5 + AZ-687 AC-687-3 (Jetson Tier-2 evidence file _docs/03_implementation/jetson_runs/2026-05-19_az687_tier2_run.txt pending).

Code Review Verdicts (sampled across batches 50..97)

The verdict field is consistently **Verdict**: in batches ≥ 53. Earlier batches (0122) use a different convention and are captured in the consolidated cumulative_review_batches_01-22_cycle1_report.md.

Verdict Approximate count (batches 53..97) Notes
PASS ~24 Includes single-task batches with inline self-review
PASS_WITH_WARNINGS ~30 Includes 38 files matching the verdict phrase
FAIL 0 No batch failed the in-loop review across the full cycle
BLOCKED 0 (at batch level; 4 at task level — see Completeness Gate above)

Auto-fix loop never had to escalate to user intervention across cycle 1.

Findings by Severity (latest 3 cumulative review windows: 70-72, 82-87, 88-92)

Severity Count (rolling 3-window) Trend
Critical 0
High 0
Medium 2 (CR-F1 csv-helper, CR-F2 fixture-path-helper — both escalated and still OPEN at end of cycle 1) Carried over from window 85-87 → 88-92
Low ~5 (5 in batch 87, 3 in batch 85, 2 each in batches 83/84, 1 in batch 81) Trending down across late batches
Info many — process notes, no action

Findings by Category (qualitative, since aggregate metrics across all 97 batches would need per-finding tagging that isn't uniform across early batches)

Category Notable patterns Top files
Bug Essentially none post Step 11 (Run Tests) — the iSAM2 ordering issue in AZ-625 was caught + fixed mid-batch
Spec-Gap AZ-595 dependency surfaced late (blocks 17 NFT scenarios); AZ-687 replay-mode guard emerged from AZ-624's Jetson run e2e/_unit_tests/helpers/, e2e/runner/helpers/
Security OpenCV CVE pin replay condition tracked since 2026-05-11 (gtsam numpy<2 ABI block); per _docs/05_security/dependency_scan.md re-validated against current pin — no advisory exposure pyproject.toml
Performance NFT-PERF Tier-2 baselines pending AZ-595 + AZ-592/AZ-593; cycle-1 Tier-1 probe completed 2026-05-19 _docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md
Maintainability Two helper-duplication patterns surfaced (CR-F1, CR-F2) — 5 cp combined hygiene PBI proposed e2e/runner/helpers/csv_evidence_writer, e2e/runner/helpers/fixture_path
Style Minimal — ruff is run pre-commit + in the e2e-runner; batch 91 opportunistically cleaned 12 pre-existing UP037 lints in airborne_bootstrap.py airborne_bootstrap.py
Scope One mid-cycle scope correction — AZ-589/AZ-590 closed Won't Fix after post-mortem found the actual gap was cross-cutting (empty _STRATEGY_REGISTRY), not per-strategy C++ wiring. Re-decomposed into AZ-591 + AZ-592/AZ-593 backlog runtime_root/__init__.pyairborne_bootstrap.py

Structural Metrics

_docs/02_document/architecture_compliance_baseline.md does not exist — cumulative reviews could not emit ## Baseline Delta sections through cycle 1. This is a known gap (see Improvement Action #3 below).

Metric Value Trend
Component count 15 First-cycle baseline
Source LoC (Python, src/) 61,071 First-cycle baseline
Cycles in component import graph 0 Healthy — no back-edges in runtime_root.airborne_bootstrap → {clock, fdr_client, runtime_root.*_factory, runtime_root.{storage,inference,errors,…}}
Cross-component edges Concentrated in runtime_root/ factories (composition root by design) Healthy — single composition seam
Contract files _docs/02_document/contracts/shared_*/ populated for FDR record schema (v1.3.0), log record schema (v1.0.0), C8 transport, post-landing upload, ingest (D-PROJ-2 placeholder) First-cycle baseline; coverage % vs public-API symbols not computed (no inventory)
Architecture violations net delta n/a (no baseline) Recommend establishing the baseline file in cycle 2 Step 6

A structural snapshot for future delta comparison is recorded in _docs/06_metrics/structure_2026-05-20.md (next item in this retro).

Efficiency

Metric Value
Blocked tasks (cycle-1 close) 4 (all Tier-2 hardware / evidence rooted)
Tasks requiring fixes after review ~5 (Medium/Low warnings landed as hygiene PBIs, not as in-cycle fixes)
Auto-fix loop escalations to user 0
Mid-cycle remediation post-mortems 1 (AZ-589/AZ-590 → AZ-591)
Mid-cycle scope rewinds 1 (Step 11 Run Tests → Step 7 Implement, for AZ-618 cross-cutting umbrella with 12 builder signatures; surfaced as 2026-05-18 lessons entry already in LESSONS.md)
Batch with most findings Batch 87 (PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low) — late-cycle e2e helper polish
Out-of-loop process leftover D-CROSS-CVE-1 OpenCV pin deferred on upstream gtsam numpy<2 ABI; re-checked at every /autodev invocation per leftover protocol

Blocker Analysis

Blocker Type Count Prevention (carries to cycle 2)
Tier-2 hardware/evidence (Jetson Orin) 4 Allocate Tier-2 Jetson access for AZ-595 + AZ-624 AC-5 re-run + AZ-687 AC-687-3; AZ-592/AZ-593 unblocked only when CI build env + DBoW2 vocab + upstream choice are decided.
Upstream library pin (D-CROSS-CVE-1) 1 Out-of-band — replay condition is gtsam numpy-2 wheels (or alternate SE(3) backend). Re-check is debounced ≤ 2 h per leftover entry; no action needed in dev cycle.
Mid-cycle scope correction 1 The fix already informed the 2026-05-18 LESSONS entry. Cross-cutting registry/factory state should be probed before classifying a per-task FAIL.

Trend Comparison

Previous retrospective: N/A — first retro.

Cycle 1 establishes the baseline. From cycle 2 onward, this section will compare against:

Metric Current (cycle-1 baseline) Target (cycle-2)
Code-review pass rate (PASS / total) ≈ 44 % PASS, ≈ 55 % PASS_WITH_WARNINGS, 0 % FAIL Maintain 0 % FAIL; lift PASS share by landing CR-F1 + CR-F2 hygiene PBIs early
Avg findings per batch (Medium + Low) ~0.2 (≈ 16 total Medium/Low across 97 batches, dominated by helper-duplication carryovers) ≤ 0.15
Blocked tasks (cycle close) 4 (all Tier-2 rooted) ≤ 2; AZ-624 AC-5 + AZ-687 AC-687-3 close once the Tier-2 run lands
Mid-cycle remediation post-mortems 1 0
Structural baseline file present No (gap) Yes

Top 3 Improvement Actions

  1. Land the two open hygiene PBIs (CR-F1 + CR-F2) before any new NFT helper expansion in cycle 2.

    • Impact: drops the only carried-over Medium findings; consolidates csv_evidence_writer + fixture_path.resolve into one canonical helper module under e2e/runner/helpers/; unblocks future NFT additions without duplication drift.
    • Effort: low (combined 5 cp; both already pre-decomposed in cumulative_review_batches_88-92_cycle1_report.md).
  2. Sequence AZ-595 (SITL observer + FDR replay fixture) as the first product task of cycle 2.

    • Impact: closes 17 NFT scenarios in one PBI — every NFT-PERF / NFT-RES / NFT-SEC scenario that currently sitl_replay_ready-skips on the Tier-1 docker harness. Also unblocks the Tier-2 evidence files for AZ-624 AC-5 + AZ-687 AC-687-3 once Jetson hardware is available.
    • Effort: medium (5 cp, in scope of a single batch; depends only on already-done tasks).
  3. Create _docs/02_document/architecture_compliance_baseline.md as a Step 6 (Decompose) prerequisite.

    • Impact: cumulative reviews can emit ## Baseline Delta sections from cycle 2 onward, quantifying architecture violations carried over / resolved / newly introduced per cycle. Without it, structural regressions are invisible to the retro process.
    • Effort: low (small file: scrape ADR list + initial violation count = 0 for cycle 2; the structural snapshot in this retro can seed the baseline). Suggest landing it in cycle 2 Step 6 as a precondition.

Suggested Rule/Skill Updates

File Change Rationale
.cursor/skills/decompose/SKILL.md (Step 6 prerequisites) Add a prerequisite check that _docs/02_document/architecture_compliance_baseline.md exists; create it with 0 baseline violations if missing Closes the gap that cumulative reviews flagged repeatedly across cycle 1 ("architecture_compliance_baseline.md does NOT exist → no Baseline Delta section emitted")
.cursor/skills/implement/SKILL.md (Step 15 Completeness Gate) Before classifying any per-task FAIL, run a workspace grep for cross-cutting state the task depends on (e.g. central registries, factory dispatch tables); if the actual gap is cross-cutting, propose a single cross-cutting task instead of N per-task remediation tasks AZ-589 + AZ-590 → AZ-591 post-mortem; saves a wasted remediation-task round-trip
.cursor/skills/decompose/SKILL.md (or a new sub-step) Identify the fixture-builder dependency surface explicitly during test-task decomposition: if N test tasks share a single un-built fixture, schedule the fixture builder ahead of the dependent tasks as a P0 prerequisite, not as a peer AZ-595 surfaced as a late-cycle 17-scenario blocker — would have been a 1-task P0 if decompose had cross-referenced fixture references in each test spec

Process Leftovers (out of band)

  • _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md — still OPEN; replay condition unchanged (gtsam numpy<2). Re-checked at start of every /autodev invocation; last check 2026-05-20T05:51 UTC+3.

End of cycle-1 retrospective.