azaion/gps-denied-onboard

Fork 0

mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 08:21:13 +00:00

Files

T

Oleksandr Bezdieniezhnykh bf13549b32

ci/woodpecker/push/02-build-push Pipeline failed

Details

[autodev] Update configuration and documentation for cycle-1

- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments.
- Updated `.gitignore` to include a new deploy rollback bookmark.
- Revised `_docs/_autodev_state.md` to reflect the current task status and steps.
- Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements.
- Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin.
- Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths.

This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.

2026-05-20 08:05:35 +03:00

11 KiB

Raw Permalink Blame History

Retrospective — 2026-05-20

Cycle-1 retrospective for GPS-Denied Onboard. Cycle 1 spans 2026-05-11 → 2026-05-20 (Problem → Deploy → this retro). This is the first retro for the project — no prior baseline. Generated by /autodev greenfield Step 17 (Retrospective, cycle-end mode).

Implementation Summary

Metric	Value
Total tasks (done)	165 (product + test + hygiene + refactor)
Total tasks (backlog)	2 (AZ-592, AZ-593 — Tier-2 OKVIS2 / VINS-Mono validation)
Total tasks (todo)	0
Total batches	97 (cycle 1)
Cycle duration	9 days (2026-05-11 → 2026-05-20)
Avg tasks per batch	≈ 1.7
Estimated total complexity points	≈ 565 cp (sampled: 80 × 3 cp, 50 × 5 cp, 30 × 2 cp, plus a handful of 4 cp and 0 cp umbrella)
Source LoC	61,071 Python (src/)
Components	15 (C1, C2, C2.5, C3, C3.5, C4, C5, C6, C7, C8, C10, C11, C12, C13 + helpers/runtime_root)
Binary tracks	3 (airborne, research, operator-orchestrator) per ADR-002 + ADR-011

Quality Metrics

Product Implementation Completeness Gate (Step 7 → Step 8)

Source: _docs/03_implementation/implementation_completeness_cycle1_report.md (revised 2026-05-19 addendum).

Verdict	Count	Percentage
PASS	114	98.3 %
BLOCKED-with-named-Tier-2-handle	4	3.4 % (overlaps; 2 share one hardware artifact)
FAIL	0	0 %

The 4 BLOCKED items are: AZ-332 (OKVIS2 production-default VIO native binding → AZ-592 backlog), AZ-333 (VINS-Mono research-only VIO → AZ-593 backlog), AZ-624 AC-5 + AZ-687 AC-687-3 (Jetson Tier-2 evidence file _docs/03_implementation/jetson_runs/2026-05-19_az687_tier2_run.txt pending).

Code Review Verdicts (sampled across batches 50..97)

The verdict field is consistently **Verdict**: in batches ≥ 53. Earlier batches (01–22) use a different convention and are captured in the consolidated cumulative_review_batches_01-22_cycle1_report.md.

Verdict	Approximate count (batches 53..97)	Notes
PASS	~24	Includes single-task batches with inline self-review
PASS_WITH_WARNINGS	~30	Includes 38 files matching the verdict phrase
FAIL	0	No batch failed the in-loop review across the full cycle
BLOCKED	0 (at batch level; 4 at task level — see Completeness Gate above)

Auto-fix loop never had to escalate to user intervention across cycle 1.

Findings by Severity (latest 3 cumulative review windows: 70-72, 82-87, 88-92)

Severity	Count (rolling 3-window)	Trend
Critical	0	—
High	0	—
Medium	2 (CR-F1 csv-helper, CR-F2 fixture-path-helper — both escalated and still OPEN at end of cycle 1)	Carried over from window 85-87 → 88-92
Low	~5 (5 in batch 87, 3 in batch 85, 2 each in batches 83/84, 1 in batch 81)	Trending down across late batches
Info	many — process notes, no action	—

Findings by Category (qualitative, since aggregate metrics across all 97 batches would need per-finding tagging that isn't uniform across early batches)

Category	Notable patterns	Top files
Bug	Essentially none post Step 11 (Run Tests) — the iSAM2 ordering issue in AZ-625 was caught + fixed mid-batch	—
Spec-Gap	AZ-595 dependency surfaced late (blocks 17 NFT scenarios); AZ-687 replay-mode guard emerged from AZ-624's Jetson run	`e2e/_unit_tests/helpers/`, `e2e/runner/helpers/`
Security	OpenCV CVE pin replay condition tracked since 2026-05-11 (gtsam numpy<2 ABI block); per `_docs/05_security/dependency_scan.md` re-validated against current pin — no advisory exposure	`pyproject.toml`
Performance	NFT-PERF Tier-2 baselines pending AZ-595 + AZ-592/AZ-593; cycle-1 Tier-1 probe completed 2026-05-19	`_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md`
Maintainability	Two helper-duplication patterns surfaced (CR-F1, CR-F2) — 5 cp combined hygiene PBI proposed	`e2e/runner/helpers/csv_evidence_writer`, `e2e/runner/helpers/fixture_path`
Style	Minimal — `ruff` is run pre-commit + in the e2e-runner; batch 91 opportunistically cleaned 12 pre-existing UP037 lints in `airborne_bootstrap.py`	`airborne_bootstrap.py`
Scope	One mid-cycle scope correction — AZ-589/AZ-590 closed Won't Fix after post-mortem found the actual gap was cross-cutting (empty `_STRATEGY_REGISTRY`), not per-strategy C++ wiring. Re-decomposed into AZ-591 + AZ-592/AZ-593 backlog	`runtime_root/__init__.py` → `airborne_bootstrap.py`

Structural Metrics

_docs/02_document/architecture_compliance_baseline.md does not exist — cumulative reviews could not emit ## Baseline Delta sections through cycle 1. This is a known gap (see Improvement Action #3 below).

Metric	Value	Trend
Component count	15	First-cycle baseline
Source LoC (Python, src/)	61,071	First-cycle baseline
Cycles in component import graph	0	Healthy — no back-edges in `runtime_root.airborne_bootstrap → {clock, fdr_client, runtime_root.*_factory, runtime_root.{storage,inference,errors,…}}`
Cross-component edges	Concentrated in `runtime_root/` factories (composition root by design)	Healthy — single composition seam
Contract files	`_docs/02_document/contracts/shared_*/` populated for FDR record schema (v1.3.0), log record schema (v1.0.0), C8 transport, post-landing upload, ingest (D-PROJ-2 placeholder)	First-cycle baseline; coverage % vs public-API symbols not computed (no inventory)
Architecture violations net delta	n/a (no baseline)	Recommend establishing the baseline file in cycle 2 Step 6

A structural snapshot for future delta comparison is recorded in _docs/06_metrics/structure_2026-05-20.md (next item in this retro).

Efficiency

Metric	Value
Blocked tasks (cycle-1 close)	4 (all Tier-2 hardware / evidence rooted)
Tasks requiring fixes after review	~5 (Medium/Low warnings landed as hygiene PBIs, not as in-cycle fixes)
Auto-fix loop escalations to user	0
Mid-cycle remediation post-mortems	1 (AZ-589/AZ-590 → AZ-591)
Mid-cycle scope rewinds	1 (Step 11 Run Tests → Step 7 Implement, for AZ-618 cross-cutting umbrella with 12 builder signatures; surfaced as 2026-05-18 lessons entry already in LESSONS.md)
Batch with most findings	Batch 87 (`PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low`) — late-cycle e2e helper polish
Out-of-loop process leftover	D-CROSS-CVE-1 OpenCV pin deferred on upstream `gtsam` numpy<2 ABI; re-checked at every `/autodev` invocation per leftover protocol

Blocker Analysis

Blocker Type	Count	Prevention (carries to cycle 2)
Tier-2 hardware/evidence (Jetson Orin)	4	Allocate Tier-2 Jetson access for AZ-595 + AZ-624 AC-5 re-run + AZ-687 AC-687-3; AZ-592/AZ-593 unblocked only when CI build env + DBoW2 vocab + upstream choice are decided.
Upstream library pin (D-CROSS-CVE-1)	1	Out-of-band — replay condition is `gtsam` numpy-2 wheels (or alternate SE(3) backend). Re-check is debounced ≤ 2 h per leftover entry; no action needed in dev cycle.
Mid-cycle scope correction	1	The fix already informed the 2026-05-18 LESSONS entry. Cross-cutting registry/factory state should be probed before classifying a per-task FAIL.

Trend Comparison

Previous retrospective: N/A — first retro.

Cycle 1 establishes the baseline. From cycle 2 onward, this section will compare against:

Metric	Current (cycle-1 baseline)	Target (cycle-2)
Code-review pass rate (PASS / total)	≈ 44 % PASS, ≈ 55 % PASS_WITH_WARNINGS, 0 % FAIL	Maintain 0 % FAIL; lift PASS share by landing CR-F1 + CR-F2 hygiene PBIs early
Avg findings per batch (Medium + Low)	~0.2 (≈ 16 total Medium/Low across 97 batches, dominated by helper-duplication carryovers)	≤ 0.15
Blocked tasks (cycle close)	4 (all Tier-2 rooted)	≤ 2; AZ-624 AC-5 + AZ-687 AC-687-3 close once the Tier-2 run lands
Mid-cycle remediation post-mortems	1	0
Structural baseline file present	No (gap)	Yes

Top 3 Improvement Actions

Land the two open hygiene PBIs (CR-F1 + CR-F2) before any new NFT helper expansion in cycle 2.
- Impact: drops the only carried-over Medium findings; consolidates csv_evidence_writer + fixture_path.resolve into one canonical helper module under e2e/runner/helpers/; unblocks future NFT additions without duplication drift.
- Effort: low (combined 5 cp; both already pre-decomposed in cumulative_review_batches_88-92_cycle1_report.md).
Sequence AZ-595 (SITL observer + FDR replay fixture) as the first product task of cycle 2.
- Impact: closes 17 NFT scenarios in one PBI — every NFT-PERF / NFT-RES / NFT-SEC scenario that currently sitl_replay_ready-skips on the Tier-1 docker harness. Also unblocks the Tier-2 evidence files for AZ-624 AC-5 + AZ-687 AC-687-3 once Jetson hardware is available.
- Effort: medium (5 cp, in scope of a single batch; depends only on already-done tasks).
Create _docs/02_document/architecture_compliance_baseline.md as a Step 6 (Decompose) prerequisite.
- Impact: cumulative reviews can emit ## Baseline Delta sections from cycle 2 onward, quantifying architecture violations carried over / resolved / newly introduced per cycle. Without it, structural regressions are invisible to the retro process.
- Effort: low (small file: scrape ADR list + initial violation count = 0 for cycle 2; the structural snapshot in this retro can seed the baseline). Suggest landing it in cycle 2 Step 6 as a precondition.

Suggested Rule/Skill Updates

File	Change	Rationale
`.cursor/skills/decompose/SKILL.md` (Step 6 prerequisites)	Add a prerequisite check that `_docs/02_document/architecture_compliance_baseline.md` exists; create it with `0` baseline violations if missing	Closes the gap that cumulative reviews flagged repeatedly across cycle 1 ("`architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted")
`.cursor/skills/implement/SKILL.md` (Step 15 Completeness Gate)	Before classifying any per-task FAIL, run a workspace grep for cross-cutting state the task depends on (e.g. central registries, factory dispatch tables); if the actual gap is cross-cutting, propose a single cross-cutting task instead of N per-task remediation tasks	AZ-589 + AZ-590 → AZ-591 post-mortem; saves a wasted remediation-task round-trip
`.cursor/skills/decompose/SKILL.md` (or a new sub-step)	Identify the fixture-builder dependency surface explicitly during test-task decomposition: if N test tasks share a single un-built fixture, schedule the fixture builder ahead of the dependent tasks as a P0 prerequisite, not as a peer	AZ-595 surfaced as a late-cycle 17-scenario blocker — would have been a 1-task P0 if decompose had cross-referenced fixture references in each test spec

Process Leftovers (out of band)

_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md — still OPEN; replay condition unchanged (gtsam numpy<2). Re-checked at start of every /autodev invocation; last check 2026-05-20T05:51 UTC+3.

End of cycle-1 retrospective.

11 KiB Raw Permalink Blame History Unescape Escape