Files
Oleksandr Bezdieniezhnykh 940066bee2 chore: WIP pre-implement
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 17:09:13 +03:00

16 KiB
Raw Permalink Blame History

Retrospective — 2026-05-26 (Cycle 3)

Cycle-3 retrospective for GPS-Denied Onboard. Cycle 3 spans 2026-05-21 → 2026-05-26 (post-cycle-2 → Step 17 Retrospective). Generated by /autodev existing-code Step 17 (Retrospective, cycle-end mode). Prior retro: retro_2026-05-20.md (cycle 1). Process gap: no cycle-2 retro was filed — cycle 2 transitioned straight from Step 11 into cycle-3 work; the autodev session boundary between cycles 2 and 3 ran without invoking Step 17. This retro partially covers cycle-2 trend deltas where the data is still available on disk, and explicitly flags the missing retro as an Improvement Action below.

Implementation Summary

Cycle 3 scope (2026-05-21 → 2026-05-26)

Metric Value
Tickets closed in cycle 3 (_docs/02_tasks/done/AZ-83{6..9}*, AZ-84{0,5,6,7}*) 7 (AZ-836, AZ-838, AZ-839, AZ-840, AZ-845, AZ-846, AZ-847)
Tickets touched but split off (deferred to cycle 4) 2 (AZ-848 — 5 SP, AZ-883 — 2 SP; both surfaced during this cycle's release flow)
Tickets in todo/ at cycle-3 close (open work) 1 (AZ-848 — the deferred one; AZ-883 mirror also written)
Cycle 3 batches (batch_*_cycle3_report.md) 6 (104, 106, 107, 108, 108b, 109) — batch 105 is reserved/missing; 108b is a same-day follow-up to 108
Cycle 3 src delta 1 commit (fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint); +43 36 LoC across 4 files in _types/, c11_tile_manager/, replay_input/
Cycle duration ~6 days (2026-05-21 first cycle-3 batch → 2026-05-26 retro)
Avg tasks per batch 7 tickets ÷ 6 batches ≈ 1.2 tasks/batch
Estimated total complexity points ~22 SP delivered (3 + 3 + 5 + 3 + 2 + 2 + 4 estimated across AZ-836/838/839/840/845/846/847); plus AZ-844 closeout work (3 SP); deferred 7 SP (AZ-848 5 + AZ-883 2)
Carry-over from cycle 1's Top 3 Improvement Actions 1/3 fulfilled (see "Trend Comparison" below)

Cumulative (cycle 1 + 2 + 3)

Metric Value (this retro) Cycle-1 retro
Total tickets closed (lifetime) ~175 (cycle 1: 165 + cycle 2: ~3-5 + cycle 3: 7) 165
Total batches (lifetime) 109 (cycle 1: 97; cycle 2: 5; cycle 3: 6 + 1 inter-cycle batch 109 numbering) 97
Source LoC, src/ Python 61,071 (unchanged vs cycle-1; cycle-3 delta is a refactor, not a feature; cycle-2 src delta also small per Step 11 report) 61,071
Components 15 (unchanged) 15
Binary tracks 3 (airborne, research, operator-orchestrator) 3

Quality Metrics

Code Review Verdicts (cycle-3 batches)

Batch Ticket Verdict Notes
104 AZ-777 Phase 1 PASS_WITH_WARNINGS 3 findings (1 Medium); AZ-777 Phase 1 closed
106 AZ-836 (TlogRouteExtractor) PASS Single-task batch; 10 ACs all PASS
107 AZ-838 (SatelliteProviderRouteClient + seed_route CLI) PASS_WITH_WARNINGS C2 — Epic AZ-835
108 AZ-839 (operator_pre_flight_setup real fixture) PASS_WITH_WARNINGS C3 — Epic AZ-835
108b AZ-839 follow-up (fix C3 fixture path mismatch) PASS Single-finding fix; no new findings
109 AZ-840 (e2e orchestrator test) PASS_WITH_WARNINGS C4 — Epic AZ-835; 17 unit tests; 3 SP per spec

Verdict distribution (cycle-3 only):

Verdict Count % of cycle-3 batches
PASS 2 33.3 %
PASS_WITH_WARNINGS 4 66.7 %
FAIL 0 0 %
BLOCKED 0 0 %

Auto-fix loop did not escalate to user intervention across cycle 3.

Cycle 3 — Findings (qualitative; no aggregated severity table in batch reports)

The 6 cycle-3 batches did NOT use a | Critical | High | Medium | Low | table convention (grep found zero matches). Findings appear in inline ## Code review sections only. Per-batch breakdown:

Severity Cycle 3 count Trend vs cycle 1
Critical 0 maintained — 0 in cycle 1 too
High 0 maintained — 0 in cycle 1 too
Medium 1 (batch 104, AZ-777 Phase 1) dropped — cycle 1 carried 2 (CR-F1, CR-F2) — see Trend Comparison
Low ~3 (informal counts across PASS_WITH_WARNINGS batches; not enumerated in tables) ~5 → ~3 (trend down)

Quality Gates Late in the Cycle (Steps 1116.5)

The interesting findings of cycle 3 did NOT come from in-batch code review — they came from the autodev quality-gate steps:

Step Surface Outcome
11 Run Tests (Jetson e2e) AZ-848 — eskf_filter_divergence at frame 3 in test_derkachi_1min.py 4 deterministic failures; root cause re-diagnosed 2026-05-26 as VioOutput.emitted_at_ns clock-source mismatch (NOT IMU-vs-IMU as initially hypothesised). Split AZ-883 for a secondary latent bug (_handle_imu SCALED_IMU2 ts_ns=0).
14 Security Audit Resumed prior 2026-05-19 audit; verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 5 Medium, 17 Low — same as cycle 1) No new vulnerabilities introduced by cycle-3 refactor; existing OpenCV CVE pin replay condition unchanged.
15 Performance Test NFRs 4/4 Unverified on Tier-1 (same as cycle 1 + 2); pure-logic evaluator unit tests 70/70 PASS Surfaced EVIDENCE_OUT default-path bug (/e2e-results is container-only; breaks Tier-1 host runs) → leftover _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md filed; perf report perf_2026-05-26_cycle3-tier1-probe.md written.
16 Deploy Resumed from cycle-1 greenfield artifacts; no cycle-3 deltas required Deploy artifacts all present (compose files, scripts/, env templates); operator workstation deploy is the production target for operator-orchestrator.
16.5 Release First-ever release; ran bench-test on jetson-e2e lab Jetson Verdict: Released. Failure profile byte-identical to Step 11 (4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed); no NEW cycle-3-scope regressions. AZ-848 / AZ-883 explicitly carried forward to cycle 4.

Structural Metrics

_docs/02_document/architecture_compliance_baseline.md still does not exist — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3.

Delta vs structure_2026-05-20.md:

Metric Cycle 1 close Cycle 3 close Delta
Component count 15 15 0
Source LoC, src/ Python 61,071 61,071 (+7 net from fd52cc9 — RouteSpec relocation is net-neutral) ~0
Cycles in component import graph 0 0 (verified — cycle-3 commit only relocates a type, no new edges) 0 (healthy)
Cross-component edges, count Concentrated in runtime_root/ factories Same 0
Contract files 5 5 (no new contracts in cycle 3 — refactor cycle) 0
architecture_compliance_baseline.md present No No (carried over gap) +0 — still missing
New Architecture violations this cycle n/a (no baseline) 0 (none flagged in cumulative reviews) n/a
Public-API symbol contract coverage % not computed not computed n/a

A fresh structural snapshot for this retro is not produced — the structure is unchanged from cycle 1 (verified via the 7 LoC delta and 0 new components). structure_2026-05-20.md remains the current authoritative snapshot. The next cycle that materially changes structure (e.g., AZ-848 contract repair adds a new field to VioOutput; cycle-4 C1 work) should re-snapshot.

Efficiency

Metric Cycle 3 value Cycle 1 value
Blocked tasks at cycle close (Tier-2 hardware or otherwise) 1 in todo/ (AZ-848 deferred) + 1 mirror (AZ-883) — both filed in this retro session, NOT blockers for cycle close 4 (all Tier-2 hardware rooted)
Tasks requiring fixes after review 1 (batch 108b is a same-day fix follow-up to 108 for a fixture path mismatch — minor) ~5
Auto-fix loop escalations to user 0 0
Mid-cycle remediation post-mortems 0 1 (AZ-589/AZ-590 → AZ-591)
Mid-cycle scope rewinds 0 1 (Step 11 → Step 7 for AZ-618)
Mid-cycle ticket splits (NEW: surfaced + split during quality-gate step) 1 (AZ-848 → split AZ-883 during release-flow investigation) 0
Process leftovers opened this cycle 1 (2026-05-26_evidence_out_default_path.md) 1 (D-CROSS-CVE-1 — still open)
Process leftovers closed this cycle 0 0

Blocker Analysis

Blocker Type Count (cycle 3) Prevention (carries to cycle 4)
Jetson tlog-replay path broken at frame 3 (AZ-848) 1 Cycle 4 first product task; primary AC: VioOutput.emitted_at_ns contract repaired so add_vio and add_fc_imu share the FC-boot timebase.
_handle_imu SCALED_IMU2 latent bug (AZ-883) 1 Cycle 4; independent of AZ-848; 2 SP.
EVIDENCE_OUT default path container-only 1 Leftover at _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md; cycle-4 quick win (15 min).
OpenCV CVE pin replay condition (D-CROSS-CVE-1) 1 (carried from cycle 1) Out-of-band; re-check at every /autodev invocation; unchanged across cycles 1-3.
Tier-2 hardware/evidence (AZ-595 fixtures, AZ-592/AZ-593 VIO native bindings) 0 (cycle 3 did not need them; cycle 1 had 4 of these) Re-emerge in cycle 4 if AZ-595 SITL fixture is sequenced.

Trend Comparison

Previous retro: retro_2026-05-20.md (cycle 1 close).

Cycle-1 Top 3 Improvement Actions — fulfillment status

# Action Status at cycle-3 close Evidence
1 Land CR-F1 + CR-F2 hygiene PBIs before any new NFT helper expansion in cycle 2 Partial / unclear — no batch report for CR-F1 / CR-F2 specifically in cycle 2 batches (98-102); but cycle-3 batches do not surface duplicated csv_evidence_writer / fixture_path helpers, suggesting silent absorption or the work is yet to land Cycle-2 batches 98-102, cycle-3 batches 104-109 — no new Medium-severity helper-duplication findings
2 Sequence AZ-595 as first product task of cycle 2 Not done — AZ-595 still listed as backlog item in cycle-1 retro language; no cycle-2 batch references AZ-595; the 17 NFT scenarios likely still skip on sitl_replay_ready Glob _docs/02_tasks/done/AZ-595* — file absent from done/
3 Create architecture_compliance_baseline.md as Step 6 prerequisite Not done — file still missing at cycle-3 close (verified via glob) _docs/02_document/architecture_compliance_baseline.md does not exist

Net assessment: cycle-1 retro's Top 3 actions were largely not delivered. The cycle-2-retro skip is the proximate cause — without a cycle-2 retro to surface non-delivery, the actions sat invisible.

Metric Comparison

Metric Cycle 1 baseline Cycle 3 close Target (cycle 4)
Code-review verdict mix ~44 % PASS / ~55 % PASS_WITH_WARNINGS / 0 % FAIL 33 % PASS / 67 % PASS_WITH_WARNINGS / 0 % FAIL Maintain 0 % FAIL; lift PASS to ≥50 % via AZ-848 fix landing cleanly (a single-finding-batch tends to be PASS)
Avg findings per batch (Medium + Low) ~0.2 ~0.7 (one Medium in batch 104 + ~3 Lows across 4 PASS_WITH_WARNINGS = ~4 ÷ 6) ≤ 0.5
Mid-cycle remediation post-mortems 1 0 0
Mid-cycle ticket splits 0 1 (AZ-848 → AZ-883) — good (correct discipline; not bad churn) maintain (split discipline)
Structural baseline file present No No (gap carried 2 cycles) Yes — drop it into cycle 4 Step 6
Cycle-N retro filed at cycle-N close Yes No for cycle 2; yes for cycle 3 Yes — fix the autodev orchestrator gap

Top 3 Improvement Actions (cycle 4)

  1. Land the AZ-848 fix as cycle-4 first product task; bench-verify on Jetson before merging.

    • Impact: unblocks the Jetson e2e tlog-replay path that's been broken since cycle 2 (the AZ-776 xfail removal). Required for any real airborne release. Carries an explicit verification protocol: long-uptime Jetson + freshly-booted FC reproduces deterministically.
    • Effort: 5 SP (per the revised spec). The fix touches the C1 VioOutput.emitted_at_ns contract and every C1 strategy that fills the field; well-scoped.
    • Pair with: AZ-883 (2 SP, _handle_imu SCALED_IMU2 ts_ns=0) — independent fix but same investigation surface.
  2. File a cycle-2 retro retroactively + add an autodev sanity check that flags missing retros.

    • Impact: cycle-1 retro's Top-3 actions all sat invisible because no cycle-2 retro re-surfaced them. The autodev orchestrator's Step 17 should refuse to enter Step 9 cycle-N+1 if retro_*.md for cycle N is absent. Catches future retro skips at the next session boundary, not 6 weeks later.
    • Effort: small (1 SP for the autodev state check; +2 SP to write the catch-up cycle-2 retro from artifacts already on disk).
  3. Land architecture_compliance_baseline.md as cycle-4 Step-6 prerequisite (third try).

    • Impact: same rationale as cycle-1 retro Improvement Action #3 — cumulative reviews still cannot emit ## Baseline Delta sections; structural regressions remain invisible across cycles.
    • Effort: ~1 SP (small file; seed from structure_2026-05-20.md with 0 violations baseline). The right insertion point is cycle 4's decompose phase; if decompose runs without it, fail-fast and create.

Suggested Rule / Skill Updates

File Change Rationale
.cursor/skills/implement/SKILL.md (batch self-review or test sub-step) Add a check: if the batch removes @pytest.mark.xfail decorators from any test, the same batch MUST include a green test execution against the actual hardware tier the test targets (or explicit tier-2-only skip documentation if hardware is unavailable in the batch session). Block PASS verdict without this evidence. AZ-848 root cause: AZ-776 removed @xfail from AC-1/2/5/6 in cycle 2 with "AC-7 stating tests run on Jetson after this task → All five pass". The Jetson run was never performed. Predates the 2026-05 meta-rule.mdc "Real Results, Not Simulated Ones" — but the implement skill's own self-review should also enforce.
.cursor/skills/autodev/state.md or flows/existing-code.md (Re-Entry section) When auto-chaining from Step 17 (Retrospective) to Step 9 (New Task) with cycle: state.cycle + 1, FIRST verify that _docs/06_metrics/retro_<YYYY-MM-DD>.md exists for the previous cycle. If absent, BLOCK and surface the gap. Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3. Cycle-1 retro's Top-3 actions sat invisible as a result.
.cursor/skills/release/SKILL.md Phase 2 strategy table Add an explicit row: bench-test — bench-rig verification on real hardware via test compose (docker-compose.test.jetson.yml style); not a production deploy; collapses Phases 3+4 into one harness run; Phase 5 explicitly N/A; allowed for first-release / refactor-only cycles. Cycle-3 release used this strategy ad-hoc; the skill's existing table forced a "manual" classification that doesn't quite fit.
.cursor/skills/release/SKILL.md Phase 1 rollback-readiness When .previous-tags.env does NOT exist AND no release/* git tag exists, treat this as "first release" and accept docker compose down as the rollback path. Do NOT block on absent rollback target. First-time release was a Phase 1 blocking gate per the current strict reading; cycle 3's bench-test release had to navigate it inline.
.cursor/skills/test-spec/SKILL.md (cycle-update mode) When the cycle-update task list includes a ticket that touches a Protocol / dataclass / contract field semantics (e.g., VioOutput.emitted_at_ns), the test-spec sync MUST flag downstream consumers explicitly (e.g., C5 ESKF + C13 FDR both read emitted_at_ns). AZ-848 affected C1 contract semantics; downstream C5 and C13 each read the field. The test-spec sync didn't flag this in cycle 2 when AZ-776 changed adjacent code.

Process Leftovers (open at snapshot)

  • _docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md — OPEN; gtsam numpy<2 ABI replay condition unchanged. Last check: 2026-05-26 in this session.
  • _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md — OPEN (NEW this cycle); EVIDENCE_OUT default path is container-only; Tier-1 host runs need explicit override; workaround documented; 1 SP fix queued for cycle 4.

End of cycle-3 retrospective.