Co-authored-by: Cursor <cursoragent@cursor.com>
16 KiB
Retrospective — 2026-05-26 (Cycle 3)
Cycle-3 retrospective for GPS-Denied Onboard. Cycle 3 spans 2026-05-21 → 2026-05-26 (post-cycle-2 → Step 17 Retrospective). Generated by
/autodevexisting-code Step 17 (Retrospective, cycle-end mode). Prior retro:retro_2026-05-20.md(cycle 1). Process gap: no cycle-2 retro was filed — cycle 2 transitioned straight from Step 11 into cycle-3 work; the autodev session boundary between cycles 2 and 3 ran without invoking Step 17. This retro partially covers cycle-2 trend deltas where the data is still available on disk, and explicitly flags the missing retro as an Improvement Action below.
Implementation Summary
Cycle 3 scope (2026-05-21 → 2026-05-26)
| Metric | Value |
|---|---|
Tickets closed in cycle 3 (_docs/02_tasks/done/AZ-83{6..9}*, AZ-84{0,5,6,7}*) |
7 (AZ-836, AZ-838, AZ-839, AZ-840, AZ-845, AZ-846, AZ-847) |
| Tickets touched but split off (deferred to cycle 4) | 2 (AZ-848 — 5 SP, AZ-883 — 2 SP; both surfaced during this cycle's release flow) |
Tickets in todo/ at cycle-3 close (open work) |
1 (AZ-848 — the deferred one; AZ-883 mirror also written) |
Cycle 3 batches (batch_*_cycle3_report.md) |
6 (104, 106, 107, 108, 108b, 109) — batch 105 is reserved/missing; 108b is a same-day follow-up to 108 |
| Cycle 3 src delta | 1 commit (fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint); +43 −36 LoC across 4 files in _types/, c11_tile_manager/, replay_input/ |
| Cycle duration | ~6 days (2026-05-21 first cycle-3 batch → 2026-05-26 retro) |
| Avg tasks per batch | 7 tickets ÷ 6 batches ≈ 1.2 tasks/batch |
| Estimated total complexity points | ~22 SP delivered (3 + 3 + 5 + 3 + 2 + 2 + 4 estimated across AZ-836/838/839/840/845/846/847); plus AZ-844 closeout work (3 SP); deferred 7 SP (AZ-848 5 + AZ-883 2) |
| Carry-over from cycle 1's Top 3 Improvement Actions | 1/3 fulfilled (see "Trend Comparison" below) |
Cumulative (cycle 1 + 2 + 3)
| Metric | Value (this retro) | Cycle-1 retro |
|---|---|---|
| Total tickets closed (lifetime) | ~175 (cycle 1: 165 + cycle 2: ~3-5 + cycle 3: 7) | 165 |
| Total batches (lifetime) | 109 (cycle 1: 97; cycle 2: 5; cycle 3: 6 + 1 inter-cycle batch 109 numbering) | 97 |
Source LoC, src/ Python |
61,071 (unchanged vs cycle-1; cycle-3 delta is a refactor, not a feature; cycle-2 src delta also small per Step 11 report) | 61,071 |
| Components | 15 (unchanged) | 15 |
| Binary tracks | 3 (airborne, research, operator-orchestrator) | 3 |
Quality Metrics
Code Review Verdicts (cycle-3 batches)
| Batch | Ticket | Verdict | Notes |
|---|---|---|---|
| 104 | AZ-777 Phase 1 | PASS_WITH_WARNINGS | 3 findings (1 Medium); AZ-777 Phase 1 closed |
| 106 | AZ-836 (TlogRouteExtractor) | PASS | Single-task batch; 10 ACs all PASS |
| 107 | AZ-838 (SatelliteProviderRouteClient + seed_route CLI) | PASS_WITH_WARNINGS | C2 — Epic AZ-835 |
| 108 | AZ-839 (operator_pre_flight_setup real fixture) | PASS_WITH_WARNINGS | C3 — Epic AZ-835 |
| 108b | AZ-839 follow-up (fix C3 fixture path mismatch) | PASS | Single-finding fix; no new findings |
| 109 | AZ-840 (e2e orchestrator test) | PASS_WITH_WARNINGS | C4 — Epic AZ-835; 17 unit tests; 3 SP per spec |
Verdict distribution (cycle-3 only):
| Verdict | Count | % of cycle-3 batches |
|---|---|---|
| PASS | 2 | 33.3 % |
| PASS_WITH_WARNINGS | 4 | 66.7 % |
| FAIL | 0 | 0 % |
| BLOCKED | 0 | 0 % |
Auto-fix loop did not escalate to user intervention across cycle 3.
Cycle 3 — Findings (qualitative; no aggregated severity table in batch reports)
The 6 cycle-3 batches did NOT use a | Critical | High | Medium | Low | table convention (grep found zero matches). Findings appear in inline ## Code review sections only. Per-batch breakdown:
| Severity | Cycle 3 count | Trend vs cycle 1 |
|---|---|---|
| Critical | 0 | maintained — 0 in cycle 1 too |
| High | 0 | maintained — 0 in cycle 1 too |
| Medium | 1 (batch 104, AZ-777 Phase 1) | dropped — cycle 1 carried 2 (CR-F1, CR-F2) — see Trend Comparison |
| Low | ~3 (informal counts across PASS_WITH_WARNINGS batches; not enumerated in tables) | ~5 → ~3 (trend down) |
Quality Gates Late in the Cycle (Steps 11–16.5)
The interesting findings of cycle 3 did NOT come from in-batch code review — they came from the autodev quality-gate steps:
| Step | Surface | Outcome |
|---|---|---|
| 11 Run Tests (Jetson e2e) | AZ-848 — eskf_filter_divergence at frame 3 in test_derkachi_1min.py |
4 deterministic failures; root cause re-diagnosed 2026-05-26 as VioOutput.emitted_at_ns clock-source mismatch (NOT IMU-vs-IMU as initially hypothesised). Split AZ-883 for a secondary latent bug (_handle_imu SCALED_IMU2 ts_ns=0). |
| 14 Security Audit | Resumed prior 2026-05-19 audit; verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 5 Medium, 17 Low — same as cycle 1) | No new vulnerabilities introduced by cycle-3 refactor; existing OpenCV CVE pin replay condition unchanged. |
| 15 Performance Test | NFRs 4/4 Unverified on Tier-1 (same as cycle 1 + 2); pure-logic evaluator unit tests 70/70 PASS | Surfaced EVIDENCE_OUT default-path bug (/e2e-results is container-only; breaks Tier-1 host runs) → leftover _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md filed; perf report perf_2026-05-26_cycle3-tier1-probe.md written. |
| 16 Deploy | Resumed from cycle-1 greenfield artifacts; no cycle-3 deltas required | Deploy artifacts all present (compose files, scripts/, env templates); operator workstation deploy is the production target for operator-orchestrator. |
| 16.5 Release | First-ever release; ran bench-test on jetson-e2e lab Jetson |
Verdict: Released. Failure profile byte-identical to Step 11 (4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed); no NEW cycle-3-scope regressions. AZ-848 / AZ-883 explicitly carried forward to cycle 4. |
Structural Metrics
_docs/02_document/architecture_compliance_baseline.md still does not exist — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3.
Delta vs structure_2026-05-20.md:
| Metric | Cycle 1 close | Cycle 3 close | Delta |
|---|---|---|---|
| Component count | 15 | 15 | 0 |
Source LoC, src/ Python |
61,071 | 61,071 (+7 net from fd52cc9 — RouteSpec relocation is net-neutral) |
~0 |
| Cycles in component import graph | 0 | 0 (verified — cycle-3 commit only relocates a type, no new edges) | 0 (healthy) |
| Cross-component edges, count | Concentrated in runtime_root/ factories |
Same | 0 |
| Contract files | 5 | 5 (no new contracts in cycle 3 — refactor cycle) | 0 |
architecture_compliance_baseline.md present |
No | No (carried over gap) | +0 — still missing |
| New Architecture violations this cycle | n/a (no baseline) | 0 (none flagged in cumulative reviews) | n/a |
| Public-API symbol contract coverage % | not computed | not computed | n/a |
A fresh structural snapshot for this retro is not produced — the structure is unchanged from cycle 1 (verified via the 7 LoC delta and 0 new components). structure_2026-05-20.md remains the current authoritative snapshot. The next cycle that materially changes structure (e.g., AZ-848 contract repair adds a new field to VioOutput; cycle-4 C1 work) should re-snapshot.
Efficiency
| Metric | Cycle 3 value | Cycle 1 value |
|---|---|---|
| Blocked tasks at cycle close (Tier-2 hardware or otherwise) | 1 in todo/ (AZ-848 deferred) + 1 mirror (AZ-883) — both filed in this retro session, NOT blockers for cycle close | 4 (all Tier-2 hardware rooted) |
| Tasks requiring fixes after review | 1 (batch 108b is a same-day fix follow-up to 108 for a fixture path mismatch — minor) | ~5 |
| Auto-fix loop escalations to user | 0 | 0 |
| Mid-cycle remediation post-mortems | 0 | 1 (AZ-589/AZ-590 → AZ-591) |
| Mid-cycle scope rewinds | 0 | 1 (Step 11 → Step 7 for AZ-618) |
| Mid-cycle ticket splits (NEW: surfaced + split during quality-gate step) | 1 (AZ-848 → split AZ-883 during release-flow investigation) | 0 |
| Process leftovers opened this cycle | 1 (2026-05-26_evidence_out_default_path.md) |
1 (D-CROSS-CVE-1 — still open) |
| Process leftovers closed this cycle | 0 | 0 |
Blocker Analysis
| Blocker Type | Count (cycle 3) | Prevention (carries to cycle 4) |
|---|---|---|
| Jetson tlog-replay path broken at frame 3 (AZ-848) | 1 | Cycle 4 first product task; primary AC: VioOutput.emitted_at_ns contract repaired so add_vio and add_fc_imu share the FC-boot timebase. |
_handle_imu SCALED_IMU2 latent bug (AZ-883) |
1 | Cycle 4; independent of AZ-848; 2 SP. |
EVIDENCE_OUT default path container-only |
1 | Leftover at _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md; cycle-4 quick win (15 min). |
| OpenCV CVE pin replay condition (D-CROSS-CVE-1) | 1 (carried from cycle 1) | Out-of-band; re-check at every /autodev invocation; unchanged across cycles 1-3. |
| Tier-2 hardware/evidence (AZ-595 fixtures, AZ-592/AZ-593 VIO native bindings) | 0 (cycle 3 did not need them; cycle 1 had 4 of these) | Re-emerge in cycle 4 if AZ-595 SITL fixture is sequenced. |
Trend Comparison
Previous retro: retro_2026-05-20.md (cycle 1 close).
Cycle-1 Top 3 Improvement Actions — fulfillment status
| # | Action | Status at cycle-3 close | Evidence |
|---|---|---|---|
| 1 | Land CR-F1 + CR-F2 hygiene PBIs before any new NFT helper expansion in cycle 2 | Partial / unclear — no batch report for CR-F1 / CR-F2 specifically in cycle 2 batches (98-102); but cycle-3 batches do not surface duplicated csv_evidence_writer / fixture_path helpers, suggesting silent absorption or the work is yet to land |
Cycle-2 batches 98-102, cycle-3 batches 104-109 — no new Medium-severity helper-duplication findings |
| 2 | Sequence AZ-595 as first product task of cycle 2 | Not done — AZ-595 still listed as backlog item in cycle-1 retro language; no cycle-2 batch references AZ-595; the 17 NFT scenarios likely still skip on sitl_replay_ready |
Glob _docs/02_tasks/done/AZ-595* — file absent from done/ |
| 3 | Create architecture_compliance_baseline.md as Step 6 prerequisite |
Not done — file still missing at cycle-3 close (verified via glob) | _docs/02_document/architecture_compliance_baseline.md does not exist |
Net assessment: cycle-1 retro's Top 3 actions were largely not delivered. The cycle-2-retro skip is the proximate cause — without a cycle-2 retro to surface non-delivery, the actions sat invisible.
Metric Comparison
| Metric | Cycle 1 baseline | Cycle 3 close | Target (cycle 4) |
|---|---|---|---|
| Code-review verdict mix | ~44 % PASS / ~55 % PASS_WITH_WARNINGS / 0 % FAIL | 33 % PASS / 67 % PASS_WITH_WARNINGS / 0 % FAIL | Maintain 0 % FAIL; lift PASS to ≥50 % via AZ-848 fix landing cleanly (a single-finding-batch tends to be PASS) |
| Avg findings per batch (Medium + Low) | ~0.2 | ~0.7 (one Medium in batch 104 + ~3 Lows across 4 PASS_WITH_WARNINGS = ~4 ÷ 6) | ≤ 0.5 |
| Mid-cycle remediation post-mortems | 1 | 0 | 0 |
| Mid-cycle ticket splits | 0 | 1 (AZ-848 → AZ-883) — good (correct discipline; not bad churn) | maintain (split discipline) |
| Structural baseline file present | No | No (gap carried 2 cycles) | Yes — drop it into cycle 4 Step 6 |
| Cycle-N retro filed at cycle-N close | Yes | No for cycle 2; yes for cycle 3 | Yes — fix the autodev orchestrator gap |
Top 3 Improvement Actions (cycle 4)
-
Land the AZ-848 fix as cycle-4 first product task; bench-verify on Jetson before merging.
- Impact: unblocks the Jetson e2e tlog-replay path that's been broken since cycle 2 (the AZ-776 xfail removal). Required for any real airborne release. Carries an explicit verification protocol: long-uptime Jetson + freshly-booted FC reproduces deterministically.
- Effort: 5 SP (per the revised spec). The fix touches the C1
VioOutput.emitted_at_nscontract and every C1 strategy that fills the field; well-scoped. - Pair with: AZ-883 (2 SP,
_handle_imuSCALED_IMU2 ts_ns=0) — independent fix but same investigation surface.
-
File a cycle-2 retro retroactively + add an autodev sanity check that flags missing retros.
- Impact: cycle-1 retro's Top-3 actions all sat invisible because no cycle-2 retro re-surfaced them. The autodev orchestrator's Step 17 should refuse to enter Step 9 cycle-N+1 if
retro_*.mdfor cycle N is absent. Catches future retro skips at the next session boundary, not 6 weeks later. - Effort: small (1 SP for the autodev state check; +2 SP to write the catch-up cycle-2 retro from artifacts already on disk).
- Impact: cycle-1 retro's Top-3 actions all sat invisible because no cycle-2 retro re-surfaced them. The autodev orchestrator's Step 17 should refuse to enter Step 9 cycle-N+1 if
-
Land
architecture_compliance_baseline.mdas cycle-4 Step-6 prerequisite (third try).- Impact: same rationale as cycle-1 retro Improvement Action #3 — cumulative reviews still cannot emit
## Baseline Deltasections; structural regressions remain invisible across cycles. - Effort: ~1 SP (small file; seed from
structure_2026-05-20.mdwith 0 violations baseline). The right insertion point is cycle 4's decompose phase; if decompose runs without it, fail-fast and create.
- Impact: same rationale as cycle-1 retro Improvement Action #3 — cumulative reviews still cannot emit
Suggested Rule / Skill Updates
| File | Change | Rationale |
|---|---|---|
.cursor/skills/implement/SKILL.md (batch self-review or test sub-step) |
Add a check: if the batch removes @pytest.mark.xfail decorators from any test, the same batch MUST include a green test execution against the actual hardware tier the test targets (or explicit tier-2-only skip documentation if hardware is unavailable in the batch session). Block PASS verdict without this evidence. |
AZ-848 root cause: AZ-776 removed @xfail from AC-1/2/5/6 in cycle 2 with "AC-7 stating tests run on Jetson after this task → All five pass". The Jetson run was never performed. Predates the 2026-05 meta-rule.mdc "Real Results, Not Simulated Ones" — but the implement skill's own self-review should also enforce. |
.cursor/skills/autodev/state.md or flows/existing-code.md (Re-Entry section) |
When auto-chaining from Step 17 (Retrospective) to Step 9 (New Task) with cycle: state.cycle + 1, FIRST verify that _docs/06_metrics/retro_<YYYY-MM-DD>.md exists for the previous cycle. If absent, BLOCK and surface the gap. |
Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3. Cycle-1 retro's Top-3 actions sat invisible as a result. |
.cursor/skills/release/SKILL.md Phase 2 strategy table |
Add an explicit row: bench-test — bench-rig verification on real hardware via test compose (docker-compose.test.jetson.yml style); not a production deploy; collapses Phases 3+4 into one harness run; Phase 5 explicitly N/A; allowed for first-release / refactor-only cycles. |
Cycle-3 release used this strategy ad-hoc; the skill's existing table forced a "manual" classification that doesn't quite fit. |
.cursor/skills/release/SKILL.md Phase 1 rollback-readiness |
When .previous-tags.env does NOT exist AND no release/* git tag exists, treat this as "first release" and accept docker compose down as the rollback path. Do NOT block on absent rollback target. |
First-time release was a Phase 1 blocking gate per the current strict reading; cycle 3's bench-test release had to navigate it inline. |
.cursor/skills/test-spec/SKILL.md (cycle-update mode) |
When the cycle-update task list includes a ticket that touches a Protocol / dataclass / contract field semantics (e.g., VioOutput.emitted_at_ns), the test-spec sync MUST flag downstream consumers explicitly (e.g., C5 ESKF + C13 FDR both read emitted_at_ns). |
AZ-848 affected C1 contract semantics; downstream C5 and C13 each read the field. The test-spec sync didn't flag this in cycle 2 when AZ-776 changed adjacent code. |
Process Leftovers (open at snapshot)
_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md— OPEN; gtsam numpy<2 ABI replay condition unchanged. Last check: 2026-05-26 in this session._docs/_process_leftovers/2026-05-26_evidence_out_default_path.md— OPEN (NEW this cycle);EVIDENCE_OUTdefault path is container-only; Tier-1 host runs need explicit override; workaround documented; 1 SP fix queued for cycle 4.
End of cycle-3 retrospective.