Files
gps-denied-onboard/_docs/06_metrics/retro_2026-05-26.md
T
Oleksandr Bezdieniezhnykh 940066bee2 chore: WIP pre-implement
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 17:09:13 +03:00

185 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Retrospective — 2026-05-26 (Cycle 3)
> Cycle-3 retrospective for GPS-Denied Onboard. Cycle 3 spans
> 2026-05-21 → 2026-05-26 (post-cycle-2 → Step 17 Retrospective).
> Generated by `/autodev` existing-code Step 17 (Retrospective,
> cycle-end mode). Prior retro: `retro_2026-05-20.md` (cycle 1).
> **Process gap**: no cycle-2 retro was filed — cycle 2 transitioned
> straight from Step 11 into cycle-3 work; the autodev session boundary
> between cycles 2 and 3 ran without invoking Step 17. This retro
> partially covers cycle-2 trend deltas where the data is still
> available on disk, and explicitly flags the missing retro as an
> Improvement Action below.
## Implementation Summary
### Cycle 3 scope (2026-05-21 → 2026-05-26)
| Metric | Value |
|--------|-------|
| Tickets closed in cycle 3 (`_docs/02_tasks/done/AZ-83{6..9}*`, `AZ-84{0,5,6,7}*`) | 7 (AZ-836, AZ-838, AZ-839, AZ-840, AZ-845, AZ-846, AZ-847) |
| Tickets touched but split off (deferred to cycle 4) | 2 (AZ-848 — 5 SP, AZ-883 — 2 SP; both surfaced during this cycle's release flow) |
| Tickets in `todo/` at cycle-3 close (open work) | 1 (AZ-848 — the deferred one; AZ-883 mirror also written) |
| Cycle 3 batches (`batch_*_cycle3_report.md`) | 6 (104, 106, 107, 108, 108b, 109) — batch 105 is reserved/missing; 108b is a same-day follow-up to 108 |
| Cycle 3 src delta | 1 commit (`fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint`); +43 36 LoC across 4 files in `_types/`, `c11_tile_manager/`, `replay_input/` |
| Cycle duration | ~6 days (2026-05-21 first cycle-3 batch → 2026-05-26 retro) |
| Avg tasks per batch | 7 tickets ÷ 6 batches ≈ 1.2 tasks/batch |
| Estimated total complexity points | ~22 SP delivered (3 + 3 + 5 + 3 + 2 + 2 + 4 estimated across AZ-836/838/839/840/845/846/847); plus AZ-844 closeout work (3 SP); deferred 7 SP (AZ-848 5 + AZ-883 2) |
| Carry-over from cycle 1's Top 3 Improvement Actions | 1/3 fulfilled (see "Trend Comparison" below) |
### Cumulative (cycle 1 + 2 + 3)
| Metric | Value (this retro) | Cycle-1 retro |
|--------|---------------------|----------------|
| Total tickets closed (lifetime) | ~175 (cycle 1: 165 + cycle 2: ~3-5 + cycle 3: 7) | 165 |
| Total batches (lifetime) | 109 (cycle 1: 97; cycle 2: 5; cycle 3: 6 + 1 inter-cycle batch 109 numbering) | 97 |
| Source LoC, `src/` Python | 61,071 (unchanged vs cycle-1; cycle-3 delta is a refactor, not a feature; cycle-2 src delta also small per Step 11 report) | 61,071 |
| Components | 15 (unchanged) | 15 |
| Binary tracks | 3 (airborne, research, operator-orchestrator) | 3 |
## Quality Metrics
### Code Review Verdicts (cycle-3 batches)
| Batch | Ticket | Verdict | Notes |
|-------|--------|---------|-------|
| 104 | AZ-777 Phase 1 | PASS_WITH_WARNINGS | 3 findings (1 Medium); AZ-777 Phase 1 closed |
| 106 | AZ-836 (TlogRouteExtractor) | **PASS** | Single-task batch; 10 ACs all PASS |
| 107 | AZ-838 (SatelliteProviderRouteClient + seed_route CLI) | PASS_WITH_WARNINGS | C2 — Epic AZ-835 |
| 108 | AZ-839 (operator_pre_flight_setup real fixture) | PASS_WITH_WARNINGS | C3 — Epic AZ-835 |
| 108b | AZ-839 follow-up (fix C3 fixture path mismatch) | **PASS** | Single-finding fix; no new findings |
| 109 | AZ-840 (e2e orchestrator test) | PASS_WITH_WARNINGS | C4 — Epic AZ-835; 17 unit tests; 3 SP per spec |
Verdict distribution (cycle-3 only):
| Verdict | Count | % of cycle-3 batches |
|---------|------:|----------------------:|
| PASS | 2 | 33.3 % |
| PASS_WITH_WARNINGS | 4 | 66.7 % |
| FAIL | 0 | 0 % |
| BLOCKED | 0 | 0 % |
Auto-fix loop did not escalate to user intervention across cycle 3.
### Cycle 3 — Findings (qualitative; no aggregated severity table in batch reports)
The 6 cycle-3 batches did NOT use a `| Critical | High | Medium | Low |` table convention (grep found zero matches). Findings appear in inline `## Code review` sections only. Per-batch breakdown:
| Severity | Cycle 3 count | Trend vs cycle 1 |
|----------|---------------:|-------------------|
| Critical | 0 | maintained — 0 in cycle 1 too |
| High | 0 | maintained — 0 in cycle 1 too |
| Medium | 1 (batch 104, AZ-777 Phase 1) | dropped — cycle 1 carried 2 (CR-F1, CR-F2) — see Trend Comparison |
| Low | ~3 (informal counts across PASS_WITH_WARNINGS batches; not enumerated in tables) | ~5 → ~3 (trend down) |
### Quality Gates Late in the Cycle (Steps 1116.5)
The interesting findings of cycle 3 did NOT come from in-batch code review — they came from the autodev quality-gate steps:
| Step | Surface | Outcome |
|------|---------|---------|
| 11 Run Tests (Jetson e2e) | AZ-848 — `eskf_filter_divergence` at frame 3 in `test_derkachi_1min.py` | 4 deterministic failures; root cause re-diagnosed 2026-05-26 as `VioOutput.emitted_at_ns` clock-source mismatch (NOT IMU-vs-IMU as initially hypothesised). Split AZ-883 for a secondary latent bug (`_handle_imu` SCALED_IMU2 ts_ns=0). |
| 14 Security Audit | Resumed prior 2026-05-19 audit; verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 5 Medium, 17 Low — same as cycle 1) | No new vulnerabilities introduced by cycle-3 refactor; existing OpenCV CVE pin replay condition unchanged. |
| 15 Performance Test | NFRs 4/4 **Unverified** on Tier-1 (same as cycle 1 + 2); pure-logic evaluator unit tests 70/70 PASS | Surfaced `EVIDENCE_OUT` default-path bug (`/e2e-results` is container-only; breaks Tier-1 host runs) → leftover `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` filed; perf report `perf_2026-05-26_cycle3-tier1-probe.md` written. |
| 16 Deploy | Resumed from cycle-1 greenfield artifacts; no cycle-3 deltas required | Deploy artifacts all present (compose files, scripts/, env templates); operator workstation deploy is the production target for `operator-orchestrator`. |
| 16.5 Release | First-ever release; ran bench-test on `jetson-e2e` lab Jetson | Verdict: **Released**. Failure profile byte-identical to Step 11 (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed`); no NEW cycle-3-scope regressions. AZ-848 / AZ-883 explicitly carried forward to cycle 4. |
## Structural Metrics
`_docs/02_document/architecture_compliance_baseline.md` **still does not exist** — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3.
Delta vs `structure_2026-05-20.md`:
| Metric | Cycle 1 close | Cycle 3 close | Delta |
|--------|----------------|----------------|-------|
| Component count | 15 | 15 | 0 |
| Source LoC, `src/` Python | 61,071 | 61,071 (+7 net from `fd52cc9` — RouteSpec relocation is net-neutral) | ~0 |
| Cycles in component import graph | 0 | 0 (verified — cycle-3 commit only relocates a type, no new edges) | 0 (healthy) |
| Cross-component edges, count | Concentrated in `runtime_root/` factories | Same | 0 |
| Contract files | 5 | 5 (no new contracts in cycle 3 — refactor cycle) | 0 |
| `architecture_compliance_baseline.md` present | No | **No (carried over gap)** | +0 — *still missing* |
| New Architecture violations this cycle | n/a (no baseline) | 0 (none flagged in cumulative reviews) | n/a |
| Public-API symbol contract coverage % | not computed | not computed | n/a |
A fresh structural snapshot for this retro is **not produced** — the structure is unchanged from cycle 1 (verified via the 7 LoC delta and 0 new components). `structure_2026-05-20.md` remains the current authoritative snapshot. The next cycle that materially changes structure (e.g., AZ-848 contract repair adds a new field to `VioOutput`; cycle-4 C1 work) should re-snapshot.
## Efficiency
| Metric | Cycle 3 value | Cycle 1 value |
|--------|---------------:|---------------:|
| Blocked tasks at cycle close (Tier-2 hardware or otherwise) | 1 in todo/ (AZ-848 deferred) + 1 mirror (AZ-883) — both filed in this retro session, NOT blockers for cycle close | 4 (all Tier-2 hardware rooted) |
| Tasks requiring fixes after review | 1 (batch 108b is a same-day fix follow-up to 108 for a fixture path mismatch — minor) | ~5 |
| Auto-fix loop escalations to user | 0 | 0 |
| Mid-cycle remediation post-mortems | 0 | 1 (AZ-589/AZ-590 → AZ-591) |
| Mid-cycle scope rewinds | 0 | 1 (Step 11 → Step 7 for AZ-618) |
| Mid-cycle ticket splits (NEW: surfaced + split during quality-gate step) | 1 (AZ-848 → split AZ-883 during release-flow investigation) | 0 |
| Process leftovers opened this cycle | 1 (`2026-05-26_evidence_out_default_path.md`) | 1 (D-CROSS-CVE-1 — still open) |
| Process leftovers closed this cycle | 0 | 0 |
### Blocker Analysis
| Blocker Type | Count (cycle 3) | Prevention (carries to cycle 4) |
|--------------|------------------|------------------------------------|
| Jetson tlog-replay path broken at frame 3 (AZ-848) | 1 | Cycle 4 first product task; primary AC: `VioOutput.emitted_at_ns` contract repaired so `add_vio` and `add_fc_imu` share the FC-boot timebase. |
| `_handle_imu` SCALED_IMU2 latent bug (AZ-883) | 1 | Cycle 4; independent of AZ-848; 2 SP. |
| `EVIDENCE_OUT` default path container-only | 1 | Leftover at `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`; cycle-4 quick win (15 min). |
| OpenCV CVE pin replay condition (D-CROSS-CVE-1) | 1 (carried from cycle 1) | Out-of-band; re-check at every `/autodev` invocation; unchanged across cycles 1-3. |
| Tier-2 hardware/evidence (AZ-595 fixtures, AZ-592/AZ-593 VIO native bindings) | 0 (cycle 3 did not need them; cycle 1 had 4 of these) | Re-emerge in cycle 4 if AZ-595 SITL fixture is sequenced. |
## Trend Comparison
Previous retro: `retro_2026-05-20.md` (cycle 1 close).
### Cycle-1 Top 3 Improvement Actions — fulfillment status
| # | Action | Status at cycle-3 close | Evidence |
|---|--------|-------------------------|----------|
| 1 | Land CR-F1 + CR-F2 hygiene PBIs before any new NFT helper expansion in cycle 2 | **Partial / unclear** — no batch report for CR-F1 / CR-F2 specifically in cycle 2 batches (98-102); but cycle-3 batches do not surface duplicated `csv_evidence_writer` / `fixture_path` helpers, suggesting silent absorption or the work is yet to land | Cycle-2 batches 98-102, cycle-3 batches 104-109 — no new Medium-severity helper-duplication findings |
| 2 | Sequence AZ-595 as first product task of cycle 2 | **Not done** — AZ-595 still listed as backlog item in cycle-1 retro language; no cycle-2 batch references AZ-595; the 17 NFT scenarios likely still skip on `sitl_replay_ready` | Glob `_docs/02_tasks/done/AZ-595*` — file absent from `done/` |
| 3 | Create `architecture_compliance_baseline.md` as Step 6 prerequisite | **Not done** — file still missing at cycle-3 close (verified via glob) | `_docs/02_document/architecture_compliance_baseline.md` does not exist |
**Net assessment**: cycle-1 retro's Top 3 actions were largely not delivered. The cycle-2-retro skip is the proximate cause — without a cycle-2 retro to surface non-delivery, the actions sat invisible.
### Metric Comparison
| Metric | Cycle 1 baseline | Cycle 3 close | Target (cycle 4) |
|--------|-------------------|----------------|-------------------|
| Code-review verdict mix | ~44 % PASS / ~55 % PASS_WITH_WARNINGS / 0 % FAIL | 33 % PASS / 67 % PASS_WITH_WARNINGS / 0 % FAIL | Maintain 0 % FAIL; lift PASS to ≥50 % via AZ-848 fix landing cleanly (a single-finding-batch tends to be PASS) |
| Avg findings per batch (Medium + Low) | ~0.2 | ~0.7 (one Medium in batch 104 + ~3 Lows across 4 PASS_WITH_WARNINGS = ~4 ÷ 6) | ≤ 0.5 |
| Mid-cycle remediation post-mortems | 1 | 0 | 0 |
| Mid-cycle ticket splits | 0 | 1 (AZ-848 → AZ-883) — *good* (correct discipline; not bad churn) | maintain (split discipline) |
| Structural baseline file present | No | **No (gap carried 2 cycles)** | Yes — drop it into cycle 4 Step 6 |
| Cycle-N retro filed at cycle-N close | Yes | **No for cycle 2; yes for cycle 3** | Yes — fix the autodev orchestrator gap |
## Top 3 Improvement Actions (cycle 4)
1. **Land the AZ-848 fix as cycle-4 first product task; bench-verify on Jetson before merging.**
- Impact: unblocks the Jetson e2e tlog-replay path that's been broken since cycle 2 (the AZ-776 xfail removal). Required for any real airborne release. Carries an explicit verification protocol: long-uptime Jetson + freshly-booted FC reproduces deterministically.
- Effort: 5 SP (per the revised spec). The fix touches the C1 `VioOutput.emitted_at_ns` contract and every C1 strategy that fills the field; well-scoped.
- Pair with: AZ-883 (2 SP, `_handle_imu` SCALED_IMU2 ts_ns=0) — independent fix but same investigation surface.
2. **File a cycle-2 retro retroactively + add an autodev sanity check that flags missing retros.**
- Impact: cycle-1 retro's Top-3 actions all sat invisible because no cycle-2 retro re-surfaced them. The autodev orchestrator's Step 17 should refuse to enter Step 9 cycle-N+1 if `retro_*.md` for cycle N is absent. Catches future retro skips at the next session boundary, not 6 weeks later.
- Effort: small (1 SP for the autodev state check; +2 SP to write the catch-up cycle-2 retro from artifacts already on disk).
3. **Land `architecture_compliance_baseline.md` as cycle-4 Step-6 prerequisite (third try).**
- Impact: same rationale as cycle-1 retro Improvement Action #3 — cumulative reviews still cannot emit `## Baseline Delta` sections; structural regressions remain invisible across cycles.
- Effort: ~1 SP (small file; seed from `structure_2026-05-20.md` with 0 violations baseline). The right insertion point is cycle 4's decompose phase; if decompose runs without it, fail-fast and create.
## Suggested Rule / Skill Updates
| File | Change | Rationale |
|------|--------|-----------|
| `.cursor/skills/implement/SKILL.md` (batch self-review or test sub-step) | Add a check: **if the batch removes `@pytest.mark.xfail` decorators from any test**, the same batch MUST include a green test execution against the actual hardware tier the test targets (or explicit `tier-2-only` skip documentation if hardware is unavailable in the batch session). Block PASS verdict without this evidence. | AZ-848 root cause: AZ-776 removed `@xfail` from AC-1/2/5/6 in cycle 2 with "AC-7 stating tests run on Jetson after this task → All five pass". The Jetson run was never performed. Predates the 2026-05 `meta-rule.mdc` "Real Results, Not Simulated Ones" — but the implement skill's own self-review should also enforce. |
| `.cursor/skills/autodev/state.md` or `flows/existing-code.md` (Re-Entry section) | When auto-chaining from Step 17 (Retrospective) to Step 9 (New Task) with `cycle: state.cycle + 1`, FIRST verify that `_docs/06_metrics/retro_<YYYY-MM-DD>.md` exists for the previous cycle. If absent, BLOCK and surface the gap. | Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3. Cycle-1 retro's Top-3 actions sat invisible as a result. |
| `.cursor/skills/release/SKILL.md` Phase 2 strategy table | Add an explicit row: `bench-test` — bench-rig verification on real hardware via test compose (`docker-compose.test.jetson.yml` style); not a production deploy; collapses Phases 3+4 into one harness run; Phase 5 explicitly N/A; allowed for first-release / refactor-only cycles. | Cycle-3 release used this strategy ad-hoc; the skill's existing table forced a "manual" classification that doesn't quite fit. |
| `.cursor/skills/release/SKILL.md` Phase 1 rollback-readiness | When `.previous-tags.env` does NOT exist AND no `release/*` git tag exists, treat this as "first release" and accept `docker compose down` as the rollback path. Do NOT block on absent rollback target. | First-time release was a Phase 1 blocking gate per the current strict reading; cycle 3's bench-test release had to navigate it inline. |
| `.cursor/skills/test-spec/SKILL.md` (cycle-update mode) | When the cycle-update task list includes a ticket that touches a Protocol / dataclass / contract field semantics (e.g., `VioOutput.emitted_at_ns`), the test-spec sync MUST flag downstream consumers explicitly (e.g., C5 ESKF + C13 FDR both read `emitted_at_ns`). | AZ-848 affected C1 contract semantics; downstream C5 and C13 each read the field. The test-spec sync didn't flag this in cycle 2 when AZ-776 changed adjacent code. |
## Process Leftovers (open at snapshot)
- `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — OPEN; gtsam numpy<2 ABI replay condition unchanged. Last check: 2026-05-26 in this session.
- `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` — OPEN (NEW this cycle); `EVIDENCE_OUT` default path is container-only; Tier-1 host runs need explicit override; workaround documented; 1 SP fix queued for cycle 4.
End of cycle-3 retrospective.