Files
Oleksandr Bezdieniezhnykh 940066bee2 chore: WIP pre-implement
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 17:09:13 +03:00

10 KiB

LESSONS

Append-only ledger of lessons learned during the project. New entries go at the top. Each entry is one short bullet + a one-sentence "what changed".

Ring buffer: trim to the last 15 entries. Categories: estimation · architecture · testing · dependencies · tooling · process.


2026-05-26 — [testing] Removing @pytest.mark.xfail must be paired with a same-batch run on the actual hardware tier the test targets

Trigger: AZ-848 root cause re-diagnosis (2026-05-26). In cycle 2, commit 8de2716 [AZ-776] Open-loop ESKF composition profile via c4_pose.enabled removed @xfail decorators from AC-1/AC-2/AC-5/AC-6 in test_derkachi_1min.py with AC-7 in the spec stating "tests run on Jetson after this task → All five pass". The Jetson run was never executed before AZ-776 closed. The latent C1 contract bug (VioOutput.emitted_at_ns uses monotonic_ns instead of FC-boot-relative timestamps) was therefore not detected until cycle-3 Step 11 — three weeks later. AZ-848 is 5 SP and now blocks all real airborne work in cycle 4.

What changed: .cursor/skills/implement/SKILL.md batch self-review should add a check — if the batch removes any @pytest.mark.xfail decorator, the same batch MUST include a green test execution against the test's target tier (or explicit tier-2-only skip documentation if the hardware is unavailable in the batch session). Block PASS verdict without this evidence. Predates the 2026-05 meta-rule.mdc "Real Results, Not Simulated Ones" rule but the implement skill's own gate should also enforce.

Source: _docs/06_metrics/retro_2026-05-26.md

2026-05-26 — [process] Autodev must block Step-N+1 entry if the previous cycle's retro file is missing

Trigger: cycle-2 retro was never filed. The autodev orchestrator silently auto-chained from cycle-2 Step 17 (if it ran at all) straight into cycle-3 Step 9 without producing retro_<cycle2-date>.md. As a result, cycle-1 retro's Top-3 Improvement Actions sat invisible across cycle 2 and were re-discovered, all three still undelivered, only at cycle-3 close — including architecture_compliance_baseline.md (action #3) which is now in its third cycle of being un-delivered.

What changed: .cursor/skills/autodev/state.md Re-Entry After Completion (or flows/existing-code.md) should verify that _docs/06_metrics/retro_<YYYY-MM-DD>.md exists for the previous cycle (state.cycle) before incrementing the cycle counter and entering Step 9 of cycle N+1. If absent, BLOCK and surface the gap with an A/B/C choice: (A) author the missing retro now, (B) stub a backfilled retro and proceed, (C) abort and ask the user.

Source: _docs/06_metrics/retro_2026-05-26.md

2026-05-26 — [tooling] When investigating bug X reveals a separate latent bug Y, file Y as a new ticket immediately — do not fold Y's scope into X

Trigger: AZ-848 evidence-based investigation (2026-05-26) used a pymavlink probe against the Derkachi tlog to verify the original "IMU-vs-IMU clock mismatch" hypothesis. The probe REFUTED the original hypothesis (both RAW_IMU and SCALED_IMU2 share the FC-boot timebase) and SIMULTANEOUSLY surfaced a separate latent bug — c8_fc_adapter._handle_imu mis-reads SCALED_IMU2.time_boot_ms as time_usec, defaulting to 0 for ~half of all IMU samples. Both bugs are real and orthogonal in their fix paths. The decision was to split — AZ-883 (2 SP) gets its own ticket, AZ-848 (5 SP) keeps its tightly-scoped contract repair.

What changed: when a deep investigation surfaces a second latent issue that's orthogonal to the primary bug, file the second issue as its own ticket in the same session (with full evidence + reproduction protocol), then resume the primary investigation. Resist the temptation to fold the second issue into the primary ticket's scope "for convenience" — it inflates SP estimates and couples fix landings unnecessarily.

Source: _docs/06_metrics/retro_2026-05-26.md

2026-05-20 — [testing] Two-tier test policy retired — all tests run on Jetson only

Trigger: a /test-run invocation on the workstation Tier-1 Docker stack uncovered eight categorically distinct, sequential bugs in the supposedly-supported workstation path (Dockerfile COPY ordering before editable install, base-image pip too old for gtsam pre-release wheels, runtime stage missing the python3 metapackage that python3 -m venv symlinks against, missing libgl1 / libglib2.0-0 for cv2 import, missing runtime_root/__main__.py shim, lazy import that never registered the c6_tile_cache config block, and a BUILD_FAISS_INDEX env flag gap in docker-compose.test.jetson.yml). None of these had been hit before because no one had actually executed the workstation Docker stack end-to-end since it was authored — the colocated Jetson Woodpecker agent was the only test environment that ever ran. Maintaining the divergent x86 path was producing only false-negative signal and engineering time, never honest test coverage.

What changed: the two-tier execution profile is retired in favour of a Jetson-only policy. Source of truth: _docs/02_document/tests/environment.md (active-policy banner at top + superseding "Decision (2026-05-20)" in § Test Execution). CI policy updated in _docs/04_deploy/ci_cd_pipeline.md and _docs/02_document/deployment/ci_cd_pipeline.md. Local-development entry point: scripts/run-tests-jetson.sh against the configured jetson-e2e SSH alias. The general rule: if you have one environment that matches production and one that doesn't, don't maintain both — maintain the one that matches.

2026-05-20 — [process] Before classifying a per-task FAIL, probe cross-cutting state the task depends on (registries, factories, baselines)

Trigger: cycle-1 Step 7 Product Implementation Completeness Gate originally classified AZ-332 + AZ-333 as FAIL and proposed two per-strategy remediation tasks (AZ-589 + AZ-590). Post-mortem found the actual gap was the empty central _STRATEGY_REGISTRY — a cross-cutting concern that should have produced one task (AZ-591), not two. AZ-589 + AZ-590 closed Won't Fix.

What changed: completeness gates should now run a workspace grep for cross-cutting registry / factory state the task depends on before classifying a per-task FAIL. If the actual root cause is cross-cutting, propose a single cross-cutting task instead of N per-task remediation tasks. Captured in _docs/06_metrics/retro_2026-05-20.md § Suggested Rule/Skill Updates.

Source: _docs/06_metrics/retro_2026-05-20.md

2026-05-20 — [testing] If N test specs share a single un-built fixture, schedule the fixture builder as a P0 prerequisite during decompose

Trigger: cycle-1 ended with 17 NFT scenarios sitl_replay_ready-skipping on the Tier-1 docker harness because AZ-595 (SITL observer + FDR replay fixture builder) was decomposed as a peer task and slipped to the end of the cycle. Cumulative review window 88-92 surfaced this as a 5 cp PBI that now blocks the cycle-2 Step 11 retry.

What changed: decompose/SKILL.md should identify the fixture-builder dependency surface explicitly during test-task decomposition. If N test tasks share one un-built fixture, the fixture builder is a P0 prerequisite and is scheduled ahead of the dependent tasks, not as a peer. Captured in _docs/06_metrics/retro_2026-05-20.md § Suggested Rule/Skill Updates.

Source: _docs/06_metrics/retro_2026-05-20.md

2026-05-20 — [architecture] Land _docs/02_document/architecture_compliance_baseline.md as a Step 6 (Decompose) prerequisite so cumulative reviews can emit Baseline Delta sections

Trigger: every cumulative review across cycle 1 logged "_docs/02_document/architecture_compliance_baseline.md does NOT exist → no Baseline Delta section emitted". Structural regressions (new cycles in the import graph, newly-introduced architecture violations) therefore could not be quantified across cycle 1 — only verified pairwise per batch.

What changed: cycle 2 Step 6 (Decompose) should create the baseline file with 0 violations seeded from the structural snapshot at _docs/06_metrics/structure_2026-05-20.md. From cycle 2 onward, ## Baseline Delta rows quantify carried-over / resolved / newly-introduced violations per cycle. Captured in _docs/06_metrics/retro_2026-05-20.md § Top 3 Improvement Actions #3.

Source: _docs/06_metrics/retro_2026-05-20.md

2026-05-18 — When autodev rewinds N → 7 (or any earlier step) mid-session, treat the handoff as a session boundary

Trigger: In Step 11 (Run Tests) cycle 1, the Jetson e2e gate routed the flow back to Step 7 (Implement) for AZ-618 (cross-cutting 5pt task with 12 infrastructure deps). The user repeatedly chose to continue in the same conversation. I rewound state cleanly (task spec + autodev state) but, on attempting to enter the implement skill's batch loop in the SAME conversation, found that even just investigating the 12 builder signatures consumed enough context to reach the Caution zone — writing the implementation would have hit truncation mid-batch.

What changed: When the autodev rewinds the flow to an EARLIER step in the same conversation (Step 11 → Step 7, Step 11 → Step 9, etc.), treat the rewind itself as a session boundary, regardless of whether the flow file's Auto-Chain Rules table marks it as one. Save the bootstrap artifacts (task spec, state, dependencies-table refresh), commit them, then ask for a fresh conversation. The rewind already cost real tool calls; the destination step's batch loop deserves clean context. Document the rewind reason in sub_step.detail so re-entry is one-line clear.

2026-05-17 — Always call getTransitionsForJiraIssue before transitionJiraIssue

Trigger: In batch 87 (autodev step 10), I transitioned AZ-436..AZ-439 with transition.id="31" assuming = "In Progress" from stale memory. Read-back showed all four moved to Done instead (id 31 in this workflow = Done; In Progress = 21, In Testing = 32, To Do = 11). The mistake was caught by the tracker rule's mandatory read-back gate, fixed by re-transitioning to 21, and confirmed via second read-back.

What changed: Treat the transition ID as workflow-specific, not memorizable across sessions. Always query getTransitionsForJiraIssue first on the actual target issue (or one in the same project/workflow) and select the transition by name ("In Progress" / "In Testing" / "Done" / "To Do") — never by hard-coded numeric id. This is true even when you "remember" the IDs from a prior batch this same day, because the agent has no guarantee the workflow definition is stable.