Closes cycle 4 (AZ-512 admin class inline edit). Steps 16-17 artifacts: - deploy_cycle4_report.md: ui/ dev pushed (09449bd..8737491, 4 commits, fast-forward); stage/main and admin/ dev deferred at the push-scope gate (option A; same as cycle 3). AZ-513 admin/ implementation + deploy gate stays open as the cross-workspace prerequisite. - retro_2026-05-13_cycle4.md: PASS_WITH_WARNINGS verdict carries; 243 PASS / 13 SKIP / 0 FAIL; bundle 291 332 B (+757 B / +0.26%); net architecture delta 0; user-action backlog 7 -> 9 (rate decelerating from +4 to +2); first cycle where the user explicitly overrode a spec-conservative default (AZ-512 Option B). - structure_2026-05-13_cycle4.md: identity-copy snapshot; no new components, no new gates, no new barrels, no new wire-contract assertions, no new architecture findings. - LESSONS.md: top-3 cycle-4 lessons appended (testing/testing/process), ring buffer at 12 of 15. - _autodev_state.md: cycle 4 closed, cycle 5 entered awaiting New Task. Jira AZ-512: In Testing -> Done with cycle-4 closing comment. Co-authored-by: Cursor <cursoragent@cursor.com>
20 KiB
Retrospective — 2026-05-13 (Phase B Cycle 4)
Mode: cycle-end (autodev existing-code Step 17)
Scope: Phase B, cycle 4 (state.cycle = 4)
Epic / theme: AZ-509 (UI workspace cycle 3 epic, continued) — single carry-over task AZ-512 reactivated under user-authorized Option B after cycle 3's Cross-Workspace Verification BLOCKING gate
Cycle duration: 1 batch (batch 16) over 1 working day (2026-05-13)
Previous retro: _docs/06_metrics/retro_2026-05-13_cycle3.md (cycle 3, same calendar day)
Implementation Summary
| Metric | Value | Δ vs cycle 3 |
|---|---|---|
| Tasks attempted | 1 (AZ-512) | −2 |
| Tasks delivered | 1 (AZ-512) | −1 (cycle 3 shipped 2, deferred 1) |
| Tasks deferred at spec gate | 0 (the only deferral was the cycle-3 carry, already reactivated this cycle) | −1 |
| Total batches | 1 (batch 16) | −2 |
| Total complexity points planned | 3 | −6 |
| Total complexity points delivered | 3 | −3 |
| Avg tasks per batch | 1 | 0 |
| Avg complexity per (completed) batch | 3 | 0 |
| Source files mutated | 5 production + test + 1 component-doc + 5 cross-cutting docs | n/a (different shape from cycle 3) |
| Cycle shape | Single-task reactivation cycle — user explicitly overrode the cycle-3 conservative-path default | new pattern |
Sources: batch_16_cycle4_report.md, implementation_report_admin_class_edit_cycle4.md, deploy_cycle4_report.md, security_report_cycle4_delta.md, perf_2026-05-13_cycle4.md, structure_2026-05-13_cycle4.md.
Quality Metrics
Code Review Results
| Verdict | Count | Percentage | Δ vs cycle 3 |
|---|---|---|---|
| PASS (inline self-review per batch report) | 1 (batch 16) | 100 % | +1 (cycle 3 had 2 PASS) |
| PASS_WITH_WARNINGS | 0 | 0 % | 0 |
| FAIL | 0 | 0 % | 0 |
| (no review — deferred at gate) | 0 | 0 % | −1 |
Note: batch 16 used inline self-review (3-point single-task batch). A formal /code-review skill run is scheduled for batch 18 (cumulative-review cadence is every K=3 batches).
Findings by Severity (code review only)
| Severity | Count | Δ vs cycle 3 |
|---|---|---|
| Critical | 0 | 0 |
| High | 0 | 0 |
| Medium | 0 | 0 |
| Low | 0 | 0 |
Findings by Category (code review)
All zero (cycle 3 was also all-zero). No new pattern.
Security-Audit Findings (Step 14 — cycle 4 delta against cycle 3)
| Status change | Count | Notable IDs |
|---|---|---|
| Closed | 0 | — |
| Newly introduced (LOW) | 1 | F-SAST-CY4-1 — lost-update / mid-air-collision on PATCH /api/admin/classes/{id} (by design per AZ-512 spec; promotes to a UI ticket only when AZ-513 lands and the backend's concurrency model is known) |
| Carried forward unchanged | 12 | F-SAST-1 (HIGH, git-history), F-SAST-CY3-1 (LOW, test-only barrel export), F-SAST-2/3/4, F-INF-1..5 |
Security verdict trajectory: cycle 3 PASS_WITH_WARNINGS → cycle 4 PASS_WITH_WARNINGS (unchanged). bun audit re-run clean. No OWASP category status flipped.
Structural Metrics
Source: _docs/06_metrics/structure_2026-05-13_cycle4.md (this cycle), compared against structure_2026-05-13.md (cycle 3 close).
| Metric | Cycle 2 close | Cycle 3 close | Cycle 4 close | Δ vs cycle 3 |
|---|---|---|---|---|
| Component count | 12 | 12 | 12 | 0 |
| Public-API barrels | 11 / 11 | 11 / 11 | 11 / 11 | 0 |
| STC-ARCH-01 carve-out exemptions | 1 | 0 | 0 | 0 (held at zero) |
| Commit-time static gates | 33 / 33 PASS | 33 / 33 PASS | 33 / 33 PASS | 0 |
| Architecture cycles | 0 | 0 | 0 | 0 |
| Architecture findings open (baseline F1–F9) | 7 of 9 | 6 of 9 | 6 of 9 | 0 |
| Newly introduced architecture violations | 0 | 0 | 0 | 0 |
| Net architecture delta this cycle | 0 | −1 | 0 | reverted to net-zero |
Wire-contract assertions (endpoints.test.ts) |
36 | 37 | 37 | 0 (AZ-512 reused endpoints.admin.class(id)) |
| Fast-profile suite | 229 PASS / 13 SKIP / 0 FAIL | 231 PASS / 13 SKIP / 0 FAIL | 243 PASS / 13 SKIP / 0 FAIL | +12 PASS, 0 SKIP |
| Bundle (gzipped initial JS) | 290 465 B | 290 575 B | 291 332 B | +757 B (+0.26 %; ~13.89 % budget) |
Auto-lesson triggers (per skill Step 1)
- Net Architecture delta > 0? No — delta is 0.
- Structural metric regression > 20 %? No — every structural metric held; test count +5.2% (improvement); bundle +0.26% (well within budget).
- Contract coverage % decreased? No — wire-contract assertions held at 37.
- New finding category emerged? No — security audit ran in delta mode; categories stable.
Zero auto-lesson triggers fired. Manual lessons (3 picked) appear in §LESSONS Append below.
Efficiency
| Metric | Value | Δ vs cycle 3 |
|---|---|---|
| Blocked tasks (cycle-internal) | 0 | 0 |
| Tasks deferred to backlog at spec gate | 0 | −1 (the cycle-3 deferral was the one reactivated here) |
| Cross-workspace prerequisite tickets filed | 0 | −1 (AZ-513 already filed in cycle 3) |
| Pre-existing bugs surfaced as side observations | 1 (MSW /api/admin/users paginated vs AdminPage.tsx flat-array expectation) |
0 |
| Tasks pending external user action (cycle-4 close) | 9 | +2 vs cycle 3's 7 |
| Tasks requiring fixes after review | 0 | 0 |
| Batch with most findings | none — 0 findings cycle-wide | n/a |
| Auto-fix loops invoked | 0 | 0 |
| Stuck-agent incidents | 0 | 0 |
| Unplanned implementation-time test stabilization loops | 1 — selector-target fix in destructive_ux.test.tsx after the ✎ button was inserted before × |
−3 (cycle 3 had 4 for AZ-510's module-scoped state ripple) |
Blocker Analysis
| Blocker Type | Count | Prevention |
|---|---|---|
| Pre-existing bug surfaced during test writing | 1 | New cycle-4 lesson: when a new test mounts the full container component, run it once without defensive fixture overrides and let the natural crashes surface latent fixture-vs-source drift, then either fix or document — never silently work around. See Improvement Action #3. |
| Selector regression in adjacent test from new affordance | 1 | New cycle-4 lesson: adding a new control to a DOM row that already holds existing controls requires auditing the test corpus for selectors like querySelector('button') or getByRole('button') without disambiguation. See Improvement Action #2. |
| Cycle-3 deferred deploy items (carry) | 3 (D-CY3-STAGE/MAIN/ADMIN-PUSH) | Still not actioned. Cycle 4 added 3 more deploy-deferred items (D-CY4-STAGE/MAIN/ADMIN-PUSH). Compounding. |
| Cross-workspace deploy gate (carry from cycles 2 and 3) | 4 (L-AZ-498-DEPLOY, L-AZ-499-OWM-REVOKE, L-AZ-501-GOOGLE-REVOKE, L-AZ-512-ADMIN-PREREQ — last one re-opened cycle 4) | Same as cycle-3 retro Action #3 — drain mechanism still not implemented. |
User-action backlog at cycle close
| Category | Count | Items |
|---|---|---|
| Manual third-party console action | 2 | L-AZ-499-OWM-REVOKE, L-AZ-501-GOOGLE-REVOKE (carry from cycle 2) |
| Cross-workspace deploy gate (satellite-provider) | 1 | L-AZ-498-DEPLOY (carry from cycle 2) |
| Cross-workspace prerequisite ticket awaiting sibling-team work | 1 | AZ-513 implementation on admin/ (re-opened cycle 4 under user-authorized Option B) |
| Cycle deploy-push pending | 5 | D-CY3-STAGE, D-CY3-MAIN, D-CY3-ADMIN-PUSH (carry); D-CY4-STAGE, D-CY4-MAIN, D-CY4-AZ513-IMPL — note the cycle-4 AZ-513 deploy gate is the same item as the cross-workspace prereq above when counted only once (de-dup) |
| Total (de-duplicated) | 9 | (cycle 1 close: 0 → cycle 2 close: 3 → cycle 3 close: 7 → cycle 4 close: 9) |
Trajectory continues: 0 → 3 → 7 → 9. Net growth +2 this cycle (cycle 4 added D-CY4-STAGE + D-CY4-MAIN; AZ-513 re-opened as
cross-workspace prerequisite; D-CY4-ADMIN-PUSH was carried-not-added because the user kept the same dev-only push scope as cycle 3). Cycle-3 retro Improvement Action #3 (track backlog as first-class metric) is now being applied — but the drain mechanism (step-0 sweep that closes items, not just notices them) is still pending. Backlog growth is decelerating (+3, +4, +2); even so, the gap between accumulated and drained remains the dominant signal.
User-decision points (cycle 4 only)
- Cycle-4 entry: user explicitly overrode the cycle-3 conservative-path default for AZ-512 ("implement 512, 513 would be implemented in minutes. You can write mocks for backend data anyway for testing."). The spec was updated to record this as user-authorized Option B; the leftover entry was re-opened with the Option-B rationale. This is the first cross-cycle override of a spec-conservative default in cycles 1-4.
- Step 13 → Steps 14+15 gate: user chose D (run both Security Audit AND Performance Test) — first time across cycles 1-4 that BOTH optional gates ran inline. Cycle 3 also ran both via auto-chain but Step 15 emitted no separate report; cycle 4 produced standalone
perf_2026-05-13_cycle4.mdfor the first time. - Cycle-4 Step 16 deploy gate: user chose A (push to ui/ dev only) — same option as cycle 3. Stage / main / admin/ dev push deferred.
Trend Comparison
| Trend | Cycle 1 | Cycle 2 | Cycle 3 | Cycle 4 | Direction |
|---|---|---|---|---|---|
| Code review pass rate (formally-reviewed batches) | 100 % | 50 % | 100 % | 100 % (self-review) | held |
| Test count (cumulative this cycle delta) | +46 | +20 | +2 | +12 | rebounded from cycle-3 low |
| Static gate count | +2 | +2 | 0 | 0 | held |
| Architecture findings open (baseline) | 7 (−2) | 7 (0) | 6 (−1) | 6 (0) | held flat |
| STC-ARCH-01 exemptions | 1 | 1 | 0 | 0 | held at zero |
| Wire-contract assertions | 36 | 36 | 37 (+1) | 37 (0) | held |
| Pending USER actions at cycle close | 0 | 3 | 7 | 9 | ⬆ still growing (rate decelerating) |
| Tasks deferred to backlog at spec gate | 0 | 0 | 1 | 0 | reverted (the cycle-3 deferral was the one reactivated) |
| Cycles where user overrode a spec-conservative default | 0 | 0 | 0 | 1 (AZ-512 Option B) | new pattern |
| Bundle (gzipped initial JS, B) | — | 290 465 | 290 575 (+110) | 291 332 (+757) | growing in line with feature delta; far within budget |
Cycle 4 is the first single-task reactivation cycle (vs cycle 3's three-task fresh cycle). The cycle-3 retro called out that the AZ-512 gate worked as designed; cycle 4 confirms the other half of the design: a user-authorized override path can flow through the entire 9→17 step sequence without regressions, while preserving the deploy gate. Both halves of the gate design are now field-validated.
Top 3 Improvement Actions
-
Codify a "pre-existing-bug surface lifting" routine — observe-then-document, never silently work-around. While writing
tests/admin_class_edit.test.tsx, I discovered the/api/admin/usersMSW handler's paginated response vsAdminPage.tsx's flat-array expectation by hitting ausers.map is not a functionrender crash. The route taken was: document in batch_16_cycle4_report.md "Pre-existing bug noted", apply a local workaround (stubUsersAsPlainArray()inbeforeEach), and recommend filing a separate UI-workspace ticket. This was the right tactical move, but the systematic routine is missing — there's no checklist anywhere that says "when a new test mounts a container component, run it once with default fixtures only, name any crashes, and decide explicitly fix-now vs document-and-defer." Without that routine, future cycles will keep accumulating quiet local workarounds and the side-observed bug list grows without a tracking artifact.- Impact: medium — the failure mode (silent test-fixture overrides masking real source bugs) is the test-side analog of "client-side validation only" — looks green, but tested against a fake. Two distinct cycles (3 and 4) already surfaced one bug each through this route.
- Effort: low — add a section "Pre-existing-bug surfacing during test writing" to
_docs/02_tasks/_templates/module_scoped_state_introduction.md(created in cycle 3) and to the implementation skill's batch-report template; require the batch report to either list "Pre-existing bug noted" entries or affirm "None observed; ran with default fixtures only".
-
Audit test selectors that pick "the button" / "the link" / "the input" without disambiguation, before adding a new affordance to an existing DOM region. The cycle-4 ✎ edit button was inserted into the same
<td>that holds the×delete button.tests/destructive_ux.test.tsxhad three call sites usingfirstRow.querySelector('button')— each silently rebound to the new ✎ button instead of the old×button, and the tests would have shipped green-but-meaningless if not caught by the test run. The fix was 1-line per site (Array.from(...).find(b => b.textContent === '×')). The deeper lesson is that the failure mode is invisible at code-review time — a code reviewer reading the source diff has no view into which test selectors will resolve to which DOM element, only the test run reveals it. The cheap structural prevention: before adding a new control to a DOM region, grep the test corpus forquerySelector('button|input|a')/getByRole('button')without name/text disambiguation, in the same file / sibling files, and add disambiguating selectors in the affordance batch.- Impact: medium — saves one stabilization loop per affordance addition; the cost of NOT catching it is silent test-meaning-drift in destructive-UX assertions, which is exactly the kind of bug Finding B4 (cycle 1) was filed for.
- Effort: low — add a 3-bullet checklist to the implement skill's "Adjacent hygiene" rules: (a) before inserting a new button/input into an existing row/region, grep for non-disambiguated selectors targeting that region; (b) update them in the same commit; (c) if you can't make them disambiguated without changing the source DOM, prefer giving the new control a stable
data-testidover rewriting test selectors.
-
Add a "user-action backlog drain rate" to the retrospective metric set. Cycle 3 retro added "user-action backlog at cycle close" as an absolute count. Cycle 4 now has two consecutive data points (7 → 9). The signal in absolute count is being applied — but the signal that matters for process-shape is the drain rate: how many items got closed this cycle vs how many got added? Cycle 4: 1 item state-transitioned (L-AZ-512-ADMIN-PREREQ moved from "deferred awaiting AZ-513" to "re-opened under Option B"; technically still open), +2 net new (D-CY4-STAGE, D-CY4-MAIN). So drain = 0, add = 2, net = +2. Tracking drain explicitly will make the drain-mechanism conversation concrete — today the retro just says "backlog is +2, drain mechanism still pending" with no metric to optimize.
- Impact: medium — operationalizes cycle-3 Action #3. Makes the drain-mechanism design (which is presumably a step-0 sweep that closes items, not just notices them) measurable from the first cycle it runs.
- Effort: low — extend
.cursor/skills/retrospective/SKILL.mdStep 1 metric collection with a "User-action backlog drain rate" subsection (count of items added this cycle vs items closed this cycle vs net change vs absolute close-count), and add to the retrospective-report template.
Suggested Rule / Skill Updates
| File | Change | Rationale |
|---|---|---|
.cursor/skills/implement/SKILL.md (Adjacent Hygiene section) |
Add the 3-bullet "test selector audit" checklist for inserting a new control into an existing DOM region. | §Top 3 Improvement Action #2. |
.cursor/skills/implement/templates/batch_report.md (if it exists) or _docs/02_tasks/_templates/module_scoped_state_introduction.md |
Add "Pre-existing-bug surfacing during test writing" subsection requirement: batch report must explicitly list observed pre-existing bugs OR affirm "None observed; ran with default fixtures only". | §Top 3 Improvement Action #1. |
.cursor/skills/retrospective/SKILL.md (Step 1 metrics) |
Add "User-action backlog drain rate" metric: items added this cycle / items closed this cycle / net delta / absolute close-count; track alongside the absolute count introduced in cycle 3. | §Top 3 Improvement Action #3. |
.cursor/skills/retrospective/templates/retrospective-report.md |
Add a "User-action backlog drain rate" sub-table alongside the absolute table under Efficiency. | §Top 3 Improvement Action #3. |
_docs/LESSONS.md (top) |
Append the 3 lessons in §LESSONS Append below; trim to ≤ 15 entries. | Skill Step 4. |
Notes — Step 16 outcome
Step 16 (Deploy) ran in real-cutover mode (option A) for the second consecutive cycle. Push scope was ui/ dev only (4 commits, fast-forward 09449bd..8737491). The cycle-3 closure commit eef3bdf (which had been locally ahead since cycle 3's push) shipped this cycle alongside cycle-4's three commits. Stage / main / admin/ dev pushes were deferred at the push-scope sub-gate (user chose option A — ui/ dev only).
- Devices will not auto-pull cycle-3 + cycle-4 changes until
dev → stage → maincompletes (D-CY4-STAGE, D-CY4-MAIN). - AZ-513 task spec still sits locally on
admin/dev— admin/ team cannot pick it up until D-CY3-ADMIN-PUSH lands (now carried into cycle 5). - No Dockerfile /
.woodpecker// nginx / env changes in cycle 4 — verified inline by the security audit (Step 14 enumerated the changed-file set as 6 source/test + 5 doc files only). - The deployed ui/ dev build will surface
admin.classes.updateFailedon real edits until AZ-513 ships in admin/ — by design under the user-authorized Option B path.
These items add to the user-action backlog; see §Efficiency → User-action backlog table.
LESSONS Append (top 3, single-sentence, tagged)
-
[testing] When inserting a new control (button, input, link) into an existing DOM row or region that already holds other controls, audit the test corpus before the commit for non-disambiguated selectors targeting that region (
querySelector('button'),getByRole('button')withoutname/text, indexedquerySelectorAll('button')[0]) and either update them with disambiguating text/role/name in the same affordance commit or give the new control a stabledata-testid— otherwise the new control silently rebinds existing assertions to the wrong element and the tests ship green-but-meaningless, exactly as cycle 4'sdestructive_ux.test.tsxdid when the AZ-512 ✎ button became the new first button in the class-row action cell. -
[testing] When a new test mounts a container component end-to-end, run it once with the project's default test fixtures only (no per-test override) and explicitly name any natural crashes ("
users.map is not a function") in the batch report as "Pre-existing bug noted" — never silently apply a local fixture workaround without recording the latent drift, because each silent workaround hides a source-vs-fixture mismatch that future authors will re-encounter as a "mysterious test setup", and cycle 4'stests/admin_class_edit.test.tsxwas the second cycle to surface one through this route. -
[process] When the user explicitly overrides a spec-conservative cycle-defer decision (the AZ-512 Option B authorization: "implement now, write mocks for backend"), the autodev MUST preserve every downstream gate that the conservative path would have enforced — re-record the override rationale in the leftover entry, keep the cross-workspace deploy gate visible at Step 16, mark the carried tickets distinctly from cycle-internal carries, and surface the override as a first-class retrospective trend ("Cycles where user overrode a spec-conservative default") — so the operating cost of the override stays measurable and the user's downstream visibility is unchanged from the conservative path.