Files
ui/_docs/06_metrics/retro_2026-05-13_cycle4.md
Oleksandr Bezdieniezhnykh eb1e8a8581 [AZ-512] Cycle 4 closure: deploy + retro + lessons + state
Closes cycle 4 (AZ-512 admin class inline edit).

Steps 16-17 artifacts:
- deploy_cycle4_report.md: ui/ dev pushed (09449bd..8737491, 4 commits,
  fast-forward); stage/main and admin/ dev deferred at the push-scope
  gate (option A; same as cycle 3). AZ-513 admin/ implementation +
  deploy gate stays open as the cross-workspace prerequisite.
- retro_2026-05-13_cycle4.md: PASS_WITH_WARNINGS verdict carries;
  243 PASS / 13 SKIP / 0 FAIL; bundle 291 332 B (+757 B / +0.26%);
  net architecture delta 0; user-action backlog 7 -> 9 (rate
  decelerating from +4 to +2); first cycle where the user explicitly
  overrode a spec-conservative default (AZ-512 Option B).
- structure_2026-05-13_cycle4.md: identity-copy snapshot; no new
  components, no new gates, no new barrels, no new wire-contract
  assertions, no new architecture findings.
- LESSONS.md: top-3 cycle-4 lessons appended (testing/testing/process),
  ring buffer at 12 of 15.
- _autodev_state.md: cycle 4 closed, cycle 5 entered awaiting New Task.

Jira AZ-512: In Testing -> Done with cycle-4 closing comment.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 04:56:42 +03:00

193 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Retrospective — 2026-05-13 (Phase B Cycle 4)
**Mode**: cycle-end (autodev existing-code Step 17)
**Scope**: Phase B, cycle 4 (`state.cycle = 4`)
**Epic / theme**: AZ-509 (UI workspace cycle 3 epic, continued) — single carry-over task AZ-512 reactivated under user-authorized Option B after cycle 3's Cross-Workspace Verification BLOCKING gate
**Cycle duration**: 1 batch (batch 16) over 1 working day (2026-05-13)
**Previous retro**: `_docs/06_metrics/retro_2026-05-13_cycle3.md` (cycle 3, same calendar day)
## Implementation Summary
| Metric | Value | Δ vs cycle 3 |
|--------|-------|--------------|
| Tasks attempted | 1 (AZ-512) | 2 |
| Tasks delivered | 1 (AZ-512) | 1 (cycle 3 shipped 2, deferred 1) |
| Tasks deferred at spec gate | 0 (the only deferral was the cycle-3 carry, already reactivated this cycle) | 1 |
| Total batches | 1 (batch 16) | 2 |
| Total complexity points planned | 3 | 6 |
| Total complexity points delivered | 3 | 3 |
| Avg tasks per batch | 1 | 0 |
| Avg complexity per (completed) batch | 3 | 0 |
| Source files mutated | 5 production + test + 1 component-doc + 5 cross-cutting docs | n/a (different shape from cycle 3) |
| Cycle shape | Single-task reactivation cycle — user explicitly overrode the cycle-3 conservative-path default | new pattern |
Sources: `batch_16_cycle4_report.md`, `implementation_report_admin_class_edit_cycle4.md`, `deploy_cycle4_report.md`, `security_report_cycle4_delta.md`, `perf_2026-05-13_cycle4.md`, `structure_2026-05-13_cycle4.md`.
## Quality Metrics
### Code Review Results
| Verdict | Count | Percentage | Δ vs cycle 3 |
|---------|-------|-----------|--------------|
| PASS (inline self-review per batch report) | 1 (batch 16) | 100 % | +1 (cycle 3 had 2 PASS) |
| PASS_WITH_WARNINGS | 0 | 0 % | 0 |
| FAIL | 0 | 0 % | 0 |
| (no review — deferred at gate) | 0 | 0 % | 1 |
Note: batch 16 used inline self-review (3-point single-task batch). A formal `/code-review` skill run is scheduled for batch 18 (cumulative-review cadence is every K=3 batches).
### Findings by Severity (code review only)
| Severity | Count | Δ vs cycle 3 |
|----------|-------|--------------|
| Critical | 0 | 0 |
| High | 0 | 0 |
| Medium | 0 | 0 |
| Low | 0 | 0 |
### Findings by Category (code review)
All zero (cycle 3 was also all-zero). No new pattern.
### Security-Audit Findings (Step 14 — cycle 4 delta against cycle 3)
| Status change | Count | Notable IDs |
|---------------|-------|-------------|
| Closed | 0 | — |
| Newly introduced (LOW) | 1 | F-SAST-CY4-1 — lost-update / mid-air-collision on `PATCH /api/admin/classes/{id}` (by design per AZ-512 spec; promotes to a UI ticket only when AZ-513 lands and the backend's concurrency model is known) |
| Carried forward unchanged | 12 | F-SAST-1 (HIGH, git-history), F-SAST-CY3-1 (LOW, test-only barrel export), F-SAST-2/3/4, F-INF-1..5 |
**Security verdict trajectory**: cycle 3 PASS_WITH_WARNINGS → cycle 4 **PASS_WITH_WARNINGS** (unchanged). `bun audit` re-run clean. No OWASP category status flipped.
## Structural Metrics
Source: `_docs/06_metrics/structure_2026-05-13_cycle4.md` (this cycle), compared against `structure_2026-05-13.md` (cycle 3 close).
| Metric | Cycle 2 close | Cycle 3 close | Cycle 4 close | Δ vs cycle 3 |
|--------|---------------|---------------|---------------|--------------|
| Component count | 12 | 12 | 12 | 0 |
| Public-API barrels | 11 / 11 | 11 / 11 | 11 / 11 | 0 |
| STC-ARCH-01 carve-out exemptions | 1 | 0 | 0 | 0 (held at zero) |
| Commit-time static gates | 33 / 33 PASS | 33 / 33 PASS | 33 / 33 PASS | 0 |
| Architecture cycles | 0 | 0 | 0 | 0 |
| Architecture findings open (baseline F1F9) | 7 of 9 | 6 of 9 | 6 of 9 | 0 |
| Newly introduced architecture violations | 0 | 0 | 0 | 0 |
| Net architecture delta this cycle | 0 | 1 | **0** | reverted to net-zero |
| Wire-contract assertions (`endpoints.test.ts`) | 36 | 37 | 37 | 0 (AZ-512 reused `endpoints.admin.class(id)`) |
| Fast-profile suite | 229 PASS / 13 SKIP / 0 FAIL | 231 PASS / 13 SKIP / 0 FAIL | **243 PASS / 13 SKIP / 0 FAIL** | **+12 PASS**, 0 SKIP |
| Bundle (gzipped initial JS) | 290 465 B | 290 575 B | **291 332 B** | +757 B (+0.26 %; ~13.89 % budget) |
### Auto-lesson triggers (per skill Step 1)
- Net Architecture delta > 0? **No** — delta is 0.
- Structural metric regression > 20 %? **No** — every structural metric held; test count +5.2% (improvement); bundle +0.26% (well within budget).
- Contract coverage % decreased? **No** — wire-contract assertions held at 37.
- New finding category emerged? **No** — security audit ran in delta mode; categories stable.
**Zero auto-lesson triggers fired.** Manual lessons (3 picked) appear in §LESSONS Append below.
## Efficiency
| Metric | Value | Δ vs cycle 3 |
|--------|-------|--------------|
| Blocked tasks (cycle-internal) | 0 | 0 |
| Tasks deferred to backlog at spec gate | 0 | 1 (the cycle-3 deferral was the one reactivated here) |
| Cross-workspace prerequisite tickets filed | 0 | 1 (AZ-513 already filed in cycle 3) |
| Pre-existing bugs surfaced as side observations | 1 (MSW `/api/admin/users` paginated vs `AdminPage.tsx` flat-array expectation) | 0 |
| Tasks pending external user action (cycle-4 close) | **9** | +2 vs cycle 3's 7 |
| Tasks requiring fixes after review | 0 | 0 |
| Batch with most findings | none — 0 findings cycle-wide | n/a |
| Auto-fix loops invoked | 0 | 0 |
| Stuck-agent incidents | 0 | 0 |
| Unplanned implementation-time test stabilization loops | 1 — selector-target fix in `destructive_ux.test.tsx` after the ✎ button was inserted before `×` | 3 (cycle 3 had 4 for AZ-510's module-scoped state ripple) |
### Blocker Analysis
| Blocker Type | Count | Prevention |
|--------------|-------|-----------|
| Pre-existing bug surfaced during test writing | 1 | New cycle-4 lesson: when a new test mounts the full container component, run it once *without* defensive fixture overrides and let the natural crashes surface latent fixture-vs-source drift, then either fix or document — never silently work around. See Improvement Action #3. |
| Selector regression in adjacent test from new affordance | 1 | New cycle-4 lesson: adding a new control to a DOM row that already holds existing controls requires auditing the test corpus for selectors like `querySelector('button')` or `getByRole('button')` without disambiguation. See Improvement Action #2. |
| Cycle-3 deferred deploy items (carry) | 3 (D-CY3-STAGE/MAIN/ADMIN-PUSH) | Still not actioned. Cycle 4 added 3 more deploy-deferred items (D-CY4-STAGE/MAIN/ADMIN-PUSH). Compounding. |
| Cross-workspace deploy gate (carry from cycles 2 and 3) | 4 (L-AZ-498-DEPLOY, L-AZ-499-OWM-REVOKE, L-AZ-501-GOOGLE-REVOKE, L-AZ-512-ADMIN-PREREQ — last one re-opened cycle 4) | Same as cycle-3 retro Action #3 — drain mechanism still not implemented. |
### User-action backlog at cycle close
| Category | Count | Items |
|----------|-------|-------|
| Manual third-party console action | 2 | L-AZ-499-OWM-REVOKE, L-AZ-501-GOOGLE-REVOKE (carry from cycle 2) |
| Cross-workspace deploy gate (satellite-provider) | 1 | L-AZ-498-DEPLOY (carry from cycle 2) |
| Cross-workspace prerequisite ticket awaiting sibling-team work | 1 | AZ-513 implementation on `admin/` (re-opened cycle 4 under user-authorized Option B) |
| Cycle deploy-push pending | 5 | D-CY3-STAGE, D-CY3-MAIN, D-CY3-ADMIN-PUSH (carry); D-CY4-STAGE, D-CY4-MAIN, D-CY4-AZ513-IMPL — note the cycle-4 AZ-513 deploy gate is the same item as the cross-workspace prereq above when counted only once (de-dup) |
| **Total (de-duplicated)** | **9** | (cycle 1 close: 0 → cycle 2 close: 3 → cycle 3 close: 7 → cycle 4 close: **9**) |
> Trajectory continues: 0 → 3 → 7 → **9**. Net growth +2 this cycle (cycle 4 added D-CY4-STAGE + D-CY4-MAIN; AZ-513 re-opened as `cross-workspace prerequisite`; D-CY4-ADMIN-PUSH was carried-not-added because the user kept the same dev-only push scope as cycle 3). Cycle-3 retro Improvement Action #3 (track backlog as first-class metric) is now being applied — but the drain mechanism (step-0 sweep that closes items, not just notices them) is still pending. **Backlog growth is decelerating** (+3, +4, **+2**); even so, the gap between accumulated and drained remains the dominant signal.
### User-decision points (cycle 4 only)
- Cycle-4 entry: user **explicitly overrode** the cycle-3 conservative-path default for AZ-512 ("implement 512, 513 would be implemented in minutes. You can write mocks for backend data anyway for testing."). The spec was updated to record this as user-authorized Option B; the leftover entry was re-opened with the Option-B rationale. This is the first cross-cycle override of a spec-conservative default in cycles 1-4.
- Step 13 → Steps 14+15 gate: user chose **D** (run both Security Audit AND Performance Test) — first time across cycles 1-4 that BOTH optional gates ran inline. Cycle 3 also ran both via auto-chain but Step 15 emitted no separate report; cycle 4 produced standalone `perf_2026-05-13_cycle4.md` for the first time.
- Cycle-4 Step 16 deploy gate: user chose **A** (push to ui/ dev only) — same option as cycle 3. Stage / main / admin/ dev push deferred.
## Trend Comparison
| Trend | Cycle 1 | Cycle 2 | Cycle 3 | Cycle 4 | Direction |
|-------|---------|---------|---------|---------|-----------|
| Code review pass rate (formally-reviewed batches) | 100 % | 50 % | 100 % | 100 % (self-review) | held |
| Test count (cumulative this cycle delta) | +46 | +20 | +2 | **+12** | rebounded from cycle-3 low |
| Static gate count | +2 | +2 | 0 | **0** | held |
| Architecture findings open (baseline) | 7 (2) | 7 (0) | 6 (1) | **6 (0)** | held flat |
| STC-ARCH-01 exemptions | 1 | 1 | 0 | **0** | held at zero |
| Wire-contract assertions | 36 | 36 | 37 (+1) | **37 (0)** | held |
| Pending USER actions at cycle close | 0 | 3 | 7 | **9** | ⬆ still growing (rate decelerating) |
| Tasks deferred to backlog at spec gate | 0 | 0 | 1 | **0** | reverted (the cycle-3 deferral was the one reactivated) |
| Cycles where user overrode a spec-conservative default | 0 | 0 | 0 | **1** (AZ-512 Option B) | new pattern |
| Bundle (gzipped initial JS, B) | — | 290 465 | 290 575 (+110) | **291 332 (+757)** | growing in line with feature delta; far within budget |
Cycle 4 is the first single-task reactivation cycle (vs cycle 3's three-task fresh cycle). The cycle-3 retro called out that the AZ-512 gate worked as designed; cycle 4 confirms the *other half* of the design: a user-authorized override path can flow through the entire 9→17 step sequence without regressions, while preserving the deploy gate. Both halves of the gate design are now field-validated.
## Top 3 Improvement Actions
1. **Codify a "pre-existing-bug surface lifting" routine — observe-then-document, never silently work-around.**
While writing `tests/admin_class_edit.test.tsx`, I discovered the `/api/admin/users` MSW handler's paginated response vs `AdminPage.tsx`'s flat-array expectation by hitting a `users.map is not a function` render crash. The route taken was: document in batch_16_cycle4_report.md "Pre-existing bug noted", apply a local workaround (`stubUsersAsPlainArray()` in `beforeEach`), and recommend filing a separate UI-workspace ticket. This was the right tactical move, but the **systematic routine** is missing — there's no checklist anywhere that says "when a new test mounts a container component, run it once with default fixtures only, name any crashes, and decide explicitly fix-now vs document-and-defer." Without that routine, future cycles will keep accumulating quiet local workarounds and the side-observed bug list grows without a tracking artifact.
- Impact: medium — the failure mode (silent test-fixture overrides masking real source bugs) is the test-side analog of "client-side validation only" — looks green, but tested against a fake. Two distinct cycles (3 and 4) already surfaced one bug each through this route.
- Effort: low — add a section "Pre-existing-bug surfacing during test writing" to `_docs/02_tasks/_templates/module_scoped_state_introduction.md` (created in cycle 3) and to the implementation skill's batch-report template; require the batch report to either list "Pre-existing bug noted" entries or affirm "None observed; ran with default fixtures only".
2. **Audit test selectors that pick "the button" / "the link" / "the input" without disambiguation, before adding a new affordance to an existing DOM region.**
The cycle-4 ✎ edit button was inserted into the same `<td>` that holds the `×` delete button. `tests/destructive_ux.test.tsx` had three call sites using `firstRow.querySelector('button')` — each silently rebound to the new ✎ button instead of the old `×` button, and the tests would have shipped green-but-meaningless if not caught by the test run. The fix was 1-line per site (`Array.from(...).find(b => b.textContent === '×')`). The deeper lesson is that the failure mode is **invisible at code-review time** — a code reviewer reading the source diff has no view into which test selectors will resolve to which DOM element, only the test run reveals it. The cheap structural prevention: before adding a new control to a DOM region, grep the test corpus for `querySelector('button|input|a')` / `getByRole('button')` without name/text disambiguation, in the same file / sibling files, and add disambiguating selectors *in the affordance batch*.
- Impact: medium — saves one stabilization loop per affordance addition; the cost of NOT catching it is silent test-meaning-drift in destructive-UX assertions, which is exactly the kind of bug Finding B4 (cycle 1) was filed for.
- Effort: low — add a 3-bullet checklist to the implement skill's "Adjacent hygiene" rules: (a) before inserting a new button/input into an existing row/region, grep for non-disambiguated selectors targeting that region; (b) update them in the same commit; (c) if you can't make them disambiguated without changing the source DOM, prefer giving the new control a stable `data-testid` over rewriting test selectors.
3. **Add a "user-action backlog drain rate" to the retrospective metric set.**
Cycle 3 retro added "user-action backlog at cycle close" as an absolute count. Cycle 4 now has two consecutive data points (7 → 9). The signal in absolute count is being applied — but the signal that matters for process-shape is the **drain rate**: how many items got closed *this* cycle vs how many got added? Cycle 4: 1 item state-transitioned (L-AZ-512-ADMIN-PREREQ moved from "deferred awaiting AZ-513" to "re-opened under Option B"; technically still open), +2 net new (D-CY4-STAGE, D-CY4-MAIN). So drain = 0, add = 2, net = +2. Tracking drain explicitly will make the drain-mechanism conversation concrete — today the retro just says "backlog is +2, drain mechanism still pending" with no metric to optimize.
- Impact: medium — operationalizes cycle-3 Action #3. Makes the drain-mechanism design (which is presumably a step-0 sweep that closes items, not just notices them) measurable from the first cycle it runs.
- Effort: low — extend `.cursor/skills/retrospective/SKILL.md` Step 1 metric collection with a "User-action backlog drain rate" subsection (count of items added this cycle vs items closed this cycle vs net change vs absolute close-count), and add to the retrospective-report template.
## Suggested Rule / Skill Updates
| File | Change | Rationale |
|------|--------|-----------|
| `.cursor/skills/implement/SKILL.md` (Adjacent Hygiene section) | Add the 3-bullet "test selector audit" checklist for inserting a new control into an existing DOM region. | §Top 3 Improvement Action #2. |
| `.cursor/skills/implement/templates/batch_report.md` (if it exists) or `_docs/02_tasks/_templates/module_scoped_state_introduction.md` | Add "Pre-existing-bug surfacing during test writing" subsection requirement: batch report must explicitly list observed pre-existing bugs OR affirm "None observed; ran with default fixtures only". | §Top 3 Improvement Action #1. |
| `.cursor/skills/retrospective/SKILL.md` (Step 1 metrics) | Add **"User-action backlog drain rate"** metric: items added this cycle / items closed this cycle / net delta / absolute close-count; track alongside the absolute count introduced in cycle 3. | §Top 3 Improvement Action #3. |
| `.cursor/skills/retrospective/templates/retrospective-report.md` | Add a "User-action backlog drain rate" sub-table alongside the absolute table under Efficiency. | §Top 3 Improvement Action #3. |
| `_docs/LESSONS.md` (top) | Append the 3 lessons in §LESSONS Append below; trim to ≤ 15 entries. | Skill Step 4. |
## Notes — Step 16 outcome
Step 16 (Deploy) ran in **real-cutover mode (option A)** for the second consecutive cycle. Push scope was ui/ `dev` only (4 commits, fast-forward `09449bd..8737491`). The cycle-3 closure commit `eef3bdf` (which had been locally ahead since cycle 3's push) shipped this cycle alongside cycle-4's three commits. Stage / main / admin/ `dev` pushes were deferred at the push-scope sub-gate (user chose option A — ui/ dev only).
- Devices will not auto-pull cycle-3 + cycle-4 changes until `dev → stage → main` completes (D-CY4-STAGE, D-CY4-MAIN).
- AZ-513 task spec still sits locally on `admin/` `dev` — admin/ team cannot pick it up until D-CY3-ADMIN-PUSH lands (now carried into cycle 5).
- No Dockerfile / `.woodpecker/` / nginx / env changes in cycle 4 — verified inline by the security audit (Step 14 enumerated the changed-file set as 6 source/test + 5 doc files only).
- The deployed ui/ dev build will surface `admin.classes.updateFailed` on real edits until AZ-513 ships in admin/ — by design under the user-authorized Option B path.
These items add to the user-action backlog; see §Efficiency → User-action backlog table.
## LESSONS Append (top 3, single-sentence, tagged)
1. **[testing]** When inserting a new control (button, input, link) into an existing DOM row or region that already holds other controls, audit the test corpus *before* the commit for non-disambiguated selectors targeting that region (`querySelector('button')`, `getByRole('button')` without `name`/`text`, indexed `querySelectorAll('button')[0]`) and either update them with disambiguating text/role/name in the same affordance commit or give the new control a stable `data-testid` — otherwise the new control silently rebinds existing assertions to the wrong element and the tests ship green-but-meaningless, exactly as cycle 4's `destructive_ux.test.tsx` did when the AZ-512 ✎ button became the new first button in the class-row action cell.
2. **[testing]** When a new test mounts a container component end-to-end, run it once with the project's default test fixtures only (no per-test override) and explicitly name any natural crashes ("`users.map is not a function`") in the batch report as "Pre-existing bug noted" — never silently apply a local fixture workaround without recording the latent drift, because each silent workaround hides a source-vs-fixture mismatch that future authors will re-encounter as a "mysterious test setup", and cycle 4's `tests/admin_class_edit.test.tsx` was the second cycle to surface one through this route.
3. **[process]** When the user explicitly overrides a spec-conservative cycle-defer decision (the AZ-512 Option B authorization: "implement now, write mocks for backend"), the autodev MUST preserve every downstream gate that the conservative path would have enforced — re-record the override rationale in the leftover entry, keep the cross-workspace deploy gate visible at Step 16, mark the carried tickets distinctly from cycle-internal carries, and surface the override as a first-class retrospective trend ("Cycles where user overrode a spec-conservative default") — so the operating cost of the override stays measurable and the user's downstream visibility is unchanged from the conservative path.