Cycle-3 retrospective:
- 6 tasks (AZ-491..AZ-496), 5 batches, 18 SP delivered.
- 100% code review pass rate (5/5 PASS_WITH_WARNINGS, 0 FAIL).
- 0 Critical/High/Medium review findings; 7 distinct Low.
- Security audit PASS_WITH_WARNINGS: 0 new Medium, 3 Low (all
test-only or operator-CLI), 2 Informational, 1 False Positive.
- Net Architecture delta: **-3** (F-AUTH-2 + D1 + D3 RESOLVED;
only new findings are Low test-side surfaces). First
net-negative cycle on record.
- 5 of 6 tasks completed first attempt (no post-review fix
commits). Cycle-2's 2 prior-retro actions all translated to
closed work (AZ-491 from Action 1, AZ-492 from Action 2,
AZ-493 from Action 3).
Top 3 cycle-4 improvement actions surfaced:
1. Execute the perf harness to capture PT-07/PT-08 baseline.
2. Bump TestSupport JWT pins 7.0.3 → 7.1.2+ (D4 NU1902 cleanup).
3. Add `workspace:` tag to cross-repo ACs in task-spec writing
and render them separately in the traceability matrix.
3 new ring-buffer lessons appended to _docs/LESSONS.md:
- [process] Option-B forcing functions for cross-team blockers.
- [process] ACs prescribing a measurement should also prescribe
the collection path.
- [process] Cross-repo-write ACs need workspace tags.
Structural snapshot at structure_2026-05-12_cycle3.md records the
new SatelliteProvider.TestSupport project (+2 ProjectReference edges
into it; no production-layer dependents) and the AZ-496 package
bumps (8.0.21 → 8.0.25).
Cycle 3 COMPLETE. State advanced to Step 9 (New Task) for cycle 4
per existing-code flow Re-Entry After Completion.
Co-authored-by: Cursor <cursoragent@cursor.com>
16 KiB
Retrospective — Cycle 3 (2026-05-12)
Tasks: AZ-491 (consolidate JWT test-mint helpers, 3 SP) + AZ-492 (perf harness PT-07 + PT-08 + JWT-attach, 5 SP) + AZ-493 (integration test DB reset hook, 3 SP) + AZ-494 (JWT iss/aud validation, 3 SP, Option B) + AZ-495 (doc folder convention, 2 SP) + AZ-496 (bump AspNetCore 8.0.25, 2 SP)
Mode: cycle-end (autodev Step 17)
Previous retro: retro_2026-05-11_cycle2.md
1. Implementation Metrics
| Metric | Cycle 3 | Δ vs cycle 2 |
|---|---|---|
| Tasks implemented | 6 (AZ-491, AZ-492, AZ-493, AZ-494, AZ-495, AZ-496) | +4 |
| Batches executed | 5 | +3 |
| Avg tasks / batch | 1.2 | -1.3 (cycle 2 = 1.0 tasks/batch but only 2 batches) |
| Total complexity delivered | 18 SP | +8 (cycle 2 = 10 SP) |
| Avg complexity / batch | 3.6 SP | -1.4 |
| Tasks at-or-below 5 SP cap | 6 of 6 (100%) | +4 (cycle 2 = 1 of 2) |
| Tasks above cap | 0 | -1 (cycle 2 had AZ-488 at 8 SP) |
| Cumulative reviews | 2 (after batches 01-03, after batches 04-05) | +1 (cycle 2 had 0) |
Sequencing: batches ordered to respect Cycle-2 retro's Action 1 (Action 1 → AZ-491 first) so subsequent perf and DB-reset work could consume the consolidated factory. AZ-494 was sequenced LAST because it depended on AZ-491's TestSupport + AZ-492's perf-harness env-var pass-through.
2. Quality Metrics
| Metric | Cycle 3 | Δ vs cycle 2 |
|---|---|---|
| Code review pass rate | 5/5 = 100% (all PASS_WITH_WARNINGS) | unchanged |
| Code review findings — Critical | 0 | unchanged |
| Code review findings — High | 0 | unchanged |
| Code review findings — Medium | 0 | unchanged |
| Code review findings — Low | 7 distinct (batch-01 F1 + F2 + batch-02 L1 + batch-03 F1 + batch-04 L1 + L2 + batch-05 L1) | +2 |
| Code review FAIL count | 0 | unchanged |
| Cumulative review findings | 4 Low (carry-overs from per-batch) + 1 operational gate (AZ-494 prod iss/aud) | new metric |
| Security audit verdict | PASS_WITH_WARNINGS | unchanged |
| Security findings introduced by cycle 3 | 0 new Medium, 3 new Low (D4 test-only, F-DBR-2 test-only, F-PERF-1 operator-CLI), 2 Informational (F-AUTH-3 test-runner log, F-AUTH-4 DEV-ONLY by design), 1 False Positive (F-DBR-1) | better (cycle 2 had 2 new Medium + 4 new Low + 1 Info) |
| Security findings RESOLVED by cycle 3 | F-AUTH-2 (Medium) by AZ-494; D1 (Medium) + D3 (Low) by AZ-496 | first cycle with explicit resolved-finding count |
3. Structural Metrics (snapshot: structure_2026-05-12_cycle3.md)
| Metric | Cycle 3 | Δ vs cycle 2 |
|---|---|---|
| .NET projects (csproj) | 9 | +1 (new: SatelliteProvider.TestSupport) |
| Cross-project edges (ProjectReference) | grew by 2 (TestSupport ← Tests, TestSupport ← IntegrationTests) | +2 |
| Cycles in project graph | 0 | unchanged |
| Public API symbols in TestSupport | 3 (JwtTokenFactory, IntegrationTestResetGuard, internal helpers) |
new |
| New Architecture violations | 0 | unchanged |
| Resolved Architecture violations | F-AUTH-2 (auth contract gap), D1+D3 (supply-chain drift), PT-07/PT-08 deferred-since-cycle-2 | best metric of cycle — first net-negative architecture delta cycle |
| Net Architecture delta | -3 (resolved minus new) | improvement |
| Contract coverage % | unchanged (no new public API surfaces this cycle) | n/a |
A snapshot was written to _docs/06_metrics/structure_2026-05-12_cycle3.md (see Step 1 self-verification).
4. Efficiency Metrics
| Metric | Cycle 3 | Δ vs cycle 2 |
|---|---|---|
| Blocked tasks (during implementation) | 1 of 6 — AZ-494 blocked on cross-team input (admin-team iss/aud values), resolved by user choosing Option B (plumbing only, fail-fast at deploy) | new pattern |
| Tasks completed first attempt (no post-review fix commits) | 5 of 6 — only AZ-491 had a follow-on batch-02 fix (fix: commit) for the env-var save/restore pattern between iss/aud overrides; the rest landed clean |
best cycle on record (cycle 1 = 0 of 1; cycle 2 = 0 of 2) |
| Tasks requiring multiple post-code-review fix commits | 0 | -2 (cycle 2 had 2 of 2) |
| Most-findings batch | batch 04 (AZ-492, 2 Low: L1 proxy measurement + L2 dupe JPEG factory) | batch 02 in cycle 2 |
| Cumulative-review-only findings | 0 — every cumulative-review item was already raised at the per-batch level | new metric; the cumulative review acted as confirmation, not discovery |
| Step-15 (Perf Test) execution | SKIPPED (user skipped the gate question; recorded as leftover) | unchanged vs cycle 2 (also skipped) |
| Step-14 (Security Audit) — net findings improvement | +1 (3 Resolved, 3 new Low test-only) | net-zero in cycle 2 |
5. Patterns Identified
Pattern 1 — Action items from prior retros directly drove the cycle scope and DID resolve
Cycle 2 retro identified three Top-3 actions:
- Action 1 (consolidate JWT test-mint helpers) → drove AZ-491. Verdict: resolved. The Phase-6 code-review rule added during AZ-491 review (the AZ-491 batch produced both the consolidation AND the rule that prevents recurrence) immediately fired on AZ-492 batch-04's L2 finding (dupe JPEG factory), proving the rule is doing useful work.
- Action 2 (perf harness work as real task) → drove AZ-492. Verdict: resolved. PT-07 + PT-08 now runnable; the leftover from cycle 2 was deleted.
- Action 3 (DB-state reset) → drove AZ-493. Verdict: resolved. Wall-clock workaround removed; two-guard model unit-tested.
This is the first cycle where retro actions translated directly to next-cycle work AND closed. Pattern is healthy. Keep doing it.
Pattern 2 — One AC (AZ-494 AC-7) requires cross-repo work; workspace-boundary rule fired correctly
AZ-494 AC-7 says "Suite contract reflects reality" — i.e. update suite/_docs/10_auth.md. That file lives in a different workspace. The autodev correctly did NOT write across workspaces; instead, the batch report + cumulative review + deploy report all flagged the cross-repo write as deferred operational work.
Insight: ACs that require cross-repo writes should be split — one AC per workspace — and tagged with the workspace they target. Otherwise an AC tagged "deferred" on technicality looks like incomplete work in the traceability matrix when it's actually the right outcome.
Pattern 3 — Spec-vs-reality drift on measurement / sentinel paths repeated twice
- AZ-492 L1: spec said "per-item gate cost < 50 ms" without specifying the measurement path; harness produced a derived proxy because direct measurement requires server-side instrumentation outside AZ-492's scope.
- AZ-493 F1: spec prescribed "DB name contains
_test" as Guard 2; reality usesDatabase=satelliteproviderand rename requires user confirmation; the implementation substituted Host allowlist as an equivalent guard.
Both cases were caught and documented during code review, but the underlying pattern is the same: ACs that prescribe a specific measurement or sentinel mechanism should also prescribe (or explicitly defer) the path for collecting / enforcing it, so the implementation has a clear bound between "follow the spec" and "substitute an equivalent".
Pattern 4 — Option-B forcing functions (fail-fast in prod) elegantly handle cross-team blockers
AZ-494 was blocked on the admin team's iss/aud values. Rather than freeze the work, Option B (user choice) shipped the validation code with empty appsettings.json values, guaranteeing prod fails at startup until the real values are supplied. This:
- Unblocks cycle 3 implementation immediately.
- Forces the cross-team conversation to happen before prod deploy, not after.
- Leaves no silently-broken state ("validation off in prod") possible.
This is a generalizable pattern: for cross-team blockers, prefer a fail-fast scaffold over a defer-the-whole-task decision. Worth adding as a lesson.
Pattern 5 — NU1902 warning surfaced AT build, was triaged inline, accepted as cycle-3 D4
The new TestSupport project's System.IdentityModel.Tokens.Jwt 7.0.3 pin triggered NU1902 (moderate severity) at every restore — visible in the build log 9 times. The batch-02 review (AZ-491) recorded this; the cycle-3 security audit recorded it again as D4; the deploy report carries it as a follow-up.
The path from "compiler warning during test runs" → "tracked security finding with a remediation owner" worked correctly. Worth keeping as a positive example: build-side warnings should ride into the security audit, not be silenced by the agent.
Pattern 6 — dotnet format whitespace --verify-no-changes ran transparently; no format-drift commits
Format drift caused two extra fix commits in cycle 2 (CS0104 fix, JwtFactory net8.0 fix). Cycle 3 had ZERO format-drift fixes — dotnet format ran as the first step of run-tests.sh, was visible in the log, and all 5 batches passed it without intervention. The earlier hardening (cycle 2 added format check to the test runner) pays off here.
6. Comparison vs. previous retro
| Metric | Cycle 1 | Cycle 2 | Cycle 3 |
|---|---|---|---|
| Tasks implemented | 1 | 2 | 6 |
| Batches | 1 | 2 | 5 |
| Critical/High review findings | 0 | 0 | 0 |
| New Medium review findings | 0 | 0 | 0 |
| New Low review findings | 3 | 6 (5 distinct) | 7 |
| Code review pass rate | 100% (1/1) | 100% (2/2) | 100% (5/5) |
| Tasks completed first attempt | 0 of 1 | 0 of 2 | 5 of 6 |
| New Medium security findings | 2 | 2 | 0 |
| Resolved security findings | 0 | 0 | 3 (F-AUTH-2, D1, D3) |
| Net Architecture delta | n/a (baseline) | +0 | -3 |
| Step-15 (Perf) executed | N/A | SKIPPED | SKIPPED |
| Step-15 leftover present at retro | N/A | YES (PT-07) | YES (cycle-3 perf-harness execution) |
Did the cycle-2 actions land?
- Cycle 2 Action 1 (consolidate JWT mint helpers) — landed as AZ-491. Phase-6 code-review rule added simultaneously. Rule fired in AZ-492 batch-04 (L2 dupe JPEG factory) as designed. Verdict: full implementation + recurrence prevention.
- Cycle 2 Action 2 (perf harness as real feature) — landed as AZ-492. PT-07 + PT-08 are now runnable. Cycle-2 leftover file deleted. Verdict: full implementation. (Execution is a separate cycle-3 leftover, see Action 3 below.)
- Cycle 2 Action 3 (DB reset between runs) — landed as AZ-493. AZ-488 wall-clock workaround removed; two-guard model in place. Verdict: full implementation.
This is the first cycle in which all prior retro actions translated to closed work. Cycle 1 actions had mixed outcomes; cycle 2 actions were the input to cycle 3.
7. Top 3 Improvement Actions (ranked by impact)
Action 1 — Execute the cycle-3 perf harness against the deployed dev image to convert the cycle-3 perf-execution leftover into PT-07/PT-08 baseline numbers
Why this is the highest impact: PT-07 and PT-08 are NOW runnable (AZ-492 closed cycle 2's biggest backlog item). The harness is sitting unexercised because Step 15 was skipped. The next time anyone makes a change that touches the region read or upload paths, the perf gate must compare against something — and right now there is no recent baseline.
Action: at the start of cycle 4 (or as a one-off "ops" task), run ./scripts/run-performance-tests.sh against the deployed dev-tier image. Record the PT-01..PT-06 / PT-07 / PT-08 numbers in _docs/06_metrics/perf_<YYYY-MM-DD>.md (mirrors perf_2026-05-11_cycle1_az484.md). Use those numbers as the new baseline.
Cost: ~30 minutes (script run + record + delete leftover file). Possibly tracked as a 2-SP cycle-4 PBI rather than ad-hoc.
Action 2 — Bump System.IdentityModel.Tokens.Jwt 7.0.3 → 7.1.2+ (or to 8.0.x) in SatelliteProvider.TestSupport to clear the NU1902 build noise
Why: 9 NU1902 hits in every test build log degrade the signal-to-noise ratio of CI output and add a low-grade "is this PR adding new vulnerabilities?" cognitive load during review. The fix is one csproj edit; the trade-off is just verifying test compat with the newer version. Test-only finding, NOT production-reachable, but cleaning it up is cheap.
Action: 2 SP PBI in cycle 4. Recommended pin: 7.1.2 to stay on the 7.x line that JwtBearer 8.0.25 transitively depends on. Verify all JwtTokenFactory unit + integration tests still pass.
Action 3 — Make "ACs requiring cross-repo writes" structurally visible during task decomposition
Why: AZ-494 AC-7 was correctly deferred to a cross-repo write, but the traceability matrix shows AC-7 as ◐ deferred which looks like missed work. Future cycles will accumulate more cross-repo ACs as the satellite-provider's contract surface grows; without structure, the traceability matrix gradually fills with ◐ markers that mean "different thing each time".
Action:
- Add a
workspace:field to task spec ACs (where applicable):workspace: satellite-provider(default) orworkspace: suite. - In
new-task/SKILL.mdStep 6 (AC writing), prompt: "if this AC requires changes outside this workspace, tag withworkspace: <name>and confirm with user that a cross-repo PBI will be filed". - In the traceability matrix, render cross-workspace ACs in a separate section ("Cross-Workspace Follow-ups") rather than mixed with in-workspace ACs.
Cost: 2 SP — small skill rule addition + traceability template update + one-time backfill of AZ-494 AC-7.
8. Recommended Rule / Skill updates
- Add to
coderule.mdc: "When a task's success depends on cross-team input that may not be available in-cycle, prefer an Option-B forcing function (ship the validation/scaffolding with prod-empty config that fails-fast at deploy) over deferring the entire task. The fail-fast contract makes the cross-team conversation impossible to skip." (Justified by Pattern 4.) - Add to
coderule.mdc(or newtask-spec.mdc): "ACs that prescribe a specific measurement or sentinel mechanism (e.g. 'per-item latency < 50ms', 'guard fires if DB name contains _test') should also prescribe — or explicitly defer — the path for collecting / enforcing it, so the implementation has a clear bound between 'follow the spec' and 'substitute an equivalent.'" (Justified by Pattern 3.) - Update
.cursor/skills/new-task/SKILL.md(Step 6): introduce optionalworkspace:field on each AC; warn when an AC implies cross-repo work. (Justified by Action 3.) - Keep the Phase-6 dupe-helper rule added to
.cursor/skills/code-review/SKILL.mdduring AZ-491 review — it fired correctly in AZ-492 batch-04 and is a low-cost / high-signal check. No change needed.
9. Decision items carried over (operator)
- Admin team iss/aud confirmation (deploy R1 from cycle-3 deploy report) — required before promoting beyond
dev. Tracked indeploy_cycle3.md. - Cross-repo doc (deploy R2 / Pattern 2 / Action 3) —
suite/_docs/10_auth.mdparagraph addition. Tracked indeploy_cycle3.md. - Cycle-1 hardening backlog (S1, S2, S4, I1, I3, I5) — pre-public-network items, unchanged. Tracked in cycle-1 retro +
security_report.md.
10. What this retro says about process maturity
Cycle 3 is the first cycle that:
- Closes more security findings than it opens (-3 Medium-or-above).
- Lands all prior-retro action items.
- Has every task at-or-below the 5-SP cap.
- Achieves a >80% first-attempt-completion rate (5/6 = 83%).
- Has zero format-drift fix commits.
- Has zero code-review FAIL verdicts.
The process is converging. The next areas of friction are (a) cross-repo coordination (Action 3), (b) following through on the perf-harness execution that the harness work itself enabled (Action 1), and (c) ongoing test-side supply-chain hygiene (Action 2). None of these are systemic; all are concrete cycle-4 PBI candidates.