Closes the AZ-484 cycle: - retro_2026-05-11.md: 5 patterns identified (code-review-PASS does not imply runtime PASS; spec-authorship under-specifies wire format / test sites; NFR test-spec entries decoupled from runner scripts; pre-existing module doc staleness; pre-existing security Mediums now visible). Top-3 actions ranked by impact, with target rule/skill files and owners. - structure_2026-05-11.md: baseline structural snapshot for future retro deltas (6 components, 12 ProjectReference edges, 0 cycles in import graph, 0 net architecture violations, 1 frozen contract, ~14% contract coverage). - LESSONS.md: header rewritten to describe the two-layer format (deep lessons + 15-entry ring buffer); appended 3 new ring-buffer entries (testing/process/estimation) sourced from this retro. - _autodev_state.md: cycle 2 starting at Step 9 New Task. Co-authored-by: Cursor <cursoragent@cursor.com>
10 KiB
Retrospective — Cycle 1 (AZ-484, 2026-05-11)
Cycle: 1 (Phase B feature cycle, single task) Tasks: AZ-484 — Multi-source tile storage schema Mode: cycle-end (autodev Step 17) Previous retro: none — this is the baseline retro for Phase B; no trend comparison possible.
1. Implementation Metrics
| Metric | Value |
|---|---|
| Tasks implemented | 1 (AZ-484) |
| Batches | 1 (#25 — numbering continues from Phase A) |
| Avg tasks / batch | 1.0 |
| Complexity points delivered | 5 SP (within the user-rule preferred 2–5 SP band) |
| Avg complexity / batch | 5.0 |
| Source files modified (production) | 4 |
| Source files added (production) | 3 (migration 013, TileSource.cs, TileSourceConverter.cs) |
| Test files modified / added | 4 modified, 1 added (TileSourceConverterTests) |
| Doc files updated in this cycle (Steps 12–17) | 13 modified, 8 added |
| Migrations added | 1 (013_AddTileSourceAndCapturedAt.sql — transactional) |
2. Quality Metrics
| Metric | Value |
|---|---|
| Code review pass rate | 1/1 = 100% (PASS) |
| Code review findings — Critical | 0 |
| Code review findings — High | 0 |
| Code review findings — Medium | 0 |
| Code review findings — Low | 0 |
| Code review FAIL count | 0 |
| Auto-fix attempts during code review | 0 |
| Post-code-review test-failure iterations | 2 (see §5 Pattern 1) |
| Security audit verdict | PASS_WITH_WARNINGS |
| Security findings introduced by AZ-484 | 0 |
| Security findings pre-existing surfaced this cycle | 5 Medium + 5 Low |
3. Structural Metrics (snapshot: structure_2026-05-11.md)
| Metric | Value | Δ vs cycle 0 |
|---|---|---|
| Components | 6 | 0 |
| ProjectReference edges | 12 | 0 |
| Cycles in import graph | 0 | 0 (still clean DAG) |
| Avg ProjectReferences / component | 2.0 | 0 |
| Architecture violations newly introduced | 0 | — |
| Architecture violations resolved | 0 | — |
| Net architecture delta | 0 | — (good) |
| Frozen contracts | 1 | +1 (tile-storage v1.0.0) |
| Contract coverage of public-API surfaces | ~14% | +14 pp (was 0%) |
| Common/Enums import sites (DataAccess + TileDownloader) | 7 | +2 (AZ-484-introduced consumers) |
4. Efficiency Metrics
| Metric | Value |
|---|---|
| Blocked task count | 0 |
| Tasks completed first attempt (post-review) | 0 of 1 — AZ-484 required 2 fix iterations during Step 11 |
| Most-findings batch | n/a — single-batch cycle |
| Spec accuracy ("test site count" estimate) | Spec said ~12 sites incl. ~3 in RegionServiceTests; actual = 6 sites, 0 in RegionServiceTests |
| Spec accuracy ("storage approach") | Spec proposed EnumStringTypeHandler reuse; actual implementation pivoted twice (TypeHandler → string + Converter) due to Dapper bypass (L-001) |
| Performance gate scenarios run | 0 of 7 (all Unverified — non-blocking; PT-07 harness deferred) |
5. Patterns Identified
Pattern 1 — Code review PASS does not imply runtime PASS
The cycle's code-review verdict was a clean PASS with zero findings. Yet Step 11 (Run Tests) found two real production bugs that required code changes in two further commits:
column "updated_at" does not exist— design bug inGetTilesByRegionAsync's outerORDER BYreferring to a column the innerDISTINCT ONhad aliased away. Caught by integration test on attempt 1.- Dapper enum bypass (L-001) — caught by integration test on attempt 2; required architectural pivot (
TileSourceTypeHandler→TileEntity.Source as string+TileSourceConverter).
The static review caught neither because both required runtime evidence. The current workflow handles this correctly (Step 11 is the gate, and it did its job), but the metric "code review PASS rate = 100%" should NOT be read as "first-attempt success rate = 100%". A truer headline metric is "first-attempt-runtime-pass rate" — which for cycle 1 was 0/1 = 0%.
This isn't a process failure — it's a normal cost of database/library integration work. But the metric framing matters for future trend analysis.
Pattern 2 — Spec authorship under-specifies wire-format and ORM mechanics
The task spec assumed EnumStringTypeHandler<T> reuse without checking what wire format the existing handler emits. The test-site-count estimate assumed RegionServiceTests referenced TileEntity (it does not). Both came from spec authorship pattern-matching against neighboring code without grepping.
This was caught and corrected in ## Pre-Implementation Audit (Risk 3 mitigation in the batch report) — the audit listed every new TileEntity site before edits and surfaced the spec inaccuracy. The audit step is doing real work; the lesson is that future task specs touching enum persistence or test sites must include a grep evidence block, not estimates.
Pattern 3 — NFR test-spec entries decoupled from runner-script updates
PT-07 was added to _docs/02_document/tests/performance-tests.md and traceability-matrix.md during Step 12 (Test-Spec Sync), but the harness in scripts/run-performance-tests.sh was not extended at the same time. This left PT-07 as documentation-only — and the perf gate at Step 15 had to record it as Unverified and file a leftover.
The risk: NFRs accumulate as Unverified across cycles, eroding gate confidence. Test-spec edits and runner-script edits should be one step, not two.
Pattern 4 — Pre-existing module documentation staleness
Cycle 1's Update Docs step refreshed AZ-484-touched module docs (8 modules + 2 components). It did NOT address pre-existing staleness elsewhere — e.g., tests_unit.md still says the project has only "a dummy test placeholder" despite 213 unit tests existing. The autodev's per-cycle doc step is correctly scoped to cycle-touched files (per the ripple log analysis), but the staleness from earlier cycles is not getting picked up.
Pattern 5 — Pre-existing Medium-severity security items now visible
The Security Audit (Step 14) — first-ever audit pass for this codebase — surfaced 5 Medium findings (Postgres default creds, .env not in dockerignore, hardcoded GMaps key, no rate limiter, Microsoft.AspNetCore.OpenApi 8.0.21 CVE) that pre-date AZ-484. None are AZ-484 regressions. Without explicit hardening tickets, they are at risk of being forgotten.
6. Comparison vs. previous retros
None. This is the baseline.
7. Top 3 Improvement Actions (ranked by impact)
Action 1 — Add a "DB-roundtrip test" requirement for every new persisted enum
Why: AZ-484's two days of debugging (1 design bug + 1 architectural pivot) traced back to L-001. A 3-line integration test that inserts a row with each enum value and reads it back through Dapper would have caught the Dapper bypass on the first run, before any architectural work was wasted.
Concrete action:
- Update
coderule.mdc§"Test environment" with an explicit rule: "Any enum that is persisted through Dapper or any ORM that uses IL-emitted deserialization (Dapper, Entity Framework's value converters in some configs) MUST have an integration test that round-trips every enum value through a real database row. Unit-testing the type handler in isolation does not prove read-side behavior." - Add an explicit checklist row in
.cursor/skills/test-spec/SKILL.md(or wherever the test-spec checklist lives): "If task introduces a persisted enum: include an integration round-trip test in the test-spec deliverable."
Owner: Author of next task spec involving an enum + DB column. Estimated impact: prevents recurrence of L-001 — saves a similar 2–3 fix iterations on the next enum task.
Action 2 — Couple NFR additions in test-spec with runner-script updates in the same step
Why: PT-07 is Unverified for cycle 1 because the spec entry and the harness work were split. Future cycles will accumulate more Unverified NFRs unless the workflow forces both to land together.
Concrete action:
- Update
.cursor/skills/test-spec/SKILL.md(or its Phase B variant): "When adding a row to_docs/02_document/tests/performance-tests.md, thescripts/run-performance-tests.shscenario must be added in the same commit. If the harness work cannot be done in this cycle, the NFR must be recorded as 'Deferred — harness work tracked in ' in the test-spec, not as an active scenario." - Make the autodev Step 15 (Performance Test) explicitly check that every active scenario has a runner implementation; treat a missing runner as a hard fail of the test-spec gate, not a soft fail of the perf gate.
Owner: Whoever runs Step 12 (Test-Spec Sync) next cycle.
Estimated impact: keeps the perf gate honest. Prevents Unverified from becoming the default state.
Action 3 — Schedule a periodic "doc-base-rate refresh" to clean up multi-cycle staleness
Why: Per-cycle doc updates only catch ripples from the cycle's changes. They do not address staleness that accumulated before the autodev workflow was introduced (e.g., tests_unit.md still describes the codebase as having a single dummy test). This staleness erodes trust in the documentation set.
Concrete action:
- Add a low-priority recurring task
AZ-XXX Periodic doc-base-rate refreshto the backlog. Cadence: every 5 cycles or every quarter, whichever is sooner. - Scope of the refresh: walk every file under
_docs/02_document/modules/and_docs/02_document/components/, compare against current source, fix any contradictions. Do NOT add new content — only fix what's wrong. - Time-box: 2 hours per refresh task.
Owner: To be assigned via the regular planning loop. Estimated impact: keeps documentation trust above the "would-believe-without-checking" threshold.
8. Recommended Rule / Skill updates
| Target file | Change |
|---|---|
.cursor/rules/coderule.mdc |
Add the persisted-enum DB-roundtrip rule from Action 1 |
.cursor/skills/test-spec/SKILL.md |
Add the NFR-plus-runner-coupling rule from Action 2 + the persisted-enum checklist row from Action 1 |
.cursor/skills/autodev/SKILL.md (Step 15) |
Add the "every active perf scenario must have a runner" gate |
9. Self-verification
- All cycle 1 implementation artifacts parsed (1 batch report, 1 review, 1 implementation report, 1 completeness report, 1 deploy report)
- All metric categories computed
- Structural snapshot written (
structure_2026-05-11.md) - No previous retro to compare against — explicitly noted
- Top 3 actions are concrete and have an owner + estimated impact
- Suggested rule/skill updates are file-specific