# Retrospective — 2026-05-13 (Cycle 1, end of cycle) **Mode**: cycle-end **Cycle**: 1 **Window**: 2026-04-16 (Phase A baseline) → 2026-05-13 (Phase B feature cycle complete + Deploy) **Previous retro**: N/A — first retrospective ## Implementation Summary | Metric | Phase A (baseline) | Phase B (cycle 1) | Total | |--------|-------------------:|------------------:|------:| | Total tasks | 7 | 4 | **11** | | Total batches | 4 | 2 | **6** | | Total complexity points | 29 | 11 | **40** | | Avg tasks per batch | 1.75 | 2.0 | 1.83 | | Avg complexity per batch | 7.25 | 5.5 | 6.67 | | Tasks per task spec | — | — | 1 | Per-task complexity (Phase B): AZ-513 (3) + AZ-196 (2) + AZ-183 (3, reverted) + AZ-197 (3) = 11 points. ## Quality Metrics ### Code Review Results | Verdict | Count | % | |---------|------:|--:| | PASS | 5 | 83% | | PASS_WITH_WARNINGS | 1 | 17% | | FAIL | 0 | 0% | ### Findings by Severity (code review only — security audit findings counted separately below) | Severity | Count | Source | |----------|------:|--------| | Critical | 0 | — | | High | 0 | — | | Medium | 1 | batch_05 F1 (race on sequential serial) | | Low | 3 | batch_05 F2/F3/F4 (uniqueness, key rotation, default empty key) | ### Findings by Category | Category | Count | Top Files | |----------|------:|-----------| | Bug | 1 | `Azaion.Services/UserService.cs` (RegisterDevice) | | Maintainability | 3 | `Azaion.Services/ResourceUpdateService.cs` (×2), `Azaion.AdminApi/appsettings.json` | | Spec-Gap | 0 | — | | Security | 0 *(code review)* / 13 *(security audit)* | — | | Performance | 0 | — | | Style | 0 | — | | Scope | 0 | — | ### Security Audit (out-of-band, post-implementation) | Severity | Count | Status at end of cycle | |----------|------:|------------------------| | Critical | 0 | — | | High | 3 | F-1 closed (OTA reverted), F-3 closed (UNIQUE INDEX), D-1 closed (Newtonsoft 13.0.4); 1 pre-existing (F-2 path traversal) deferred to AZ-516 | | Medium | 5 | 0 closed in audit; recorded as AZ-517..AZ-520 | | Low | 5 | 0 closed; recorded as AZ-521 (bundle) | > The audit found 1 **regression** introduced by cycle-1 work: F-1 (`/get-update` exposed plaintext encryption keys, AZ-183). Fix: full revert of AZ-183. F-3 was an amplification of a pre-existing race (`RegisterDevice` not having a UNIQUE INDEX); the audit closed it by adding `env/db/06_users_email_unique.sql` and consolidating `RegisterDevice` to delegate row insertion to `RegisterUser`. ### Performance Test | Verdict | NFT thresholds met | Coverage gaps | |---------|--------------------|---------------| | PASS | 2/2 (NFT-PERF-01 login p95=33 ms vs 500 ms; NFT-PERF-04 user-list p95=152 ms vs 1000 ms) | NFT-PERF-02/03 obsolete (OTA reverted); no `/classes` perf coverage yet | ### Deploy Audit (this step) | Drift | Severity | Resolved this cycle | Carried forward | |-------|---------:|--------------------:|----------------:| | A — host pulls `:latest`, CI never produces it | Medium | yes | — | | B — no secret manager | Medium | yes (sops + age) | — | | C — container runs as root | Medium | yes (`USER app`) | — | | D — stale `.woodpecker/build-arm.yml` reference | Low | yes (doc + actual files audited) | — | | E — perf script run-on-demand | Low | spec'd; auto-gating deferred | I | | F — no vulnerable-dep gate | Low | yes (deps-audit step) | — | | G — unused `docker.test/Dockerfile` | Low | yes (deleted) | — | | H — TCP-only healthcheck in test compose | Low | yes (curl /health/live) | — | | I — no coverage threshold | Low | — | yes | | J — manual DB migrations | Low | — | yes | | K — no metrics / tracing implemented | Medium | spec only | yes | | L — no central log aggregator | Low | — | yes | | M — no tracing exporter | Low | — | yes | | N — no zero-downtime deploy | Medium | — | yes | | O — no remote SSH wrapper | Low | — | yes | **7 resolved this cycle, 8 carried forward.** ## Efficiency Metrics | Metric | Value | Notes | |--------|------:|-------| | Blocked tasks | 0 | — | | Tasks requiring fixes after review | 0 | All findings deferred or descoped, none required cycle re-entry | | Auto-fix attempts triggered | 0 | Across all 6 batches | | Stuck agents | 0 | — | | Reverts after main code shipped | 1 | **AZ-183** — same-day revert after security audit finding F-1 | | Skipped tests with documented reason | 1 | AZ-195 AC-1 (DB recovery test needs Docker socket access) | | Test pass rate (E2E suite, end of Step 7) | 44/44 | After Dockerfile + healthcheck changes | ### Blocker Analysis No blockers, but two notable mid-cycle pivots: | Event | Type | Prevention idea | |-------|------|------------------| | User clarified mid-implement (2026-05-13) that the Loader is architecturally retired → AZ-197 was rescoped from cross-workspace to admin-only | Spec ambiguity discovered late | Add an "implicit assumptions" review gate to `new-task` Step 5 (Acceptance Criteria) that explicitly asks: which other workspaces does this touch? Are they still active? | | Security audit found AZ-183 ships plaintext encryption keys → entire feature reverted same day | Threat model gap not caught at planning | Add a lightweight "what new authenticated endpoints / persistence does this introduce?" prompt to `new-task` Step 5; route any non-zero answer through a 5-minute threat-model check before complexity is finalized | ## Structural Snapshot This is the first retro, so no delta computation. Snapshot persisted to `_docs/06_metrics/structure_2026-05-13.md` (placeholder — module-layout.md has 5 conceptual sub-components but only **one** ownership boundary in the registry, so cross-component edge counting is degenerate for this workspace). | Metric | Value | Source | |--------|------:|--------| | Components (registry) | 1 (`Admin API`) | `_docs/02_document/module-layout.md` | | Conceptual sub-components | 5 | same | | csproj projects | 5 | `Azaion.AdminApi.sln` (4 prod + 1 e2e) | | Cycles in module graph | 0 | inspection (single deployable, no cross-component edges in the registry) | | New Architecture violations this cycle | 0 | no `cumulative_review_batches_*.md` exists; verified by inspection of batch reviews — no Architecture-category findings | | Resolved Architecture violations | 0 | — | | Net Architecture delta | 0 | — | | Public-API contract files (`_docs/02_document/contracts/`) | 0 | folder absent | | Contract coverage % | n/a | n/a | > Contract files are not part of this project's documentation set today. If future cycles introduce them (e.g., as part of a UI ↔ admin contract test effort), this section will start carrying real coverage numbers. ## Trend Comparison | Metric | Previous | Current | Change | |--------|----------|--------:|--------| | Pass rate | n/a | 83% (5/6) | n/a | | Avg findings per batch | n/a | 0.67 | n/a | | Reverts | n/a | 1 | n/a | | Carried-forward operational drifts | n/a | 8 | n/a | ## Top 3 Improvement Actions 1. **Add a security threat-model micro-step to `new-task` Step 5 (Acceptance Criteria)** - **What**: Two extra lines on every task spec — "New authenticated endpoints introduced: [list]" and "New persistent data introduced: [list]". If either is non-empty, the next sub-step is a 5-minute threat-model check (data flow, secrets exposure, replay surface). Output recorded in the task spec under `## Threat Model Notes`. - **Impact**: catches the AZ-183-style "endpoint exposes plaintext key" class of regression at planning time, before the 3-pt budget is committed. Saves at least one cycle of implement → security-audit → revert per occurrence. - **Effort**: low (skill text edit + template addition). 2. **Adopt the `_cycleN_` batch-report naming convention starting cycle 2** - **What**: Rename forward — every new batch report and code-review file in cycle 2+ uses `batch_NN_cycleM_report.md` and `batch_NN_cycleM_review.md`. Cycle-1 files stay as `batch_NN_report.md` for history. Update the `implement` skill's report-filename template. - **Impact**: prevents silent overwrite of cycle-1 batch reports when cycle 2's `batch_07` lands (would currently collide with `batch_07_report.md` if that name was used). Already documented in the existing-code flow Step 10 — this enforces it. - **Effort**: low (one edit in `.cursor/skills/implement/`). 3. **File the 8 carried-forward deploy drifts as Jira tickets in cycle 2 backlog** - **What**: I, J, K, L, M, N, O are real backlog items (coverage gates, automated migrations, metrics + tracing, central logs, exporter, zero-downtime deploy, remote SSH wrapper). They currently live only as references in `_docs/04_deploy/*.md`. Promote them to AZ-tickets with story points. - **Impact**: makes operational debt visible alongside feature work; protects against silent erosion of the deploy plan over multiple cycles. - **Effort**: medium (≈ 30 min of ticket creation + sizing). ## Suggested Rule / Skill Updates | File | Change | Rationale | |------|--------|-----------| | `.cursor/skills/new-task/SKILL.md` | Add Step 5.5 — "Threat-Model Micro-Check" with the two prompts above | AZ-183 revert (cycle 1) | | `.cursor/skills/implement/SKILL.md` | Update batch-report filename template to `batch_NN_cycleM_report.md` (and review file analogously) | Naming-collision risk on cycle 2 | | `.cursor/rules/coderule.mdc` | Add bullet: "Do not reuse retired numeric error codes (gaps are intentional)" | Batch 6 deletes codes 40 and 45 from `ExceptionEnum` — needs a rule so cycle 2 reviewers know not to fill the gap | | `_docs/04_deploy/`-derived backlog | New AZ-* tickets for drifts I, J, K, L, M, N, O | Top action 3 above | ## Notes - **First retrospective.** No prior baseline; cycle 2 will be the first one with delta numbers. - **Cycle health**: green. 0 FAIL verdicts, 0 stuck agents, 0 auto-fix attempts, 44/44 E2E tests pass after Step 7's code edits. The single revert (AZ-183) was caught by the next-step security audit and resolved before deploy — the system worked, but the goal of the threat-model micro-check is to catch it one step earlier. - **Operator burden after this cycle**: the 8 carried-forward drifts represent ≈ 22 story points of follow-up infrastructure work (rough sizing — to be confirmed when filed as tickets per Top Action 3).