mirror of
https://github.com/azaion/admin.git
synced 2026-06-21 21:41:08 +00:00
3a925b9b0f
- Deleted the `POST /resources/get/{dataFolder?}` and `GET /resources/get-installer` endpoints as part of the architectural shift towards simplified resource management.
- Removed associated methods and configurations, including `ResourcesService.GetEncryptedResource`, `ResourcesService.GetInstaller`, and related properties in `ResourcesConfig`.
- Cleaned up environment variables and configuration files to reflect the removal of installer-related settings.
- Eliminated the `GetResourceRequest` DTO and its validator, along with the `WrongResourceName` error code.
- Updated documentation to clarify the changes in resource handling and the retirement of per-user file encryption.
Co-authored-by: Cursor <cursoragent@cursor.com>
170 lines
10 KiB
Markdown
170 lines
10 KiB
Markdown
# Retrospective — 2026-05-13 (Cycle 1, end of cycle)
|
||
|
||
**Mode**: cycle-end
|
||
**Cycle**: 1
|
||
**Window**: 2026-04-16 (Phase A baseline) → 2026-05-13 (Phase B feature cycle complete + Deploy)
|
||
**Previous retro**: N/A — first retrospective
|
||
|
||
## Implementation Summary
|
||
|
||
| Metric | Phase A (baseline) | Phase B (cycle 1) | Total |
|
||
|--------|-------------------:|------------------:|------:|
|
||
| Total tasks | 7 | 4 | **11** |
|
||
| Total batches | 4 | 2 | **6** |
|
||
| Total complexity points | 29 | 11 | **40** |
|
||
| Avg tasks per batch | 1.75 | 2.0 | 1.83 |
|
||
| Avg complexity per batch | 7.25 | 5.5 | 6.67 |
|
||
| Tasks per task spec | — | — | 1 |
|
||
|
||
Per-task complexity (Phase B): AZ-513 (3) + AZ-196 (2) + AZ-183 (3, reverted) + AZ-197 (3) = 11 points.
|
||
|
||
## Quality Metrics
|
||
|
||
### Code Review Results
|
||
|
||
| Verdict | Count | % |
|
||
|---------|------:|--:|
|
||
| PASS | 5 | 83% |
|
||
| PASS_WITH_WARNINGS | 1 | 17% |
|
||
| FAIL | 0 | 0% |
|
||
|
||
### Findings by Severity (code review only — security audit findings counted separately below)
|
||
|
||
| Severity | Count | Source |
|
||
|----------|------:|--------|
|
||
| Critical | 0 | — |
|
||
| High | 0 | — |
|
||
| Medium | 1 | batch_05 F1 (race on sequential serial) |
|
||
| Low | 3 | batch_05 F2/F3/F4 (uniqueness, key rotation, default empty key) |
|
||
|
||
### Findings by Category
|
||
|
||
| Category | Count | Top Files |
|
||
|----------|------:|-----------|
|
||
| Bug | 1 | `Azaion.Services/UserService.cs` (RegisterDevice) |
|
||
| Maintainability | 3 | `Azaion.Services/ResourceUpdateService.cs` (×2), `Azaion.AdminApi/appsettings.json` |
|
||
| Spec-Gap | 0 | — |
|
||
| Security | 0 *(code review)* / 13 *(security audit)* | — |
|
||
| Performance | 0 | — |
|
||
| Style | 0 | — |
|
||
| Scope | 0 | — |
|
||
|
||
### Security Audit (out-of-band, post-implementation)
|
||
|
||
| Severity | Count | Status at end of cycle |
|
||
|----------|------:|------------------------|
|
||
| Critical | 0 | — |
|
||
| High | 3 | F-1 closed (OTA reverted), F-3 closed (UNIQUE INDEX), D-1 closed (Newtonsoft 13.0.4); 1 pre-existing (F-2 path traversal) deferred to AZ-516 |
|
||
| Medium | 5 | 0 closed in audit; recorded as AZ-517..AZ-520 |
|
||
| Low | 5 | 0 closed; recorded as AZ-521 (bundle) |
|
||
|
||
> The audit found 1 **regression** introduced by cycle-1 work: F-1 (`/get-update` exposed plaintext encryption keys, AZ-183). Fix: full revert of AZ-183. F-3 was an amplification of a pre-existing race (`RegisterDevice` not having a UNIQUE INDEX); the audit closed it by adding `env/db/06_users_email_unique.sql` and consolidating `RegisterDevice` to delegate row insertion to `RegisterUser`.
|
||
|
||
### Performance Test
|
||
|
||
| Verdict | NFT thresholds met | Coverage gaps |
|
||
|---------|--------------------|---------------|
|
||
| PASS | 2/2 (NFT-PERF-01 login p95=33 ms vs 500 ms; NFT-PERF-04 user-list p95=152 ms vs 1000 ms) | NFT-PERF-02/03 obsolete (OTA reverted); no `/classes` perf coverage yet |
|
||
|
||
### Deploy Audit (this step)
|
||
|
||
| Drift | Severity | Resolved this cycle | Carried forward |
|
||
|-------|---------:|--------------------:|----------------:|
|
||
| A — host pulls `:latest`, CI never produces it | Medium | yes | — |
|
||
| B — no secret manager | Medium | yes (sops + age) | — |
|
||
| C — container runs as root | Medium | yes (`USER app`) | — |
|
||
| D — stale `.woodpecker/build-arm.yml` reference | Low | yes (doc + actual files audited) | — |
|
||
| E — perf script run-on-demand | Low | spec'd; auto-gating deferred | I |
|
||
| F — no vulnerable-dep gate | Low | yes (deps-audit step) | — |
|
||
| G — unused `docker.test/Dockerfile` | Low | yes (deleted) | — |
|
||
| H — TCP-only healthcheck in test compose | Low | yes (curl /health/live) | — |
|
||
| I — no coverage threshold | Low | — | yes |
|
||
| J — manual DB migrations | Low | — | yes |
|
||
| K — no metrics / tracing implemented | Medium | spec only | yes |
|
||
| L — no central log aggregator | Low | — | yes |
|
||
| M — no tracing exporter | Low | — | yes |
|
||
| N — no zero-downtime deploy | Medium | — | yes |
|
||
| O — no remote SSH wrapper | Low | — | yes |
|
||
|
||
**7 resolved this cycle, 8 carried forward.**
|
||
|
||
## Efficiency Metrics
|
||
|
||
| Metric | Value | Notes |
|
||
|--------|------:|-------|
|
||
| Blocked tasks | 0 | — |
|
||
| Tasks requiring fixes after review | 0 | All findings deferred or descoped, none required cycle re-entry |
|
||
| Auto-fix attempts triggered | 0 | Across all 6 batches |
|
||
| Stuck agents | 0 | — |
|
||
| Reverts after main code shipped | 1 | **AZ-183** — same-day revert after security audit finding F-1 |
|
||
| Skipped tests with documented reason | 1 | AZ-195 AC-1 (DB recovery test needs Docker socket access) |
|
||
| Test pass rate (E2E suite, end of Step 7) | 44/44 | After Dockerfile + healthcheck changes |
|
||
|
||
### Blocker Analysis
|
||
|
||
No blockers, but two notable mid-cycle pivots:
|
||
|
||
| Event | Type | Prevention idea |
|
||
|-------|------|------------------|
|
||
| User clarified mid-implement (2026-05-13) that the Loader is architecturally retired → AZ-197 was rescoped from cross-workspace to admin-only | Spec ambiguity discovered late | Add an "implicit assumptions" review gate to `new-task` Step 5 (Acceptance Criteria) that explicitly asks: which other workspaces does this touch? Are they still active? |
|
||
| Security audit found AZ-183 ships plaintext encryption keys → entire feature reverted same day | Threat model gap not caught at planning | Add a lightweight "what new authenticated endpoints / persistence does this introduce?" prompt to `new-task` Step 5; route any non-zero answer through a 5-minute threat-model check before complexity is finalized |
|
||
|
||
## Structural Snapshot
|
||
|
||
This is the first retro, so no delta computation. Snapshot persisted to `_docs/06_metrics/structure_2026-05-13.md` (placeholder — module-layout.md has 5 conceptual sub-components but only **one** ownership boundary in the registry, so cross-component edge counting is degenerate for this workspace).
|
||
|
||
| Metric | Value | Source |
|
||
|--------|------:|--------|
|
||
| Components (registry) | 1 (`Admin API`) | `_docs/02_document/module-layout.md` |
|
||
| Conceptual sub-components | 5 | same |
|
||
| csproj projects | 5 | `Azaion.AdminApi.sln` (4 prod + 1 e2e) |
|
||
| Cycles in module graph | 0 | inspection (single deployable, no cross-component edges in the registry) |
|
||
| New Architecture violations this cycle | 0 | no `cumulative_review_batches_*.md` exists; verified by inspection of batch reviews — no Architecture-category findings |
|
||
| Resolved Architecture violations | 0 | — |
|
||
| Net Architecture delta | 0 | — |
|
||
| Public-API contract files (`_docs/02_document/contracts/`) | 0 | folder absent |
|
||
| Contract coverage % | n/a | n/a |
|
||
|
||
> Contract files are not part of this project's documentation set today. If future cycles introduce them (e.g., as part of a UI ↔ admin contract test effort), this section will start carrying real coverage numbers.
|
||
|
||
## Trend Comparison
|
||
|
||
| Metric | Previous | Current | Change |
|
||
|--------|----------|--------:|--------|
|
||
| Pass rate | n/a | 83% (5/6) | n/a |
|
||
| Avg findings per batch | n/a | 0.67 | n/a |
|
||
| Reverts | n/a | 1 | n/a |
|
||
| Carried-forward operational drifts | n/a | 8 | n/a |
|
||
|
||
## Top 3 Improvement Actions
|
||
|
||
1. **Add a security threat-model micro-step to `new-task` Step 5 (Acceptance Criteria)**
|
||
- **What**: Two extra lines on every task spec — "New authenticated endpoints introduced: [list]" and "New persistent data introduced: [list]". If either is non-empty, the next sub-step is a 5-minute threat-model check (data flow, secrets exposure, replay surface). Output recorded in the task spec under `## Threat Model Notes`.
|
||
- **Impact**: catches the AZ-183-style "endpoint exposes plaintext key" class of regression at planning time, before the 3-pt budget is committed. Saves at least one cycle of implement → security-audit → revert per occurrence.
|
||
- **Effort**: low (skill text edit + template addition).
|
||
|
||
2. **Adopt the `_cycleN_` batch-report naming convention starting cycle 2**
|
||
- **What**: Rename forward — every new batch report and code-review file in cycle 2+ uses `batch_NN_cycleM_report.md` and `batch_NN_cycleM_review.md`. Cycle-1 files stay as `batch_NN_report.md` for history. Update the `implement` skill's report-filename template.
|
||
- **Impact**: prevents silent overwrite of cycle-1 batch reports when cycle 2's `batch_07` lands (would currently collide with `batch_07_report.md` if that name was used). Already documented in the existing-code flow Step 10 — this enforces it.
|
||
- **Effort**: low (one edit in `.cursor/skills/implement/`).
|
||
|
||
3. **File the 8 carried-forward deploy drifts as Jira tickets in cycle 2 backlog**
|
||
- **What**: I, J, K, L, M, N, O are real backlog items (coverage gates, automated migrations, metrics + tracing, central logs, exporter, zero-downtime deploy, remote SSH wrapper). They currently live only as references in `_docs/04_deploy/*.md`. Promote them to AZ-tickets with story points.
|
||
- **Impact**: makes operational debt visible alongside feature work; protects against silent erosion of the deploy plan over multiple cycles.
|
||
- **Effort**: medium (≈ 30 min of ticket creation + sizing).
|
||
|
||
## Suggested Rule / Skill Updates
|
||
|
||
| File | Change | Rationale |
|
||
|------|--------|-----------|
|
||
| `.cursor/skills/new-task/SKILL.md` | Add Step 5.5 — "Threat-Model Micro-Check" with the two prompts above | AZ-183 revert (cycle 1) |
|
||
| `.cursor/skills/implement/SKILL.md` | Update batch-report filename template to `batch_NN_cycleM_report.md` (and review file analogously) | Naming-collision risk on cycle 2 |
|
||
| `.cursor/rules/coderule.mdc` | Add bullet: "Do not reuse retired numeric error codes (gaps are intentional)" | Batch 6 deletes codes 40 and 45 from `ExceptionEnum` — needs a rule so cycle 2 reviewers know not to fill the gap |
|
||
| `_docs/04_deploy/`-derived backlog | New AZ-* tickets for drifts I, J, K, L, M, N, O | Top action 3 above |
|
||
|
||
## Notes
|
||
|
||
- **First retrospective.** No prior baseline; cycle 2 will be the first one with delta numbers.
|
||
- **Cycle health**: green. 0 FAIL verdicts, 0 stuck agents, 0 auto-fix attempts, 44/44 E2E tests pass after Step 7's code edits. The single revert (AZ-183) was caught by the next-step security audit and resolved before deploy — the system worked, but the goal of the threat-model micro-check is to catch it one step earlier.
|
||
- **Operator burden after this cycle**: the 8 carried-forward drifts represent ≈ 22 story points of follow-up infrastructure work (rough sizing — to be confirmed when filed as tickets per Top Action 3).
|