Files
Oleksandr Bezdieniezhnykh 3a925b9b0f
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status
refactor: remove obsolete resource download and installer endpoints
- Deleted the `POST /resources/get/{dataFolder?}` and `GET /resources/get-installer` endpoints as part of the architectural shift towards simplified resource management.
- Removed associated methods and configurations, including `ResourcesService.GetEncryptedResource`, `ResourcesService.GetInstaller`, and related properties in `ResourcesConfig`.
- Cleaned up environment variables and configuration files to reflect the removal of installer-related settings.
- Eliminated the `GetResourceRequest` DTO and its validator, along with the `WrongResourceName` error code.
- Updated documentation to clarify the changes in resource handling and the retirement of per-user file encryption.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 04:17:55 +03:00

10 KiB
Raw Permalink Blame History

Retrospective — 2026-05-13 (Cycle 1, end of cycle)

Mode: cycle-end Cycle: 1 Window: 2026-04-16 (Phase A baseline) → 2026-05-13 (Phase B feature cycle complete + Deploy) Previous retro: N/A — first retrospective

Implementation Summary

Metric Phase A (baseline) Phase B (cycle 1) Total
Total tasks 7 4 11
Total batches 4 2 6
Total complexity points 29 11 40
Avg tasks per batch 1.75 2.0 1.83
Avg complexity per batch 7.25 5.5 6.67
Tasks per task spec 1

Per-task complexity (Phase B): AZ-513 (3) + AZ-196 (2) + AZ-183 (3, reverted) + AZ-197 (3) = 11 points.

Quality Metrics

Code Review Results

Verdict Count %
PASS 5 83%
PASS_WITH_WARNINGS 1 17%
FAIL 0 0%

Findings by Severity (code review only — security audit findings counted separately below)

Severity Count Source
Critical 0
High 0
Medium 1 batch_05 F1 (race on sequential serial)
Low 3 batch_05 F2/F3/F4 (uniqueness, key rotation, default empty key)

Findings by Category

Category Count Top Files
Bug 1 Azaion.Services/UserService.cs (RegisterDevice)
Maintainability 3 Azaion.Services/ResourceUpdateService.cs (×2), Azaion.AdminApi/appsettings.json
Spec-Gap 0
Security 0 (code review) / 13 (security audit)
Performance 0
Style 0
Scope 0

Security Audit (out-of-band, post-implementation)

Severity Count Status at end of cycle
Critical 0
High 3 F-1 closed (OTA reverted), F-3 closed (UNIQUE INDEX), D-1 closed (Newtonsoft 13.0.4); 1 pre-existing (F-2 path traversal) deferred to AZ-516
Medium 5 0 closed in audit; recorded as AZ-517..AZ-520
Low 5 0 closed; recorded as AZ-521 (bundle)

The audit found 1 regression introduced by cycle-1 work: F-1 (/get-update exposed plaintext encryption keys, AZ-183). Fix: full revert of AZ-183. F-3 was an amplification of a pre-existing race (RegisterDevice not having a UNIQUE INDEX); the audit closed it by adding env/db/06_users_email_unique.sql and consolidating RegisterDevice to delegate row insertion to RegisterUser.

Performance Test

Verdict NFT thresholds met Coverage gaps
PASS 2/2 (NFT-PERF-01 login p95=33 ms vs 500 ms; NFT-PERF-04 user-list p95=152 ms vs 1000 ms) NFT-PERF-02/03 obsolete (OTA reverted); no /classes perf coverage yet

Deploy Audit (this step)

Drift Severity Resolved this cycle Carried forward
A — host pulls :latest, CI never produces it Medium yes
B — no secret manager Medium yes (sops + age)
C — container runs as root Medium yes (USER app)
D — stale .woodpecker/build-arm.yml reference Low yes (doc + actual files audited)
E — perf script run-on-demand Low spec'd; auto-gating deferred I
F — no vulnerable-dep gate Low yes (deps-audit step)
G — unused docker.test/Dockerfile Low yes (deleted)
H — TCP-only healthcheck in test compose Low yes (curl /health/live)
I — no coverage threshold Low yes
J — manual DB migrations Low yes
K — no metrics / tracing implemented Medium spec only yes
L — no central log aggregator Low yes
M — no tracing exporter Low yes
N — no zero-downtime deploy Medium yes
O — no remote SSH wrapper Low yes

7 resolved this cycle, 8 carried forward.

Efficiency Metrics

Metric Value Notes
Blocked tasks 0
Tasks requiring fixes after review 0 All findings deferred or descoped, none required cycle re-entry
Auto-fix attempts triggered 0 Across all 6 batches
Stuck agents 0
Reverts after main code shipped 1 AZ-183 — same-day revert after security audit finding F-1
Skipped tests with documented reason 1 AZ-195 AC-1 (DB recovery test needs Docker socket access)
Test pass rate (E2E suite, end of Step 7) 44/44 After Dockerfile + healthcheck changes

Blocker Analysis

No blockers, but two notable mid-cycle pivots:

Event Type Prevention idea
User clarified mid-implement (2026-05-13) that the Loader is architecturally retired → AZ-197 was rescoped from cross-workspace to admin-only Spec ambiguity discovered late Add an "implicit assumptions" review gate to new-task Step 5 (Acceptance Criteria) that explicitly asks: which other workspaces does this touch? Are they still active?
Security audit found AZ-183 ships plaintext encryption keys → entire feature reverted same day Threat model gap not caught at planning Add a lightweight "what new authenticated endpoints / persistence does this introduce?" prompt to new-task Step 5; route any non-zero answer through a 5-minute threat-model check before complexity is finalized

Structural Snapshot

This is the first retro, so no delta computation. Snapshot persisted to _docs/06_metrics/structure_2026-05-13.md (placeholder — module-layout.md has 5 conceptual sub-components but only one ownership boundary in the registry, so cross-component edge counting is degenerate for this workspace).

Metric Value Source
Components (registry) 1 (Admin API) _docs/02_document/module-layout.md
Conceptual sub-components 5 same
csproj projects 5 Azaion.AdminApi.sln (4 prod + 1 e2e)
Cycles in module graph 0 inspection (single deployable, no cross-component edges in the registry)
New Architecture violations this cycle 0 no cumulative_review_batches_*.md exists; verified by inspection of batch reviews — no Architecture-category findings
Resolved Architecture violations 0
Net Architecture delta 0
Public-API contract files (_docs/02_document/contracts/) 0 folder absent
Contract coverage % n/a n/a

Contract files are not part of this project's documentation set today. If future cycles introduce them (e.g., as part of a UI ↔ admin contract test effort), this section will start carrying real coverage numbers.

Trend Comparison

Metric Previous Current Change
Pass rate n/a 83% (5/6) n/a
Avg findings per batch n/a 0.67 n/a
Reverts n/a 1 n/a
Carried-forward operational drifts n/a 8 n/a

Top 3 Improvement Actions

  1. Add a security threat-model micro-step to new-task Step 5 (Acceptance Criteria)

    • What: Two extra lines on every task spec — "New authenticated endpoints introduced: [list]" and "New persistent data introduced: [list]". If either is non-empty, the next sub-step is a 5-minute threat-model check (data flow, secrets exposure, replay surface). Output recorded in the task spec under ## Threat Model Notes.
    • Impact: catches the AZ-183-style "endpoint exposes plaintext key" class of regression at planning time, before the 3-pt budget is committed. Saves at least one cycle of implement → security-audit → revert per occurrence.
    • Effort: low (skill text edit + template addition).
  2. Adopt the _cycleN_ batch-report naming convention starting cycle 2

    • What: Rename forward — every new batch report and code-review file in cycle 2+ uses batch_NN_cycleM_report.md and batch_NN_cycleM_review.md. Cycle-1 files stay as batch_NN_report.md for history. Update the implement skill's report-filename template.
    • Impact: prevents silent overwrite of cycle-1 batch reports when cycle 2's batch_07 lands (would currently collide with batch_07_report.md if that name was used). Already documented in the existing-code flow Step 10 — this enforces it.
    • Effort: low (one edit in .cursor/skills/implement/).
  3. File the 8 carried-forward deploy drifts as Jira tickets in cycle 2 backlog

    • What: I, J, K, L, M, N, O are real backlog items (coverage gates, automated migrations, metrics + tracing, central logs, exporter, zero-downtime deploy, remote SSH wrapper). They currently live only as references in _docs/04_deploy/*.md. Promote them to AZ-tickets with story points.
    • Impact: makes operational debt visible alongside feature work; protects against silent erosion of the deploy plan over multiple cycles.
    • Effort: medium (≈ 30 min of ticket creation + sizing).

Suggested Rule / Skill Updates

File Change Rationale
.cursor/skills/new-task/SKILL.md Add Step 5.5 — "Threat-Model Micro-Check" with the two prompts above AZ-183 revert (cycle 1)
.cursor/skills/implement/SKILL.md Update batch-report filename template to batch_NN_cycleM_report.md (and review file analogously) Naming-collision risk on cycle 2
.cursor/rules/coderule.mdc Add bullet: "Do not reuse retired numeric error codes (gaps are intentional)" Batch 6 deletes codes 40 and 45 from ExceptionEnum — needs a rule so cycle 2 reviewers know not to fill the gap
_docs/04_deploy/-derived backlog New AZ-* tickets for drifts I, J, K, L, M, N, O Top action 3 above

Notes

  • First retrospective. No prior baseline; cycle 2 will be the first one with delta numbers.
  • Cycle health: green. 0 FAIL verdicts, 0 stuck agents, 0 auto-fix attempts, 44/44 E2E tests pass after Step 7's code edits. The single revert (AZ-183) was caught by the next-step security audit and resolved before deploy — the system worked, but the goal of the threat-model micro-check is to catch it one step earlier.
  • Operator burden after this cycle: the 8 carried-forward drifts represent ≈ 22 story points of follow-up infrastructure work (rough sizing — to be confirmed when filed as tickets per Top Action 3).