diff --git a/_docs/03_implementation/deploy_cycle4.md b/_docs/03_implementation/deploy_cycle4.md new file mode 100644 index 0000000..30ef74d --- /dev/null +++ b/_docs/03_implementation/deploy_cycle4.md @@ -0,0 +1,69 @@ +# Deploy Report — Cycle 4 (AZ-500) + +**Date**: 2026-05-12 +**Cycle**: 4 +**Scope**: Single-task cycle — **AZ-500 .NET 8 LTS → .NET 10 migration** (cross-cutting infra: every csproj, both Dockerfiles, every CI script, two docs, plus the Microsoft.OpenApi 1.x → 2.x compat refactor in `Program.cs`). + +## What is shipping + +### Code changes (committed to `dev`, pushed to `origin/dev`) + +| Commit | Subject | +|--------|---------| +| `c0f004d` | `[AZ-500] Cycle 4 Step 9: new-task .NET 10 migration` | +| `8131363` | `[AZ-500] .NET 8 -> .NET 10 migration` | +| `de609cf` | `[AZ-500] Cycle 4 implement-skill wrap-up reports` | +| `af4219f` | `[AZ-500] Cycle 4 Steps 12-15 sync (test-spec / docs / security / perf)` | + +All 4 commits on `dev`, pushed to `origin/dev` as of this report. + +### Database migration + +**None this cycle.** AZ-500 is runtime/SDK only; the `tiles` table schema and all DbUp migrations are unchanged. Per AZ-500 Constraint: "Do not rename any database objects during this task." + +### Configuration changes (operator must verify before promoting) + +| Setting | Was | Now | Source | +|---------|-----|-----|--------| +| Host SDK on every dev/CI machine | .NET 8.0.x SDK installed (or rolling from .NET 10.0.x via global.json `latestMinor` — no, that one didn't roll because cycle-3 pin was `8.0.0/latestMinor`) | **.NET 10.0.x SDK installed** (pin is now `10.0.0/latestMinor`, so any 10.0.x patch SDK satisfies — `dotnet --list-sdks` should show ≥ 10.0.0) | AZ-500 AC-2 — `global.json`. Surfaced in cycle-3 perf-harness leftover (host had only 10.0.103, .NET 8 pin wouldn't roll). Now resolved at the project pin level. | +| Docker image tag (`api` service) | `mcr.microsoft.com/dotnet/aspnet:8.0` (floating, last resolved 8.0.25-bullseye-slim) | **`mcr.microsoft.com/dotnet/aspnet:10.0`** (floating; first build pulls latest 10.0.x patch from Microsoft) | AZ-500 AC-3 — `SatelliteProvider.Api/Dockerfile`. CI image (`mcr.microsoft.com/dotnet/sdk:10.0`) similarly bumped in `.woodpecker/01-test.yml` + `scripts/run-tests.sh`. | +| **No new env vars introduced.** | — | — | AZ-500 carries forward the cycle-3 env contract verbatim (`JWT_SECRET ≥ 32B`, `JWT_ISSUER`, `JWT_AUDIENCE`, `GOOGLE_MAPS_API_KEY`). | + +### Container image + +- **Source**: `SatelliteProvider.Api/Dockerfile` multi-stage build, base `mcr.microsoft.com/dotnet/aspnet:10.0` (was `:8.0`). +- **Verification on dev workstation (local)**: `docker compose up -d --build` → API healthy on `:18980`, `/swagger` returns 301, anonymous probe of `/api/satellite/region/` returns 401 (expected — JWT enforcement). Verified at the start of Step 15 perf run. +- **Verification on CI**: `origin/dev` push at `af4219f` triggers Woodpecker `01-test` (now on `mcr.microsoft.com/dotnet/sdk:10.0`) → `02-build-push` → registry tag `dev-arm`. **Operator action**: confirm the next CI run on `dev` succeeds before promoting to staging. +- **Multi-arch**: `mcr.microsoft.com/dotnet/sdk:10.0` and `aspnet:10.0` are published as multi-arch (amd64 + arm64) by Microsoft — verified via `docker manifest inspect` in cycle 3 (no change in cycle 4); Risk #6 from AZ-500 spec is closed. + +## Verification gates passed in this cycle + +| Gate | Result | Evidence | +|------|--------|----------| +| Step 11 — Functional test suite | **PASS** | 271 unit + integration tests green; `_docs/03_implementation/implementation_report_dotnet10_migration_cycle4.md` | +| Step 12 — Test-Spec Sync | **PASS** | `_docs/02_document/tests/traceability-matrix.md` updated with 8 AZ-500 AC rows + .NET 10 runtime restriction supersession + Cycle-4 coverage shape note | +| Step 13 — Update Docs | **PASS** | 8 doc files synced + `_docs/02_document/ripple_log_cycle4.md` (empty import-graph ripple recorded with rationale) | +| Step 14 — Security Audit | **PASS_WITH_WARNINGS** | `_docs/05_security/dependency_scan_cycle4.md` + `security_report_cycle4.md`; 0 new Critical/High; cycle-3 D2 (`Microsoft.NET.Test.Sdk` carry-over) still open per AZ-500 scope | +| Step 15 — Performance Test | **PASS_WITH_UNVERIFIED** | `_docs/06_metrics/perf_2026-05-12_cycle4.md`; PT-01..PT-07 PASS (7.7x improvement on PT-07 warm p95); PT-08 unmeasurable (pre-existing script bug, not a .NET 10 regression) | + +## Outstanding leftovers (NOT closed by cycle 4) + +1. **`_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md`** — STAYS OPEN. Replay #3 (this cycle's Step 15 full run) appended; PT-08 still hits the same pre-existing `scripts/run-performance-tests.sh:417` grep-pipefail bug. Per AZ-500 Constraint: "leftover file is deleted ONLY when the full perf script runs cleanly." Closure path is the script-fix follow-up PBI below. + +## Recommended follow-up PBIs (out of cycle-4 scope, surfaced for backlog) + +| ID | Estimate | Title | Why | +|----|----------|-------|-----| +| (TBD) | 1 SP | Fix `scripts/run-performance-tests.sh:416-417` grep-pipefail | Replace `grep -o ... \| wc -l` with `grep -c ... \|\| true`. Unblocks PT-08 + closes the cycle-3 perf-harness leftover. Trivial mechanical fix. | +| (TBD) | 3 SP | Migrate `WithOpenApi(...)` callsites to ASP.NET Core 10 minimal-API metadata extensions | Clears 8 `ASPDEPR002` deprecation warnings in `Program.cs`. Recorded in `_docs/03_implementation/reviews/batch_01_cycle4_review.md`. API still fully functional in .NET 10 (deprecated, not removed). | +| (TBD) | 1 SP | Microsoft.OpenApi 2.x nullable cleanup | `CS8604` warning in `SatelliteProvider.Api/Swagger/ParameterDescriptionFilter.cs:25` exposed by the major bump. | +| (TBD) | 1 SP | Bump `Microsoft.NET.Test.Sdk` 17.8.0 → 17.13.0+ | Closes cycle-3 D2 (transitive `NuGet.Frameworks` flag). Test-runtime exposure only; safe to land independently. | +| (TBD) | 1 SP (recheck per cycle) | `Serilog.AspNetCore` 8.0.3 → 10.x | Currently retains 8.0.3 fallback per AZ-500 Risk #4 (no 10.x line published as of cycle 4). Re-check at every cycle start; bump as soon as a 10.x line ships. Removes the AGENTS.md/00_discovery/api_program inline-fallback notes. | + +## Operator runbook for promoting to staging / production + +1. Wait for CI to confirm `dev` build at `af4219f` is green and the registry has the new `dev-arm` (and any new amd64) tag built on `aspnet:10.0`. +2. Verify on a dev environment that `docker pull` of the new tag + `docker compose up -d` brings the API up healthy on the configured port; smoke-test `/swagger` (expect 200/301) and `/api/satellite/region/` (expect 401, no 500). +3. If staging/production hosts have not yet provisioned the .NET 10 SDK on the *host* (only relevant for non-containerised perf-harness invocations), update host provisioning to install .NET 10.0.x SDK. The deployed container itself does NOT need a host SDK — only `scripts/run-performance-tests.sh` does (and that script normally only runs from a developer or CI machine). +4. No DB migration to apply. No env-var change to coordinate. +5. Roll forward; if a regression appears, roll back to the prior `dev-arm` tag (the one built from commit `de609cf` or earlier) — same compose contract, same env vars, same DB schema. diff --git a/_docs/06_metrics/retro_2026-05-12_cycle4.md b/_docs/06_metrics/retro_2026-05-12_cycle4.md new file mode 100644 index 0000000..007ce31 --- /dev/null +++ b/_docs/06_metrics/retro_2026-05-12_cycle4.md @@ -0,0 +1,222 @@ +# Retrospective — Cycle 4 (2026-05-12) + +**Tasks**: AZ-500 (.NET 8 LTS → .NET 10 migration, 5 SP) +**Mode**: cycle-end (autodev Step 17) +**Previous retro**: `retro_2026-05-12_cycle3.md` +**Cycle shape**: single-task migration cycle (cross-cutting infra) — first such cycle since the project was documented. + +## 1. Implementation Metrics + +| Metric | Cycle 4 | Δ vs cycle 3 | +|--------|---------|--------------| +| Tasks implemented | **1** (AZ-500) | -5 | +| Batches executed | **1** | -4 | +| Avg tasks / batch | 1.0 | -0.2 | +| Total complexity delivered | **5 SP** | -13 SP | +| Avg complexity / batch | 5 SP | +1.4 | +| Tasks at-or-below 5 SP cap | **1 of 1 (100%)** | unchanged | +| Tasks above cap | 0 | unchanged | +| Cumulative reviews | **0** (single-task batch — Phase 6 cross-task consistency N/A; only the per-batch review ran) | -2 | + +**Sequencing**: Single batch — AZ-500 is one atomic coordinated bump (per its own Constraint: "TFM, SDK pin, Docker images, CI images, and M.E.* package versions ALL move in the same commit"). The cycle completed in 4 dev commits (Step 9 task spec, Step 10 implementation, Step 10 wrap-up reports, Step 12-15 sync). + +## 2. Quality Metrics + +### Code Review Results + +| Verdict | Count | Percentage | +|---------|-------|-----------| +| PASS | 0 | 0% | +| PASS_WITH_WARNINGS | **1** | **100%** | +| FAIL | 0 | 0% | + +### Findings by Severity (per-batch code review) + +| Severity | Cycle 4 | Δ vs cycle 3 | +|----------|---------|--------------| +| Critical | 0 | unchanged | +| High | 0 | unchanged | +| Medium | **2** (F1 ASPDEPR002 deprecation, F2 CS8604 nullable) | **+2** (cycle 3 had 0 new Medium) | +| Low | 1 (F3 implicit-scope perf-script path fix) | -6 | + +### Findings by Category + +| Category | Count | Top Files | +|----------|-------|-----------| +| Maintainability | 2 | `SatelliteProvider.Api/Program.cs` (F1), `SatelliteProvider.Api/Swagger/ParameterDescriptionFilter.cs` (F2) | +| Scope | 1 | `scripts/run-performance-tests.sh:49` (F3 — necessary scope creep, accepted) | +| Bug | 0 | — | +| Spec-Gap | 0 | — | +| Security | 0 NEW (5 NEW informational Lows in `dependency_scan_cycle4.md` — all "no published advisory" confirmations on the 10.x lines) | -2 Low (cycle 3 introduced 3) | +| Performance | 0 | — | +| Style | 0 | — | + +**Note on the 2 Medium findings**: both are *consequences of the major-version bump itself*, not implementation defects. F1 (ASPDEPR002 on 8 `WithOpenApi(...)` callsites) is a deprecation that the .NET 10 line introduces — the API still works, but the old surface is slated for removal. F2 (CS8604 nullable on `parameter.Name`) is exposed by Microsoft.OpenApi 2.x's stricter nullable annotations. Both were intentionally deferred per AZ-500's "preserve behaviour during runtime/SDK migration" contract; neither was an implementation choice. + +### Security audit (cycle 4) + +| Metric | Value | Δ vs cycle 3 | +|--------|-------|--------------| +| Verdict | **PASS_WITH_WARNINGS** | unchanged | +| Mode | **Resume** (only `dependency_scan` re-run; static / OWASP / infrastructure carried forward unchanged because AZ-500 made no source-level edits to those surfaces) | new pattern | +| New Critical / High | 0 / 0 | unchanged | +| New Medium | **0** | unchanged | +| New Low (informational only — all "no published advisory" confirmations on the 10.x lines) | **5** | varies | +| Resolved findings | **2** (cycle-3 D1 + D3 forward-resolved by the major-version bump to JwtBearer 10.0.7 / OpenApi 10.0.7) | -1 (cycle 3 resolved 3) | +| Carry-overs (still OPEN) | 1 (cycle-3 D2 — `Microsoft.NET.Test.Sdk 17.8.0` transitive `NuGet.Frameworks` flag, explicitly out of AZ-500 scope) | unchanged | + +### Performance gate (cycle 4 — first executed perf gate since cycle 1's AZ-484 run) + +| Metric | Value | +|--------|-------| +| Verdict | **PASS_WITH_UNVERIFIED** | +| Scenarios | 7 Pass · 0 Warn · 0 Fail · 1 Unverified | +| AZ-500 NFR (Performance) | **MET** for 7/8 scenarios; PT-08 unmeasurable due to pre-existing `scripts/run-performance-tests.sh:417` grep-pipefail bug (not a .NET 10 regression) | +| PT-07 warm p95 | 301ms — **7.7x improvement** vs cycle-3 short-variant baseline (2340ms at N=2); cold p95 = 2782ms (-14%) | +| PT-06 route creation | 90ms — **49% improvement** vs cycle-3 (178ms); consistent with .NET 10 GC + JIT improvements on small-allocation paths | +| Cycle-3 perf-harness leftover | **STAYS OPEN** per AZ-500 Constraint (deletes only on a fully clean run; replay #3 results appended) | + +## 3. Structural Metrics (snapshot: `structure_2026-05-12_cycle4.md`) + +| Metric | Cycle 4 | Δ vs cycle 3 | +|--------|---------|--------------| +| .NET projects (csproj) | **9** | unchanged | +| Cross-project edges (ProjectReference) | **20** | unchanged | +| Cycles in project graph | 0 | unchanged | +| Average ProjectReferences per component | ~2.2 | unchanged | +| New Architecture violations | **0** | unchanged | +| Resolved Architecture violations | **2 forward** (cycle-3 D1 + D3 by major-version bump) | -1 (cycle 3 resolved 3 + the cycle-2 PT-07/PT-08 leftover) | +| Net Architecture delta | **0** | +3 (cycle 3 was -3) | +| Contract coverage % | unchanged (no new public API surfaces) | n/a | + +**Cycle-4 is structurally neutral by design** — a runtime/SDK migration that preserves behaviour. The graph properties confirm this: same 9 projects, same 20 edges, same DAG (zero cycles), same in-degree distribution. The only structural-equivalence change is the coordinated 11-package M.E.* + 3-package ASP.NET + 1 Swashbuckle major-line bump, all forward-clean against advisories. + +## 4. Efficiency Metrics + +| Metric | Cycle 4 | Δ vs cycle 3 | +|--------|---------|--------------| +| Blocked tasks (during implementation) | **0 of 1** — implementation iterated 3 times during the implement skill (Serilog NU1605 → revert per Risk #4; Microsoft.OpenApi 2.x compile errors → user A/B/C decision → Swashbuckle 6.6.2 → 10.1.7 + Program.cs refactor; OpenApiSecurityRequirement type mismatch → wrap in lambda + List), but never reached the code-review FAIL gate | improvement (cycle 3 had 1 blocked of 6) | +| Tasks completed first attempt (no post-review fix commits) | **1 of 1 (100%)** | best cycle on record (cycle 3 was 5 of 6) | +| Tasks requiring multiple post-code-review fix commits | 0 | unchanged | +| Most-findings batch | batch 01 (the only batch — 2 Medium + 1 Low; both Mediums are scope-deferred, not implementation defects) | n/a | +| Cumulative-review-only findings | n/a (single-task batch) | n/a | +| Step-15 (Perf Test) execution | **EXECUTED** (full default-param run; 7 Pass + 1 Unverified) | **first cycle since cycle 1 to actually execute Step 15** | +| Step-14 (Security Audit) — net findings improvement | **+2** (2 Resolved cycle-3 carry-overs by forward-bump; 5 new Low informational confirmations only) | unchanged net direction (cycle 3 was +1) | + +## 5. Patterns Identified + +### Pattern 1 — Major-version bumps cascade through transitive deps; the surprise was Microsoft.OpenApi 1.x → 2.x + +The implementation hit two separate dependency-conflict surprises that the task spec didn't pre-anticipate: + +1. **`Serilog.AspNetCore 10.0.0` → NU1605 conflict** with the project's pinned `Serilog.Sinks.File 6.0.0` (transitive dep `>= 7.0.0`). Resolved by reverting to `Serilog.AspNetCore 8.0.3` per Risk #4's pre-documented fallback. **The risk was anticipated; the resolution was already specified.** Good outcome. + +2. **`Microsoft.AspNetCore.OpenApi 10.0.7` → CS0234/CS0246/CS7069 compile errors** in `Program.cs` because Microsoft.OpenApi 2.x removed the `Microsoft.OpenApi.Models` namespace, and Swashbuckle 6.6.2 (still pinned) only knows the 1.x namespace. **The risk was NOT anticipated** — Risk #2 in the task spec only flagged "Swagger UI generation breaking" at a high level; the actual breakage was at compile time (namespace removal), and the fix required bumping Swashbuckle (6.6.2 → 10.1.7) AND refactoring `Program.cs` to use the new 2.x types (`OpenApiSecuritySchemeReference`, `JsonSchemaType`, `IOpenApiSchema`). The user was correctly asked via A/B/C; option A was picked; the refactor landed cleanly. + +**Insight**: when a task spec lists "Microsoft.AspNetCore.* package bumps", it should also list the **transitive packages whose major version changes** as a result. Microsoft.OpenApi was technically transitive in cycle 3 (Swashbuckle 6.6.2 pulled it in at 1.x); after the AspNetCore 10.0.7 bump it became *direct* (the new line wants 2.x), forcing Swashbuckle and source code to follow. **A pre-implementation `dotnet restore` dry run (or NuGet manifest diff) against the proposed pin set would have caught this in spec-time, not implementation-time.** + +### Pattern 2 — A pre-existing bug in a sibling script became visible the moment the migration unblocked the path leading to it + +`scripts/run-performance-tests.sh:417` (`grep -o ... | wc -l` with `set -o pipefail`) had been broken since the script was written, but it never fired because the script always exited earlier (cycle 1 didn't reach PT-08; cycle 2 PT-08 was a stub; cycle 3 perf gate was skipped). AZ-500's bootstrap fix (replay #2) and the Step 15 full run (replay #3) were the first two runs ever to reach line 417 — and both crashed there with the same root cause. + +**Insight**: scope-protected migrations like AZ-500 explicitly defer fixes to *unrelated* code, but they should explicitly call out **bugs the migration newly EXPOSES** (vs. introduces) as required follow-up PBIs in the task spec's `Risks & Mitigation` or a new `Newly Exposed Bugs` section. The cycle-3 perf-harness leftover already documents this case in detail; the pattern itself is what's worth surfacing. + +### Pattern 3 — "Resume" mode for the security audit avoided 4 unnecessary re-runs without losing posture + +The cycle-4 security audit ran in **resume mode** — only Phase 1 (dependency scan) was re-executed; Phases 2/3/4 (static analysis, OWASP, infrastructure) carried forward unchanged from cycle 3 because AZ-500 made no source-level edits to those surfaces (only `Program.cs` Swashbuckle DI registration internals + one `using` directive change in `ParameterDescriptionFilter.cs`). + +**Insight**: per-cycle sub-skill prerequisites that ask "resume / overwrite / skip" should accept "resume narrowed to phases X" as a first-class option when the cycle's scope is provably outside the unchanged phases' coverage. We did this manually here via the A/B/C choice; codifying it in the security skill's prereq #3 would be a small DX win. + +### Pattern 4 — First cycle since cycle 1 where Step 15 (Performance Test) actually executed end-to-end + +Cycles 2 + 3 both **skipped** Step 15. Cycle 4 ran the full default-parameter perf harness (PT-01..PT-08) against the migrated `dev` image and captured 7 Pass + 1 Unverified (PT-08 hit the pre-existing script bug). This produced: +- The first PT-07/PT-08 numbers ever recorded against the active codebase (cycle-1 baseline was AZ-484-only; PT-07/PT-08 were added in cycle 3 by AZ-492 but never executed end-to-end); +- A 7.7x improvement signal on PT-07 warm p95 (mostly N=20 dilution; partly .NET 10 pipeline); +- Concrete confirmation that AZ-500 NFR (Performance — "must not regress beyond existing thresholds") is met for every scenario where measurement was possible. + +**Insight**: the cycle-3 retro Action 1 ("Execute the cycle-3 perf harness against the deployed `dev` image to convert the cycle-3 perf-execution leftover into PT-07/PT-08 baseline numbers") was effectively absorbed into AZ-500's NFR (Performance) gate. Worth recording that Action-1 happened — just not as a standalone PBI, but folded into the migration's verification. + +### Pattern 5 — Single-task cycles produce structurally neutral retro metrics — that's not a problem, but it confuses the trend lines + +Most efficiency / quality / structural metrics in this retro are "unchanged from cycle 3" or "0 NEW", because runtime/SDK migrations don't add business logic. The trend graphs in §6 will show artificial flatlines on metrics like "tasks implemented" (1 vs 6) and "complexity delivered" (5 vs 18 SP). That's correct — but if read naively, it could look like cycle 4 was "less productive" than cycle 3. + +**Insight**: when a cycle has a single non-functional task (migration, refactor, dependency hygiene), the retro should explicitly reframe the metric set around what *that* PBI shape proves: continuity (0 regressions), forward-resolution (cycle-3 D1+D3 closed), and unblocking (perf harness now runnable end-to-end). The "tasks implemented" count as a proxy for productivity is misleading here. + +## 6. Comparison vs. previous retros + +| Metric | Cycle 1 | Cycle 2 | Cycle 3 | Cycle 4 | +|--------------------------------------|----------------|----------------|----------------|----------------| +| Tasks implemented | 1 | 2 | 6 | **1** | +| Total complexity delivered | 8 SP | 10 SP | 18 SP | **5 SP** | +| Batches | 1 | 2 | 5 | **1** | +| Critical/High review findings | 0 | 0 | 0 | **0** | +| New Medium review findings | 0 | 0 | 0 | **2** (both bump-consequences, scope-deferred) | +| New Low review findings | 3 | 6 (5 distinct) | 7 | **1** | +| Code review pass rate | 100% (1/1) | 100% (2/2) | 100% (5/5) | **100% (1/1)** | +| Tasks completed first attempt | 0 of 1 | 0 of 2 | 5 of 6 | **1 of 1** | +| New Medium security findings | 2 | 2 | 0 | **0** | +| Resolved security findings | 0 | 0 | 3 | **2** (forward-bump) | +| Net Architecture delta | n/a (baseline) | +0 | -3 | **0** | +| Step-15 (Perf) executed | YES (AZ-484) | SKIPPED | SKIPPED | **YES (full PT-01..PT-08)** | +| Step-15 leftover present at retro | NO | YES | YES | **YES (still — pre-existing script bug, not migration-caused)** | + +### Did the cycle-3 actions land? + +- **Cycle 3 Action 1 (execute perf harness against deployed dev image)** — landed implicitly via AZ-500 Step 15 full run. PT-07 + PT-08 numbers now recorded in `_docs/06_metrics/perf_2026-05-12_cycle4.md`. **Verdict: implemented as part of the migration's NFR gate**, not as a standalone 2-SP PBI. Leftover stays open due to the *separate* pre-existing PT-08 script bug. +- **Cycle 3 Action 2 (bump `System.IdentityModel.Tokens.Jwt 7.0.3 → 7.1.2+` to clear NU1902)** — **NOT landed in cycle 4**. Explicitly out of AZ-500 scope per "no unrelated package bumps" Constraint. NU1902 hits still appear in every test build log (counted ~9 in the AZ-500 perf-harness build trace). Recommended rollover to cycle 5. +- **Cycle 3 Action 3 (`workspace:` field on cross-repo ACs in new-task skill)** — **NOT landed in cycle 4**. AZ-500 had zero cross-repo ACs (it's a single-workspace migration), so this rule was not exercised. Recommended rollover. + +This is the second consecutive cycle where a prior-retro action fully closed (Action 1 here, Actions 1+2+3 in cycle 3). Pattern is stable — good. + +## 7. Top 3 Improvement Actions (ranked by impact) + +### Action 1 — Fix `scripts/run-performance-tests.sh:416-417` grep-pipefail (1 SP, mechanical) + +**Why this is the highest impact**: this single-line bug (replace `grep -o ... | wc -l` with `grep -c ... || true`) blocks PT-08 measurement on every run, AND blocks closure of the cycle-3 perf-harness leftover that has now been carried across 3 cycles. Two consecutive replays (#2 + #3) reproduced the exact same failure mode at the same line; the fix is provably needed AND provably small. + +**Action**: 1 SP PBI in cycle 5. Replaces 2 lines in the script. After it lands, re-run `./scripts/run-performance-tests.sh` once — if the full PT-01..PT-08 sweep is clean, delete `_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md`. + +**Cost**: ~15 minutes (edit + one perf run + leftover deletion). Counted as 1 SP because the deletion contract requires a successful end-to-end run. + +### Action 2 — Migrate the 8 `WithOpenApi(...)` callsites in `Program.cs` to ASP.NET Core 10 minimal-API metadata extensions (3 SP) + +**Why**: clears the 8 `ASPDEPR002` deprecation warnings that AZ-500 left behind (intentionally, per the "preserve behaviour" contract). The replacement API is documented at `https://aka.ms/aspnet/deprecate/002` — straightforward swap to `WithSummary`/`WithDescription`/`WithName`/`WithTags`. Once migrated, the deprecation warning drops out of every CI build log and the Swagger UI continues to render the same metadata. + +**Action**: 3 SP PBI in cycle 5. Already filed as a recommended follow-up in `_docs/03_implementation/reviews/batch_01_cycle4_review.md`. Test gate is the existing `SwaggerDocument_AdvertisesBearerSecurityScheme` programmatic check + a manual swagger-UI smoke (Bearer Authorize button still present). + +**Cost**: 3 SP — touches 8 callsites, light testing burden. Could reasonably be paired with Action 3 below as a single 5-SP "OpenApi 2.x cleanup" PBI. + +### Action 3 — Pre-flight transitive-major-version impact analysis at task-spec time (process change, ~0 SP for the rule, ~1 SP for the next migration that uses it) + +**Why**: AZ-500's biggest implementation surprise (Microsoft.OpenApi 1.x → 2.x cascade forcing Swashbuckle bump + Program.cs refactor) was knowable in advance with a `dotnet restore --dry-run` or `dotnet list package --include-transitive` before/after the proposed pin set. The task spec accurately listed the *direct* package bumps but not the transitive major-version flips. Pattern 1 above documents the cost: extra A/B/C round-trip during implementation, plus an unscheduled refactor. + +**Action**: +- Add to `coderule.mdc` (or a new `task-spec.mdc`): "When a task spec proposes a major-version bump of any direct dependency, the spec must also list the transitive packages whose major version changes as a result, OR explicitly note 'transitive major-version drift not analyzed in spec — implementer to surface and ASK if any non-trivial transitive bump is required'. The check is: run `dotnet restore --dry-run` (or `dotnet list package --include-transitive`) against a scratch branch with the proposed pins before writing the spec, OR mark the spec as 'transitive analysis deferred to implementation time' so the implementer knows to allow extra time." +- Optionally add a `new-task/SKILL.md` Step prompt: "for any package bump in this task, has the transitive major-version drift been analyzed?" + +**Cost**: rule addition is 0 SP; the next migration PBI that adopts the practice will absorb a ~1 SP slack to do the dry-run analysis at spec time. ROI: avoids the ~30-minute mid-implementation unscheduled refactor + A/B/C round-trip that AZ-500 hit. + +## 8. Suggested Rule / Skill updates + +| File | Change | Rationale | +|------|--------|-----------| +| `coderule.mdc` (new bullet near "library API verification" section) | "When a task spec proposes a major-version bump of any direct dependency, the spec must list the transitive packages whose major version changes as a result, OR explicitly note 'transitive major-version drift not analyzed in spec'. The check at spec-time is `dotnet restore --dry-run` against a scratch branch (or equivalent for non-.NET ecosystems)." | Pattern 1 (Microsoft.OpenApi 1.x → 2.x cascade); Action 3 | +| `.cursor/skills/security/SKILL.md` (Phase prereq #3) | Add a fourth option to the resume/overwrite/skip prompt: "Resume narrowed (only re-run Phase X)" — applicable when the cycle's source-level changes are provably outside the unchanged phases' coverage. | Pattern 3 (Resume mode worked manually here; codifying saves a few minutes per single-PBI infrastructure cycle) | +| `coderule.mdc` (new bullet in scope-discipline section) | "When a scope-protected task (migration, dependency bump) newly *exposes* a pre-existing bug elsewhere in the codebase, the implementation MUST surface it as a recommended follow-up PBI in the batch report — and the cycle's deploy report MUST list it as a 'newly exposed bug' separate from 'newly introduced findings'. Bugs that already existed do not count as cycle-introduced regressions, but they must not be silently re-buried." | Pattern 2 (`scripts/run-performance-tests.sh:417` grep-pipefail surfaced via AZ-500's bootstrap fix) | +| `.cursor/skills/retrospective/SKILL.md` (Step 2 narrative guidance) | When a cycle has a single non-functional task (migration / refactor / dependency hygiene), the retro report should explicitly reframe the metric set around continuity (0 regressions), forward-resolution (prior findings closed), and unblocking (capabilities now exercisable end-to-end) — not just task count + complexity points. | Pattern 5 (single-task migration cycles produce flatline metrics that are correct but easily misread) | + +## 9. Decision items carried over (operator) + +- **Cycle-3 perf-harness leftover** — STAYS OPEN per AZ-500 Constraint. Closure path: Action 1 above (1 SP fix → re-run → delete file). Tracked in `deploy_cycle4.md` runbook + `_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md`. +- **Admin team iss/aud confirmation** (carried from cycle 3) — still required before promoting beyond `dev`. Unchanged. Tracked in `deploy_cycle3.md` + `deploy_cycle4.md`. +- **Cross-repo doc** (carried from cycle 3) — `suite/_docs/10_auth.md` paragraph addition. Unchanged. + +## 10. What this retro says about process maturity + +Cycle 4 is the first cycle that: +- Executed Step 15 (Performance Test) end-to-end against a real deployed image since cycle 1. +- Forward-resolved supply-chain advisories by *bumping past them* (cycle-3 D1+D3 closed by the major-version migration), rather than by patch-line bumps. +- Surfaced a transitive-major-version cascade as an implementation-time surprise — and the response chain (NU1605 fail → revert per pre-spec'd Risk → continue; CS0234 fail → user A/B/C → bump Swashbuckle + refactor → continue) showed the implement-skill's recovery loop working as designed. +- Carried a leftover across two cycles where the *underlying capability is healthy* but the *instrumentation harness is broken*. The leftover replay protocol kept the issue visible without blocking forward progress. + +The process continues to converge. The remaining friction points after cycle 4 are (a) transitive-dep awareness at spec-time (Action 3), (b) the lingering `scripts/run-performance-tests.sh` grep-pipefail (Action 1), and (c) the cycle-3 carry-overs (Test.Sdk + IdentityModel.Tokens.Jwt) that AZ-500 explicitly excluded — all are concrete cycle-5 PBI candidates totalling ~6 SP, comfortably below a normal cycle's capacity. diff --git a/_docs/06_metrics/structure_2026-05-12_cycle4.md b/_docs/06_metrics/structure_2026-05-12_cycle4.md new file mode 100644 index 0000000..65804fe --- /dev/null +++ b/_docs/06_metrics/structure_2026-05-12_cycle4.md @@ -0,0 +1,84 @@ +# Structural Snapshot — 2026-05-12 (post-cycle 4, AZ-500) + +Cycle 4 delta against `structure_2026-05-12_cycle3.md`. Source of truth: `_docs/02_document/module-layout.md` + on-disk `*.csproj` graph + `_docs/02_document/contracts/`. + +## Projects + +| Layer | csproj | Cycle 4 delta | +|-------|--------|---------------| +| 1 (Foundation) | `SatelliteProvider.Common` | TFM `net8.0` → `net10.0` | +| 1 (Foundation) | `SatelliteProvider.DataAccess` | TFM `net8.0` → `net10.0`; M.E.* 9.0.10 → 10.0.7 (Configuration.Abstractions, Logging.Abstractions) | +| 3 (Application) | `SatelliteProvider.Services.TileDownloader` | TFM `net8.0` → `net10.0`; M.E.* 9.0.10 → 10.0.7 (Caching.Memory, Http, Logging.Abstractions, Options.ConfigurationExtensions) | +| 3 (Application) | `SatelliteProvider.Services.RegionProcessing` | TFM `net8.0` → `net10.0`; M.E.* 9.0.10 → 10.0.7 (DependencyInjection.Abstractions, Hosting.Abstractions, Logging.Abstractions, Options.ConfigurationExtensions) | +| 3 (Application) | `SatelliteProvider.Services.RouteManagement` | TFM `net8.0` → `net10.0`; M.E.* 9.0.10 → 10.0.7 (DependencyInjection.Abstractions, Hosting.Abstractions, Logging.Abstractions, Options.ConfigurationExtensions) | +| 4 (API / Entry) | `SatelliteProvider.Api` | TFM `net8.0` → `net10.0`; `Microsoft.AspNetCore.Authentication.JwtBearer` 8.0.25 → **10.0.7**; `Microsoft.AspNetCore.OpenApi` 8.0.25 → **10.0.7**; `Swashbuckle.AspNetCore` 6.6.2 → **10.1.7** (drove transitive `Microsoft.OpenApi` 1.x → 2.3.x); `Serilog.AspNetCore` retained at 8.0.3 (Risk #4 fallback); `Program.cs` Microsoft.OpenApi 2.x setup refactor (3 internal edits — namespace, `OpenApiSecuritySchemeReference`, `JsonSchemaType` + `IOpenApiSchema`); `Swagger/ParameterDescriptionFilter.cs` namespace update | +| 5 (Test-Support) | `SatelliteProvider.TestSupport` | TFM `net8.0` → `net10.0`; NuGet refs unchanged (still pinned to `Microsoft.IdentityModel.Tokens 7.0.3` + `System.IdentityModel.Tokens.Jwt 7.0.3` — cycle-3 D4 carry-over) | +| 6 (Tests) | `SatelliteProvider.Tests` | TFM `net8.0` → `net10.0`; M.E.* 9.0.10 → 10.0.7 (Caching.Memory, Configuration.Json, DependencyInjection, Http, Logging.Abstractions, Logging.Console, Options); `Microsoft.AspNetCore.Authentication.JwtBearer` 8.0.25 → 10.0.7 (transitively via ProjectReference to Api) | +| 6 (Tests) | `SatelliteProvider.IntegrationTests` | TFM `net8.0` → `net10.0`; transitive bumps via TestSupport + Api | + +**Project count**: 9 (unchanged from cycle 3 — AZ-500 is a runtime/SDK migration, not a project-graph change). + +## Cross-Project Import Edges (compile-time `ProjectReference`) + +| Edge | Count | Cycle 4 delta | +|------|-------|----------------| +| Api → {Common, DataAccess, TileDownloader, RegionProcessing, RouteManagement} | 5 | unchanged | +| TileDownloader → {Common, DataAccess} | 2 | unchanged | +| DataAccess → {Common} | 1 | unchanged | +| RegionProcessing → {Common, DataAccess} | 2 | unchanged | +| RouteManagement → {Common, DataAccess} | 2 | unchanged | +| Tests → {Api, TileDownloader, RegionProcessing, RouteManagement, Common, DataAccess, TestSupport} | 7 | unchanged | +| IntegrationTests → {TestSupport} (+ runtime DTOs only) | 1 | unchanged | + +**Total ProjectReference edges**: 20 (unchanged from cycle 3). AZ-500 added zero cross-project import edges. + +## Source-import sites — cycle 4 delta + +| Importer | Imports from | Cycle 4 delta | +|----------|--------------|---------------| +| `SatelliteProvider.Api/Program.cs` | `Microsoft.OpenApi` (was `Microsoft.OpenApi.Models`) | namespace move (1.x → 2.x); same conceptual surface, different package layout | +| `SatelliteProvider.Api/Swagger/ParameterDescriptionFilter.cs` | `Microsoft.OpenApi` (was `Microsoft.OpenApi.Models`) | same namespace move | +| All other source files | unchanged | — | + +**No new source-level imports introduced.** The two changed `using` directives are namespace renames forced by the Microsoft.OpenApi 2.x layout, not new dependencies. + +## Graph properties + +- **Cycles in project import graph**: 0 (clean DAG — unchanged) +- **Average ProjectReferences per component**: 20 / 9 = ~2.2 (unchanged from cycle 3) +- **Max in-degree**: Common (still highest at 6 — Api, TileDownloader, DataAccess, RegionProcessing, RouteManagement, Tests). +- **Max out-degree**: Tests (7 — unchanged from cycle 3). +- **TestSupport position**: leaf-of-test-subgraph; no production-layer importers (unchanged). + +## NuGet dependency hygiene (cycle 4) + +| Package | Cycle-3 version | Cycle-4 version | Status | +|---------|-----------------|-----------------|--------| +| `Microsoft.AspNetCore.Authentication.JwtBearer` | 8.0.25 | **10.0.7** | RESOLVED cycle-3 D1 forward (now on the 10.x line; same underlying CVE patch, current line) | +| `Microsoft.AspNetCore.OpenApi` | 8.0.25 | **10.0.7** | RESOLVED cycle-3 D3 forward (same as above) | +| `Swashbuckle.AspNetCore` | 6.6.2 | **10.1.7** | NEW major line — 0 known vulnerabilities; bumped to land Microsoft.OpenApi 2.x compat required by ASP.NET Core 10 | +| `Microsoft.OpenApi` (transitive via Swashbuckle) | 1.x | **2.3.x** | NEW major line — 0 known vulnerabilities; drove the 3 internal `Program.cs` setup edits | +| `Serilog.AspNetCore` | 8.0.3 | **8.0.3 (unchanged)** | Risk #4 fallback — no 10.x line published as of cycle 4; restores cleanly on `net10.0` via netstandard 2.0; recheck per cycle | +| `Microsoft.Extensions.*` (11 distinct package IDs) | 9.0.10 | **10.0.7** | Coordinated bump across 6 csproj files; 0 known vulnerabilities on 10.0.7 line; historical CVE-2024-43483 already not applicable in cycle 3 (9.0.10 baseline post-rc.1 cutoff) | +| `Microsoft.IdentityModel.Tokens` (TestSupport) | 7.0.3 | **7.0.3 (unchanged)** | Cycle-3 D4 carry-over — explicitly out of AZ-500 scope per "no unrelated package bumps" Constraint; recommend separate PBI | +| `System.IdentityModel.Tokens.Jwt` (TestSupport) | 7.0.3 | **7.0.3 (unchanged)** | Same disposition as above | +| `Microsoft.NET.Test.Sdk` | 17.8.0 | **17.8.0 (unchanged)** | Cycle-3 D2 carry-over (transitive `NuGet.Frameworks` flag); explicitly out of AZ-500 scope | +| `SixLabors.ImageSharp` | 3.1.11 | **3.1.11 (unchanged)** | clean | +| `Npgsql` | 9.0.2 | **9.0.2 (unchanged)** | clean | +| `Newtonsoft.Json` | 13.0.4 | **13.0.4 (unchanged)** | clean | +| `Dapper` | 2.1.35 | **2.1.35 (unchanged)** | clean | +| `dbup-postgresql` | 6.0.3 | **6.0.3 (unchanged)** | clean | + +## Architecture / contract surface (cycle 4 delta) + +- **No new public-API contracts** under `_docs/02_document/contracts/` this cycle. AZ-500 preserves every endpoint shape, every DTO, the JWT validation contract (signature + lifetime + iss + aud + 30s clock skew), the multi-source `tile-storage` contract v1.0.0, the `uav-tile-upload` contract v1.0.0, and the AZ-488 permissions claim policy. +- `_docs/02_document/architecture.md` updated to reference .NET 10 / ASP.NET Core 10 / JwtBearer 10.0.7 (Authentication & Authorization paragraph + §2 Tech Stack table); the Architecture Vision prose is unchanged. +- The 8 `WithOpenApi(...)` callsites in `Program.cs` now emit `ASPDEPR002` deprecation warnings — recorded as a follow-up PBI (3 SP). API surface is unchanged. + +## Net Architecture delta vs cycle 3 + +- **Resolved**: cycle-3 D1 (CVE-2026-26130 SignalR DoS, JwtBearer 8.0.21 line) **forward-resolved by major-version bump**; cycle-3 D3 (Microsoft.AspNetCore.OpenApi 8.0.21 line) similarly forward-resolved. The cycle-3 perf-harness leftover stays OPEN per AZ-500 Constraint, NOT closed by this cycle. **Total resolved: 2 (both Low/forward).** +- **Newly introduced (informational only)**: 5 new Low informational findings in `dependency_scan_cycle4.md` (F1-cy4..F5-cy4 — all "no known vulnerabilities" confirmations on the 10.x lines); 0 new Medium; 0 new High; 0 new Critical. 2 new Maintainability Mediums in code review (F1 ASPDEPR002 deprecation, F2 CS8604 nullable) — both deferred per "scope discipline" + the AZ-500 contract of "preserve behaviour during migration". +- **Net Architecture delta**: 0 net architecture-level findings. The cycle is structurally neutral — same component count, same edges, same DAG, same contract surface. The only structural-equivalence change is the 11-package M.E.* + 3-package ASP.NET + 1 Swashbuckle major-line bump, all forward-clean against advisories. + +Cycle 4 is the first runtime/SDK-migration cycle. Net-zero is the expected outcome for that PBI shape — there is no business-logic delta to add structure. The graph properties confirming this is a "preserve-behaviour" migration: same imports, same edges, same cycles (zero), same in-degree distribution, same DAG. diff --git a/_docs/LESSONS.md b/_docs/LESSONS.md index 6380b00..e2f7381 100644 --- a/_docs/LESSONS.md +++ b/_docs/LESSONS.md @@ -37,6 +37,12 @@ If the enum's wire string happens to match a member name case-insensitively (e.g ## Ring buffer (last 15 entries — newest at top) +- [2026-05-12] [dependencies] Major-version bumps of direct deps cascade through transitives; the task spec must list the transitive packages whose major version changes as a result OR explicitly note "transitive major-version drift not analyzed in spec" — verify with `dotnet restore --dry-run` against a scratch branch before writing the spec (cycle 4: AZ-500 surprise-bumped `Microsoft.OpenApi` 1.x → 2.x via the `Microsoft.AspNetCore.OpenApi` 8.0.25 → 10.0.7 path; forced an unscheduled Swashbuckle bump + Program.cs refactor mid-implementation). + Source: _docs/06_metrics/retro_2026-05-12_cycle4.md +- [2026-05-12] [process] When a scope-protected task newly *exposes* a pre-existing bug elsewhere in the codebase (vs. introducing a new one), surface it as a recommended follow-up PBI in the batch report AND list it as a "newly exposed bug" separate from "newly introduced findings" in the deploy report — bugs that already existed don't count as cycle-introduced regressions, but they must not be silently re-buried (cycle 4: AZ-500's bootstrap fix unmasked the pre-existing `scripts/run-performance-tests.sh:417` `grep -o | wc -l` + `pipefail` bug). + Source: _docs/06_metrics/retro_2026-05-12_cycle4.md +- [2026-05-12] [process] When a cycle has a single non-functional task (migration / refactor / dependency hygiene), the retro must reframe the metric set around continuity (0 regressions), forward-resolution (prior findings closed by the bump itself), and unblocking (capabilities now exercisable end-to-end) — task count + complexity points read as misleading flatlines that look like under-productivity (cycle 4: AZ-500 alone delivered 5 SP vs cycle 3's 18 SP, but the cycle's value was forward-resolving 2 cycle-3 advisories and finally executing PT-01..PT-08 end-to-end against the migrated build). + Source: _docs/06_metrics/retro_2026-05-12_cycle4.md - [2026-05-12] [process] For cross-team blockers (admin team must supply config values, etc.), prefer an Option-B forcing function (ship the validation/scaffolding with prod-empty config that fails-fast at deploy) over deferring the entire task — the fail-fast contract makes the cross-team conversation impossible to skip and ships the in-workspace work in the current cycle (cycle 3: AZ-494 shipped iss/aud validation with empty prod appsettings so deploy must supply real values). Source: _docs/06_metrics/retro_2026-05-12_cycle3.md - [2026-05-12] [process] ACs that prescribe a specific measurement or sentinel mechanism (e.g. "per-item latency < 50ms", "guard fires when DB name contains _test") should also prescribe — or explicitly defer — the path for collecting / enforcing it, or implementations will substitute proxies / equivalents that look like spec drift in review (cycle 3: AZ-492 PT-08 per-item gate cost became a derived proxy; AZ-493 DB-name guard became Host-allowlist). diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index cb4d012..4a063f0 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -2,14 +2,14 @@ ## Current Step flow: existing-code -step: 16 -name: Deploy -status: in_progress +step: 9 +name: New Task +status: not_started sub_step: - phase: 1 - name: commit-cycle4-sync + phase: 0 + name: awaiting-invocation detail: "" retry_count: 0 -cycle: 4 +cycle: 5 tracker: jira auto_push: true