Files
satellite-provider/_docs/06_metrics/retro_2026-05-12_cycle5.md
T
Oleksandr Bezdieniezhnykh ea278afb37 [AZ-503] [AZ-504] Cycle 5 Step 17: retrospective + close cycle
retro_2026-05-12_cycle5.md captures the cycle-end retrospective:
- Implementation: 2 tasks (AZ-504 + AZ-503-foundation), 4 SP total,
  100% first-attempt pass rate, 1 mid-implement scope-split
  (AZ-503 → AZ-503-foundation + AZ-505, blocked-linked).
- Quality: 50/50 PASS/PASS_WITH_WARNINGS, 0 new Medium+, 1 new Low
  (defensive contentSha256 soft-NULL guard).
- Security: PASS_WITH_WARNINGS, 0 new Critical/High/Medium, 2 new
  Low informational (F1 flightId provenance, F2 pgcrypto runbook
  gap).
- Performance: PASS_WITH_INFRA_WARNINGS — first measurable PT-08
  ever (Run #1 199ms, Run #2 117ms vs 2000ms threshold); PT-01/02
  failed on recurring local Docker/colima DNS cold-start, not an
  app regression.
- Structural: +1 ProjectReference edge (IntegrationTests → Common),
  +1 minor contract bump (uav-tile-upload 1.0.0 → 1.1.0), +1 DB
  migration (014_AddTileIdentityColumns.sql), 0 NuGet bumps,
  0 csproj additions, DAG still acyclic at 9 projects.

structure_2026-05-12_cycle5.md captures the structural snapshot.

LESSONS.md updated with 3 cycle-5 entries (oldest dropped to
preserve the 15-entry ring buffer):
- [architecture] Cross-repo cryptographic invariants must live as
  code constants in both repos with reference-vector tests.
- [tooling] When perf-mode "one re-run" fires twice with the same
  DNS root cause, escalate from re-run to harness fix.
- [process] Spec contradicts live code by >=2 prerequisites →
  prefer split into foundation + follow-up (A/B/C option C).

Top 3 follow-up actions (cycle 6 candidates):
- Action 1 (1 SP): DNS pre-warm in scripts/run-performance-tests.sh
  → closes the cycle-3 perf-harness leftover.
- Action 2 (5 SP): AZ-505 — inventory endpoint + HTTP/2 + Leaflet
  covering index (blocked-linked on AZ-503-foundation, this cycle).
- Action 3 (1 SP): pgcrypto pre-install runbook step (F2-cy5 doc
  fix).

Cycle 5 closed. Autodev state advanced for cycle 6 by the next
/autodev invocation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 18:07:57 +03:00

250 lines
29 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Retrospective — Cycle 5 (2026-05-12)
**Tasks**: AZ-503-foundation (tile identity — UUIDv5 + integer UPSERT foundation, 3 SP) + AZ-504 (perf-script grep-pipefail fix, 1 SP). The original AZ-503 spec (5 SP combined) was split mid-implement into AZ-503-foundation (this cycle) + **AZ-505** (5 SP — inventory endpoint + HTTP/2 + Leaflet covering index, blocked-linked to AZ-503-foundation, deferred to cycle 6).
**Mode**: cycle-end (autodev Step 17)
**Previous retro**: `retro_2026-05-12_cycle4.md`
**Cycle shape**: small-foundation cycle — first schema-changing cycle since cycle 1's AZ-484; first cycle to ship a contract minor-version bump; first cycle since cycle 1 with multiple measurable PT-08 batches.
## 1. Implementation Metrics
| Metric | Cycle 5 | Δ vs cycle 4 |
|--------|---------|--------------|
| Tasks implemented | **2** (AZ-504, AZ-503-foundation) | +1 |
| Batches executed | **2** | +1 |
| Avg tasks / batch | 1.0 | unchanged |
| Total complexity delivered | **4 SP** (1 + 3) | -1 SP |
| Avg complexity / batch | 2 SP | -3 SP |
| Tasks at-or-below 5 SP cap | **2 of 2 (100%)** | unchanged |
| Tasks split mid-cycle into a follow-up PBI | **1 of 2** (AZ-503 → AZ-503-foundation + AZ-505) | new (cycle 4 had no splits) |
| Tasks above cap | 0 | unchanged |
| Cumulative reviews | **0** (cumulative-review trigger is every 3 batches; cycle 5 has 2 batches, so no trigger) | unchanged |
**Sequencing**: 2 batches — AZ-504 first (smallest mechanical fix; landed in batch 01), AZ-503-foundation second (batch 02). This ordering was chosen so the AZ-504 fix would be in place by the time the AZ-503-foundation tests run (some of which use the perf script's regression-tested helpers). The cycle completed in **6 dev commits** counting Step 9 task specs, both batches, the autodev-state chore, Steps 11-15 sync, and Step 16 deploy — three more than cycle-4's 4-commit count because of the scope-split conversation and the two-batch structure.
## 2. Quality Metrics
### Code Review Results
| Verdict | Count | Percentage |
|---------|-------|-----------|
| PASS | **1** (batch 01 AZ-504) | **50%** |
| PASS_WITH_WARNINGS | **1** (batch 02 AZ-503-foundation) | **50%** |
| FAIL | 0 | 0% |
### Findings by Severity (per-batch code review)
| Severity | Cycle 5 | Δ vs cycle 4 |
|----------|---------|--------------|
| Critical | 0 | unchanged |
| High | 0 | unchanged |
| Medium | **0** | -2 (cycle 4 had 2 bump-consequence Mediums) |
| Low | **1** (F-cy5 batch02 — soft-NULL guard on `contentSha256` when `File.Exists==false`; practically unreachable in happy path) | unchanged |
### Findings by Category
| Category | Count | Top Files |
|----------|-------|-----------|
| Maintainability | **1** | `SatelliteProvider.Services.TileDownloader/TileService.cs` (BuildTileEntity) |
| Bug | 0 | — |
| Spec-Gap | 0 | — |
| Security | 0 NEW code-review (2 NEW informational Lows in `_docs/05_security/` — F1-cy5 flightId provenance, F2-cy5 pgcrypto runbook gap — both long-term, not code defects) | -3 informational vs cycle 4 |
| Performance | 0 | — |
| Style | 0 | — |
| Scope | 0 | -1 (cycle 4 had 1 — the perf-script path fix) |
**Note on the 1 Low finding**: the `contentSha256` soft-NULL guard is defensive against transient I/O failure between the JPEG-write and the SHA256-compute steps. The downloader writes the file before the SHA256 read, so in practice the NULL branch is unreachable. The column is NULLable at the DB level. Tightening to throw-on-missing-file is recommended as a follow-up if downstream consumers ever rely on NOT NULL.
### Security audit (cycle 5)
| Metric | Value | Δ vs cycle 4 |
|--------|-------|--------------|
| Verdict | **PASS_WITH_WARNINGS** | unchanged (cumulative pos.) |
| Mode | **Delta** (full re-scan of new/modified code in AZ-503/AZ-504 against OWASP Top 10 + dependency manifest diff + infrastructure-change check) | new this cycle (cycle 4 was "Resume narrowed to dependency_scan only") |
| New Critical / High | 0 / 0 | unchanged |
| New Medium | **0** | unchanged |
| New Low (informational only) | **2** (F1-cy5: `metadata.flightId` not authenticated provenance — long-term recommendation when an authoritative flight registry exists; F2-cy5: `pgcrypto` deployment runbook gap on managed Postgres — captured in `deploy_cycle5.md` operator runbook) | -3 vs cycle 4 (5 informational confirmations) |
| Resolved findings | **0** | -2 (cycle 4 forward-resolved 2 cycle-3 carry-overs via major-version bumps; cycle 5 made no package bumps) |
| Carry-overs (still OPEN) | 3 (cycle-3 D2 `Microsoft.NET.Test.Sdk 17.8.0` transitive flag; cycle-4 D-IdentityModel-7.0.3 — both `Microsoft.IdentityModel.Tokens` and `System.IdentityModel.Tokens.Jwt` in TestSupport; `Serilog.AspNetCore 8.0.3` fallback) | unchanged |
**SHA-1 in `Uuidv5.cs` was explicitly assessed and cleared**: RFC 9562 §5.5 mandates SHA-1 for namespace-based UUIDv5; the algorithm is not used as a cryptographic hash for security (no collision-resistance against an adversary is claimed); the input space (tile coordinates × namespace) is not adversary-controlled in any reasonable threat model. This is the standard library-vetted construction for deterministic IDs — flagged explicitly in `static_analysis_cycle5.md` so it does not get re-flagged in future audits.
### Performance gate (cycle 5)
| Metric | Value |
|--------|-------|
| Verdict | **PASS_WITH_INFRA_WARNINGS** |
| Scenarios | 6 Pass · 0 Warn · 2 Fail · 0 Unverified (across Run #2 — Run #1 was 5 Pass + 3 Fail before `colima restart`) |
| AZ-504 NFR (PT-08 reaches summary) | **MET** — first cycle ever where PT-08 returns a measured batch p95. PT-08 ran clean across both runs: Run #1 batch p95 = 199ms, Run #2 batch p95 = **117ms** (vs 2000ms AZ-488 threshold). |
| AZ-503-foundation NFR (no UPSERT hot-path regression) | **MET** — PT-08 uses the new integer-key + flight-aware UPSERT for 200 batches; 200/200 accepted, 0 rejected, 0 failed; p95 117ms is **faster** than any prior cycle's measurement of this path. |
| Cycle-3 perf-harness leftover | **STAYS OPEN** — replay #5 documented. The AZ-504 fix is verified working across two perf runs, but the "default-parameter exit-0 run" criterion is blocked by recurring local Docker/colima DNS cold-start (PT-01 fails on `mt0/tile.googleapis.com` DNS lookup at the first request after each `docker compose up`). Two concrete closure paths recorded in the leftover (DNS pre-warm in script, OR move perf gate to CI). |
## 3. Structural Metrics (snapshot: `structure_2026-05-12_cycle5.md`)
| Metric | Cycle 5 | Δ vs cycle 4 |
|--------|---------|--------------|
| .NET projects (csproj) | **9** | unchanged |
| Cross-project edges (ProjectReference) | **21** | **+1** (IntegrationTests → Common — justified by cross-repo invariant deduplication) |
| Cycles in project graph | 0 | unchanged |
| Average ProjectReferences per component | ~2.3 | +0.1 |
| Max in-degree (Common) | 7 | **+1** (now also imported by IntegrationTests) |
| New Architecture violations | **0** | unchanged |
| Resolved Architecture violations | **0** | -2 vs cycle 4 (cycle 4 forward-resolved 2 via the .NET 10 bump; cycle 5 had no bumps) |
| Net Architecture delta | **0** | unchanged (cycle 4 was also 0) |
| Public-API contract delta | **+1 minor** (`uav-tile-upload.md` 1.0.0 → 1.1.0 additive) | new this cycle (cycle 4 had no contract changes) |
| Database schema delta | **+4 columns, +2 indexes, -2 indexes, +1 extension (`pgcrypto`)** | new this cycle (cycle 4 had no schema changes) |
| NuGet package bumps | **0** | -16 (cycle 4 coordinated 16 distinct package bumps) |
**Cycle 5 structural posture**: one schema migration, one minor contract bump, one new cross-component utility (`Uuidv5`), one new IntegrationTests → Common edge. Zero NuGet bumps, zero csproj additions, zero new public-API endpoints (the inventory endpoint is deferred to AZ-505). The DAG remains acyclic.
## 4. Efficiency Metrics
| Metric | Cycle 5 | Δ vs cycle 4 |
|--------|---------|--------------|
| Blocked tasks (during implementation) | **0 of 2** | unchanged |
| Tasks split mid-implement | **1 of 2** (AZ-503 → AZ-503-foundation + AZ-505 via A/B/C decision at /autodev Step 10) | new this cycle |
| Tasks completed first attempt (no post-review fix commits) | **2 of 2 (100%)** | unchanged (cycle 4 also 100%) |
| Tasks requiring multiple post-code-review fix commits | 0 | unchanged |
| Most-findings batch | batch 02 (AZ-503-foundation — 1 Low; batch 01 had 0 findings) | similar shape |
| Step-15 (Perf Test) execution | **EXECUTED twice** (Run #1 + Run #2; Run #2 is the better signal post `colima restart`) | unchanged (cycle 4 also executed Step 15) |
| Step-15 leftover at end of cycle | **YES (still — same cycle-3 leftover)** — but the AZ-504 verification half is satisfied this cycle; only the "infra-noise-free exit-0 run" half remains. | unchanged in identity, narrower in remaining gap |
| Step-14 (Security Audit) — net findings improvement | **0** new Medium+, 2 new informational Lows, 0 resolved | unchanged net direction |
| Number of A/B/C decision points hit during autodev | **3** (Step 10 scope split, Step 15 Run #1 gate, Step 15 Run #2 gate) | +2 vs cycle 4's 0 in-cycle A/B/C points |
## 5. Patterns Identified
### Pattern 1 — Spec-contradiction-driven A/B/C split at /autodev Step 10 is now a known shape
When AZ-503-implementation started, three contradictions surfaced against the live codebase: `flight_id` did not exist as a column, `FlightId` did not exist as a DTO field, and `voting_status` did not exist as referenced by an AC. The combined work needed to make the spec executable as written was ~5 SP — above the cycle-policy cap. Per `meta-rule.mdc` "Critical Thinking" + `autodev/protocols.md` A/B/C scope-protection, the implement skill stopped, surfaced the contradictions, and offered three options (A: implement the spec as-written; B: defer the whole task; **C: split into foundation + follow-up** — which the user picked).
The split produced a self-contained, testable, 3 SP foundation PBI that this cycle delivered cleanly, and a 5 SP `AZ-505` follow-up that is now blocked-linked in the Jira graph. **This is the first cycle where a scope-split mid-implement happened and landed cleanly without losing test coverage continuity.**
**Insight**: when a /autodev cycle's task spec drifts from the live codebase by more than ~1 missing prerequisite, the split-into-foundation pattern is preferable to either (a) silently expanding the cycle's SP budget or (b) deferring the entire PBI. The foundation captures the prerequisite infrastructure the follow-up needs; the follow-up captures the user-facing capability. Both halves remain individually shippable and individually testable.
### Pattern 2 — First measurable PT-08 batch in the project's history
PT-08 (UAV batch upload p95) was scenario-spec'd in cycle 2 (AZ-488) but never produced a measurable batch number until this cycle. Three runs preceded this one (cycle 3 short variant: PT-08 crashed at line 417; cycle 4 full run: same crash; cycle 5 Run #1 + Run #2: **clean exits to summary with 200/200 accepted**). The AZ-504 fix (`grep -c … || true`) is the single thing that unblocked this.
The captured numbers are: Run #1 batch p95 = 199ms, Run #2 batch p95 = **117ms**, both far under the 2000ms AZ-488 threshold. Per-item proxy = batch p95 / batch_size = **1119ms**, far under the AZ-492 per-item gate.
**Insight**: a 1-SP mechanical script fix unblocked a previously-unmeasurable NFR that had been carried as a leftover across three cycles. The leftover replay protocol kept the issue visible without blocking forward progress; the eventual fix took 15 minutes once it became the cycle's smallest atomic PBI. **This is a textbook case for the "leftover replay" mechanism in `tracker.mdc` — keep small unblocked-but-not-yet-fixed items visible, close them when they fit naturally in a cycle's scope.**
### Pattern 3 — Local Docker/colima DNS cold-start is a recurring class of "infra-noise" failure that contaminates the perf gate
Cycle 4's perf gate noted DNS resolution intermittence as an unrelated test flake. Cycle 5 hit the same class **twice in the same session** — first during the functional test phase (resolved by `colima restart`), then during Step 15 Run #1 (manifested on `tile.googleapis.com`), then again during Step 15 Run #2 (manifested on `mt0.google.com` — the warmup probe between Run #1 and Run #2 only touched `tile.googleapis.com` + `mt1.google.com`).
The pattern is: the first Google-Maps tile fetch immediately after `docker compose up` may fail DNS resolution if the colima resolver hasn't pre-cached the specific hostname yet. Subsequent fetches succeed. PT-01 (cold tile download) is the first scenario that hits a Google-Maps hostname and therefore is the canary for this class of failure.
**Insight**: this is not an application regression — it is **environment instability that the perf script does not currently shield itself against**. Two concrete mitigations are recorded in the cycle-3 leftover (replay #5 entry) as out-of-scope follow-up PBIs:
1. **Add a DNS pre-warm step** to `scripts/run-performance-tests.sh` before PT-01 (1 SP, deterministic fix);
2. **Move the perf gate to CI / a stable-resolver environment** (2 SP, structural fix).
The pattern is worth surfacing as a Lesson — the perf-mode skill (`test-run/SKILL.md`) already says "always worth ONE re-run before declaring a regression", and we did one; but the underlying environment continues to be the cycle's single largest source of non-application gate noise.
### Pattern 4 — Cross-repo cryptographic invariants belong in a code-level constant, not a doc reference
AZ-503 introduces a `TileNamespace` UUID (`5b8d0c2e-7f1a-4d3b-9c5e-1f3a8e7d2b6c`) that MUST byte-match the same constant in `gps-denied-onboard/components/c6_tile_cache/_uuid.py` (sibling workspace), or every cross-repo `tileId` lookup silently misses. The constant lives as a pinned `public const string TileNamespace` in `SatelliteProvider.Common.Utils.Uuidv5` and is asserted by the `Uuidv5Tests` reference-vector unit tests. The sibling workspace is documented to mirror the same constant.
**Insight**: when a cross-repo invariant is a magic constant (a UUID, a base32 alphabet, a tile-zoom convention), it must live in code in BOTH repos with reference-vector tests on BOTH sides. Documentation alone (e.g., "see contract foo.md") is not enough — a drift between the constants would only surface as a 100% lookup-miss in production, which is harder to detect than a unit-test failure. Cycle 5 captured this on the satellite-provider side; the sibling-repo side will be handled when AZ-505 lands.
### Pattern 5 — Schema migrations + `pgcrypto` create a deploy-side dependency that the runbook must spell out
Migration 014 enables `pgcrypto` (`CREATE EXTENSION IF NOT EXISTS pgcrypto;`) for the session-scoped backfill function. On stock Postgres 16 (our `postgres:16` Docker image), `pgcrypto` is bundled. On managed cloud Postgres providers (RDS, Cloud SQL, Azure Postgres), the migration-running role typically needs `cloudsqlsuperuser` / `rds_superuser` / equivalent to `CREATE EXTENSION`. If that privilege is absent, migration 014 fails and the application doesn't start.
This was flagged as F2-cy5 (Low informational) during the security audit, recorded in `deploy_cycle5.md` operator runbook step 2, and explicitly called out as a recommended out-of-scope follow-up PBI. **It is NOT a code defect — it is a deploy-runbook gap.**
**Insight**: cycles that add `CREATE EXTENSION` / `CREATE ROLE` / `ALTER SYSTEM` / other privilege-sensitive statements to migrations must produce a deploy-runbook update in the same cycle, even when the cycle is otherwise small. The information is uniquely-visible at code-write time; if the runbook update is deferred, it tends to get lost.
## 6. Comparison vs. previous retros
| Metric | Cycle 1 | Cycle 2 | Cycle 3 | Cycle 4 | Cycle 5 |
|--------------------------------------|----------------|----------------|----------------|----------------|----------------|
| Tasks implemented | 1 | 2 | 6 | 1 | **2** |
| Total complexity delivered | 8 SP | 10 SP | 18 SP | 5 SP | **4 SP** |
| Batches | 1 | 2 | 5 | 1 | **2** |
| Critical/High review findings | 0 | 0 | 0 | 0 | **0** |
| New Medium review findings | 0 | 0 | 0 | 2 (bump conseq.) | **0** |
| New Low review findings | 3 | 6 (5 distinct) | 7 | 1 | **1** |
| Code review pass rate | 100% (1/1) | 100% (2/2) | 100% (5/5) | 100% (1/1) | **100% (2/2)** |
| Tasks completed first attempt | 0 of 1 | 0 of 2 | 5 of 6 | 1 of 1 | **2 of 2** |
| Tasks split mid-implement | 0 | 0 | 0 | 0 | **1** |
| New Medium security findings | 2 | 2 | 0 | 0 | **0** |
| Resolved security findings | 0 | 0 | 3 | 2 (fwd-bump) | **0** |
| Net Architecture delta | n/a (baseline) | +0 | -3 | 0 | **0** |
| Schema change | YES (AZ-484) | NO | NO | NO | **YES (AZ-503-foundation)** |
| Public-API contract change | YES (tile-storage v1.0.0) | YES (uav-tile-upload v1.0.0) | NO | NO | **YES (uav-tile-upload v1.0.0 → v1.1.0 additive)** |
| Step-15 (Perf) executed | YES | SKIPPED | SKIPPED | YES | **YES (Run #1 + Run #2)** |
| Step-15 leftover at retro | NO | YES | YES | YES | **YES (still — same cycle-3 leftover; AZ-504 verification half satisfied)** |
| First measurable PT-08 batch | n/a | n/a | n/a | NO (script crash) | **YES (Run #1 199ms, Run #2 117ms)** |
### Did the cycle-4 actions land?
- **Cycle 4 Action 1 (fix `scripts/run-performance-tests.sh:416-417` grep-pipefail)** — **LANDED as AZ-504** in cycle 5 (this cycle). Closes Pattern 2 from cycle 4's retro. Verified working across two perf runs.
- **Cycle 4 Action 2 (migrate `WithOpenApi(...)` callsites to ASP.NET Core 10 minimal-API metadata extensions, 3 SP)** — **NOT landed in cycle 5**. Explicitly out of AZ-503/AZ-504 scope per `coderule.mdc` "scope discipline". Re-listed as a recommended follow-up PBI in `deploy_cycle5.md`. Carries to cycle 6.
- **Cycle 4 Action 3 (pre-flight transitive-major-version impact analysis at task-spec time)** — **NOT directly exercised** in cycle 5. AZ-503 and AZ-504 had zero NuGet bumps, so the rule was preventive only. It still lives in `coderule.mdc` (added in cycle 4) and will fire on the next package-bumping PBI.
This is the **third consecutive cycle where a prior-retro action landed** (Action 1 here, Action 1 in cycle 4, Actions 1+2+3 in cycle 3). Pattern is stable.
### Did the cycle-3 actions land?
- **Cycle 3 Action 1 (execute perf harness against deployed dev image)** — landed in cycle 4 (implicit AZ-500 NFR gate). Closed.
- **Cycle 3 Action 2 (bump `System.IdentityModel.Tokens.Jwt 7.0.3 → 7.1.2+`)** — **NOT landed in cycle 5**. Carry-over D2-cy4 still open. Test-runtime exposure only; safe to land independently.
- **Cycle 3 Action 3 (`workspace:` field on cross-repo ACs in new-task skill)** — **NOT exercised**. AZ-503 has one cross-workspace invariant (the `TileNamespace` UUID) but that invariant is captured as a Constraint in the task spec body rather than as a per-AC `workspace:` tag. The rule is still not codified in `new-task/SKILL.md`. AZ-505 next cycle has explicit cross-repo writes (the gps-denied-onboard side of the UUIDv5 namespace constant); the cycle-3 rule should land before AZ-505 writes its spec.
## 7. Top 3 Improvement Actions (ranked by impact)
### Action 1 — Add DNS pre-warm to `scripts/run-performance-tests.sh` before PT-01 (1 SP, deterministic)
**Why this is the highest impact**: Pattern 3 above documents the same class of failure across cycles 4 + 5. It contaminates the perf gate with non-application noise, blocks closure of the cycle-3 perf-harness leftover (now in its third carry-over cycle), and forces a manual `colima restart` + re-run per session. A deterministic DNS pre-warm before PT-01 fires removes the entire failure class on the local-dev runner.
**Action**: 1 SP PBI in cycle 6. Insert a `getent hosts mt0.google.com mt1.google.com mt2.google.com mt3.google.com tile.googleapis.com` (or equivalent for the runner's resolver) inside the api container immediately before PT-01 — fail-fast if any hostname is unresolvable after a small retry window. Closes the cycle-3 perf-harness leftover on the next run.
**Cost**: ~20 minutes (one shell addition + one perf run to verify exit-0 + delete the leftover). Counted as 1 SP because deletion of the leftover requires a full clean run.
### Action 2 — Implement AZ-505 (deferred AZ-503 half: inventory endpoint + HTTP/2 + Leaflet covering index, 5 SP)
**Why**: AZ-505 is blocked-linked to AZ-503-foundation. The foundation (this cycle) shipped the deterministic identity, the integer-key UPSERT, and the per-flight layout. AZ-505 ships the user-facing inventory endpoint (`POST /api/satellite/tiles/inventory`) that lets consumers ask "given these N (lat,lon,zoom) coordinates, which tileIds + variants do you have?". AZ-505 also enables HTTP/2 on the API (required for batched inventory responses without TCP head-of-line blocking) and rewrites the Leaflet hot path against the new `location_hash` index.
**Action**: 5 SP PBI in cycle 6. Foundation prerequisites are now in place. AZ-505 also carries the cross-repo write obligation for the gps-denied-onboard side of the UUIDv5 namespace constant.
**Cost**: 5 SP — within the cycle cap. Foundation is done; this is the user-facing payload that justifies the schema work.
### Action 3 — Add a `pgcrypto` pre-install check step to the deployment runbook (1 SP, ops-side)
**Why**: Pattern 5 above. Migration 014 silently relies on `pgcrypto` being installable by the migration-running role. Stock Postgres 16 is fine; managed cloud Postgres providers may not be. F2-cy5 already captures this in `_docs/05_security/owasp_review_cycle5.md` and `deploy_cycle5.md` operator runbook step 2 — but a runbook step that says "check this before running the migration" is necessary if the project ever migrates off Docker-postgres for a non-dev environment.
**Action**: 1 SP PBI in cycle 6 (or land as a doc-only PR within cycle 6's scope-discipline budget). Update the deployment runbook with a pre-migration `SELECT EXTNAME FROM pg_extension WHERE extname='pgcrypto'` + a fallback path if missing.
**Cost**: ~30 minutes of doc work. Counted as 1 SP because it touches the deployment runbook (cross-cutting infra doc).
## 8. Suggested Rule / Skill updates
| File | Change | Rationale |
|------|--------|-----------|
| `coderule.mdc` (new bullet in scope-discipline section) | "When a migration adds a `CREATE EXTENSION` / `CREATE ROLE` / `ALTER SYSTEM` / other privilege-sensitive statement, the same cycle's deploy report MUST add a pre-migration runbook step that verifies the privilege exists in the target environment. The runbook step is required even when the cycle is otherwise small." | Pattern 5 (pgcrypto in migration 014 created a deploy-runbook gap that was caught at security-audit time, not at migration-write time) |
| `new-task/SKILL.md` (new check in Step 5 — Risks & Mitigation) | When a task spec introduces a cross-repo cryptographic invariant (a UUID namespace, a base32/64 alphabet, a tile-zoom convention, a deterministic-key formula), the spec MUST list both the in-workspace code location of the constant AND the sibling-workspace code location it must byte-match — with reference-vector tests on both sides. Doc references alone do not satisfy this. | Pattern 4 (`TileNamespace` UUID must byte-match `gps-denied-onboard/components/c6_tile_cache/_uuid.py`) |
| `autodev/protocols.md` (formalise Step-10 contradiction-driven A/B/C) | The "scope-split" branch of the scope-protection A/B/C choice should be a first-class named option, not an ad-hoc decision. When a /autodev cycle's task spec contradicts the live codebase by >=2 prerequisites, the implement skill should preferentially offer **A: implement as-written / B: defer entirely / C: split into foundation + follow-up** with C being the recommended default. Cycle 5 derived this manually; codifying it makes future cycles cheaper to navigate. | Pattern 1 (AZ-503 → AZ-503-foundation + AZ-505 split was the cycle's largest decision, made via ad-hoc Choose; the recommended option matched the documented pattern but the path was discovered, not followed) |
| `test-run/SKILL.md` (Perf Mode Step 5 — add explicit retrospective trigger) | After the "one re-run" rule fires twice across consecutive cycles with the same root-cause class (e.g., DNS, NTP, resolver), the perf-mode skill should auto-surface a recommended PBI for a deterministic fix at the environment / harness layer — not as a re-run, as a fix. Cycle 5 fired the rule manually here. | Pattern 3 (DNS cold-start hit cycle 4 + cycle 5; the perf-mode skill's "one re-run" already shields against single-incident noise, but doesn't yet escalate when noise recurs) |
## 9. Decision items carried over (operator)
- **Cycle-3 perf-harness leftover** — STAYS OPEN. Replay #5 entry recorded. **Half-closed** this cycle: AZ-504 script fix verified working across 2 runs; remaining half (full default-parameter exit-0 run) blocked by recurring local DNS noise. Closure path: Action 1 above (DNS pre-warm in script, 1 SP) OR move perf gate to CI/cloud runner.
- **Admin team iss/aud confirmation** (carried from cycles 3 + 4) — still required before promoting beyond `dev`. Unchanged. Tracked in `deploy_cycle3.md` + `deploy_cycle4.md` + `deploy_cycle5.md`.
- **D2-cy4 — `Microsoft.NET.Test.Sdk 17.8.0` transitive `NuGet.Frameworks` flag** — unchanged. Test-runtime exposure only; safe to land in a future cycle.
- **D-IdentityModel-7.0.3** (cycle-4 carry-over — both `Microsoft.IdentityModel.Tokens` and `System.IdentityModel.Tokens.Jwt` at 7.0.3 in TestSupport, NU1902) — unchanged. Cycle-3 Action 2 obligation; should land before any new auth-touching cycle.
- **F1-cy5 — `metadata.flightId` authenticated provenance** — long-term, not actionable until an authoritative flight registry exists in the suite. Recorded in `owasp_review_cycle5.md` and `deploy_cycle5.md` as a long-term recommendation.
- **F2-cy5 — `pgcrypto` deployment runbook gap** — Action 3 above. Trivial doc-only fix.
- **`Serilog.AspNetCore 8.0.3` fallback** — unchanged; no 10.x line published as of cycle 5. Recheck at every cycle start.
- **Cross-repo doc `suite/_docs/10_auth.md` paragraph** (cycle-3 carry-over) — unchanged.
- **`workspace:` field on cross-repo ACs in `new-task/SKILL.md`** (cycle-3 Action 3, never landed) — must land before AZ-505 task spec is written, since AZ-505 has explicit cross-repo writes (the gps-denied-onboard side of the UUIDv5 namespace constant).
## 10. What this retro says about process maturity
Cycle 5 is the first cycle that:
- **Split a task spec mid-implementation into a foundation + follow-up pair** that both shipped (foundation in cycle 5, follow-up scheduled as AZ-505). The /autodev step-10 contradiction-driven A/B/C path proved itself end-to-end.
- **Carried a 3-cycle-old leftover from "completely unmeasurable" to "verified-fixed, exit-0 blocked only by infra noise"**. The AZ-504 fix is provably working; the remaining blocker is environment, not code.
- **Shipped a schema migration with a backfill, a contract minor version bump, and a new on-disk path layout — all additive, all backward-compatible with cycle-4 clients** — proving the project's documentation pipeline (test-spec sync, doc update, ripple log, security audit, deploy report) now scales to schema-touching cycles, not just runtime/SDK migrations (cycle 4) or single-file refactors (cycles 1-3).
- **Recorded 3 new infrastructure-level recommendations** (DNS pre-warm in perf script, pgcrypto pre-install runbook step, cross-repo invariant rule in new-task skill) — none of them are bugs in the cycle's code, all of them are process gaps the cycle's *work shape* surfaced. This is the second consecutive cycle where the retro's top action items are predominantly process / harness / runbook, not code defects.
The process continues to converge. The remaining friction points after cycle 5 are (a) local-dev Docker/colima DNS noise that contaminates Step 15 (Action 1), (b) the carry-over package-hygiene PBIs from cycles 3/4 (Test.Sdk, IdentityModel.Tokens.Jwt, WithOpenApi callsites, Serilog 10.x) that have been deferred per scope discipline four cycles in a row, and (c) the cross-repo invariant codification (cycle-3 Action 3) that must land before AZ-505 writes its spec. All are concrete cycle-6 PBI candidates totalling ~12-15 SP, which is one fully-loaded normal cycle — or split across cycles 6 + 7 if AZ-505 alone fills cycle 6.