mirror of https://github.com/azaion/satellite-provider.git synced 2026-06-21 07:01:15 +00:00

Files

T

Oleksandr Bezdieniezhnykh ea278afb37 [AZ-503] [AZ-504] Cycle 5 Step 17: retrospective + close cycle

retro_2026-05-12_cycle5.md captures the cycle-end retrospective:
- Implementation: 2 tasks (AZ-504 + AZ-503-foundation), 4 SP total,
  100% first-attempt pass rate, 1 mid-implement scope-split
  (AZ-503 → AZ-503-foundation + AZ-505, blocked-linked).
- Quality: 50/50 PASS/PASS_WITH_WARNINGS, 0 new Medium+, 1 new Low
  (defensive contentSha256 soft-NULL guard).
- Security: PASS_WITH_WARNINGS, 0 new Critical/High/Medium, 2 new
  Low informational (F1 flightId provenance, F2 pgcrypto runbook
  gap).
- Performance: PASS_WITH_INFRA_WARNINGS — first measurable PT-08
  ever (Run #1 199ms, Run #2 117ms vs 2000ms threshold); PT-01/02
  failed on recurring local Docker/colima DNS cold-start, not an
  app regression.
- Structural: +1 ProjectReference edge (IntegrationTests → Common),
  +1 minor contract bump (uav-tile-upload 1.0.0 → 1.1.0), +1 DB
  migration (014_AddTileIdentityColumns.sql), 0 NuGet bumps,
  0 csproj additions, DAG still acyclic at 9 projects.

structure_2026-05-12_cycle5.md captures the structural snapshot.

LESSONS.md updated with 3 cycle-5 entries (oldest dropped to
preserve the 15-entry ring buffer):
- [architecture] Cross-repo cryptographic invariants must live as
  code constants in both repos with reference-vector tests.
- [tooling] When perf-mode "one re-run" fires twice with the same
  DNS root cause, escalate from re-run to harness fix.
- [process] Spec contradicts live code by >=2 prerequisites →
  prefer split into foundation + follow-up (A/B/C option C).

Top 3 follow-up actions (cycle 6 candidates):
- Action 1 (1 SP): DNS pre-warm in scripts/run-performance-tests.sh
  → closes the cycle-3 perf-harness leftover.
- Action 2 (5 SP): AZ-505 — inventory endpoint + HTTP/2 + Leaflet
  covering index (blocked-linked on AZ-503-foundation, this cycle).
- Action 3 (1 SP): pgcrypto pre-install runbook step (F2-cy5 doc
  fix).

Cycle 5 closed. Autodev state advanced for cycle 6 by the next
/autodev invocation.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-12 18:07:57 +03:00

29 KiB

Raw Blame History

Retrospective — Cycle 5 (2026-05-12)

Tasks: AZ-503-foundation (tile identity — UUIDv5 + integer UPSERT foundation, 3 SP) + AZ-504 (perf-script grep-pipefail fix, 1 SP). The original AZ-503 spec (5 SP combined) was split mid-implement into AZ-503-foundation (this cycle) + AZ-505 (5 SP — inventory endpoint + HTTP/2 + Leaflet covering index, blocked-linked to AZ-503-foundation, deferred to cycle 6). Mode: cycle-end (autodev Step 17) Previous retro: retro_2026-05-12_cycle4.md Cycle shape: small-foundation cycle — first schema-changing cycle since cycle 1's AZ-484; first cycle to ship a contract minor-version bump; first cycle since cycle 1 with multiple measurable PT-08 batches.

1. Implementation Metrics

Metric	Cycle 5	Δ vs cycle 4
Tasks implemented	2 (AZ-504, AZ-503-foundation)	+1
Batches executed	2	+1
Avg tasks / batch	1.0	unchanged
Total complexity delivered	4 SP (1 + 3)	-1 SP
Avg complexity / batch	2 SP	-3 SP
Tasks at-or-below 5 SP cap	2 of 2 (100%)	unchanged
Tasks split mid-cycle into a follow-up PBI	1 of 2 (AZ-503 → AZ-503-foundation + AZ-505)	new (cycle 4 had no splits)
Tasks above cap	0	unchanged
Cumulative reviews	0 (cumulative-review trigger is every 3 batches; cycle 5 has 2 batches, so no trigger)	unchanged

Sequencing: 2 batches — AZ-504 first (smallest mechanical fix; landed in batch 01), AZ-503-foundation second (batch 02). This ordering was chosen so the AZ-504 fix would be in place by the time the AZ-503-foundation tests run (some of which use the perf script's regression-tested helpers). The cycle completed in 6 dev commits counting Step 9 task specs, both batches, the autodev-state chore, Steps 11-15 sync, and Step 16 deploy — three more than cycle-4's 4-commit count because of the scope-split conversation and the two-batch structure.

2. Quality Metrics

Code Review Results

Verdict	Count	Percentage
PASS	1 (batch 01 AZ-504)	50%
PASS_WITH_WARNINGS	1 (batch 02 AZ-503-foundation)	50%
FAIL	0	0%

Findings by Severity (per-batch code review)

Severity	Cycle 5	Δ vs cycle 4
Critical	0	unchanged
High	0	unchanged
Medium	0	-2 (cycle 4 had 2 bump-consequence Mediums)
Low	1 (F-cy5 batch02 — soft-NULL guard on `contentSha256` when `File.Exists==false`; practically unreachable in happy path)	unchanged

Findings by Category

Category	Count	Top Files
Maintainability	1	`SatelliteProvider.Services.TileDownloader/TileService.cs` (BuildTileEntity)
Bug	0	—
Spec-Gap	0	—
Security	0 NEW code-review (2 NEW informational Lows in `_docs/05_security/` — F1-cy5 flightId provenance, F2-cy5 pgcrypto runbook gap — both long-term, not code defects)	-3 informational vs cycle 4
Performance	0	—
Style	0	—
Scope	0	-1 (cycle 4 had 1 — the perf-script path fix)

Note on the 1 Low finding: the contentSha256 soft-NULL guard is defensive against transient I/O failure between the JPEG-write and the SHA256-compute steps. The downloader writes the file before the SHA256 read, so in practice the NULL branch is unreachable. The column is NULLable at the DB level. Tightening to throw-on-missing-file is recommended as a follow-up if downstream consumers ever rely on NOT NULL.

Security audit (cycle 5)

Metric	Value	Δ vs cycle 4
Verdict	PASS_WITH_WARNINGS	unchanged (cumulative pos.)
Mode	Delta (full re-scan of new/modified code in AZ-503/AZ-504 against OWASP Top 10 + dependency manifest diff + infrastructure-change check)	new this cycle (cycle 4 was "Resume narrowed to dependency_scan only")
New Critical / High	0 / 0	unchanged
New Medium	0	unchanged
New Low (informational only)	2 (F1-cy5: `metadata.flightId` not authenticated provenance — long-term recommendation when an authoritative flight registry exists; F2-cy5: `pgcrypto` deployment runbook gap on managed Postgres — captured in `deploy_cycle5.md` operator runbook)	-3 vs cycle 4 (5 informational confirmations)
Resolved findings	0	-2 (cycle 4 forward-resolved 2 cycle-3 carry-overs via major-version bumps; cycle 5 made no package bumps)
Carry-overs (still OPEN)	3 (cycle-3 D2 `Microsoft.NET.Test.Sdk 17.8.0` transitive flag; cycle-4 D-IdentityModel-7.0.3 — both `Microsoft.IdentityModel.Tokens` and `System.IdentityModel.Tokens.Jwt` in TestSupport; `Serilog.AspNetCore 8.0.3` fallback)	unchanged

SHA-1 in Uuidv5.cs was explicitly assessed and cleared: RFC 9562 §5.5 mandates SHA-1 for namespace-based UUIDv5; the algorithm is not used as a cryptographic hash for security (no collision-resistance against an adversary is claimed); the input space (tile coordinates × namespace) is not adversary-controlled in any reasonable threat model. This is the standard library-vetted construction for deterministic IDs — flagged explicitly in static_analysis_cycle5.md so it does not get re-flagged in future audits.

Performance gate (cycle 5)

Metric	Value
Verdict	PASS_WITH_INFRA_WARNINGS
Scenarios	6 Pass · 0 Warn · 2 Fail · 0 Unverified (across Run #2 — Run #1 was 5 Pass + 3 Fail before `colima restart`)
AZ-504 NFR (PT-08 reaches summary)	MET — first cycle ever where PT-08 returns a measured batch p95. PT-08 ran clean across both runs: Run #1 batch p95 = 199ms, Run #2 batch p95 = 117ms (vs 2000ms AZ-488 threshold).
AZ-503-foundation NFR (no UPSERT hot-path regression)	MET — PT-08 uses the new integer-key + flight-aware UPSERT for 200 batches; 200/200 accepted, 0 rejected, 0 failed; p95 117ms is faster than any prior cycle's measurement of this path.
Cycle-3 perf-harness leftover	STAYS OPEN — replay #5 documented. The AZ-504 fix is verified working across two perf runs, but the "default-parameter exit-0 run" criterion is blocked by recurring local Docker/colima DNS cold-start (PT-01 fails on `mt0/tile.googleapis.com` DNS lookup at the first request after each `docker compose up`). Two concrete closure paths recorded in the leftover (DNS pre-warm in script, OR move perf gate to CI).

3. Structural Metrics (snapshot: `structure_2026-05-12_cycle5.md`)

Metric	Cycle 5	Δ vs cycle 4
.NET projects (csproj)	9	unchanged
Cross-project edges (ProjectReference)	21	+1 (IntegrationTests → Common — justified by cross-repo invariant deduplication)
Cycles in project graph	0	unchanged
Average ProjectReferences per component	~2.3	+0.1
Max in-degree (Common)	7	+1 (now also imported by IntegrationTests)
New Architecture violations	0	unchanged
Resolved Architecture violations	0	-2 vs cycle 4 (cycle 4 forward-resolved 2 via the .NET 10 bump; cycle 5 had no bumps)
Net Architecture delta	0	unchanged (cycle 4 was also 0)
Public-API contract delta	+1 minor (`uav-tile-upload.md` 1.0.0 → 1.1.0 additive)	new this cycle (cycle 4 had no contract changes)
Database schema delta	+4 columns, +2 indexes, -2 indexes, +1 extension (`pgcrypto`)	new this cycle (cycle 4 had no schema changes)
NuGet package bumps	0	-16 (cycle 4 coordinated 16 distinct package bumps)

Cycle 5 structural posture: one schema migration, one minor contract bump, one new cross-component utility (Uuidv5), one new IntegrationTests → Common edge. Zero NuGet bumps, zero csproj additions, zero new public-API endpoints (the inventory endpoint is deferred to AZ-505). The DAG remains acyclic.

4. Efficiency Metrics

Metric	Cycle 5	Δ vs cycle 4
Blocked tasks (during implementation)	0 of 2	unchanged
Tasks split mid-implement	1 of 2 (AZ-503 → AZ-503-foundation + AZ-505 via A/B/C decision at /autodev Step 10)	new this cycle
Tasks completed first attempt (no post-review fix commits)	2 of 2 (100%)	unchanged (cycle 4 also 100%)
Tasks requiring multiple post-code-review fix commits	0	unchanged
Most-findings batch	batch 02 (AZ-503-foundation — 1 Low; batch 01 had 0 findings)	similar shape
Step-15 (Perf Test) execution	EXECUTED twice (Run #1 + Run #2; Run #2 is the better signal post `colima restart`)	unchanged (cycle 4 also executed Step 15)
Step-15 leftover at end of cycle	YES (still — same cycle-3 leftover) — but the AZ-504 verification half is satisfied this cycle; only the "infra-noise-free exit-0 run" half remains.	unchanged in identity, narrower in remaining gap
Step-14 (Security Audit) — net findings improvement	0 new Medium+, 2 new informational Lows, 0 resolved	unchanged net direction
Number of A/B/C decision points hit during autodev	3 (Step 10 scope split, Step 15 Run #1 gate, Step 15 Run #2 gate)	+2 vs cycle 4's 0 in-cycle A/B/C points

5. Patterns Identified

Pattern 1 — Spec-contradiction-driven A/B/C split at /autodev Step 10 is now a known shape

When AZ-503-implementation started, three contradictions surfaced against the live codebase: flight_id did not exist as a column, FlightId did not exist as a DTO field, and voting_status did not exist as referenced by an AC. The combined work needed to make the spec executable as written was ~5 SP — above the cycle-policy cap. Per meta-rule.mdc "Critical Thinking" + autodev/protocols.md A/B/C scope-protection, the implement skill stopped, surfaced the contradictions, and offered three options (A: implement the spec as-written; B: defer the whole task; C: split into foundation + follow-up — which the user picked).

The split produced a self-contained, testable, 3 SP foundation PBI that this cycle delivered cleanly, and a 5 SP AZ-505 follow-up that is now blocked-linked in the Jira graph. This is the first cycle where a scope-split mid-implement happened and landed cleanly without losing test coverage continuity.

Insight: when a /autodev cycle's task spec drifts from the live codebase by more than ~1 missing prerequisite, the split-into-foundation pattern is preferable to either (a) silently expanding the cycle's SP budget or (b) deferring the entire PBI. The foundation captures the prerequisite infrastructure the follow-up needs; the follow-up captures the user-facing capability. Both halves remain individually shippable and individually testable.

Pattern 2 — First measurable PT-08 batch in the project's history

PT-08 (UAV batch upload p95) was scenario-spec'd in cycle 2 (AZ-488) but never produced a measurable batch number until this cycle. Three runs preceded this one (cycle 3 short variant: PT-08 crashed at line 417; cycle 4 full run: same crash; cycle 5 Run #1 + Run #2: clean exits to summary with 200/200 accepted). The AZ-504 fix (grep -c … || true) is the single thing that unblocked this.

The captured numbers are: Run #1 batch p95 = 199ms, Run #2 batch p95 = 117ms, both far under the 2000ms AZ-488 threshold. Per-item proxy = batch p95 / batch_size = 11–19ms, far under the AZ-492 per-item gate.

Insight: a 1-SP mechanical script fix unblocked a previously-unmeasurable NFR that had been carried as a leftover across three cycles. The leftover replay protocol kept the issue visible without blocking forward progress; the eventual fix took 15 minutes once it became the cycle's smallest atomic PBI. This is a textbook case for the "leftover replay" mechanism in tracker.mdc — keep small unblocked-but-not-yet-fixed items visible, close them when they fit naturally in a cycle's scope.

Pattern 3 — Local Docker/colima DNS cold-start is a recurring class of "infra-noise" failure that contaminates the perf gate

Cycle 4's perf gate noted DNS resolution intermittence as an unrelated test flake. Cycle 5 hit the same class twice in the same session — first during the functional test phase (resolved by colima restart), then during Step 15 Run #1 (manifested on tile.googleapis.com), then again during Step 15 Run #2 (manifested on mt0.google.com — the warmup probe between Run #1 and Run #2 only touched tile.googleapis.com + mt1.google.com).

The pattern is: the first Google-Maps tile fetch immediately after docker compose up may fail DNS resolution if the colima resolver hasn't pre-cached the specific hostname yet. Subsequent fetches succeed. PT-01 (cold tile download) is the first scenario that hits a Google-Maps hostname and therefore is the canary for this class of failure.

Insight: this is not an application regression — it is environment instability that the perf script does not currently shield itself against. Two concrete mitigations are recorded in the cycle-3 leftover (replay #5 entry) as out-of-scope follow-up PBIs:

Add a DNS pre-warm step to scripts/run-performance-tests.sh before PT-01 (1 SP, deterministic fix);
Move the perf gate to CI / a stable-resolver environment (2 SP, structural fix).

The pattern is worth surfacing as a Lesson — the perf-mode skill (test-run/SKILL.md) already says "always worth ONE re-run before declaring a regression", and we did one; but the underlying environment continues to be the cycle's single largest source of non-application gate noise.

Pattern 4 — Cross-repo cryptographic invariants belong in a code-level constant, not a doc reference

AZ-503 introduces a TileNamespace UUID (5b8d0c2e-7f1a-4d3b-9c5e-1f3a8e7d2b6c) that MUST byte-match the same constant in gps-denied-onboard/components/c6_tile_cache/_uuid.py (sibling workspace), or every cross-repo tileId lookup silently misses. The constant lives as a pinned public const string TileNamespace in SatelliteProvider.Common.Utils.Uuidv5 and is asserted by the Uuidv5Tests reference-vector unit tests. The sibling workspace is documented to mirror the same constant.

Insight: when a cross-repo invariant is a magic constant (a UUID, a base32 alphabet, a tile-zoom convention), it must live in code in BOTH repos with reference-vector tests on BOTH sides. Documentation alone (e.g., "see contract foo.md") is not enough — a drift between the constants would only surface as a 100% lookup-miss in production, which is harder to detect than a unit-test failure. Cycle 5 captured this on the satellite-provider side; the sibling-repo side will be handled when AZ-505 lands.

Pattern 5 — Schema migrations + `pgcrypto` create a deploy-side dependency that the runbook must spell out

Migration 014 enables pgcrypto (CREATE EXTENSION IF NOT EXISTS pgcrypto;) for the session-scoped backfill function. On stock Postgres 16 (our postgres:16 Docker image), pgcrypto is bundled. On managed cloud Postgres providers (RDS, Cloud SQL, Azure Postgres), the migration-running role typically needs cloudsqlsuperuser / rds_superuser / equivalent to CREATE EXTENSION. If that privilege is absent, migration 014 fails and the application doesn't start.

This was flagged as F2-cy5 (Low informational) during the security audit, recorded in deploy_cycle5.md operator runbook step 2, and explicitly called out as a recommended out-of-scope follow-up PBI. It is NOT a code defect — it is a deploy-runbook gap.

Insight: cycles that add CREATE EXTENSION / CREATE ROLE / ALTER SYSTEM / other privilege-sensitive statements to migrations must produce a deploy-runbook update in the same cycle, even when the cycle is otherwise small. The information is uniquely-visible at code-write time; if the runbook update is deferred, it tends to get lost.

6. Comparison vs. previous retros

Metric	Cycle 1	Cycle 2	Cycle 3	Cycle 4	Cycle 5
Tasks implemented	1	2	6	1	2
Total complexity delivered	8 SP	10 SP	18 SP	5 SP	4 SP
Batches	1	2	5	1	2
Critical/High review findings	0	0	0	0	0
New Medium review findings	0	0	0	2 (bump conseq.)	0
New Low review findings	3	6 (5 distinct)	7	1	1
Code review pass rate	100% (1/1)	100% (2/2)	100% (5/5)	100% (1/1)	100% (2/2)
Tasks completed first attempt	0 of 1	0 of 2	5 of 6	1 of 1	2 of 2
Tasks split mid-implement	0	0	0	0	1
New Medium security findings	2	2	0	0	0
Resolved security findings	0	0	3	2 (fwd-bump)	0
Net Architecture delta	n/a (baseline)	+0	-3	0	0
Schema change	YES (AZ-484)	NO	NO	NO	YES (AZ-503-foundation)
Public-API contract change	YES (tile-storage v1.0.0)	YES (uav-tile-upload v1.0.0)	NO	NO	YES (uav-tile-upload v1.0.0 → v1.1.0 additive)
Step-15 (Perf) executed	YES	SKIPPED	SKIPPED	YES	YES (Run #1 + Run #2)
Step-15 leftover at retro	NO	YES	YES	YES	YES (still — same cycle-3 leftover; AZ-504 verification half satisfied)
First measurable PT-08 batch	n/a	n/a	n/a	NO (script crash)	YES (Run #1 199ms, Run #2 117ms)

Did the cycle-4 actions land?

Cycle 4 Action 1 (fix scripts/run-performance-tests.sh:416-417 grep-pipefail) — LANDED as AZ-504 in cycle 5 (this cycle). Closes Pattern 2 from cycle 4's retro. Verified working across two perf runs.
Cycle 4 Action 2 (migrate WithOpenApi(...) callsites to ASP.NET Core 10 minimal-API metadata extensions, 3 SP) — NOT landed in cycle 5. Explicitly out of AZ-503/AZ-504 scope per coderule.mdc "scope discipline". Re-listed as a recommended follow-up PBI in deploy_cycle5.md. Carries to cycle 6.
Cycle 4 Action 3 (pre-flight transitive-major-version impact analysis at task-spec time) — NOT directly exercised in cycle 5. AZ-503 and AZ-504 had zero NuGet bumps, so the rule was preventive only. It still lives in coderule.mdc (added in cycle 4) and will fire on the next package-bumping PBI.

This is the third consecutive cycle where a prior-retro action landed (Action 1 here, Action 1 in cycle 4, Actions 1+2+3 in cycle 3). Pattern is stable.

Did the cycle-3 actions land?

Cycle 3 Action 1 (execute perf harness against deployed dev image) — landed in cycle 4 (implicit AZ-500 NFR gate). Closed.
Cycle 3 Action 2 (bump System.IdentityModel.Tokens.Jwt 7.0.3 → 7.1.2+) — NOT landed in cycle 5. Carry-over D2-cy4 still open. Test-runtime exposure only; safe to land independently.
Cycle 3 Action 3 (workspace: field on cross-repo ACs in new-task skill) — NOT exercised. AZ-503 has one cross-workspace invariant (the TileNamespace UUID) but that invariant is captured as a Constraint in the task spec body rather than as a per-AC workspace: tag. The rule is still not codified in new-task/SKILL.md. AZ-505 next cycle has explicit cross-repo writes (the gps-denied-onboard side of the UUIDv5 namespace constant); the cycle-3 rule should land before AZ-505 writes its spec.

7. Top 3 Improvement Actions (ranked by impact)

Action 1 — Add DNS pre-warm to `scripts/run-performance-tests.sh` before PT-01 (1 SP, deterministic)

Why this is the highest impact: Pattern 3 above documents the same class of failure across cycles 4 + 5. It contaminates the perf gate with non-application noise, blocks closure of the cycle-3 perf-harness leftover (now in its third carry-over cycle), and forces a manual colima restart + re-run per session. A deterministic DNS pre-warm before PT-01 fires removes the entire failure class on the local-dev runner.

Action: 1 SP PBI in cycle 6. Insert a getent hosts mt0.google.com mt1.google.com mt2.google.com mt3.google.com tile.googleapis.com (or equivalent for the runner's resolver) inside the api container immediately before PT-01 — fail-fast if any hostname is unresolvable after a small retry window. Closes the cycle-3 perf-harness leftover on the next run.

Cost: ~20 minutes (one shell addition + one perf run to verify exit-0 + delete the leftover). Counted as 1 SP because deletion of the leftover requires a full clean run.

Action 2 — Implement AZ-505 (deferred AZ-503 half: inventory endpoint + HTTP/2 + Leaflet covering index, 5 SP)

Why: AZ-505 is blocked-linked to AZ-503-foundation. The foundation (this cycle) shipped the deterministic identity, the integer-key UPSERT, and the per-flight layout. AZ-505 ships the user-facing inventory endpoint (POST /api/satellite/tiles/inventory) that lets consumers ask "given these N (lat,lon,zoom) coordinates, which tileIds + variants do you have?". AZ-505 also enables HTTP/2 on the API (required for batched inventory responses without TCP head-of-line blocking) and rewrites the Leaflet hot path against the new location_hash index.

Action: 5 SP PBI in cycle 6. Foundation prerequisites are now in place. AZ-505 also carries the cross-repo write obligation for the gps-denied-onboard side of the UUIDv5 namespace constant.

Cost: 5 SP — within the cycle cap. Foundation is done; this is the user-facing payload that justifies the schema work.

Action 3 — Add a `pgcrypto` pre-install check step to the deployment runbook (1 SP, ops-side)

Why: Pattern 5 above. Migration 014 silently relies on pgcrypto being installable by the migration-running role. Stock Postgres 16 is fine; managed cloud Postgres providers may not be. F2-cy5 already captures this in _docs/05_security/owasp_review_cycle5.md and deploy_cycle5.md operator runbook step 2 — but a runbook step that says "check this before running the migration" is necessary if the project ever migrates off Docker-postgres for a non-dev environment.

Action: 1 SP PBI in cycle 6 (or land as a doc-only PR within cycle 6's scope-discipline budget). Update the deployment runbook with a pre-migration SELECT EXTNAME FROM pg_extension WHERE extname='pgcrypto' + a fallback path if missing.

Cost: ~30 minutes of doc work. Counted as 1 SP because it touches the deployment runbook (cross-cutting infra doc).

8. Suggested Rule / Skill updates

File	Change	Rationale
`coderule.mdc` (new bullet in scope-discipline section)	"When a migration adds a `CREATE EXTENSION` / `CREATE ROLE` / `ALTER SYSTEM` / other privilege-sensitive statement, the same cycle's deploy report MUST add a pre-migration runbook step that verifies the privilege exists in the target environment. The runbook step is required even when the cycle is otherwise small."	Pattern 5 (pgcrypto in migration 014 created a deploy-runbook gap that was caught at security-audit time, not at migration-write time)
`new-task/SKILL.md` (new check in Step 5 — Risks & Mitigation)	When a task spec introduces a cross-repo cryptographic invariant (a UUID namespace, a base32/64 alphabet, a tile-zoom convention, a deterministic-key formula), the spec MUST list both the in-workspace code location of the constant AND the sibling-workspace code location it must byte-match — with reference-vector tests on both sides. Doc references alone do not satisfy this.	Pattern 4 (`TileNamespace` UUID must byte-match `gps-denied-onboard/components/c6_tile_cache/_uuid.py`)
`autodev/protocols.md` (formalise Step-10 contradiction-driven A/B/C)	The "scope-split" branch of the scope-protection A/B/C choice should be a first-class named option, not an ad-hoc decision. When a /autodev cycle's task spec contradicts the live codebase by >=2 prerequisites, the implement skill should preferentially offer A: implement as-written / B: defer entirely / C: split into foundation + follow-up with C being the recommended default. Cycle 5 derived this manually; codifying it makes future cycles cheaper to navigate.	Pattern 1 (AZ-503 → AZ-503-foundation + AZ-505 split was the cycle's largest decision, made via ad-hoc Choose; the recommended option matched the documented pattern but the path was discovered, not followed)
`test-run/SKILL.md` (Perf Mode Step 5 — add explicit retrospective trigger)	After the "one re-run" rule fires twice across consecutive cycles with the same root-cause class (e.g., DNS, NTP, resolver), the perf-mode skill should auto-surface a recommended PBI for a deterministic fix at the environment / harness layer — not as a re-run, as a fix. Cycle 5 fired the rule manually here.	Pattern 3 (DNS cold-start hit cycle 4 + cycle 5; the perf-mode skill's "one re-run" already shields against single-incident noise, but doesn't yet escalate when noise recurs)

9. Decision items carried over (operator)

Cycle-3 perf-harness leftover — STAYS OPEN. Replay #5 entry recorded. Half-closed this cycle: AZ-504 script fix verified working across 2 runs; remaining half (full default-parameter exit-0 run) blocked by recurring local DNS noise. Closure path: Action 1 above (DNS pre-warm in script, 1 SP) OR move perf gate to CI/cloud runner.
Admin team iss/aud confirmation (carried from cycles 3 + 4) — still required before promoting beyond dev. Unchanged. Tracked in deploy_cycle3.md + deploy_cycle4.md + deploy_cycle5.md.
D2-cy4 — Microsoft.NET.Test.Sdk 17.8.0 transitive NuGet.Frameworks flag — unchanged. Test-runtime exposure only; safe to land in a future cycle.
D-IdentityModel-7.0.3 (cycle-4 carry-over — both Microsoft.IdentityModel.Tokens and System.IdentityModel.Tokens.Jwt at 7.0.3 in TestSupport, NU1902) — unchanged. Cycle-3 Action 2 obligation; should land before any new auth-touching cycle.
F1-cy5 — metadata.flightId authenticated provenance — long-term, not actionable until an authoritative flight registry exists in the suite. Recorded in owasp_review_cycle5.md and deploy_cycle5.md as a long-term recommendation.
F2-cy5 — pgcrypto deployment runbook gap — Action 3 above. Trivial doc-only fix.
Serilog.AspNetCore 8.0.3 fallback — unchanged; no 10.x line published as of cycle 5. Recheck at every cycle start.
Cross-repo doc suite/_docs/10_auth.md paragraph (cycle-3 carry-over) — unchanged.
workspace: field on cross-repo ACs in new-task/SKILL.md (cycle-3 Action 3, never landed) — must land before AZ-505 task spec is written, since AZ-505 has explicit cross-repo writes (the gps-denied-onboard side of the UUIDv5 namespace constant).

10. What this retro says about process maturity

Cycle 5 is the first cycle that:

Split a task spec mid-implementation into a foundation + follow-up pair that both shipped (foundation in cycle 5, follow-up scheduled as AZ-505). The /autodev step-10 contradiction-driven A/B/C path proved itself end-to-end.
Carried a 3-cycle-old leftover from "completely unmeasurable" to "verified-fixed, exit-0 blocked only by infra noise". The AZ-504 fix is provably working; the remaining blocker is environment, not code.
Shipped a schema migration with a backfill, a contract minor version bump, and a new on-disk path layout — all additive, all backward-compatible with cycle-4 clients — proving the project's documentation pipeline (test-spec sync, doc update, ripple log, security audit, deploy report) now scales to schema-touching cycles, not just runtime/SDK migrations (cycle 4) or single-file refactors (cycles 1-3).
Recorded 3 new infrastructure-level recommendations (DNS pre-warm in perf script, pgcrypto pre-install runbook step, cross-repo invariant rule in new-task skill) — none of them are bugs in the cycle's code, all of them are process gaps the cycle's work shape surfaced. This is the second consecutive cycle where the retro's top action items are predominantly process / harness / runbook, not code defects.

The process continues to converge. The remaining friction points after cycle 5 are (a) local-dev Docker/colima DNS noise that contaminates Step 15 (Action 1), (b) the carry-over package-hygiene PBIs from cycles 3/4 (Test.Sdk, IdentityModel.Tokens.Jwt, WithOpenApi callsites, Serilog 10.x) that have been deferred per scope discipline four cycles in a row, and (c) the cross-repo invariant codification (cycle-3 Action 3) that must land before AZ-505 writes its spec. All are concrete cycle-6 PBI candidates totalling ~12-15 SP, which is one fully-loaded normal cycle — or split across cycles 6 + 7 if AZ-505 alone fills cycle 6.

29 KiB Raw Blame History Unescape Escape