[AZ-503] [AZ-504] Cycle 5 Steps 11-15 sync

Wrap up cycle 5 verification + documentation:
- Steps 10/11 wrap-up reports (implementation_completeness +
  implementation_report) for the AZ-503-foundation + AZ-504 batch.
- Step 12 test-spec sync: AZ-503-foundation/AZ-504 ACs appended;
  AZ-505 deferred ACs recorded.
- Step 13 update-docs: architecture, data-model, glossary, module-
  layout, uav-tile-upload contract (v1.1.0), DataAccess + Services
  + Tests module docs synced; new common_uuidv5.md module doc.
- Step 14 security audit: PASS_WITH_WARNINGS; 0 new Critical/High;
  2 new Low informational (F1 flightId provenance, F2 pgcrypto
  deploy gap).
- Step 15 performance test: PASS_WITH_INFRA_WARNINGS; PT-08
  passed twice (AZ-504 fix verified); PT-01/02 failed due to
  recurring local Docker/colima DNS cold-start (not an app
  regression). Cycle-3 perf-harness leftover stays OPEN with
  replay #5 documented.
- Autodev state moved to Step 16 (Deploy).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 18:01:27 +03:00
parent c646aa93e2
commit 61612044fb
27 changed files with 1075 additions and 50 deletions
@@ -130,3 +130,30 @@ PBI opened: **AZ-504 — "Perf script: fix grep | wc -l pipefail crash in PT-08"
The "open the PBI" half of the Replay obligation is now done. The "full perf run is green" half remains outstanding — this leftover stays open until AZ-504 lands AND a default-parameter `./scripts/run-performance-tests.sh` (`PERF_REPEAT_COUNT=20 PERF_UAV_BATCH_SIZE=10`) exits 0 against an api built from `dev`.
Next-cycle /autodev should NOT attempt replay #5 (open another PBI) — AZ-504 is the canonical replay vehicle. The next replay action is implementing AZ-504 itself (cycle 5 Step 10).
## Replay attempt #5 — 2026-05-12T14:34Z / 14:50Z (cycle 5 Step 15 Performance Test gate, post-AZ-504 landed)
AZ-504 landed in cycle 5 (Steps 1012). User picked A at the Step 15 (Performance Test) gate. Two full default-parameter runs of `./scripts/run-performance-tests.sh` (`PERF_REPEAT_COUNT=20 PERF_UAV_BATCH_SIZE=10`) executed against `docker compose up -d --build`. Full report in `_docs/06_metrics/perf_2026-05-12_cycle5.md`.
| | Run #1 (14:34Z, no prep) | Run #2 (14:50Z, post `colima restart`) |
|---|---|---|
| Exit code | 1 | 1 |
| PT-08 (AZ-504 fix) | **PASS** 199ms p95 | **PASS** 117ms p95 |
| PT-01 (cold tile) | FAIL HTTP 500 `tile.googleapis.com` DNS | FAIL HTTP 500 `mt0.google.com` DNS |
| PT-02 (cached tile) | FAIL HTTP 500 (cascade of PT-01) | FAIL 1060ms (cascade of PT-01) |
| PT-03..PT-07 | mostly PASS once DNS warmed mid-run | all PASS |
**AZ-504 verification: MET across both runs.** PT-08 reaches summary cleanly for the first time across all 5 replay attempts in this leftover. The `grep -c … || true` pipefail fix in `scripts/run-performance-tests.sh:416-417` works as designed.
**AZ-503-foundation regression check: PASS.** PT-08 p95 = 117ms (vs 2000ms threshold; vs cycle-4 ad-hoc 99ms single-batch; vs Run #1 199ms). The new integer-only, flight-aware UPSERT path is faster, not slower.
**Why this leftover STAYS OPEN despite AZ-504 landing**: the deletion criterion is "the full perf script runs cleanly" / "exit 0". Run #2 exited 1 because of a recurring intermittent Docker/colima DNS cold-start bug — the first Google Maps hostname touched by PT-01 after each `docker compose up` is uncached in colima's resolver, so PT-01 returns HTTP 500. After ~1 retry / a few seconds, all `mt0..mt3.google.com` + `tile.googleapis.com` are warm and every subsequent scenario succeeds. This is **infrastructure noise, not application regression** and not an AZ-504 script bug.
**Two consecutive runs are enough**. Per `meta-rule.mdc`'s "long investigation retrospective" trigger, chasing this with Run #3 / #4 / restarting colima again would be a rabbit-hole. The perf-mode skill (`test-run/SKILL.md` §Perf Mode → Step 5) is explicit: "always worth **one** re-run before declaring a regression" — we did one.
**Recommended out-of-scope follow-ups to actually close this leftover** (estimated 1 SP each, do NOT open in cycle 5 — that violates scope discipline):
1. **Add DNS pre-warmup to `scripts/run-performance-tests.sh`** before PT-01. Inside the api container or via `docker compose exec api`, run `getent hosts mt0.google.com mt1.google.com mt2.google.com mt3.google.com tile.googleapis.com` once. This deterministically removes the cold-DNS class of PT-01 / PT-02 failures.
2. **Run perf in CI / cloud** with a stable resolver — the harness is portable, only the orchestration layer changes.
Either follow-up, when implemented, will produce an exit-0 default-parameter run and let this leftover be deleted. Until then, this leftover stays open with the AZ-504 verification half satisfied and the green-exit-0 half blocked by infra (not the script, not the application).