Files
Oleksandr Bezdieniezhnykh 0e05fc519a [AZ-503] [AZ-504] Cycle 5 Step 16 deploy report
deploy_cycle5.md captures everything operators need to promote
cycle 5 beyond dev:

- Code shipped: AZ-503-foundation (deterministic UUIDv5 tile
  identity, integer-only flight-aware UPSERT, per-flight on-disk
  paths) + AZ-504 (perf script grep-pipefail fix).
- NEW database migration 014_AddTileIdentityColumns.sql adds
  flight_id, location_hash, content_sha256, legacy_id; enables
  pgcrypto; swaps the AZ-484 float index for the new
  idx_tiles_unique_identity integer index. Idempotent under
  DbUp's journal.
- NEW contract version uav-tile-upload.md 1.0.0 → 1.1.0 (adds
  optional flightId; derived tileId in response).
- NEW per-flight on-disk path layout for UAV tiles (additive;
  legacy paths preserved).
- No env-var changes. Container image base unchanged from cycle 4.
- Verification gates passed: PASS (Step 11), PASS (Steps 12+13),
  PASS_WITH_WARNINGS (Step 14), PASS_WITH_INFRA_WARNINGS (Step 15).
- Cycle-3 perf-harness leftover stays OPEN with two clean follow-up
  paths recorded (DNS pre-warm in script, OR move perf gate to CI).
- Operator runbook includes pgcrypto pre-install check for managed
  Postgres providers.

Autodev state advanced to Step 17 (Retrospective).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 18:01:49 +03:00

127 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Deploy Report — Cycle 5 (AZ-503-foundation + AZ-504)
**Date**: 2026-05-12
**Cycle**: 5
**Scope**: Two-task cycle —
1. **AZ-503-foundation**: deterministic tile identity (UUIDv5 namespace + content SHA-256), integer-only flight-aware UPSERT, per-flight on-disk paths, DB migration `014_AddTileIdentityColumns.sql` (adds `flight_id`, `location_hash`, `content_sha256`, `legacy_id`; enables `pgcrypto` extension; supersedes the AZ-484 float-based unique index with `idx_tiles_unique_identity`).
2. **AZ-504**: pre-existing `scripts/run-performance-tests.sh:416-417` `grep -o … | wc -l` pipefail crash repaired with `grep -c … || true` — unblocks PT-08 batch summarisation.
The larger AZ-503 scope (`POST /api/satellite/tiles/inventory` endpoint, HTTP/2 enablement, Leaflet covering index) was split into **AZ-505 (next cycle)** at Step 9 to keep cycle 5 within the 25 SP rule.
## What is shipping
### Code changes (committed to `dev`)
| Commit | Subject |
|--------|---------|
| `8e509b5` | `[AZ-503] [AZ-504] cycle 5 new-task: tile identity + perf-script-fix` |
| `ab437a1` | `[AZ-504] Fix grep \| wc -l pipefail crash in PT-08 batch counting` |
| `f619749` | `chore: update autodev state after AZ-504 batch 1` |
| `c646aa9` | `[AZ-503] Tile identity → UUIDv5 + integer UPSERT (foundation)` |
| _pending this commit_ | `[AZ-503] [AZ-504] Cycle 5 Steps 12-15 sync (test-spec / docs / security / perf)` |
| _pending this commit_ | `[AZ-503] [AZ-504] Cycle 5 Step 16 deploy report` |
The four landed commits are on `dev` but NOT YET pushed to `origin/dev` as of this report. Operator runbook step 1 below covers the push.
### Database migration (NEW — operator must coordinate)
**Migration `014_AddTileIdentityColumns.sql`** lands automatically on container startup via the existing DbUp runner (`SatelliteProvider.DataAccess/DatabaseMigrator.cs`). Idempotent — re-running is a no-op.
Schema changes on the `tiles` table:
| Column | Type | Nullability | Purpose |
|--------|------|-------------|---------|
| `flight_id` | `uuid` | NULLable | UAV-source tiles only; null for `google_maps`. Distinguishes per-flight uploads at the UPSERT level. |
| `location_hash` | `bytea` (16 B, MD5 of integer key) | NOT NULL (backfilled deterministically for legacy rows) | Used by `idx_tiles_location_hash` for fast lookups. |
| `content_sha256` | `bytea` (32 B) | NULLable | SHA-256 of tile bytes for content-integrity / future dedup. Populated on new writes; NULL for pre-migration rows. |
| `legacy_id` | `uuid` | NULLable | Captures the pre-AZ-503 random `id` value so external references survive the move to deterministic UUIDv5. NULL for new rows. |
Index changes:
| Change | Index | Notes |
|--------|-------|-------|
| **DROPPED** | `idx_tiles_unique_location` (AZ-484 float-based) | Superseded by the integer-key index below. |
| **CREATED** | `idx_tiles_unique_identity` UNIQUE on `(zoom_level, tile_size_meters, tile_x_int, tile_y_int, source, COALESCE(flight_id, '00000000-0000-0000-0000-000000000000'))` | Resolves UPSERT conflicts. Float-rounding ambiguity from AZ-484 is gone. |
| **CREATED** | `idx_tiles_location_hash` on `location_hash` | Future inventory-endpoint lookups (AZ-505). |
Extension changes:
- **`pgcrypto`** is `CREATE EXTENSION IF NOT EXISTS pgcrypto;` at the top of migration 014. Used during migration only (for the deterministic backfill of `location_hash` on existing rows). After backfill, the application code does not query pgcrypto functions — hashes are computed in C# via `System.Security.Cryptography.SHA256` and `SatelliteProvider.Common.Utils.Uuidv5`. **Pre-deploy ops check**: on managed Postgres providers (RDS, Cloud SQL, Azure Postgres), confirm `pgcrypto` is in the `cloudsqlsuperuser`/`rds_superuser`-installable list. On stock Postgres 16 (our `docker-compose.yml` uses `postgres:16`), it is bundled. See F2-cy5 in `_docs/05_security/owasp_review_cycle5.md`.
Backward compatibility:
- **Reads** of legacy rows continue to work — they have NULL `flight_id`, `content_sha256`, `legacy_id`. The new index treats NULL `flight_id` as `'00000000-...-0000'` via `COALESCE`, so legacy `(google_maps, …)` rows are still uniquely keyed.
- **Writes** of new google-maps tiles continue to work unchanged — the AZ-503-foundation change preserved the existing producer path; only the UAV path was made flight-aware.
- **No rename of any existing column or table** — the change is purely additive + index swap. Per `coderule.mdc`: "Do not rename any databases or tables or table columns without confirmation."
### Configuration changes (operator must verify before promoting)
| Setting | Was | Now | Source |
|---------|-----|-----|--------|
| **No new env vars introduced.** | — | — | Cycle 5 carries forward the cycle-4 env contract verbatim (`JWT_SECRET ≥ 32B`, `JWT_ISSUER`, `JWT_AUDIENCE`, `GOOGLE_MAPS_API_KEY`). |
| Postgres extension | (none required) | **`pgcrypto` must be installable** by the migration-running role (typically the app's DB owner). Stock Postgres 16: pre-bundled. Managed cloud Postgres: verify per provider docs. | AZ-503 migration `014_AddTileIdentityColumns.sql` (line 1). |
| On-disk path layout for UAV tiles | `./tiles/uav/{zoom}/{x}/{y}.jpg` (legacy) | **`./tiles/uav/{flightId or 'none'}/{zoom}/{x}/{y}.jpg`** — flight-aware sub-directory | `SatelliteProvider.Services/Handlers/UavTileUploadHandler.cs` + `_docs/02_document/contracts/api/uav-tile-upload.md` v1.1.0. **Operator note**: existing legacy UAV tiles on disk under `./tiles/uav/{zoom}/...` are NOT moved — only new uploads use the per-flight tree. No backfill of files is performed (intentional — see AZ-503 ripple log). |
| Container image (`api` service) | `mcr.microsoft.com/dotnet/aspnet:10.0` (cycle-4 baseline) | **unchanged** (`mcr.microsoft.com/dotnet/aspnet:10.0`) | No Dockerfile, no `.woodpecker/*.yml`, no `scripts/run-tests.sh` changes this cycle. |
### Contract changes (consumer-visible)
| Contract | Version | Change | Action for consumers |
|----------|---------|--------|----------------------|
| `POST /api/satellite/tiles/uav` (`uav-tile-upload.md`) | **1.0.0 → 1.1.0** | Adds optional `metadata.flightId: uuid?` field on each tile item. Adds derived `tileId` (deterministic UUIDv5) to the response. | **Additive only**: existing clients that don't send `flightId` continue to work — they get the `flight_id=null` UPSERT slot (same as cycle 4 behaviour). Clients ingesting tiles from multiple flights into the same lat/lon/zoom cell SHOULD start sending `flightId` to avoid cross-flight collisions. |
| (no other contract changed) | — | — | — |
### Container image
- **Source**: `SatelliteProvider.Api/Dockerfile` multi-stage build, base `mcr.microsoft.com/dotnet/aspnet:10.0`**unchanged from cycle 4**.
- **Verification on dev workstation (local)**: `docker compose up -d --build` succeeded twice this cycle (functional test run + perf Run #2). API healthy on `:18980`. Migration 014 ran cleanly the first time; second `up` correctly reported "No new scripts need to be executed" via DbUp's journal. Verified at the start of Step 11 (functional tests) and Step 15 (performance Run #2).
- **Verification on CI**: pending — the Step-12/13/14/15 sync commit + this deploy report commit have not yet been pushed. Operator action: after push, confirm the next Woodpecker `01-test` + `02-build-push` runs on `dev` succeed before promoting.
- **Multi-arch**: unchanged from cycle 4 (`aspnet:10.0` is multi-arch by Microsoft).
## Verification gates passed in this cycle
| Gate | Result | Evidence |
|------|--------|----------|
| Step 11 — Functional test suite | **PASS** | All unit + integration tests green after a `colima restart` mid-run for an unrelated transient DNS hiccup. `_docs/03_implementation/implementation_report_tile_identity_uuidv5_cycle5.md` |
| Step 12 — Test-Spec Sync | **PASS** | `_docs/02_document/tests/traceability-matrix.md` and `blackbox-tests.md` updated with AZ-503 (foundation) + AZ-504 ACs; AZ-503-full (now AZ-505) deferred ACs are recorded as "Deferred to AZ-505". |
| Step 13 — Update Docs | **PASS** | 15 doc files synced + 1 new module doc (`common_uuidv5.md`) + `_docs/02_document/ripple_log_cycle5.md`. Architecture, data-model, glossary, module-layout, UAV tile-upload contract (v1.1.0), and DataAccess + Services + Tests module docs all reflect AZ-503-foundation. |
| Step 14 — Security Audit | **PASS_WITH_WARNINGS** | `_docs/05_security/security_report_cycle5.md`; 0 new Critical/High; 0 new Medium; 2 new Low informational findings (F1-cy5 `metadata.flightId` provenance — long-term recommendation; F2-cy5 `pgcrypto` deploy-runbook gap — captured above in this report). Cycle-4 D2-cy4 (`Microsoft.NET.Test.Sdk` transitive flag) still open per scope. |
| Step 15 — Performance Test | **PASS_WITH_INFRA_WARNINGS** | `_docs/06_metrics/perf_2026-05-12_cycle5.md`. PT-03..PT-08 PASS across two runs; PT-08 (the AZ-504 fix target) PASSED both runs with 200/200 batches accepted and p95 = 117ms (vs 2000ms threshold). PT-01 / PT-02 FAILed both runs due to a recurring local-dev Docker/colima DNS cold-start bug (`Name or service not known` on `mt0/tile.googleapis.com` at the first request after every `docker compose up`) — **not an application regression**, reclassified as "Unverified — infrastructure noise" in the trend track. AZ-503 hot path (PT-08 UPSERT) is **faster** than cycle-4 baselines (117ms vs 199ms vs the unmeasurable cycle-3/4 batch), not slower. |
## Outstanding leftovers (NOT closed by cycle 5)
1. **`_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md`** — STAYS OPEN. Replay #5 entry appended this cycle. The AZ-504 half of the closure obligation (the `grep -c … || true` script fix) is verified working across two perf runs; the remaining half (a fully-green exit-0 default-parameter perf run) is blocked by the local-dev Docker/colima DNS cold-start bug captured in `perf_2026-05-12_cycle5.md`. Closure path is one of the recommended follow-up PBIs below.
## Recommended follow-up PBIs (out of cycle-5 scope, surfaced for backlog)
| ID | Estimate | Title | Why |
|----|----------|-------|-----|
| **AZ-505** | 5 SP | Tile identity full: inventory endpoint + HTTP/2 + Leaflet index | The deferred-from-AZ-503 half. Foundation (this cycle) is the prerequisite. Already spec'd at `_docs/02_tasks/todo/AZ-505_*.md` if Step 9 produced one; otherwise file at cycle-6 New Task. |
| (TBD) | 1 SP | Perf script DNS pre-warm before PT-01 | Add `docker compose exec api getent hosts mt0..mt3.google.com tile.googleapis.com` (or equivalent) before PT-01 fires in `scripts/run-performance-tests.sh`. Deterministically removes the cold-DNS class of PT-01/PT-02 failures. **Closes the cycle-3 perf-harness leftover on the next local perf run.** Trivial mechanical fix. |
| (TBD) | 2 SP | Move perf gate to CI / cloud runner | Stable resolver, eliminates local-dev DNS flake entirely. The harness itself is portable; only the orchestration layer changes. Complementary to (or alternative to) the DNS pre-warm PBI. |
| (TBD) | 1 SP | Deployment runbook: pgcrypto pre-install step | Adds the F2-cy5 finding to the operator runbook: "For managed Postgres (RDS / Cloud SQL / Azure Postgres), verify `pgcrypto` is installable by the migration-running role before deploying AZ-503". Stock Postgres 16 is unaffected. |
| (TBD) | 2 SP (recheck per cycle) | Authenticated provenance for `metadata.flightId` | F1-cy5 long-term recommendation. When/if an authoritative flight registry is introduced, validate that the JWT-bound caller owns the claimed flight before persisting. Not actionable until that registry exists. |
| (TBD) | 1 SP | Bump `Microsoft.NET.Test.Sdk` 17.8.0 → 17.13.0+ | Carry-over D2-cy4 (transitive `NuGet.Frameworks` flag). Test-runtime exposure only; safe to land independently. **Unchanged from cycle 4.** |
| (TBD) | 3 SP | Migrate `WithOpenApi(...)` callsites to ASP.NET Core 10 minimal-API metadata extensions | Carry-over from cycle 4 (`ASPDEPR002` warnings). API still fully functional; deprecation, not removal. **Unchanged from cycle 4.** |
| (TBD) | 1 SP (recheck per cycle) | `Serilog.AspNetCore` 8.0.3 → 10.x | Carry-over from cycle 4. Re-check each cycle; bump as soon as a 10.x line ships. **Unchanged from cycle 4 — no 10.x line published as of cycle 5.** |
## Operator runbook for promoting to staging / production
1. **Push** the cycle-5 sync commits + this deploy report to `origin/dev`. Confirm Woodpecker `01-test` runs green on `dev`.
2. **Verify pgcrypto availability on the target Postgres**:
- Stock Postgres 16 / `postgres:16` Docker image: pre-bundled, no action.
- Managed cloud Postgres: confirm `pgcrypto` is installable by the migration-running role per the provider docs. If not, install it manually before container startup (or escalate per provider).
3. **Deploy** the new `dev-arm` (and amd64) image. On container startup, DbUp applies migration `014_AddTileIdentityColumns.sql` once. Backfill of `location_hash` for legacy rows runs inside the migration and is deterministic (so re-running the migration against a non-empty DB is idempotent — DbUp's journal prevents it anyway).
4. **Smoke-test**: `/swagger` (expect 200/301), `/api/satellite/region/<random>` (expect 401, JWT enforcement), and a single `POST /api/satellite/tiles/uav` upload with a freshly-minted JWT — expect a `tileId` in the response and a per-flight file under `./tiles/uav/{flightId or 'none'}/`.
5. **Verify** the new index landed: `SELECT indexname FROM pg_indexes WHERE tablename='tiles' AND indexname='idx_tiles_unique_identity';` should return one row, and `idx_tiles_unique_location` should NO LONGER exist on the same table.
6. **No env-var change to coordinate.** Cycle 5 doesn't introduce any new app config.
7. **Roll-forward** plan: if a regression appears post-deploy, the rollback target is the prior `dev-arm` tag (built from commit `e31f592` or earlier — the cycle-4 close commit). Migration 014 is forward-only — if rolling back, the new columns + index stay (they are additive); the app code simply ignores them.
8. **Outstanding ops-side gap (long-standing, NOT new in cycle 5)**: admin team `iss/aud` confirmation before promoting beyond `dev`. Unchanged from cycle 3 / 4 runbooks.
## Differences vs. cycle 4 deploy
- **NEW**: a database migration (`014_AddTileIdentityColumns.sql`) — cycle 4 had no schema change. Adds the `pgcrypto` extension prerequisite.
- **NEW**: a contract version bump (`uav-tile-upload.md` 1.0.0 → 1.1.0) — cycle 4 had no contract change.
- **NEW**: a per-flight on-disk path layout change for UAV tiles (additive — legacy paths still exist; only new uploads use the per-flight tree).
- **UNCHANGED**: container image base (`aspnet:10.0`), CI image (`sdk:10.0`), all env vars, all multi-arch tags, all carry-over follow-up PBIs from cycle 4 (re-listed above).
- **CLEARER**: the perf gate this cycle has direct evidence that the AZ-503 UPSERT hot path doesn't regress (PT-08 200/200 batches, p95 117ms — first measurable PT-08 in the project's history thanks to AZ-504).