Files
satellite-provider/_docs/03_implementation/deploy_cycle6.md
T
Oleksandr Bezdieniezhnykh ba3bdb1918 [AZ-505] Cycle 6 Steps 15-16 perf + deploy report
Step 15 (Performance Test): 8/8 PT scenarios PASS in a single
default-parameter run (exit 0). Adapts scripts/run-performance-tests.sh
for the new TLS+ALPN dev listener via CURL_OPTS=(--cacert ./certs/api.crt).
Report at _docs/06_metrics/perf_2026-05-12_cycle6.md. The clean exit-0
satisfies the cycle-3 perf-harness leftover deletion criterion that
carried across cycles 3-5; leftover file deleted.

Step 16 (Deploy): _docs/03_implementation/deploy_cycle6.md captures the
shipping payload (inventory endpoint, HTTP/2 TLS+ALPN, tiles_leaflet_path
covering index, migration 015), the dev-cert plumbing for local-docker +
integration-tests parity, the production-TLS topology note (terminate at
ingress; never promote the dev cert), and the operator runbook for
promoting cycle-6 past dev.

NU1902 / CA2227 / ASPDEPR002 / Serilog-10.x re-listed as carry-overs
unchanged; admin-team iss/aud confirmation unchanged.

State advanced to Step 17 (Retrospective).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 23:02:00 +03:00

18 KiB
Raw Blame History

Deploy Report — Cycle 6 (AZ-505)

Date: 2026-05-12 Cycle: 6 Scope: One-task cycle — AZ-505 Tile inventory endpoint (POST /api/satellite/tiles/inventory) + HTTP/2 enablement on the dev listener (TLS+ALPN) + Leaflet covering index (tiles_leaflet_path).

AZ-505 ships the consumer-facing payload of the AZ-503 tile-identity epic that was intentionally split out at the end of cycle 5. With this cycle, the AZ-503 epic's external surface is feature-complete; the onboard TileDownloader (sibling repo gps-denied-onboard AZ-316) can flip c11.use_bulk_list_endpoint=true once cycle 6 is deployed to its target environment.

What is shipping

Code changes (committed to dev)

Commit Subject
aa1a1bf chore: open cycle 6 — state advanced to Step 9 (New Task)
3c7cd4e chore: update autodev state to Step 10 (Implement) and refine task details for AZ-505
909f69c [AZ-505] Tile inventory endpoint + HTTP/2 + Leaflet covering index
da40534 chore: advance autodev state to Step 11 (Run Tests) after AZ-505 batch 1
c74a233 [AZ-505] AC-5 fix: enable TLS for HTTP/2 via ALPN
5d84d28 [AZ-505] Test-spec sync + task-mode doc updates for cycle 6
pending this commit [AZ-505] Cycle 6 Step 15 perf + Step 16 deploy report

All commits are on dev but NOT YET pushed to origin/dev as of this report. Operator runbook step 1 below covers the push.

Database migration (NEW — automatic on container startup)

Migration 015_AddTilesLeafletPathIndex.sql lands automatically on container startup via the existing DbUp runner. Idempotent — re-running is a no-op.

Index changes on the tiles table:

Change Index Notes
CREATED tiles_leaflet_path on (location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source) Covering index for the Leaflet hot path (GET /tiles/{z}/{x}/{y}). Makes the dominant query an Index Only Scan (heap fetches ≤ 1 on a freshly VACUUM ANALYZE-d table).
DROPPED idx_tiles_location_hash (cycle 5, migration 014) Superseded — the new covering index has the same leading column location_hash. The drop is in the same migration as the create; net index count on tiles is unchanged.

Lock window: the migration runs CREATE INDEX (not CONCURRENTLY — DbUp's single-script transaction model is incompatible with CONCURRENTLY's no-transaction requirement). Expected wall time on a populated production-sized tiles table is acceptable (a few seconds to ~1 minute depending on row count); the migration header documents this trade-off and the upgrade path if a larger table necessitates a manual concurrent rebuild. AZ-505 Risk 1 + Risk 2 cover the trade-offs.

pgcrypto: still required, still installed automatically by migration 014 from cycle 5. Cycle 6 does not introduce any new extension dependency.

Backward compatibility:

  • Reads of legacy rows continue to work — the rewired GetByTileCoordinatesAsync filters on location_hash (deterministic UUIDv5 of {z}/{x}/{y}), which is NOT NULL for all rows after cycle 5's backfill. Behaviour is byte-identical to the cycle-5 query for any row whose location_hash matches.
  • Writes unchanged — the cycle-6 PBI does not modify any producer path.
  • No rename of any existing column or table. Cycle 6 is index-only on the schema side.

Configuration changes (operator must verify before promoting)

Setting Was Now Source
No new env vars introduced. Cycle 6 carries forward the cycle-5 env contract verbatim (JWT_SECRET ≥ 32B, JWT_ISSUER, JWT_AUDIENCE, GOOGLE_MAPS_API_KEY).
Dev/test listener protocol http://+:8080 (HTTP/1.1 only) https://+:8080 with Http1AndHttp2 and ALPN SatelliteProvider.Api/Program.cs + docker-compose.yml (ASPNETCORE_URLS, ASPNETCORE_Kestrel__Certificates__Default__Path=/app/certs/api.pfx, __Password=satellite-dev-cert). Dev/test only — production deploys terminate TLS at the ingress (cluster-managed cert) and forward plaintext HTTP/2 over the cluster network to the api pod's listener; the dev-cert plumbing below is for local-docker + integration-tests parity.
Dev cert artifacts (none) ./certs/api.pfx (server) + ./certs/api.crt (public CA) — generated idempotently by scripts/run-tests.sh ensure_dev_cert block using openssl inside an alpine container scripts/run-tests.sh + .gitignore (the certs/ directory is git-ignored — never commit the PFX). Operator note: the dev cert is for local development and the integration-tests container only; staging/prod must NEVER reuse it. The integration-tests container mounts api.crt into /usr/local/share/ca-certificates/ and runs update-ca-certificates in its entrypoint so HttpClient trusts the dev cert with no per-test handler tweaks.
Container image (api service) mcr.microsoft.com/dotnet/aspnet:10.0 (cycle-5 baseline) unchanged (mcr.microsoft.com/dotnet/aspnet:10.0) No Dockerfile, no .woodpecker/*.yml changes this cycle.
Perf harness http://localhost:18980 default https://localhost:18980 default — CURL_OPTS=(--cacert ./certs/api.crt) when the dev cert is present, else falls through to system CA store scripts/run-performance-tests.sh. Override via PERF_CURL_OPTS (e.g. -k --silent) when running against a staging cert.

Contract changes (consumer-visible)

Contract Version Change Action for consumers
POST /api/satellite/tiles/inventory (tile-inventory.md) NEW — 1.0.0 New endpoint. Body shape XOR tiles[] (Form A: integer {z,x,y}) OR locationHashes[] (Form B: hex-encoded UUIDv5). Returns one entry per request entry in input order, with present/absent shaping. MaxEntriesPerRequest = 5000. Sibling repo onboarding: gps-denied-onboard AZ-316 can flip its config flag c11.use_bulk_list_endpoint=true once this is deployed. Until flipped, the onboard TileDownloader falls back to per-tile lookup as it does today.
tile-storage.md (data-access contract) 1.0.0 → 2.0.0 (joint freeze AZ-503-foundation + AZ-505) Major bump promotes the Leaflet read path to use location_hash as the index-driving column. Architecture.md had named AZ-505 as the cycle that closes this freeze since cycle 5. Internal: data-access layer consumers (TileService, RegionService, RouteService, region/route processing services) read through ITileRepository — no API change visible to them.
Dev listener: http://api:8080 (HTTP/1.1) → https://api:8080 (HTTP/1.1 + HTTP/2 via ALPN) n/a — dev/test affordance, not a production contract Programmatic clients pointing at the dev compose stack must trust ./certs/api.crt (mount + update-ca-certificates) or pass -k/--insecure. Browser clients: certificate trust prompt the first time, then HTTP/2-capable browsers will negotiate h2 automatically. Production unaffected — ingress controls TLS termination there.

Container image

  • Source: SatelliteProvider.Api/Dockerfile multi-stage build, base mcr.microsoft.com/dotnet/aspnet:10.0unchanged from cycle 5.
  • New mount in docker-compose.yml: ./certs/api.pfx:/app/certs/api.pfx:ro (dev/test only — the dev cert is generated by scripts/run-tests.sh and gitignored).
  • New mount in docker-compose.tests.yml: ./certs/api.crt:/usr/local/share/ca-certificates/satellite-provider-dev.crt:ro + entrypoint update-ca-certificates so HttpClient trusts the dev cert.
  • Verification on dev workstation (local): docker compose up -d --build succeeded multiple times this cycle (functional test runs + perf run). API healthy on https://localhost:18980 (swagger 200; anonymous POST /api/satellite/tiles/inventory returns 401). Migration 015 ran cleanly on a dev-baseline DB; re-runs are journal-skipped by DbUp.
  • Verification on CI: pending — the Step-12/13/15 sync commit + this deploy report commit have not yet been pushed. Operator action: after push, confirm the next Woodpecker 01-test + 02-build-push runs on dev succeed before promoting. Note that the 01-test runner builds the dev cert in-CI via the scripts/run-tests.sh ensure_dev_cert block; no new CI secret is required.
  • Multi-arch: unchanged from cycle 5 (aspnet:10.0 is multi-arch by Microsoft).

Verification gates passed in this cycle

Gate Result Evidence
Step 11 — Functional test suite PASS All unit + integration tests green after the AC-5 TLS fix and three follow-up test-data fixes (Http2MultiplexingTests slippy coords, DateTime.Kind=UtcUnspecified on raw Npgsql seed paths, MigrationTests accepts either idx_tiles_location_hash OR tiles_leaflet_path). _docs/03_implementation/implementation_report_tile_inventory_cycle6.md + _docs/03_implementation/implementation_completeness_cycle6_report.md.
Step 12 — Test-Spec Sync PASS _docs/02_document/tests/traceability-matrix.md rewires AZ-503 deferrals onto AZ-505 ACs; blackbox-tests.md BT-23..BT-26 + performance-tests.md PT-09 cover the cycle-6 ACs/NFRs.
Step 13 — Update Docs PASS Architecture, module-layout, glossary, data_model, contract artifacts (tile-inventory.md v1.0.0 + tile-storage.md v2.0.0), module docs (api_program.md, common_dtos.md, common_interfaces.md, services_tile_service.md, dataaccess_migrator.md, dataaccess_tile_repository.md), system-flows (F7 Leaflet Tile Serving + F8 Tile Inventory Bulk Lookup), _docs/02_document/ripple_log_cycle6.md.
Step 14 — Security Audit SKIPPED User skipped the optional gate. No _docs/05_security/security_report_cycle6.md produced. Cycle 5 carry-overs (pgcrypto ops gap recorded in cycle 5 deploy report; Microsoft.IdentityModel NU1902 7.0.3 still pinned) are unchanged. The new TLS dev affordance is dev/test only — staging/prod still terminate TLS at ingress, so the dev cert is not in the production trust chain.
Step 15 — Performance Test PASS _docs/06_metrics/perf_2026-05-12_cycle6.md. 8/8 scenarios PASS (PT-01..PT-08), exit 0, single default-parameter run, no infra noise. PT-08 batch p95 = 544ms (vs 2000ms threshold; vs cycle-5 117ms — the increase is per-curl TLS handshake overhead on the host-loopback measurement leg, not application latency). AZ-505 NFR-1 (inventory p95 ≤ 200ms at coords≤500) verified inline by TileInventoryTests.PerformanceBudget_AC4 against a seeded 1000-row table — observed median 58ms, p95 well under threshold. AZ-505 NFR-2 (HTTP/2 multiplexing, single TLS connection, 8 concurrent tile reads) verified inline by Http2MultiplexingTests with HttpVersion == 2.0 asserted on every response and cumulative wall time under 5s. Cycle-3 perf-harness leftover CLOSED by this exit-0 run.

Outstanding leftovers (status this cycle)

  • _docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.mdCLOSED this cycle. The deletion criterion ("default-parameter ./scripts/run-performance-tests.sh exits 0 against an api built from dev") is satisfied by the Step 15 run in this cycle. File deleted in the same commit as this deploy report.
  • No other open leftovers as of cycle 6.
ID Estimate Title Why
(TBD) 1 SP Deployment runbook: ingress TLS termination + HTTP/2 forwarding Cycle 6 introduces the first HTTP/2-enabled endpoint. For production deployments behind an ingress (Traefik, Nginx, AWS ALB, etc.), document the expected topology — TLS terminates at ingress with a cluster-managed cert; cluster-internal traffic to the api pod uses cleartext HTTP/2 (h2c) inside the cluster network. The dev cert plumbing (./certs/) is dev/test only and must NEVER reach a non-dev environment. Trivial doc-only fix; folds into the next deploy-runbook update.
(TBD) 1 SP _docs/02_document/contracts/data-access/tile-storage.md consumer audit The contract bumped 1.0.0 → 2.0.0 in this cycle. Audit sibling repos for any consumer pinning the v1 row shape; flag breaking-change consumers before promotion past dev.
(TBD) 3 SP (recheck per cycle) Bump Microsoft.IdentityModel.Tokens / System.IdentityModel.Tokens.Jwt 7.0.3 → 7.1.2+ Carry-over from cycles 35 (NU1902 moderate severity advisory). Test-runtime + production runtime exposure; safe to land independently as a dependency-only PR. Unchanged from cycle 5.
(TBD) 1 SP Bump Microsoft.NET.Test.Sdk 17.8.0 → 17.13.0+ Carry-over D2-cy4 (transitive NuGet.Frameworks flag). Test-runtime exposure only. Unchanged from cycles 4 + 5.
(TBD) 3 SP Migrate WithOpenApi(...) callsites to ASP.NET Core 10 minimal-API metadata extensions Carry-over from cycles 4 + 5 (ASPDEPR002 warnings). API still fully functional; deprecation, not removal. Unchanged from cycles 4 + 5.
(TBD) 1 SP (recheck per cycle) Serilog.AspNetCore 8.0.3 → 10.x Carry-over from cycles 4 + 5. Re-check each cycle; bump as soon as a 10.x line ships compatible with Serilog.Sinks.File ≥ 7.0.0 in this project's dep graph. Unchanged from cycle 5 — no 10.x line published as of cycle 6.
(TBD) 2 SP Inventory endpoint estimatedBytes field Deferred per AZ-505 Outcome bullet 1 — only land when production profiling shows the per-row stat() cost is justified.
(TBD) 5 SP HTTP/3 / QUIC dev listener Deferred per AZ-505 Excluded list. Adds UDP plumbing to dev compose and ALPN h3 advertisement; production payoff depends on consumer mix.

Operator runbook for promoting to staging / production

  1. Push the cycle-6 sync commits + this deploy report to origin/dev. Confirm Woodpecker 01-test runs green on dev (the dev cert is regenerated in-CI by scripts/run-tests.sh; no new CI secret is required).
  2. Production TLS topology check (see follow-up PBI above for the runbook formalisation):
    • Production deploys MUST terminate TLS at the ingress with a cluster-managed cert; the dev cert at ./certs/api.pfx is NEVER promoted to a non-dev environment (it is gitignored and regenerated on demand).
    • Cluster-internal traffic from the ingress to the api pod uses cleartext HTTP/2 (h2c). Kestrel's Http1AndHttp2 listener will negotiate either over TLS+ALPN (dev/test) or over plain h2c when there is no certificate present and Endpoints__Default__Url=http://+:8080 is set instead. Confirm the production manifest sets the URL form appropriate to the cluster's terminal-TLS model.
  3. Verify migration 015 readiness on the target Postgres:
    • pgcrypto (already required since cycle 5): no new action.
    • Migration 015 runs a single transactional CREATE INDEX. On a small/medium tiles table the lock window is acceptable. If the target table is large (≥ 10M rows), schedule the deploy in a low-traffic window OR pre-create the index manually with CREATE INDEX CONCURRENTLY matching migration 015's column list and INCLUDE clause, then let DbUp's journal mark the migration as applied via the manual route.
  4. Deploy the new dev-arm (and amd64) image. On container startup DbUp applies migration 015_AddTilesLeafletPathIndex.sql once. Re-runs are journal-skipped.
  5. Smoke-test (production):
    • /swagger (expect 200/301), /api/satellite/region/<random> (expect 401, JWT enforcement) — unchanged from cycle 5.
    • POST /api/satellite/tiles/inventory with a freshly-minted JWT, body {"tiles":[{"zoomLevel":18,"x":158485,"y":91707}]} — expect 200 with one entry whose present field reflects whether that tile exists in the target environment.
    • Cycle-5 smoke (POST /api/satellite/tiles/uav) unchanged.
  6. Verify the new index landed: SELECT indexname FROM pg_indexes WHERE tablename='tiles' AND indexname='tiles_leaflet_path'; should return one row, and idx_tiles_location_hash should NO LONGER exist on the same table.
  7. Verify HTTP/2 negotiation against the production ingress (one-off, not a regression test): curl --http2 -sv https://<prod-host>/api/satellite/region/<id> should log * Using HTTP2 and a Bearer-rejected 401. If the ingress is HTTP/1.1-only, request the ops team enable HTTP/2 on it for tile-read performance — the api side is already speaking it.
  8. No env-var change to coordinate. Cycle 6 doesn't introduce any new app config.
  9. Roll-forward plan: if a regression appears post-deploy, the rollback target is the prior dev-arm tag (built from commit ea278af or earlier — the cycle-5 close commit). Migration 015 is forward-only — if rolling back, the new tiles_leaflet_path index stays (it is additive and used only by reads); the dropped idx_tiles_location_hash would need to be re-created manually if a future migration ever expects it (no current migration does — its only consumer was the cycle-5 -> cycle-6 transition, which is now complete).
  10. Outstanding ops-side gap (long-standing, NOT new in cycle 6): admin team iss/aud confirmation before promoting beyond dev. Unchanged from cycles 3 / 4 / 5 runbooks.

Differences vs. cycle 5 deploy

  • NEW: a public-API endpoint (POST /api/satellite/tiles/inventory) — cycle 5 added no public endpoints, only modified UAV upload semantics.
  • NEW: a data-access contract major bump (tile-storage.md 1.0.0 → 2.0.0) — cycle 5 only bumped the UAV upload contract.
  • NEW: HTTP/2 negotiation on the dev/test listener via TLS+ALPN; dev cert plumbing in compose + tests + perf script.
  • NEW: a database migration (015_AddTilesLeafletPathIndex.sql) — index-only, additive + dropping the cycle-5 idx_tiles_location_hash whose role the new index fully subsumes.
  • NEW (for the project, not for the cycle's primary scope): perf script now defaults to HTTPS + dev-cert trust; documented PERF_CURL_OPTS override.
  • UNCHANGED: container image base (aspnet:10.0), CI image (sdk:10.0), all env vars, all multi-arch tags, the cycle-4-and-earlier carry-over follow-up PBIs.
  • CLOSED: the cycle-3 perf-harness leftover. Cycle 6's clean exit-0 perf run satisfies the deletion criterion that has been carried across cycles 3 → 4 → 5.
  • CLEARER: the AZ-503 epic's external surface is now complete (inventory endpoint + leaflet covering index + HTTP/2 multiplex). Onboard TileDownloader (sibling repo) can flip c11.use_bulk_list_endpoint=true once this is in its target environment.