[AZ-505] AC-5 fix: enable TLS for HTTP/2 via ALPN
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful

Kestrel with HttpProtocols.Http1AndHttp2 on a plaintext listener
silently downgrades to HTTP/1.1-only (logs "HTTP/2 is not enabled
... TLS is not enabled"), so AC-5's multiplexed-GET test failed
with HTTP_1_1_REQUIRED. ALPN cannot run over plaintext, so the
fix switches the dev listener to TLS on https://+:8080:

- scripts/run-tests.sh generates a self-signed dev cert idempotently
  (./certs/api.pfx + api.crt) via openssl in an alpine container;
  certs/ is gitignored.
- docker-compose.yml binds Kestrel to ASPNETCORE_URLS=https://+:8080
  with Kestrel__Certificates__Default__Path bound to the .pfx.
- docker-compose.tests.yml mounts api.crt into the integration-tests
  container's CA store and runs update-ca-certificates so HttpClient
  trusts the cert transparently; default API_URL is now https://api:8080.
- Drop the obsolete Http2UnencryptedSupport AppContext switch from
  Http2MultiplexingTests; ALPN over TLS handles negotiation.

Test-data fixes caught on the post-TLS rerun (independent of the TLS
switch but surfaced together):

- Http2MultiplexingTests: switch slippy coords from (154321, 95812)
  -- which Google Maps returns 404 for -- to (158485, 91707), the
  slippy projection of (47.461747, 37.647063) already exercised by
  JwtIntegrationTests.
- TileInventoryTests + LeafletPathIndexOnlyTests: SpecifyKind to
  Unspecified at the binding site for raw Npgsql seed paths writing
  into tiles.captured_at / created_at / updated_at (TIMESTAMP without
  tz). Npgsql v6+ refuses Kind=Utc into plain timestamp columns;
  production goes through Dapper and never hits this code path.
- MigrationTests Az503NewUniqueIndexCoversIntegerKeyAndFlightId:
  accept either idx_tiles_location_hash (migration 014) or its
  AZ-505 successor tiles_leaflet_path (migration 015) -- both have
  location_hash as the leading column, which is the AC-9 intent.

Docs updated to reflect the TLS+ALPN path: tile-inventory.md
Non-Goals, modules/api_program.md, module-layout.md, the AZ-505
task spec's Risk 3, and the cycle 6 implementation + completeness
reports. The full integration test suite passes (mode=full, exit 0).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-12 22:19:26 +03:00
parent da40534b49
commit c74a2339aa
17 changed files with 148 additions and 42 deletions
@@ -54,7 +54,7 @@ $ rg -i 'placeholder|TODO|NotImplemented|scaffold|native bridge|fake|mock' \
Named technologies / integrations promised by the task:
- **`tiles_leaflet_path` covering index** — created by migration 015; verified to exist when migrations run on a fresh DB.
- **Kestrel HTTP/2 (`Http1AndHttp2`)** — wired via `builder.WebHost.ConfigureKestrel` per the AZ-505 Outcome bullet 3. AC-5 integration test confirms `HttpResponseMessage.Version == 2.0` over 20 concurrent multiplexed GETs.
- **Kestrel HTTP/2 (`Http1AndHttp2`)** — wired via `builder.WebHost.ConfigureKestrel` per the AZ-505 Outcome bullet 3. Dev listener bound to `https://+:8080` with `./certs/api.pfx` so ALPN can advertise `h2` (Kestrel requires TLS for HTTP/2 negotiation); cert generation is idempotent via `scripts/run-tests.sh` and `certs/` is gitignored. AC-5 integration test confirms `HttpResponseMessage.Version == 2.0` over 20 concurrent multiplexed GETs.
- **Npgsql `ANY($1::uuid[])` array binding** — used in `GetTilesByLocationHashesAsync`. The escape from Dapper is documented inline and is the production behaviour exercised by the AC-1 / AC-4 integration tests.
- **Cross-repo `Uuidv5.TileNamespace`** — unchanged from AZ-503. AZ-505 consumes the existing constant via `Uuidv5.LocationHashForTile`. The sibling-repo's Python `c6_tile_cache/_uuid.py:TILE_NAMESPACE` is owned by `gps-denied-onboard` and is **out of scope for the satellite-provider workspace** per the AZ-505 Constraints section.
@@ -103,3 +103,7 @@ Tests (existence + AC mapping verified):
## Required Remediation Tasks: None
Cycle 6 is complete from the implementation perspective. The full integration-test gate is owned by autodev Step 11 (test-run skill) per the handoff in `implementation_report_tile_inventory_cycle6.md`.
## Post-gate correction (Run Tests step, follow-up commit)
The Step 11 (Run Tests) execution surfaced an AC-5 runtime gap that the source-code-only completeness gate could not catch: `HttpProtocols.Http1AndHttp2` on a plaintext listener silently downgrades to HTTP/1.1-only (Kestrel logs `HTTP/2 is not enabled for [::]:8080 ... TLS is not enabled`), so the multiplexed-GET test failed with `HTTP_1_1_REQUIRED`. The corrective commit (`[AZ-505] AC-5 fix: enable TLS for HTTP/2 via ALPN`) switches the dev listener to TLS on `https://+:8080` so ALPN can negotiate `h2`. Details — including the cert-generation script, docker-compose binding, integration-test CA trust setup, and two unrelated test-data fixes also caught on the rerun (Google-Maps-404 coords in `Http2MultiplexingTests`; `DateTime.Kind=Utc` vs `timestamp without time zone` in three raw-Npgsql seed paths; the stale `idx_tiles_location_hash` assertion in `MigrationTests`) — are in `implementation_report_tile_inventory_cycle6.md` → "Post-merge correction". The full integration test suite now passes (mode=full, exit 0). Gate verdict remains **PASS**: every named AZ-505 technology is integrated and exercised by a green AC test.
@@ -11,7 +11,7 @@ Cycle 6 ships **the consumer-facing payload of the AZ-503-foundation tile identi
- **`POST /api/satellite/tiles/inventory`** — bulk-list / pre-flight cache sizing endpoint that the onboard `TileDownloader` (sibling repo `gps-denied-onboard` AZ-316) is gated behind `c11.use_bulk_list_endpoint=false` until this PBI lands in the target environment.
- **`tiles_leaflet_path` covering index** — makes the Leaflet hot path (`GET /tiles/{z}/{x}/{y}`) an `Index Only Scan` against `(location_hash, captured_at DESC, updated_at DESC, id DESC) INCLUDE (file_path, source)`. `GetByTileCoordinatesAsync` was rewired to filter on `location_hash` (deterministic UUIDv5) to drive the index; behaviour is byte-identical.
- **Kestrel HTTP/2 (h2c)** — `Http1AndHttp2` on every dev listener so programmatic clients (httpx `http2=True`, .NET `HttpClient` with `Version20` + `RequestVersionExact`) can multiplex tile reads on one TCP connection. Browsers still negotiate HTTP/1.1 over plaintext — browser Leaflet wins come from the covering-index hot path.
- **Kestrel HTTP/2 (TLS + ALPN)** — `Http1AndHttp2` on the dev listener, now bound to `https://+:8080` with a self-signed cert at `./certs/api.pfx` (generated by `scripts/run-tests.sh`; gitignored). Kestrel requires TLS for HTTP/2 protocol negotiation; ALPN advertises both `h2` and `http/1.1` so programmatic clients (httpx `http2=True`, .NET `HttpClient` with `Version20` + `RequestVersionExact`) and HTTP/2-capable browsers all multiplex tile reads on a single TLS connection. The integration-test container trusts the dev cert via `/usr/local/share/ca-certificates/` + `update-ca-certificates`. (Original plan was h2c on plaintext 8080; switched mid-cycle when Kestrel logged `HTTP/2 is not enabled for [::]:8080 ... TLS is not enabled`. See "Post-merge correction" below.)
- **Contract artifacts** — new `tile-inventory.md` v1.0.0 and the long-deferred `tile-storage.md` v2.0.0 major bump (architecture.md had named AZ-505 as owner since cycle 5).
## Batches
@@ -61,7 +61,7 @@ Code review accepted PASS after one auto-fix round (consolidated `ComputeLocatio
| AC-2 Leaflet path returns most-recent variant via `location_hash` | Covered | `TileInventoryTests.LeafletReadReturnsMostRecentViaLocationHash_AC2` (DB-level verification of the exact SELECT used by `GetByTileCoordinatesAsync`; ServeTile is a wrapper around the row read) |
| AC-3 Leaflet hot path uses `Index Only Scan using tiles_leaflet_path` | Covered | `LeafletPathIndexOnlyTests.RunAll` (EXPLAIN ANALYZE + regex + Heap Fetches ≤ 1) |
| AC-4 Inventory p95 ≤ 1000 ms for 2500 tiles | Covered | `TileInventoryTests.PerformanceBudget_AC4` (full-suite only; smoke prints documented skip) |
| AC-5 HTTP/2 multiplexed responses on the dev plaintext endpoint | Covered | `Http2MultiplexingTests.RunAll` |
| AC-5 HTTP/2 multiplexed responses on the dev TLS endpoint (ALPN-negotiated) | Covered | `Http2MultiplexingTests.RunAll` |
| AC-6 Request validation: 400 both-populated / 400 neither / 400 > 5000 / 401 anonymous | Covered | `TileInventoryTests.ValidationRejects{BothPopulated,NeitherPopulated,OversizedBatch}_AC6` + `TileInventoryTests.UnauthenticatedRequestReturns401_AC6` |
| AC-7 Contract artifacts produced in the same commit | Covered (doc-only) | `tile-inventory.md` v1.0.0 + `tile-storage.md` v2.0.0 Change Log entry + module-layout / glossary / data_model / module-doc updates |
@@ -89,6 +89,24 @@ Recommendation for `test-run`:
- Auto-push: enabled this session (`auto_push: true` in `_docs/_autodev_state.md`)
- Commits pushed (subject lines):
- `[AZ-505] Tile inventory + HTTP/2 + leaflet covering index`
- `[AZ-505] AC-5 fix: enable TLS for HTTP/2 via ALPN` (post-merge correction — see below)
## Post-merge correction (Run Tests step)
The initial cycle-6 commit configured Kestrel as `HttpProtocols.Http1AndHttp2` on a plaintext listener (`http://+:8080`). The `Http2MultiplexingTests` regression run revealed Kestrel logs `HTTP/2 is not enabled for [::]:8080 ... TLS is not enabled` and silently falls back to HTTP/1.1 only — ALPN cannot run over plaintext, so the protocol-multiplex semantics AC-5 verifies never engaged. The fix:
- Generate a self-signed dev cert (`./certs/api.pfx` + `./certs/api.crt`) idempotently from `scripts/run-tests.sh` (`ensure_dev_cert` block using `openssl` inside an `alpine` container). `certs/` is gitignored.
- Switch the API listener to `https://+:8080` in `docker-compose.yml` (`ASPNETCORE_URLS`, `ASPNETCORE_Kestrel__Certificates__Default__Path`, `__Password`) and mount the `.pfx` read-only.
- In `docker-compose.tests.yml`, mount `./certs/api.crt` into `/usr/local/share/ca-certificates/` of the integration-tests container and run `update-ca-certificates` in the entrypoint so `HttpClient` trusts the dev cert with no per-test handler tweaks. Test default `API_URL` updated to `https://api:8080`.
- Drop the `AppContext.SetSwitch("System.Net.SocketsHttpHandler.Http2UnencryptedSupport", true)` line from `Http2MultiplexingTests.cs` — no longer needed once ALPN over TLS does the negotiation.
While re-running the suite, two test-data issues in the new tests also surfaced and were fixed in the same correction commit (they are independent of the TLS switch, both about Npgsql v6+ refusing `DateTime.Kind=Utc` for `timestamp without time zone` columns, both seen first on the post-TLS rerun):
- `Http2MultiplexingTests` was hardcoding slippy coords `(z=18, x=154321, y=95812)` that Google Maps returns 404 for. Switched to `(158485, 91707)` — the slippy projection of `(47.461747, 37.647063)` already exercised by `JwtIntegrationTests`.
- `TileInventoryTests.PerformanceBudget_AC4`, `TileInventoryTests.SeedTileAsync`, and `LeafletPathIndexOnlyTests` were binding `DateTime.UtcNow` (Kind=Utc) into the `tiles.captured_at` / `created_at` / `updated_at` `TIMESTAMP` columns via raw `NpgsqlCommand` (production goes through Dapper, which never hits this code path). `DateTime.SpecifyKind(..., Unspecified)` applied at the binding site fixes the test seed without altering the production write path.
- `MigrationTests.Az503NewUniqueIndexCoversIntegerKeyAndFlightId` was hardcoded to look for `idx_tiles_location_hash` from migration 014; migration 015 explicitly drops that index because the new covering index `tiles_leaflet_path` has the same leading column. Assertion broadened to accept either index name so the AZ-503 AC-9 intent ("a location_hash-indexed access path exists") stays verifiable.
After the correction commit the full integration test suite passes (mode=full, exit 0).
## Open Items