[AZ-1124] Add PT-10 gRPC stream perf scenario

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-27 09:51:14 +00:00 · 2026-06-26 11:26:14 +03:00
parent a0449f79d0
commit 7dac986996
15 changed files with 598 additions and 11 deletions
@@ -262,6 +262,13 @@ Step 9 cycle 8b: folded into cycle 8 as step 1 (AZ-812). Section retained in dep
 Step 9 cycle 9: 2 tasks created (AZ-1074 = 5 pts, AZ-1075 = 3 pts) — total 8 pts. gRPC TileStream for route-based progressive tile delivery.
 Step 9 cycle 10: 1 task created (AZ-1113 = 2 pts) — REST 400 error message sanitization (F-AZ795-1/2, F-AZ810-1). Child of AZ-795.
 Step 9 cycle 11: 1 task created (AZ-1123 = 1 pt) — document `docker-compose.perf.yml` host-port conflict playbook (cycle 10 retro action).
+Step 9 cycle 12: 1 task created (AZ-1124 = 3 pts) — PT-10 gRPC `DeliverRouteTiles` stream perf scenario (cycle 9–11 retro carry-over).
+
+### Step 9 cycle 12 (PT-10 gRPC stream perf — AZ-1124)
+
+| Task | Depends On | Points | Status |
+|------|-----------|--------|--------|
+| AZ-1124 PT-10 gRPC stream perf scenario | AZ-1074, AZ-1075, AZ-492 | 3 | Todo |

 ## Coverage Verification

@@ -0,0 +1,142 @@
+# PT-10: gRPC DeliverRouteTiles stream performance scenario
+
+**Task**: AZ-1124_pt10_grpc_stream_perf
+**Name**: PT-10 gRPC stream perf scenario
+**Description**: Add a runnable PT-10 performance scenario that measures latency of the existing `DeliverRouteTiles` server-streaming RPC against a live API.
+**Complexity**: 3 points
+**Dependencies**: AZ-1074, AZ-1075, AZ-492
+**Component**: Test infrastructure (`scripts/run-performance-tests.sh`, `SatelliteProvider.IntegrationTests`) + perf NFR coverage (`_docs/02_document/tests/performance-tests.md`)
+**Tracker**: AZ-1124
+**Epic**: AZ-115
+
+### Document Dependencies
+
+- `_docs/02_document/contracts/c11_tilemanager/tile_provision_grpc.md` (consumer — wire contract for `DeliverRouteTiles`)
+- `_docs/02_document/tests/blackbox-tests.md` § BT-32 (functional baseline; PT-10 adds NFR evidence)
+
+### ADR Compliance
+
+> Implements ADR-013: gRPC `RouteTileDelivery` transport alongside REST for progressive tile delivery.
+
+## Problem
+
+Cycle 9 shipped `RouteTileDelivery.DeliverRouteTiles` (AZ-1074) with functional integration coverage (AZ-1075 / BT-32). The performance gate still runs PT-01..PT-08 (REST-only). Step 15 can pass while the gRPC streaming surface remains **Unverified** for latency — a gap called out in cycle 9–11 retrospectives and LESSONS.md (`[testing]` entry 2026-06-25).
+
+Operators and autodev Step 15 need measurable evidence for:
+
+- **Time-to-first tile batch** on a cold path (may include Google Maps download)
+- **Total stream duration** until `DeliveryComplete`
+- **Slow-consumer smoke** — stream completes without corruption when the client deliberately delays between events (backpressure sanity check per AZ-1074 AC-4)
+
+## Outcome
+
+- `scripts/run-performance-tests.sh` runs PT-10 against the **real** gRPC `DeliverRouteTiles` endpoint (TLS + JWT metadata), not a mock.
+- PT-10 reports `first_batch_ms`, `total_stream_ms`, and optional slow-consumer pass/fail over `PERF_REPEAT_COUNT` iterations (default 20).
+- `_docs/02_document/tests/performance-tests.md` documents PT-10 triggers, thresholds, and pass criteria.
+- `_docs/02_document/tests/traceability-matrix.md` marks gRPC stream perf as covered (no longer Unverified).
+- gRPC perf gap from cycle 9–11 retros is closed.
+
+## Scope
+
+### Included
+
+- Add `SatelliteProvider.IntegrationTests` perf bootstrap subcommand (e.g. `--run-pt10`) invoked from `scripts/run-performance-tests.sh`, following the AZ-492 pattern (`--mint-only`, `--gen-uav-fixture`). The subcommand:
+  - Mints or accepts JWT via existing `PerfBootstrap` / `JwtTokenFactory` surface
+  - Opens a gRPC channel to `API_URL` (same TLS trust as REST perf probes)
+  - Calls `DeliverRouteTiles` with the standard 2-waypoint / 500m / zoom-18 fixture (`GrpcTestHelpers.BuildValidRequest` coordinates)
+  - Records wall-clock **first `RouteTileEvent` with `Batch` payload** and **stream close** (`DeliveryComplete` or `DeliveryError`)
+  - Supports `PERF_REPEAT_COUNT` cold iterations; prints p50/p95 summary to stdout for the shell script to gate
+  - Optional slow-consumer mode (`PERF_PT10_SLOW_MS` delay between stream events, default 50) on a single iteration — asserts stream completes with ≥1 tile and no `DeliveryError`
+- Add PT-10 section to `scripts/run-performance-tests.sh` with non-zero exit on harness failure (same contract as PT-07/PT-08).
+- Add PT-10 spec block to `_docs/02_document/tests/performance-tests.md` with documented thresholds:
+  - Cold `p95(first_batch_ms) ≤ 30000` (aligns with PT-01 cold tile budget — includes GM round-trip)
+  - Cold `p95(total_stream_ms) ≤ 120000` (2-point 500m corridor at zoom 18 on dev hardware)
+  - Slow-consumer sub-check: completes without error (no fixed ms threshold)
+- Update `traceability-matrix.md` — gRPC streaming NFR row references PT-10.
+- Update script header comment from "PT-01..PT-08" to include PT-10.
+
+### Excluded
+
+- Production code changes to `RouteTileDeliveryGrpcService`, orchestrator, or proto (endpoint already exists).
+- New gRPC RPCs or contract version bumps.
+- Load testing with concurrent streams (k6 / multiple clients) — single-client sequential repeats only.
+- Promoting PT-09 inventory inline test into the shell harness (separate follow-up).
+- Setting production SLOs — thresholds are dev-harness budgets; Step 15 retains its A/B/C gate on threshold failures.
+
+## Acceptance Criteria
+
+**AC-1: PT-10 exercises real gRPC stream**
+Given a running API with TLS and valid `JWT_SECRET`
+When `scripts/run-performance-tests.sh` runs PT-10
+Then the probe calls `RouteTileDelivery.DeliverRouteTiles` over gRPC with `authorization: Bearer` metadata, AND receives at least one `Batch` event before `DeliveryComplete` on every successful iteration.
+
+**AC-2: Latency metrics reported**
+Given `PERF_REPEAT_COUNT` cold iterations (default 20)
+When PT-10 completes
+Then the harness prints `first_batch_ms` and `total_stream_ms` per iteration plus aggregate p50/p95 for both metrics.
+
+**AC-3: Threshold gate**
+Given the documented dev-hardware budgets in `performance-tests.md`
+When PT-10 p95 values are compared
+Then `p95(first_batch_ms) ≤ 30000` AND `p95(total_stream_ms) ≤ 120000`, OR the script exits non-zero with an explicit threshold failure message (Step 15 may still offer override).
+
+**AC-4: Slow-consumer smoke**
+Given `PERF_PT10_SLOW_MS` > 0 (default 50)
+When PT-10 runs the slow-consumer sub-check once per script invocation
+Then the stream completes with `DeliveryComplete`, at least one tile in collected batches, AND no `DeliveryError`.
+
+**AC-5: Spec and traceability updated**
+Given the post-task repository state
+When `performance-tests.md` and `traceability-matrix.md` are read
+Then PT-10 is documented as **Implemented** with scenario ID PT-10, AND the gRPC stream perf row is no longer `Unverified`.
+
+**AC-6: No duplicated JWT / gRPC bootstrap**
+Given existing `PerfBootstrap`, `GrpcTestHelpers`, and `JwtTokenFactory`
+When PT-10 mints tokens and opens channels
+Then it reuses those surfaces — no third JWT mint implementation and no parallel gRPC client factory in the shell script.
+
+## Non-Functional Requirements
+
+**Performance**
+- Thresholds above are dev-compose budgets on 8-core x86 baseline; document revision path if hardware changes.
+
+**Reliability**
+- Script exits non-zero if gRPC channel cannot be established, JWT is missing, or zero successful iterations complete.
+- Fail loudly with RPC status code and detail on `DeliveryError` or non-OK `RpcException`.
+
+**Compatibility**
+- Bash 4+; gRPC probe runs via `dotnet` IntegrationTests DLL (already built for perf bootstrap).
+- Works with `docker-compose.perf.yml` overlay and `API_URL=https://localhost:18980` default.
+
+## Unit Tests
+
+| AC Ref | What to Test | Required Outcome |
+|--------|-------------|-----------------|
+| AC-6 | Grep / static review | No new `JwtSecurityToken` constructors in perf script; PT-10 logic lives in IntegrationTests |
+
+## Blackbox Tests
+
+| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
+|--------|------------------------|-------------|-------------------|----------------|
+| AC-1..AC-4 | API + DB running via compose; valid JWT | Run full `scripts/run-performance-tests.sh` including PT-10 | Real gRPC stream; metrics printed; thresholds evaluated | PT-10 NFR |
+| AC-1 | Same stack | Run IntegrationTests `--run-pt10` in isolation | Exit 0; JSON or tabular metrics on stdout | — |
+
+## Constraints
+
+- Must not regress PT-01..PT-08 or `scripts/run-tests.sh`.
+- Must not commit minted tokens or disable TLS verification in tracked files.
+- Threshold failures are reported by the script; autodev Step 15 handles operator override — this task does not weaken gates silently.
+
+## Risks & Mitigation
+
+**Risk 1: Cold-path variance from Google Maps**
+- *Risk*: First-batch p95 spikes on network blips look like regressions.
+- *Mitigation*: Document that PT-10 cold path includes GM download; Step 15 override path remains available; record actuals in `_docs/06_metrics/perf_<date>.md`.
+
+**Risk 2: TLS / HTTP2 setup differs from integration tests**
+- *Risk*: Perf script `API_URL` may not match in-compose `https://api:8080` used by integration tests.
+- *Mitigation*: Reuse `GrpcTestHelpers.CreateChannel` cert handling; document required `API_URL` and `certs/api.crt` in perf script header (same as REST perf).
+
+**Risk 3: Threshold too tight on laptop hardware**
+- *Risk*: p95 gates fail on underpowered dev machines.
+- *Mitigation*: Thresholds match PT-01/PT-03 family (generous dev budgets); tune in `_docs/06_metrics` after first green run if needed.