[AZ-1124] Add PT-10 gRPC stream perf scenario
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-06-26 11:26:14 +03:00
parent a0449f79d0
commit 7dac986996
15 changed files with 598 additions and 11 deletions
+7
View File
@@ -262,6 +262,13 @@ Step 9 cycle 8b: folded into cycle 8 as step 1 (AZ-812). Section retained in dep
Step 9 cycle 9: 2 tasks created (AZ-1074 = 5 pts, AZ-1075 = 3 pts) — total 8 pts. gRPC TileStream for route-based progressive tile delivery.
Step 9 cycle 10: 1 task created (AZ-1113 = 2 pts) — REST 400 error message sanitization (F-AZ795-1/2, F-AZ810-1). Child of AZ-795.
Step 9 cycle 11: 1 task created (AZ-1123 = 1 pt) — document `docker-compose.perf.yml` host-port conflict playbook (cycle 10 retro action).
Step 9 cycle 12: 1 task created (AZ-1124 = 3 pts) — PT-10 gRPC `DeliverRouteTiles` stream perf scenario (cycle 911 retro carry-over).
### Step 9 cycle 12 (PT-10 gRPC stream perf — AZ-1124)
| Task | Depends On | Points | Status |
|------|-----------|--------|--------|
| AZ-1124 PT-10 gRPC stream perf scenario | AZ-1074, AZ-1075, AZ-492 | 3 | Todo |
## Coverage Verification
@@ -0,0 +1,142 @@
# PT-10: gRPC DeliverRouteTiles stream performance scenario
**Task**: AZ-1124_pt10_grpc_stream_perf
**Name**: PT-10 gRPC stream perf scenario
**Description**: Add a runnable PT-10 performance scenario that measures latency of the existing `DeliverRouteTiles` server-streaming RPC against a live API.
**Complexity**: 3 points
**Dependencies**: AZ-1074, AZ-1075, AZ-492
**Component**: Test infrastructure (`scripts/run-performance-tests.sh`, `SatelliteProvider.IntegrationTests`) + perf NFR coverage (`_docs/02_document/tests/performance-tests.md`)
**Tracker**: AZ-1124
**Epic**: AZ-115
### Document Dependencies
- `_docs/02_document/contracts/c11_tilemanager/tile_provision_grpc.md` (consumer — wire contract for `DeliverRouteTiles`)
- `_docs/02_document/tests/blackbox-tests.md` § BT-32 (functional baseline; PT-10 adds NFR evidence)
### ADR Compliance
> Implements ADR-013: gRPC `RouteTileDelivery` transport alongside REST for progressive tile delivery.
## Problem
Cycle 9 shipped `RouteTileDelivery.DeliverRouteTiles` (AZ-1074) with functional integration coverage (AZ-1075 / BT-32). The performance gate still runs PT-01..PT-08 (REST-only). Step 15 can pass while the gRPC streaming surface remains **Unverified** for latency — a gap called out in cycle 911 retrospectives and LESSONS.md (`[testing]` entry 2026-06-25).
Operators and autodev Step 15 need measurable evidence for:
- **Time-to-first tile batch** on a cold path (may include Google Maps download)
- **Total stream duration** until `DeliveryComplete`
- **Slow-consumer smoke** — stream completes without corruption when the client deliberately delays between events (backpressure sanity check per AZ-1074 AC-4)
## Outcome
- `scripts/run-performance-tests.sh` runs PT-10 against the **real** gRPC `DeliverRouteTiles` endpoint (TLS + JWT metadata), not a mock.
- PT-10 reports `first_batch_ms`, `total_stream_ms`, and optional slow-consumer pass/fail over `PERF_REPEAT_COUNT` iterations (default 20).
- `_docs/02_document/tests/performance-tests.md` documents PT-10 triggers, thresholds, and pass criteria.
- `_docs/02_document/tests/traceability-matrix.md` marks gRPC stream perf as covered (no longer Unverified).
- gRPC perf gap from cycle 911 retros is closed.
## Scope
### Included
- Add `SatelliteProvider.IntegrationTests` perf bootstrap subcommand (e.g. `--run-pt10`) invoked from `scripts/run-performance-tests.sh`, following the AZ-492 pattern (`--mint-only`, `--gen-uav-fixture`). The subcommand:
- Mints or accepts JWT via existing `PerfBootstrap` / `JwtTokenFactory` surface
- Opens a gRPC channel to `API_URL` (same TLS trust as REST perf probes)
- Calls `DeliverRouteTiles` with the standard 2-waypoint / 500m / zoom-18 fixture (`GrpcTestHelpers.BuildValidRequest` coordinates)
- Records wall-clock **first `RouteTileEvent` with `Batch` payload** and **stream close** (`DeliveryComplete` or `DeliveryError`)
- Supports `PERF_REPEAT_COUNT` cold iterations; prints p50/p95 summary to stdout for the shell script to gate
- Optional slow-consumer mode (`PERF_PT10_SLOW_MS` delay between stream events, default 50) on a single iteration — asserts stream completes with ≥1 tile and no `DeliveryError`
- Add PT-10 section to `scripts/run-performance-tests.sh` with non-zero exit on harness failure (same contract as PT-07/PT-08).
- Add PT-10 spec block to `_docs/02_document/tests/performance-tests.md` with documented thresholds:
- Cold `p95(first_batch_ms) ≤ 30000` (aligns with PT-01 cold tile budget — includes GM round-trip)
- Cold `p95(total_stream_ms) ≤ 120000` (2-point 500m corridor at zoom 18 on dev hardware)
- Slow-consumer sub-check: completes without error (no fixed ms threshold)
- Update `traceability-matrix.md` — gRPC streaming NFR row references PT-10.
- Update script header comment from "PT-01..PT-08" to include PT-10.
### Excluded
- Production code changes to `RouteTileDeliveryGrpcService`, orchestrator, or proto (endpoint already exists).
- New gRPC RPCs or contract version bumps.
- Load testing with concurrent streams (k6 / multiple clients) — single-client sequential repeats only.
- Promoting PT-09 inventory inline test into the shell harness (separate follow-up).
- Setting production SLOs — thresholds are dev-harness budgets; Step 15 retains its A/B/C gate on threshold failures.
## Acceptance Criteria
**AC-1: PT-10 exercises real gRPC stream**
Given a running API with TLS and valid `JWT_SECRET`
When `scripts/run-performance-tests.sh` runs PT-10
Then the probe calls `RouteTileDelivery.DeliverRouteTiles` over gRPC with `authorization: Bearer` metadata, AND receives at least one `Batch` event before `DeliveryComplete` on every successful iteration.
**AC-2: Latency metrics reported**
Given `PERF_REPEAT_COUNT` cold iterations (default 20)
When PT-10 completes
Then the harness prints `first_batch_ms` and `total_stream_ms` per iteration plus aggregate p50/p95 for both metrics.
**AC-3: Threshold gate**
Given the documented dev-hardware budgets in `performance-tests.md`
When PT-10 p95 values are compared
Then `p95(first_batch_ms) ≤ 30000` AND `p95(total_stream_ms) ≤ 120000`, OR the script exits non-zero with an explicit threshold failure message (Step 15 may still offer override).
**AC-4: Slow-consumer smoke**
Given `PERF_PT10_SLOW_MS` > 0 (default 50)
When PT-10 runs the slow-consumer sub-check once per script invocation
Then the stream completes with `DeliveryComplete`, at least one tile in collected batches, AND no `DeliveryError`.
**AC-5: Spec and traceability updated**
Given the post-task repository state
When `performance-tests.md` and `traceability-matrix.md` are read
Then PT-10 is documented as **Implemented** with scenario ID PT-10, AND the gRPC stream perf row is no longer `Unverified`.
**AC-6: No duplicated JWT / gRPC bootstrap**
Given existing `PerfBootstrap`, `GrpcTestHelpers`, and `JwtTokenFactory`
When PT-10 mints tokens and opens channels
Then it reuses those surfaces — no third JWT mint implementation and no parallel gRPC client factory in the shell script.
## Non-Functional Requirements
**Performance**
- Thresholds above are dev-compose budgets on 8-core x86 baseline; document revision path if hardware changes.
**Reliability**
- Script exits non-zero if gRPC channel cannot be established, JWT is missing, or zero successful iterations complete.
- Fail loudly with RPC status code and detail on `DeliveryError` or non-OK `RpcException`.
**Compatibility**
- Bash 4+; gRPC probe runs via `dotnet` IntegrationTests DLL (already built for perf bootstrap).
- Works with `docker-compose.perf.yml` overlay and `API_URL=https://localhost:18980` default.
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-6 | Grep / static review | No new `JwtSecurityToken` constructors in perf script; PT-10 logic lives in IntegrationTests |
## Blackbox Tests
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|--------|------------------------|-------------|-------------------|----------------|
| AC-1..AC-4 | API + DB running via compose; valid JWT | Run full `scripts/run-performance-tests.sh` including PT-10 | Real gRPC stream; metrics printed; thresholds evaluated | PT-10 NFR |
| AC-1 | Same stack | Run IntegrationTests `--run-pt10` in isolation | Exit 0; JSON or tabular metrics on stdout | — |
## Constraints
- Must not regress PT-01..PT-08 or `scripts/run-tests.sh`.
- Must not commit minted tokens or disable TLS verification in tracked files.
- Threshold failures are reported by the script; autodev Step 15 handles operator override — this task does not weaken gates silently.
## Risks & Mitigation
**Risk 1: Cold-path variance from Google Maps**
- *Risk*: First-batch p95 spikes on network blips look like regressions.
- *Mitigation*: Document that PT-10 cold path includes GM download; Step 15 override path remains available; record actuals in `_docs/06_metrics/perf_<date>.md`.
**Risk 2: TLS / HTTP2 setup differs from integration tests**
- *Risk*: Perf script `API_URL` may not match in-compose `https://api:8080` used by integration tests.
- *Mitigation*: Reuse `GrpcTestHelpers.CreateChannel` cert handling; document required `API_URL` and `certs/api.crt` in perf script header (same as REST perf).
**Risk 3: Threshold too tight on laptop hardware**
- *Risk*: p95 gates fail on underpowered dev machines.
- *Mitigation*: Thresholds match PT-01/PT-03 family (generous dev budgets); tune in `_docs/06_metrics` after first green run if needed.