mirror of
https://github.com/azaion/satellite-provider.git
synced 2026-06-21 19:01:15 +00:00
[AZ-492] Cycle 3 batch 4: perf harness PT-07 + PT-08 + JWT-attach
Drains all three deferred perf-harness items in one batch: - PT-01..PT-06 now carry Authorization: Bearer minted via the canonical SatelliteProvider.TestSupport.JwtTokenFactory (AZ-491) — no third copy of JWT logic in the shell. - PT-07 implemented as cold + warm dual-pass distribution (N=20 each), reports p50/p95 for both passes and fails if warm p95 >= cold p95. - PT-08 implemented as 20-batch upload distribution with batch p95 gated at the AZ-488 2000 ms target; per-item gate cost reported as derived proxy (batch_p95 / batch_size). New SatelliteProvider.IntegrationTests/PerfBootstrap.cs adds two CLI short-circuit subcommands (--mint-only and --gen-uav-fixture <path>) invoked by the shell so the perf script never inlines the JWT or JPEG-fixture logic. The dispatch sits at the top of Program.cs Main and runs before any HTTP / DB / readiness setup. performance-tests.md PT-07 + PT-08 flip from Deferred to Implemented. traceability-matrix.md PT-07 + PT-08 rows move from recorded to covered (PT-08 partial due to per-item proxy — flagged Low in batch-4 review). _docs/_process_leftovers/2026-05-11_perf-pt07-harness.md deleted; the leftovers directory is now empty. Closes cycle-2 retro Action 2; LESSONS.md [process] rule about Deferred NFRs remains in force as a guardrail. Also includes the previously-uncommitted cumulative review report for cycle-3 batches 01-03 (generated at the end of batch 3 but not staged). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,131 @@
|
||||
# Performance harness: PT-07 + PT-08 + JWT-attach in run-performance-tests.sh
|
||||
|
||||
**Task**: AZ-492_perf_harness_pt07_pt08_jwt_attach
|
||||
**Name**: Perf harness drains all 3 deferred items
|
||||
**Description**: Promote the deferred PT-07 (route-tile fetch warm cache) and PT-08 (UAV tile batch upload latency) NFRs into actual runnable scenarios in `scripts/run-performance-tests.sh`, AND fix the script so PT-01..PT-06 stop returning 401 against the post-AZ-487 build by attaching a Bearer token to every request.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-487 (Bearer-token attach depends on the JWT_SECRET / token-mint surface introduced in AZ-487); soft dependency on AZ-491 (Option B token-mint reuse)
|
||||
**Component**: Test infrastructure (`scripts/run-performance-tests.sh`) + perf NFR coverage (`_docs/02_document/tests/performance-tests.md`)
|
||||
**Tracker**: AZ-492
|
||||
**Epic**: none (cycle-3 test-infrastructure hardening)
|
||||
|
||||
## Problem
|
||||
|
||||
The performance test gate has now been 0-of-N for two cycles running, and the perf-side rot is actively masking real regressions:
|
||||
|
||||
1. **Cycle 1 (AZ-484)**: PT-07 (route-tile fetch warm cache) was added to `performance-tests.md` and the traceability matrix, but the runner-script implementation was not. Recorded as `Deferred — harness work tracked in _docs/_process_leftovers/2026-05-11_perf-pt07-harness.md`. Step 15 Performance Test gate marked all scenarios `Unverified`. Cycle 1 retrospective Action 2 introduced the "Deferred-status NFRs are allowed at most once" rule (LESSONS.md `[process]`).
|
||||
2. **Cycle 2 (AZ-488)**: PT-08 (UAV tile batch upload latency) was added to `performance-tests.md`, again as Deferred under the cycle-1-sanctioned escape hatch. The leftover file was updated with a PT-08 follow-on instruction.
|
||||
3. **Cycle 2 (AZ-487 side effect)**: `scripts/run-performance-tests.sh` does not attach an `Authorization: Bearer …` header to its outbound requests. After AZ-487 made every endpoint `RequireAuthorization()`, PT-01..PT-06 now return 401 for every call. Step 15 Performance Test gate at cycle 2 had to be skipped because of this script rot. Recorded as a third item in `_docs/_process_leftovers/2026-05-11_perf-pt07-harness.md`.
|
||||
|
||||
Cycle 2 retrospective Improvement Action 2 (`_docs/06_metrics/retro_2026-05-11_cycle2.md` § 7) promoted "schedule PT-07 + PT-08 + JWT-attach as actual feature work" to a top-3 action. Per the cycle-2 LESSONS.md `[process]` rule, any new Deferred-status NFR after this point requires this PBI to land first.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `scripts/run-performance-tests.sh` mints (or reads from `.env`) a valid HS256 JWT signed with `JWT_SECRET` and attaches `Authorization: Bearer <token>` to every probe request.
|
||||
- PT-01..PT-06 return real HTTP 200 responses with measurements written to `_docs/06_metrics/perf_<date>.md` — not 401, not `Unverified`.
|
||||
- PT-07 (route-tile fetch warm cache) is implemented as a runnable scenario in the script. Its row in `performance-tests.md` moves from `Status: Deferred` to `Status: Implemented` with the measurement target documented.
|
||||
- PT-08 (UAV tile batch upload latency) is implemented as a runnable scenario in the script. Same status transition.
|
||||
- The leftover file `_docs/_process_leftovers/2026-05-11_perf-pt07-harness.md` is deleted (or empty enough to delete) after this PBI lands — all three items resolved.
|
||||
- A regression test in CI runs the perf script in smoke mode (single iteration per scenario) to keep the script honest going forward.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Add Bearer-token attach logic to `scripts/run-performance-tests.sh`. Two viable shapes (implementer's choice):
|
||||
- **Option A**: script accepts `PERF_JWT_TOKEN` env var (operator pre-mints) and attaches via `curl -H "Authorization: Bearer $PERF_JWT_TOKEN"`.
|
||||
- **Option B**: script invokes a small `dotnet run --project SatelliteProvider.TestSupport` (or equivalent) to mint the token from `JWT_SECRET` on the fly, then attaches it. Reuses the consolidated factory from `01_consolidate_jwt_test_helpers` if that PBI ships first.
|
||||
- Implement PT-07 scenario per its existing spec in `_docs/02_document/tests/performance-tests.md` (cold + warm region request, measure `tile_lookup_ms` vs `total_ms`).
|
||||
- Implement PT-08 scenario per its existing spec (UAV batch upload of N tiles, measure end-to-end latency + per-item gate cost).
|
||||
- Update `performance-tests.md`: move PT-07 and PT-08 from `Status: Deferred` to `Status: Implemented`; document measurement targets and acceptable variance.
|
||||
- Update `_docs/02_document/tests/traceability-matrix.md` — refresh PT-07 + PT-08 coverage notes from `Unverified` to actual scenario IDs.
|
||||
- Add a CI smoke run of the perf script (or document why none — e.g., needs full DB seed; in that case, gate at Step 15 stays manual but the script is verified per cycle by autodev).
|
||||
- Delete `_docs/_process_leftovers/2026-05-11_perf-pt07-harness.md` (or trim to only any genuinely-future items not addressed here).
|
||||
|
||||
### Excluded
|
||||
|
||||
- Any new perf scenarios beyond PT-07 + PT-08 + the 401 fix.
|
||||
- Any change to production code; this is harness-only work.
|
||||
- Any threshold-based PASS/FAIL gating in autodev Step 15 — the existing skill handles threshold comparison; this PBI only makes the scenarios runnable.
|
||||
- Replacing `curl`-based probes with a structured load tool (k6, JMeter, etc.) — that is a separate decision.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: PT-01..PT-06 no longer 401**
|
||||
Given the post-AZ-487 build is running with a valid `JWT_SECRET`
|
||||
When `scripts/run-performance-tests.sh` runs the existing PT-01..PT-06 scenarios
|
||||
Then every probe request receives an HTTP 200/2xx response (or the scenario's documented expected status), AND zero 401 responses appear in the perf log output.
|
||||
|
||||
**AC-2: PT-07 runs to completion**
|
||||
Given a clean tile cache state
|
||||
When the script runs PT-07 (cold then warm region request)
|
||||
Then `tile_lookup_ms` and `total_ms` are measured for both cold and warm requests, AND the warm `tile_lookup_ms` is documented as less than the cold value (no specific threshold required — just measurable).
|
||||
|
||||
**AC-3: PT-08 runs to completion**
|
||||
Given a valid JWT token with the `GPS` permission
|
||||
When the script runs PT-08 (UAV batch upload of N tiles, N = parameterized, default 10)
|
||||
Then the script reports per-item gate cost and end-to-end batch latency, AND all N items return either `accepted` or a documented reject reason (no `STORAGE_FAILURE` from harness misconfiguration).
|
||||
|
||||
**AC-4: Spec status reflects implementation**
|
||||
Given the post-PBI repository state
|
||||
When `_docs/02_document/tests/performance-tests.md` is read
|
||||
Then PT-07 and PT-08 are marked `Status: Implemented` (or equivalent active state), AND the section is reformatted to no longer claim "harness work tracked in <leftover>".
|
||||
|
||||
**AC-5: Leftover drained**
|
||||
Given `_docs/_process_leftovers/2026-05-11_perf-pt07-harness.md`
|
||||
When this PBI lands
|
||||
Then the file is either deleted or trimmed to genuinely-future items only (and the cycle-1 + cycle-2 follow-on instructions removed because they are now resolved).
|
||||
|
||||
**AC-6: Token-mint surface reused, not duplicated**
|
||||
Given the consolidated JWT factory exists (after `01_consolidate_jwt_test_helpers`) OR the existing per-project mint helpers
|
||||
When the perf script mints its token
|
||||
Then it reuses the canonical surface from `01_consolidate_jwt_test_helpers` (if that PBI has landed) rather than inlining a third mint implementation in the shell script.
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Reliability**
|
||||
- Script must fail loudly (non-zero exit code, clear message) if `JWT_SECRET` is unset or shorter than 32 bytes — same contract as `scripts/run-tests.sh`.
|
||||
- Script must not silently skip a scenario; if a scenario cannot run, exit non-zero with an explicit reason.
|
||||
|
||||
**Compatibility**
|
||||
- Bash 4+ compatible (the script already uses bash; do not introduce Python or other runtime dependencies for token minting unless they leverage an existing project — Option B above).
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | The Bearer-attach logic itself (if a helper function is introduced) | Returns a valid `Authorization: Bearer <token>` header value when invoked with a known secret |
|
||||
| AC-6 | Token mint reuses canonical surface (after `01_consolidate_jwt_test_helpers` lands) | A grep across `scripts/` and test projects shows no new `new JwtSecurityToken(` constructor calls |
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-1 | API container running with auth; valid `JWT_SECRET` exported | Run `scripts/run-performance-tests.sh` PT-01..PT-06 | All requests return non-401 status codes | Reliability |
|
||||
| AC-2 | Clean tile cache; valid Bearer token attached | Run PT-07 cold + warm region request | Cold and warm `tile_lookup_ms` reported separately; warm < cold (no threshold required) | — |
|
||||
| AC-3 | Valid Bearer token with `GPS` permission attached | Run PT-08 UAV batch upload of 10 tiles | All 10 items return `accepted` or documented reject; per-item gate cost reported | — |
|
||||
| AC-4 | Post-PBI repo state | Read `performance-tests.md` | PT-07 + PT-08 status `Implemented`; no "Deferred — harness work" language | — |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Must not regress the existing `scripts/run-tests.sh` flow.
|
||||
- Must not bake a Bearer token into git-tracked files. The token is minted at runtime or read from a git-ignored env var.
|
||||
- If the consolidated test-support library from `01_consolidate_jwt_test_helpers` exists, this PBI MUST reuse it for token minting — no third copy of the mint logic.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Dependency on `01_consolidate_jwt_test_helpers`**
|
||||
- *Risk*: This PBI is cleaner if Option B (mint via `SatelliteProvider.TestSupport`) is taken AND that project exists. If `01_consolidate_jwt_test_helpers` chose Option B for itself (no new library), Option B here is unavailable.
|
||||
- *Mitigation*: Implementer picks between Option A (pre-minted token via env var) and Option B based on the state of `01_consolidate_jwt_test_helpers` at start of work. Option A always works regardless.
|
||||
|
||||
**Risk 2: Perf measurements are unstable on dev hardware**
|
||||
- *Risk*: PT-07 + PT-08 are timing-sensitive. Running on a developer laptop will produce noisy results that look like regressions.
|
||||
- *Mitigation*: The script reports measurements but does NOT gate on them. Autodev Step 15 already has an A/B/C gate on threshold failures handled by the test-run skill in perf mode. This PBI's scope is "make scenarios runnable", not "set thresholds".
|
||||
|
||||
**Risk 3: Token expiry mid-test**
|
||||
- *Risk*: A 1-hour-lifetime token may expire during a long perf run.
|
||||
- *Mitigation*: Mint with a generous lifetime (e.g., 4h) when starting the script. Document the lifetime choice in the script header.
|
||||
|
||||
**Risk 4: CI smoke run becomes a maintenance burden**
|
||||
- *Risk*: A CI-side perf smoke run that runs every commit may be flaky and become a source of noise.
|
||||
- *Mitigation*: Document explicitly whether a CI smoke run is added. If added, run only on `dev` push (not every PR commit) and only with a tight per-scenario timeout to catch script-rot, not perf regressions.
|
||||
Reference in New Issue
Block a user