Drains all three deferred perf-harness items in one batch: - PT-01..PT-06 now carry Authorization: Bearer minted via the canonical SatelliteProvider.TestSupport.JwtTokenFactory (AZ-491) — no third copy of JWT logic in the shell. - PT-07 implemented as cold + warm dual-pass distribution (N=20 each), reports p50/p95 for both passes and fails if warm p95 >= cold p95. - PT-08 implemented as 20-batch upload distribution with batch p95 gated at the AZ-488 2000 ms target; per-item gate cost reported as derived proxy (batch_p95 / batch_size). New SatelliteProvider.IntegrationTests/PerfBootstrap.cs adds two CLI short-circuit subcommands (--mint-only and --gen-uav-fixture <path>) invoked by the shell so the perf script never inlines the JWT or JPEG-fixture logic. The dispatch sits at the top of Program.cs Main and runs before any HTTP / DB / readiness setup. performance-tests.md PT-07 + PT-08 flip from Deferred to Implemented. traceability-matrix.md PT-07 + PT-08 rows move from recorded to covered (PT-08 partial due to per-item proxy — flagged Low in batch-4 review). _docs/_process_leftovers/2026-05-11_perf-pt07-harness.md deleted; the leftovers directory is now empty. Closes cycle-2 retro Action 2; LESSONS.md [process] rule about Deferred NFRs remains in force as a guardrail. Also includes the previously-uncommitted cumulative review report for cycle-3 batches 01-03 (generated at the end of batch 3 but not staged). Co-authored-by: Cursor <cursoragent@cursor.com>
15 KiB
Code Review — Batch 04 cycle 3
Tasks reviewed: AZ-492 (Perf harness PT-07 + PT-08 + JWT-attach) Date: 2026-05-12 Verdict: PASS_WITH_WARNINGS (2 Low findings; 0 Critical/High/Medium)
Phase 1: Context
Spec inputs read: _docs/02_tasks/todo/AZ-492_perf_harness_pt07_pt08_jwt_attach.md; project restrictions / solution overview from prior batches still in force; cycle-2 retrospective Action 2 explicitly promoted this work; LESSONS.md [process] rule on Deferred-status NFRs is the governing guardrail. Changed files mapped to the AZ-492 ACs:
SatelliteProvider.IntegrationTests/PerfBootstrap.cs(new) → AC-1 (mint surface) + AC-3 (UAV fixture) + AC-6 (no duplicate mint)SatelliteProvider.IntegrationTests/Program.cs(modified) → dispatch for AC-1 / AC-3 subcommandsscripts/run-performance-tests.sh(rewritten) → AC-1 / AC-2 / AC-3_docs/02_document/tests/performance-tests.md→ AC-4_docs/02_document/tests/traceability-matrix.md→ AC-4 (status visibility)_docs/_process_leftovers/2026-05-11_perf-pt07-harness.md(deleted) → AC-5- Module-layout + tests_integration doc updates → architectural documentation
_docs/06_metrics/retro_2026-05-11_cycle2.md§ Action 2 → process-resolution back-reference
Phase 2: Spec Compliance
| AC | Status | Evidence |
|---|---|---|
| AC-1: PT-01..PT-06 no longer 401 | Covered (static) | 10 curl sites in scripts/run-performance-tests.sh all carry -H "$AUTH_HEADER"; AUTH_HEADER="Authorization: Bearer $PERF_JWT_TOKEN". The wait_region_completed polling helper also carries it (line 131). The PT-08 curl_args array carries it (line 389). Runtime verification deferred to Step 16. |
| AC-2: PT-07 runs to completion | Covered (static) | Two timing arrays (PT07_COLD_MS, PT07_WARM_MS) populated by 20 cold + 20 warm requests at the same coordinate set. percentile() awk helper computes p50 and p95 for both. Pass/fail asserts warm_p95 < cold_p95. Same-coordinate design guarantees the warm pass hits the cache (otherwise the cold pass would not have populated it). |
| AC-3: PT-08 runs to completion | Covered (static) | --gen-uav-fixture produces the JPEG once; 20 batches of 10 distinct-coordinate items uploaded. Per-batch accepted/rejected/failed counts surfaced. Batch p95 gated at 2000 ms. Per-item gate cost is a derived proxy (see Spec-vs-reality below). |
| AC-4: Spec status reflects implementation | Covered | performance-tests.md PT-07 and PT-08 both carry **Status**: **Implemented (AZ-492).**; "Deferred — harness work tracked in " language gone. traceability-matrix.md rows moved from ◐ recorded / ◐ recorded (Deferred) to ✓. |
| AC-5: Leftover drained | Covered | _docs/_process_leftovers/2026-05-11_perf-pt07-harness.md deleted; ls _docs/_process_leftovers/ shows no files. |
| AC-6: Token-mint surface reused, not duplicated | Covered | rg 'new JwtSecurityToken\(' matches one source-code site only (SatelliteProvider.TestSupport/JwtTokenFactory.cs). PerfBootstrap.MintToken() is a 6-line delegation to JwtTokenFactory.Create. The shell script does not inline JWT logic — it shells out to dotnet <dll> --mint-only. |
Contract verification: AZ-492 has no ## Contract section; not applicable.
Consumer-side contract verification: AZ-492 depends on AZ-487 (JWT_SECRET surface) and AZ-491 (canonical JwtTokenFactory). Both consumed correctly — the new PerfBootstrap.cs uses JwtTestHelpers.ResolveSecretOrThrow (the AZ-487-introduced env-var surface) and JwtTokenFactory.Create (the AZ-491 canonical factory). No drift.
Scope creep check: Implementation stayed within harness-only scope per spec § Excluded ("Any change to production code; this is harness-only work"). Documentation updates to module-layout / tests_integration / traceability-matrix are explicitly in scope for the doc-sync part of the spec. Retro back-reference is process hygiene, not scope creep.
Phase 3: Code Quality
SOLID:
PerfBootstraphas a clear single responsibility: short-circuit perf-harness subcommands. Two methods, both pure-CLI dispatch + delegate. Good.Program.csdispatch block is 13 lines — minimal coupling between perf-bootstrap and the rest of the runner. The decision to put it FIRST (before any HTTP / DB / env read) is correct — these subcommands should work even on a host where the API and Postgres are not running.
Error handling:
PerfBootstrap.MintTokencatchesInvalidOperationExceptionfromJwtTestHelpers.ResolveSecretOrThrowand writes to stderr, returning exit code 1. Good — the shell can detect the failure cleanly.PerfBootstrap.GenerateUavFixturevalidatesargs.Length >= 2and emits a usage hint on stderr with exit code 2. Good.- Shell script: every
curlcapture checks the HTTP code and adds toFAILon mismatch. Thepercentile()helper guards againstNR == 0to avoid div-by-zero. Thedotnet --mint-onlycapture checks[[ -z "$PERF_JWT_TOKEN" ]]after the call to catch an empty-output failure.
Naming: PerfSubject, PerfTokenLifetime, PerfBootstrap — clear, consistent prefix. Shell variables follow the existing convention (PT07_*, PT08_*, PERF_*). AUTH_HEADER is uppercase per shell-script convention.
Complexity:
PerfBootstrap.MintToken— 14 lines, no branches except the try/catch.PerfBootstrap.GenerateUavFixture— 12 lines.PerfBootstrap.CreateValidJpeg— 20 lines, single loop.scripts/run-performance-tests.sh— 430 lines total. Each PT-NN scenario is ~25–60 lines and is independently readable. The PT-08 batch loop is the longest single block (~70 lines including metadata construction); it could be extracted into a function but the inlining keeps the data flow obvious. Acceptable.
DRY:
- The same JPEG-creation pattern now exists in three places:
UavTileImageFactory.CreateRandomJpeg(unit),UavUploadTests.CreateValidJpeg(integration), andPerfBootstrap.CreateValidJpeg(perf bootstrap). This is flagged as L2 below with a recommended consolidation path.
Test quality: AZ-492 spec's "Unit Tests" table notes that AC-1 (Bearer-attach helper) is only testable if a helper function is introduced — the shell script inlines the header directly, so no shell-side helper to test. AC-6 is tested by the repo-wide grep (one source site → not new). Both AC verifications above use the static-check approach the spec authorises. The mint logic itself is unit-tested in SatelliteProvider.Tests/TestSupport/JwtTokenFactoryTests (AZ-491); the PerfBootstrap.MintToken wrapper is a 6-line delegation and is exercised end-to-end by Step 16's actual perf run.
Dead code: None added. The deleted leftover file removes ~50 lines of stale process documentation.
Phase 4: Security Quick-Scan
- Token lifetime: 4 hours, mitigation for AZ-492 Risk 3 (token expiry mid-test). Tokens are minted on each script run and never persisted; the perf script's tmpdir is wiped on exit (trap-based cleanup). No JWT material ends up on disk.
- Secret handling:
JWT_SECRETis read from env or.env(gitignored). Never echoed. The byte-length check fails fast for under-32-byte secrets — same contract asscripts/run-tests.sh. Good. PERF_JWT_TOKENenv var: documented as the operator-supplied alternative to in-script minting. If the operator pre-mints, the script does not echo the token value (only its byte length). Good.- Subject value:
perf-tests— distinct from the integration test runner subject (integration-tests) so audit logs can disambiguate the source. Good. permissions: GPSclaim: required by AZ-488's UAV upload endpoint. Granted to the perf token so PT-08 can exercise the AC-3 path. No other permissions are minted — least-privilege ish.- Input validation on
--gen-uav-fixture: path is treated as a literal filesystem path. The script writes to$PERF_TMP_DIR/uav_fixture.jpgwhich is a freshly-createdmktemp -ddirectory. No path-traversal risk in the current call site; if a future consumer passes an untrusted path it would write to that path — documented behaviour for an internal test helper.
No new attack surface introduced; no secret material touches version control; no new endpoints exposed.
Phase 5: Performance Scan
The harness IS the performance scan. Observations on the harness itself:
dotnet buildruns once per script invocation (or skipped if the DLL already exists). Build time ~5–10 s on dev hardware — acceptable for a perf run.dotnet <dll> --mint-onlystartup is ~1.5–2 s (CLR cold start + a tiny token mint). Acceptable for a one-time bootstrap.- The cold + warm passes in PT-07 do 40 total region requests at ~200m zoom 18 — about ~40 × (5–15 s per request) = 3.5–10 minutes for PT-07 alone. PT-08 adds another 20 × (200–2000 ms) = 4–40 s. The full script run on dev hardware is in the 8–15 minute range; not fast, not glacial.
- The
awkpercentile helper sorts the input array — O(N log N) per call. With N=20 this is trivial.
No performance concerns in the harness code itself.
Phase 6: Cross-Task Consistency
- Naming alignment with AZ-491:
PerfBootstrap.PerfSubject = "perf-tests"mirrors the AZ-491 pattern ofJwtTestHelpers.DefaultSubject = "integration-tests". Consistent.PermissionsClaimType = "permissions"matches the valueUavUploadTestsand the API use. - Architecture alignment with AZ-491 + AZ-493: pure / stateless logic stays in
TestSupport(JwtTokenFactory,IntegrationTestResetGuard); side-effectful / dependency-bearing logic stays in the consumer (IntegrationTestDatabaseResetfor Npgsql,PerfBootstrapfor ImageSharp). Same boundary as batches 2 and 3. Good. - Documentation pattern: AZ-492 follows the AZ-491 / AZ-493 doc pattern —
tests_integration.mdgets the runtime surface,module-layout.mdgets the boundary rationale,performance-tests.mdgets the test-spec status flip. Consistent. - Duplicate test-helper detection (the Phase 6 rule added by AZ-491 review): the JPEG factory triple is flagged below as L2 — the rule fires.
Findings
L1 (Low / Spec accuracy) — AZ-492 AC-3 "per-item gate cost" satisfied by proxy, not direct measurement
Location: scripts/run-performance-tests.sh PT-08 (~line 410); _docs/02_document/tests/performance-tests.md PT-08 entry.
Issue: AC-3 says "the script reports per-item gate cost and end-to-end batch latency". The end-to-end batch latency is direct; the per-item gate cost is reported as the derived proxy batch_p95 / batch_size. True per-call UavTileQualityGate.Validate timing requires server-side instrumentation that the AZ-492 spec § Excluded explicitly excludes ("Any change to production code; this is harness-only work").
Severity rationale: Low / Spec accuracy. The deviation is a conscious trade-off documented in both the script comments AND the test-spec doc AND the traceability-matrix row (which now reads ✓ (batch p95) / ◐ (per-item proxy only)). The AC is satisfied in spirit (the harness produces a per-item number); future work on production-side timing would replace the proxy with the real value.
Suggested follow-up: PBI to add a quality_gate_validate_ms field to the per-item response or a metrics endpoint, then update the perf script to consume it. Estimate 2 SP. Sequence: after the first AZ-492 perf-gate result determines whether the proxy is misleading in practice.
L2 (Low / Maintainability) — Duplicate CreateValidJpeg-shaped JPEG factory now in THREE locations
Location:
SatelliteProvider.Tests/TestUtilities/UavTileImageFactory.cs(unit tests — internal)SatelliteProvider.IntegrationTests/UavUploadTests.cs§CreateValidJpeg(integration — private)SatelliteProvider.IntegrationTests/PerfBootstrap.cs§CreateValidJpeg(perf bootstrap — private)
Issue: All three produce a 256×256 random-noise JPEG via ImageSharp Image<Rgba32> with random.Next(256) per channel and JpegEncoder { Quality = 95 }. The implementations differ trivially (constants, comments) but the logical surface is identical. This is exactly the cycle-2 problem that AZ-491 solved for the JWT factory.
Severity rationale: Low / Maintainability. None of the three is wrong; the issue is future drift — if the quality gate becomes pickier (e.g. enforces minimum entropy), all three factories must update in lockstep. The AZ-491 review explicitly added a Phase 6 rule for this pattern, and the rule fires here.
Suggested follow-up: Move the JPEG factory to SatelliteProvider.TestSupport/UavTileImageFactory.cs (extending the AZ-491 boundary). PerfBootstrap then becomes a one-line var bytes = UavTileImageFactory.CreateRandomJpeg();. The integration-tests UavUploadTests consumes the same surface, eliminating the third copy. Cost: 1–2 SP. The ImageSharp dependency would have to be added to SatelliteProvider.TestSupport.csproj, which is acceptable because both consumers (Tests + IntegrationTests) already depend on it.
Not a blocker for this PBI because (a) the proximate AZ-492 scope was harness-only, (b) extracting the factory in this batch would have pulled ImageSharp into TestSupport — a non-trivial architectural change that warrants its own review, and (c) the precedent is well-established (AZ-491 split the JWT factory; this is the natural sequel).
Architecture Compliance
- Layering:
PerfBootstrapsits inSatelliteProvider.IntegrationTests(Layer-99 test infra). It calls intoSatelliteProvider.TestSupport(Layer-99 test infra) which it already ProjectReferences. No production layer touched. The new module-layout note documents this explicitly. - WebApi documentation convention (AZ-495): not relevant to this batch; no WebApi changes.
- Test-isolation guardrail (AZ-493): PT-07's distinct-coordinate cold pass means the test data spreads across 20 cells per run. The persistent Postgres volume + AZ-493 reset hook handle cleanup between integration-test runs; the perf script doesn't share a runner with the integration tests, but if a future operator runs PT-07 against the same volume the cells will accumulate. Not a problem in practice (the integration-test reset hook truncates
tileson the next integration run), but worth noting in the perf-gate playbook if PT-07 ever starts mis-firing due to pre-existing cells at the chosen coordinates.
Recurring patterns (for the cycle 3 retrospective)
- Spec-vs-reality on derived measurements: PT-08's "per-item gate cost" became a proxy because the spec didn't constrain the measurement path. AZ-493 had a similar pattern (DB-name vs Host-allowlist). The cycle-3 retro should capture: "ACs that prescribe a measurement should also prescribe the path for collecting it, or note that the harness gets to choose between direct measurement and a proxy."
- Triple-duplicate test fixtures: the JWT factory was consolidated in AZ-491; the JPEG factory is the natural next target. Capture as an Improvement Action for cycle 4.
Verdict
PASS_WITH_WARNINGS — implementation satisfies all 6 ACs (runtime verification of AC-1 / AC-2 / AC-3 at Step 16). Two Low findings, both deferred to future PBIs by explicit scope choices in AZ-492. No Critical/High/Medium findings.
Ready to merge / advance to the next batch.