Files
annotations/_docs/02_document/tests/performance-tests.md
T
Oleksandr Bezdieniezhnykh 03f879206e docs+src: complete Steps 1-3 outcomes + auth re-sync baseline
This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 20:19:05 +03:00

148 lines
5.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Performance Tests
> **Calibration note**: no contracted SLAs exist anywhere in the codebase or `acceptance_criteria.md`. The thresholds below are **inferred starting points** anchored to the documented system properties. Step 15 (Performance Test) of the autodev existing-code flow will tune them against real targets. A test that fails the threshold is a *signal*, not a release-blocker, until the targets are contracted.
### NFT-PERF-LATENCY-01: Annotation create — p95 latency, small image
**Summary**: Sequential `POST /annotations` with a small frame stays under a per-call threshold at p95.
**Traces to**: implicit NFR; documented gap on AC-N-* (no contracted target)
**Metric**: end-to-end response latency in ms (consumer wall-clock from request start to body close).
**Preconditions**:
- SUT freshly started; warmup loop of 10 sequential calls discarded.
- Clean state; clean outbox; RabbitMQ stream consumer not connected (writes fan out via channel + outbox only).
- Single in-process consumer (no concurrent load).
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Warmup: 10× `POST /annotations` with `image_small.jpg` | discarded |
| 2 | Measure: 50× `POST /annotations` with `image_small.jpg`, sequential, single consumer | record latency per call |
| 3 | Compute p50, p95, p99 | summary stats |
**Pass criteria**: p95 ≤ 1500ms, p99 ≤ 3000ms (single-instance dev DB, no concurrent load).
**Duration**: ~2 minutes.
---
### NFT-PERF-LATENCY-02: Annotation create — large image
**Summary**: Same shape as -01 with a 7 MB image.
**Traces to**: same as -01.
**Metric**: end-to-end latency.
**Preconditions**: same as -01.
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Warmup: 5× `POST /annotations` with `image_large.JPG` | discarded |
| 2 | Measure: 20× `POST /annotations` with `image_large.JPG`, sequential | record latency per call |
| 3 | p50, p95, p99 | summary stats |
**Pass criteria**: p95 ≤ 5000ms, p99 ≤ 8000ms.
**Duration**: ~2 minutes.
---
### NFT-PERF-THROUGHPUT-01: Annotation create — sustained writes
**Summary**: 5-minute sustained `POST /annotations` traffic at 5 RPS does not degrade response latency.
**Metric**: response latency over time + total successful responses.
**Preconditions**: SUT warm; clean state; clean outbox; RabbitMQ broker reachable.
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Warmup: 30s at 5 RPS with `image_small.jpg` | discarded |
| 2 | Measure: 5 minutes at 5 RPS, 1 consumer | record per-second latency p50/p95 |
| 3 | Compare windows | p95 in last minute ≤ 1.5× p95 in first minute |
**Pass criteria**: 0 HTTP 5xx; p95 latency in last minute ≤ 1.5× p95 in first minute.
**Duration**: ~6 minutes.
---
### NFT-PERF-OUTBOX-DRAIN-01: FailsafeProducer drain rate
**Summary**: Under sustained writes, the outbox queue depth stays bounded.
**Traces to**: AC-N-03
**Metric**: `SELECT COUNT(*) FROM annotations_queue_records` sampled every 5s during the run.
**Preconditions**: NFT-PERF-THROUGHPUT-01 running; RabbitMQ broker reachable; no stream consumer back-pressure.
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | While -THROUGHPUT-01 is running, sample queue depth every 5s for the full duration | record samples |
| 2 | Compute max queue depth + average drain interval | summary stats |
**Pass criteria**: max queue depth ≤ 100 rows; depth at end-of-run ≤ depth at start-of-run + 10.
**Duration**: 5 minutes (overlaid on -THROUGHPUT-01).
---
### NFT-PERF-SSE-FANOUT-01: SSE delivery latency under modest fan-out
**Summary**: 10 simultaneous SSE subscribers receive every event for their mission within the latency budget.
**Traces to**: AC-F-10
**Metric**: per-subscriber event-arrival latency (consumer wall-clock from `POST /annotations` returning to SSE event arrival).
**Preconditions**: SUT warm; clean state.
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Open 10 SSE connections to `/annotations/events?missionId=<m>` | all 10 alive |
| 2 | `POST /annotations` once for mission `<m>` | record post-return timestamp |
| 3 | Each subscriber records its event-arrival timestamp | per-subscriber latency |
| 4 | Compute max latency across the 10 subscribers | summary |
**Pass criteria**: every subscriber receives the event; max latency ≤ 1000ms.
**Duration**: 30s.
---
### NFT-PERF-LIST-01: Annotation listing on populated DB
**Summary**: `GET /annotations?limit=100` against a DB with 10,000 rows responds within budget.
**Metric**: end-to-end response latency.
**Preconditions**: DB pre-seeded with 10,000 annotations + 50,000 detections (use `dataseed` to insert via direct SQL, bypassing the public API for population speed — the test still queries via the public API).
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Warmup: 5× `GET /annotations?limit=100&offset=0` | discarded |
| 2 | Measure: 20× `GET /annotations?limit=100&offset=<random 0..9000>` | record per-call latency |
| 3 | p95 | summary |
**Pass criteria**: p95 ≤ 1000ms (read-only path; index `ix_annotations_created_date` should keep it fast).
**Duration**: ~1 minute.
---
### NFT-PERF-DATASET-01: Dataset class distribution at scale
**Summary**: `GET /dataset/class-distribution` against the populated DB.
**Metric**: end-to-end latency.
**Preconditions**: same populated DB as NFT-PERF-LIST-01.
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Warmup: 3 calls | discarded |
| 2 | Measure: 10 calls | record latency |
**Pass criteria**: p95 ≤ 2000ms.
**Duration**: ~30s.