Files
annotations/_docs/02_document/tests/test-data.md
T
Oleksandr Bezdieniezhnykh 03f879206e docs+src: complete Steps 1-3 outcomes + auth re-sync baseline
This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 20:19:05 +03:00

103 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Test Data Management
## Seed Data Sets
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|----------|-------------|---------------|-----------|---------|
| `tokens-test` | 3 ES256 access tokens minted on demand by the runner: `ann-token` (claim `ANN`), `dataset-token` (`DATASET`), `adm-token` (`ADM`). All carry `iss=$JWT_ISSUER`, `aud=$JWT_AUDIENCE`, `exp=now+5m`, and a deterministic `sub` GUID per role. | F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12 | The harness runs a **mock JWKS issuer** (Python script `tests/harness/mock_issuer.py` or the equivalent .NET fixture) that publishes the public ES256 key at `JWT_JWKS_URL`. The runner imports the matching private key as a fixture and mints tokens per test. | Tokens are short-lived (5m) and never persisted; key pair regenerates on `docker compose down -v` |
| `mission-test` | One canonical waypoint id `00000000-0000-0000-0000-000000000aaa` used as `WaypointId` / `MissionId` in every annotation create. | All F1, F2, F3, F4, F5, F8 | Implicit — no FK enforcement; the GUID is just a column value. | N/A |
| `classes-baseline` | The 19 detection classes seeded by `DatabaseMigrator` (ids 018, names per `data_parameters.md`). | F7-001 (catalog read), F1-* (class_num references) | Auto, by the SUT's boot-time migrator. | N/A — schema-managed |
| `clean-state` | Empty `annotations`, `media`, `detection`, `annotations_queue_records` tables at the start of each test class. | every test class that asserts on count / depth | xUnit class fixture: `TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE;` via direct DB connection (out-of-band, runner-only). | Fixture's `Dispose()` truncates again |
## Data Isolation Strategy
- **Per-class truncation** — each xUnit test class declares an `IClassFixture<CleanStateFixture>` that truncates the four mutable tables before the first test in the class and again after the last.
- **Per-test token** — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
- **Per-test mission id** — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh `WaypointId` GUID per test so concurrent test runs don't leak events into each other.
- **Per-test stream consumer** — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset `next` (current end of stream). They consume only messages produced after the test starts.
- **Filesystem isolation** — `annotations-images`, `annotations-videos`, `annotations-deleted` volumes are recreated by `docker compose down -v` between full runs. Per-test cleanup removes only files the test wrote (matching `<id>` patterns).
## Input Data Mapping
| Input Data File | Source Location | Description | Covers Scenarios |
|-----------------|----------------|-------------|-----------------|
| `image_small.jpg` | `<fixtures>/image_small.jpg` | 1280×720 frame, ~1.5 MB | F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-* |
| `image_dense01.jpg` | `<fixtures>/image_dense01.jpg` | small dense frame (~230 KB) | F1-004, F5-002, F8-002 |
| `image_dense02.jpg` | `<fixtures>/image_dense02.jpg` | larger dense frame (~2.8 MB) | F5-002 |
| `image_different_types.jpg` | `<fixtures>/image_different_types.jpg` | multi-class scene (900×1600) | F8-002 (class filter) |
| `image_empty_scene.jpg` | `<fixtures>/image_empty_scene.jpg` | 1920×1080 empty scene | F1-003 (zero detections), NFT-PERF-* warmup |
| `image_large.JPG` | `<fixtures>/image_large.JPG` | 6252×4168, ~7 MB | F1-005 (large payload), NFT-PERF-LATENCY |
| `video_short01.mp4` | `<fixtures>/video_short01.mp4` | ~150 MB video | F1-006 (video annotation), F1-007 |
| `video_short02.mp4` | `<fixtures>/video_short02.mp4` | distinct-bytes second video | F1-007 (distinct bytes → distinct ids) |
`<fixtures>` resolves to `/fixtures` inside the test runner / SUT container, bound to `../detections/_docs/00_problem/input_data/` per `_docs/00_problem/input_data/fixtures.md`.
## Synthetic request payloads
JSON request bodies for `POST /annotations`, `PUT /annotations/{id}`, `POST /dataset/status/bulk`, and the auth flows live under `_docs/00_problem/input_data/requests/`. Each test references a request file by id (`F1_001_request.json`). Class numbers in detections come from the seeded `detection_classes` (ids 018); coordinates are normalized 0..1 floats.
## Expected Results Mapping
(Full table is `_docs/00_problem/input_data/expected_results/results_report.md` — 44 rows. Selected entries here for cross-reference.)
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Source |
|-----------------|------------|-----------------|-------------------|-----------|--------|
| FT-P-01 (=F1-001) | `image_small.jpg` + `F1_001_request.json` | HTTP 200 + `AnnotationDto`; `id =~ /^[0-9a-f]{32}$/`; `detections.length == 1` | exact, schema_match, regex | N/A | `expected_results/F1_001_response.json` |
| FT-P-02 (=F1-002) | Same input, second POST | Same `id` as FT-P-01; no duplicate row | exact | N/A | inline |
| FT-P-04 (=F1-004) | `image_dense01.jpg` + `F1_004_request.json` | HTTP 200; `detections.length == 5`; YOLO label file with 5 lines | exact, file_content | N/A | `expected_results/F1_004_response.json` |
| FT-P-10 (=F3-001) | F1-001 fires, SSE subscriber connected | event with `operation == "Created"`, `latency ≤ 1000ms` | exact, threshold_max | ± 200ms | inline |
| FT-N-04 (=F1-N-004) | F1-001 with no `Authorization` header | HTTP 401 + error envelope | exact, schema_match | N/A | inline |
| NFT-PERF-LATENCY-01 | `image_small.jpg` × 50 sequential calls | p95 latency ≤ 1500ms | threshold_max | N/A | inline |
| NFT-RES-01 | RabbitMQ stopped, F1-001 fires | HTTP 200 returned to caller; outbox row stays; SUT stays alive | exact | N/A | inline |
| NFT-SEC-01 | F1-001 with JWT signed by **wrong** key | HTTP 401 | exact | N/A | inline |
| NFT-RES-LIM-01 | F4 outbox under sustained load | queue depth ≤ 10× steady-state for ≥ 30 min | threshold_max | N/A | inline |
## External Dependency Mocks
| External Service | Mock/Stub | How Provided | Behavior |
|-----------------|-----------|-------------|----------|
| RabbitMQ Stream broker | Real `rabbitmq:3.13-management` with the streams plugin | Docker service in `e2e-net` | Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using `docker exec rabbitmq rabbitmqctl stop_app && start_app` |
| Postgres | Real `postgres:13` | Docker service | Real DB; resilience tests (NFT-RES-04) crash and restart it |
| Detections service | Not run | N/A | The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic `detections[]` payloads in `requests/`. |
| Suite-level reverse proxy / TLS terminator | Not run | N/A | Tests speak directly to `http://annotations:8080`. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT". |
## Data Validation Rules
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|-----------|-----------|-----------------|------------------------|
| `image_bytes` (POST /annotations) | non-null, non-empty byte array | empty array `[]`, missing field | HTTP 400/422; error envelope |
| `mediaType` (POST /annotations) | enum `Image=10` or `Video=20` | `5`, `100`, missing | HTTP 400/422; error envelope |
| `detections[].class_num` | int, no range validator today | `-1`, `999` | HTTP 200 today (lenient); flagged as gap (SEC-05) |
| `detections[].centerX/Y/width/height` | float, no range validator today | `1.5`, `-0.1`, `NaN` | HTTP 200 today (lenient); flagged as gap (SEC-05) |
| `Authorization` header | bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with `alg` pinned to ES256 | missing, wrong issuer, wrong audience, wrong signature, expired, `alg=HS256` forgery | HTTP 401; error envelope |
| Caller policy | `ANN`, `DATASET`, or `ADM` per endpoint | mismatched policy | HTTP 403; error envelope |
| `WaypointId` (POST /annotations, /media) | GUID format | not a GUID | HTTP 400/422 from model binder |
| File-upload size (POST /media) | no explicit limit visible at controller; underlying ASP.NET form-options apply | >256 MB single file | likely HTTP 400 from form-options; verify in NFT-RES-LIM-02 |
## Runtime-generated test data
Two scenario groups consume **synthetic test data generated by the runner at execution time** rather than static files on disk. This is intentional and explicitly allowed by `templates/expected-results.md` ("Test data may be generated programmatically — note this in test-data.md"):
| Scenario | Generated data | How |
|----------|----------------|-----|
| NFT-RES-LIM-02 (single-file upload boundary) | Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB | Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo. |
| NFT-PERF-LIST-01, NFT-PERF-DATASET-01 | 10,000 `annotations` rows + 50,000 `detection` rows in the test DB | `dataseed` job runs a parameterised SQL script that bulk-inserts rows with `media_id` referencing 100 distinct seeded media rows; uses `CROSS JOIN generate_series` for speed. Cleared by `clean-state` truncation between test classes. |
The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).
## Bearer token harness
Annotations is verifier-only — there is no `/auth/login` to call from a test. The harness reproduces the production model in miniature:
1. **Key pair** — a fresh ES256 key pair is generated when the test stack starts (`docker compose up`). The private key is mounted into the runner container; the public key is mounted into a tiny **mock issuer** sidecar that serves `/.well-known/jwks.json` over HTTP **inside the docker-compose network**.
2. **JWKS URL configuration** — the SUT is started with `JWT_ISSUER=https://e2e-issuer.test`, `JWT_AUDIENCE=annotations-e2e`, and `JWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json`. The HTTPS-only constraint of `HttpDocumentRetriever { RequireHttps = true }` is relaxed for tests by either (a) overriding `RequireHttps=false` via test-only configuration, or (b) running a TLS-terminating proxy in front of the issuer. Option (a) is preferred for simplicity; the relaxation is gated on `ASPNETCORE_ENVIRONMENT=E2ETest` and never applied in production builds. (This is the testability item flagged in `architecture.md` Open Risks §6.)
3. **Token minting** — the runner exposes a per-test helper `mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?)` that builds an ES256 JWT from the in-process private key with the configured `iss`/`aud`, `exp = now + 5m`, a per-role deterministic `sub` GUID, and the requested policy claim. `overrides` lets a test produce expired / wrong-iss / wrong-aud / forged-`alg=HS256` variants for the security suite.
4. **No persisted users** — there is no `users` table in this service. Each test mints exactly the token it needs.
## Notes for the runner
- **Boot order**: `postgres``rabbitmq``e2e-issuer` (mock JWKS) → `annotations` (waits for postgres, rabbitmq, and a successful JWKS fetch) → `dataseed``e2e-runner`.
- **Fresh-state vs. carry-over**: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
- **Stream consumption**: every test that reads from `azaion-annotations` records the offset before the test acts, then consumes from `start_offset = recorded_offset + 1` to ignore historical messages.
- **Conditional probes**: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.