annotations/_docs/02_document/tests/test-data.md

# Test Data Management

## Seed Data Sets

| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|----------|-------------|---------------|-----------|---------|
| `tokens-test` | 3 ES256 access tokens minted on demand by the runner: `ann-token` (claim `ANN`), `dataset-token` (`DATASET`), `adm-token` (`ADM`). All carry `iss=$JWT_ISSUER`, `aud=$JWT_AUDIENCE`, `exp=now+5m`, and a deterministic `sub` GUID per role. | F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12 | The harness runs a **mock JWKS issuer** (Python script `tests/harness/mock_issuer.py` or the equivalent .NET fixture) that publishes the public ES256 key at `JWT_JWKS_URL`. The runner imports the matching private key as a fixture and mints tokens per test. | Tokens are short-lived (5m) and never persisted; key pair regenerates on `docker compose down -v` |
| `mission-test` | One canonical waypoint id `00000000-0000-0000-0000-000000000aaa` used as `WaypointId` / `MissionId` in every annotation create. | All F1, F2, F3, F4, F5, F8 | Implicit — no FK enforcement; the GUID is just a column value. | N/A |
| `classes-baseline` | The 19 detection classes seeded by `DatabaseMigrator` (ids 0–18, names per `data_parameters.md`). | F7-001 (catalog read), F1-* (class_num references) | Auto, by the SUT's boot-time migrator. | N/A — schema-managed |
| `clean-state` | Empty `annotations`, `media`, `detection`, `annotations_queue_records` tables at the start of each test class. | every test class that asserts on count / depth | xUnit class fixture: `TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE;` via direct DB connection (out-of-band, runner-only). | Fixture's `Dispose()` truncates again |

## Data Isolation Strategy

- **Per-class truncation** — each xUnit test class declares an `IClassFixture<CleanStateFixture>` that truncates the four mutable tables before the first test in the class and again after the last.
- **Per-test token** — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
- **Per-test mission id** — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh `WaypointId` GUID per test so concurrent test runs don't leak events into each other.
- **Per-test stream consumer** — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset `next` (current end of stream). They consume only messages produced after the test starts.
- **Filesystem isolation** — `annotations-images`, `annotations-videos`, `annotations-deleted` volumes are recreated by `docker compose down -v` between full runs. Per-test cleanup removes only files the test wrote (matching `<id>` patterns).

## Input Data Mapping

| Input Data File | Source Location | Description | Covers Scenarios |
|-----------------|----------------|-------------|-----------------|
| `image_small.jpg` | `<fixtures>/image_small.jpg` | 1280×720 frame, ~1.5 MB | F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-* |
| `image_dense01.jpg` | `<fixtures>/image_dense01.jpg` | small dense frame (~230 KB) | F1-004, F5-002, F8-002 |
| `image_dense02.jpg` | `<fixtures>/image_dense02.jpg` | larger dense frame (~2.8 MB) | F5-002 |
| `image_different_types.jpg` | `<fixtures>/image_different_types.jpg` | multi-class scene (900×1600) | F8-002 (class filter) |
| `image_empty_scene.jpg` | `<fixtures>/image_empty_scene.jpg` | 1920×1080 empty scene | F1-003 (zero detections), NFT-PERF-* warmup |
| `image_large.JPG` | `<fixtures>/image_large.JPG` | 6252×4168, ~7 MB | F1-005 (large payload), NFT-PERF-LATENCY |
| `video_short01.mp4` | `<fixtures>/video_short01.mp4` | ~150 MB video | F1-006 (video annotation), F1-007 |
| `video_short02.mp4` | `<fixtures>/video_short02.mp4` | distinct-bytes second video | F1-007 (distinct bytes → distinct ids) |

`<fixtures>` resolves to `/fixtures` inside the test runner / SUT container, bound to `../detections/_docs/00_problem/input_data/` per `_docs/00_problem/input_data/fixtures.md`.

## Synthetic request payloads

JSON request bodies for `POST /annotations`, `PUT /annotations/{id}`, `POST /dataset/status/bulk`, and the auth flows live under `_docs/00_problem/input_data/requests/`. Each test references a request file by id (`F1_001_request.json`). Class numbers in detections come from the seeded `detection_classes` (ids 0–18); coordinates are normalized 0..1 floats.

## Expected Results Mapping

(Full table is `_docs/00_problem/input_data/expected_results/results_report.md` — 44 rows. Selected entries here for cross-reference.)

| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Source |
|-----------------|------------|-----------------|-------------------|-----------|--------|
| FT-P-01 (=F1-001) | `image_small.jpg` + `F1_001_request.json` | HTTP 200 + `AnnotationDto`; `id =~ /^[0-9a-f]{32}$/`; `detections.length == 1` | exact, schema_match, regex | N/A | `expected_results/F1_001_response.json` |
| FT-P-02 (=F1-002) | Same input, second POST | Same `id` as FT-P-01; no duplicate row | exact | N/A | inline |
| FT-P-04 (=F1-004) | `image_dense01.jpg` + `F1_004_request.json` | HTTP 200; `detections.length == 5`; YOLO label file with 5 lines | exact, file_content | N/A | `expected_results/F1_004_response.json` |
| FT-P-10 (=F3-001) | F1-001 fires, SSE subscriber connected | event with `operation == "Created"`, `latency ≤ 1000ms` | exact, threshold_max | ± 200ms | inline |
| FT-N-04 (=F1-N-004) | F1-001 with no `Authorization` header | HTTP 401 + error envelope | exact, schema_match | N/A | inline |
| NFT-PERF-LATENCY-01 | `image_small.jpg` × 50 sequential calls | p95 latency ≤ 1500ms | threshold_max | N/A | inline |
| NFT-RES-01 | RabbitMQ stopped, F1-001 fires | HTTP 200 returned to caller; outbox row stays; SUT stays alive | exact | N/A | inline |
| NFT-SEC-01 | F1-001 with JWT signed by **wrong** key | HTTP 401 | exact | N/A | inline |
| NFT-RES-LIM-01 | F4 outbox under sustained load | queue depth ≤ 10× steady-state for ≥ 30 min | threshold_max | N/A | inline |

## External Dependency Mocks

| External Service | Mock/Stub | How Provided | Behavior |
|-----------------|-----------|-------------|----------|
| RabbitMQ Stream broker | Real `rabbitmq:3.13-management` with the streams plugin | Docker service in `e2e-net` | Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using `docker exec rabbitmq rabbitmqctl stop_app && start_app` |
| Postgres | Real `postgres:13` | Docker service | Real DB; resilience tests (NFT-RES-04) crash and restart it |
| Detections service | Not run | N/A | The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic `detections[]` payloads in `requests/`. |
| Suite-level reverse proxy / TLS terminator | Not run | N/A | Tests speak directly to `http://annotations:8080`. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT". |

## Data Validation Rules

| Data Type | Validation | Invalid Examples | Expected System Behavior |
|-----------|-----------|-----------------|------------------------|
| `image_bytes` (POST /annotations) | non-null, non-empty byte array | empty array `[]`, missing field | HTTP 400/422; error envelope |
| `mediaType` (POST /annotations) | enum `Image=10` or `Video=20` | `5`, `100`, missing | HTTP 400/422; error envelope |
| `detections[].class_num` | int, no range validator today | `-1`, `999` | HTTP 200 today (lenient); flagged as gap (SEC-05) |
| `detections[].centerX/Y/width/height` | float, no range validator today | `1.5`, `-0.1`, `NaN` | HTTP 200 today (lenient); flagged as gap (SEC-05) |
| `Authorization` header | bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with `alg` pinned to ES256 | missing, wrong issuer, wrong audience, wrong signature, expired, `alg=HS256` forgery | HTTP 401; error envelope |
| Caller policy | `ANN`, `DATASET`, or `ADM` per endpoint | mismatched policy | HTTP 403; error envelope |
| `WaypointId` (POST /annotations, /media) | GUID format | not a GUID | HTTP 400/422 from model binder |
| File-upload size (POST /media) | no explicit limit visible at controller; underlying ASP.NET form-options apply | >256 MB single file | likely HTTP 400 from form-options; verify in NFT-RES-LIM-02 |

## Runtime-generated test data

Two scenario groups consume **synthetic test data generated by the runner at execution time** rather than static files on disk. This is intentional and explicitly allowed by `templates/expected-results.md` ("Test data may be generated programmatically — note this in test-data.md"):

| Scenario | Generated data | How |
|----------|----------------|-----|
| NFT-RES-LIM-02 (single-file upload boundary) | Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB | Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo. |
| NFT-PERF-LIST-01, NFT-PERF-DATASET-01 | 10,000 `annotations` rows + 50,000 `detection` rows in the test DB | `dataseed` job runs a parameterised SQL script that bulk-inserts rows with `media_id` referencing 100 distinct seeded media rows; uses `CROSS JOIN generate_series` for speed. Cleared by `clean-state` truncation between test classes. |

The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).

## Bearer token harness

Annotations is verifier-only — there is no `/auth/login` to call from a test. The harness reproduces the production model in miniature:

1. **Key pair** — a fresh ES256 key pair is generated when the test stack starts (`docker compose up`). The private key is mounted into the runner container; the public key is mounted into a tiny **mock issuer** sidecar that serves `/.well-known/jwks.json` over HTTP **inside the docker-compose network**.
2. **JWKS URL configuration** — the SUT is started with `JWT_ISSUER=https://e2e-issuer.test`, `JWT_AUDIENCE=annotations-e2e`, and `JWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json`. The HTTPS-only constraint of `HttpDocumentRetriever { RequireHttps = true }` is relaxed for tests by either (a) overriding `RequireHttps=false` via test-only configuration, or (b) running a TLS-terminating proxy in front of the issuer. Option (a) is preferred for simplicity; the relaxation is gated on `ASPNETCORE_ENVIRONMENT=E2ETest` and never applied in production builds. (This is the testability item flagged in `architecture.md` Open Risks §6.)
3. **Token minting** — the runner exposes a per-test helper `mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?)` that builds an ES256 JWT from the in-process private key with the configured `iss`/`aud`, `exp = now + 5m`, a per-role deterministic `sub` GUID, and the requested policy claim. `overrides` lets a test produce expired / wrong-iss / wrong-aud / forged-`alg=HS256` variants for the security suite.
4. **No persisted users** — there is no `users` table in this service. Each test mints exactly the token it needs.

## Notes for the runner

- **Boot order**: `postgres` → `rabbitmq` → `e2e-issuer` (mock JWKS) → `annotations` (waits for postgres, rabbitmq, and a successful JWKS fetch) → `dataseed` → `e2e-runner`.
- **Fresh-state vs. carry-over**: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
- **Stream consumption**: every test that reads from `azaion-annotations` records the offset before the test acts, then consumes from `start_offset = recorded_offset + 1` to ignore historical messages.
- **Conditional probes**: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.