This commit captures everything produced during autodev existing-code Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec), together with the targeted auth + CORS re-sync triggered on 2026-05-14 when codebase drift was detected at Step 4 entry. None of this work was previously committed. Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution, architecture, system flows, glossary, module-layout, per-component specs (01..06), modules, deployment, diagrams, data model, FINAL report, verification log, discovery. Step 2 (Architecture Baseline) — architecture_compliance_baseline.md. Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No High/Critical findings; auto-chained to Step 3 per existing-code flow. Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across blackbox, security, resilience, resource-limit, performance), plus e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh, scripts/run-performance-tests.sh. Coverage 88% over the active scope (40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered). Targeted auth + CORS re-sync — replaces the deleted in-house token issuer with a JWKS-verifier model. AuthController and TokenService removed; JwtExtensions switched from HS256 symmetric to ES256 over admin's JWKS. ConfigurationResolver and CorsConfigurationValidator added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01, SEC-02, SEC-03 marked Closed. One new testability risk recorded in architecture.md Open Risks Section 6 (JWKS HTTPS gating). Source changes: - src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning - src/Program.cs (modified) — DI wiring for ConfigurationResolver and CorsConfigurationValidator - src/Controllers/AuthController.cs (deleted) — no in-service issuance - src/Services/TokenService.cs (deleted) — same - src/Infrastructure/ConfigurationResolver.cs (new) - src/Infrastructure/CorsConfigurationValidator.cs (new) - .env.example (new) — required env var documentation - .gitignore (updated) Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec captures the change-spec for downstream services that consumed the now deleted /auth endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
12 KiB
Test Data Management
Seed Data Sets
| Data Set | Description | Used by Tests | How Loaded | Cleanup |
|---|---|---|---|---|
tokens-test |
3 ES256 access tokens minted on demand by the runner: ann-token (claim ANN), dataset-token (DATASET), adm-token (ADM). All carry iss=$JWT_ISSUER, aud=$JWT_AUDIENCE, exp=now+5m, and a deterministic sub GUID per role. |
F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12 | The harness runs a mock JWKS issuer (Python script tests/harness/mock_issuer.py or the equivalent .NET fixture) that publishes the public ES256 key at JWT_JWKS_URL. The runner imports the matching private key as a fixture and mints tokens per test. |
Tokens are short-lived (5m) and never persisted; key pair regenerates on docker compose down -v |
mission-test |
One canonical waypoint id 00000000-0000-0000-0000-000000000aaa used as WaypointId / MissionId in every annotation create. |
All F1, F2, F3, F4, F5, F8 | Implicit — no FK enforcement; the GUID is just a column value. | N/A |
classes-baseline |
The 19 detection classes seeded by DatabaseMigrator (ids 0–18, names per data_parameters.md). |
F7-001 (catalog read), F1-* (class_num references) | Auto, by the SUT's boot-time migrator. | N/A — schema-managed |
clean-state |
Empty annotations, media, detection, annotations_queue_records tables at the start of each test class. |
every test class that asserts on count / depth | xUnit class fixture: TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE; via direct DB connection (out-of-band, runner-only). |
Fixture's Dispose() truncates again |
Data Isolation Strategy
- Per-class truncation — each xUnit test class declares an
IClassFixture<CleanStateFixture>that truncates the four mutable tables before the first test in the class and again after the last. - Per-test token — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
- Per-test mission id — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh
WaypointIdGUID per test so concurrent test runs don't leak events into each other. - Per-test stream consumer — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset
next(current end of stream). They consume only messages produced after the test starts. - Filesystem isolation —
annotations-images,annotations-videos,annotations-deletedvolumes are recreated bydocker compose down -vbetween full runs. Per-test cleanup removes only files the test wrote (matching<id>patterns).
Input Data Mapping
| Input Data File | Source Location | Description | Covers Scenarios |
|---|---|---|---|
image_small.jpg |
<fixtures>/image_small.jpg |
1280×720 frame, ~1.5 MB | F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-* |
image_dense01.jpg |
<fixtures>/image_dense01.jpg |
small dense frame (~230 KB) | F1-004, F5-002, F8-002 |
image_dense02.jpg |
<fixtures>/image_dense02.jpg |
larger dense frame (~2.8 MB) | F5-002 |
image_different_types.jpg |
<fixtures>/image_different_types.jpg |
multi-class scene (900×1600) | F8-002 (class filter) |
image_empty_scene.jpg |
<fixtures>/image_empty_scene.jpg |
1920×1080 empty scene | F1-003 (zero detections), NFT-PERF-* warmup |
image_large.JPG |
<fixtures>/image_large.JPG |
6252×4168, ~7 MB | F1-005 (large payload), NFT-PERF-LATENCY |
video_short01.mp4 |
<fixtures>/video_short01.mp4 |
~150 MB video | F1-006 (video annotation), F1-007 |
video_short02.mp4 |
<fixtures>/video_short02.mp4 |
distinct-bytes second video | F1-007 (distinct bytes → distinct ids) |
<fixtures> resolves to /fixtures inside the test runner / SUT container, bound to ../detections/_docs/00_problem/input_data/ per _docs/00_problem/input_data/fixtures.md.
Synthetic request payloads
JSON request bodies for POST /annotations, PUT /annotations/{id}, POST /dataset/status/bulk, and the auth flows live under _docs/00_problem/input_data/requests/. Each test references a request file by id (F1_001_request.json). Class numbers in detections come from the seeded detection_classes (ids 0–18); coordinates are normalized 0..1 floats.
Expected Results Mapping
(Full table is _docs/00_problem/input_data/expected_results/results_report.md — 44 rows. Selected entries here for cross-reference.)
| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Source |
|---|---|---|---|---|---|
| FT-P-01 (=F1-001) | image_small.jpg + F1_001_request.json |
HTTP 200 + AnnotationDto; id =~ /^[0-9a-f]{32}$/; detections.length == 1 |
exact, schema_match, regex | N/A | expected_results/F1_001_response.json |
| FT-P-02 (=F1-002) | Same input, second POST | Same id as FT-P-01; no duplicate row |
exact | N/A | inline |
| FT-P-04 (=F1-004) | image_dense01.jpg + F1_004_request.json |
HTTP 200; detections.length == 5; YOLO label file with 5 lines |
exact, file_content | N/A | expected_results/F1_004_response.json |
| FT-P-10 (=F3-001) | F1-001 fires, SSE subscriber connected | event with operation == "Created", latency ≤ 1000ms |
exact, threshold_max | ± 200ms | inline |
| FT-N-04 (=F1-N-004) | F1-001 with no Authorization header |
HTTP 401 + error envelope | exact, schema_match | N/A | inline |
| NFT-PERF-LATENCY-01 | image_small.jpg × 50 sequential calls |
p95 latency ≤ 1500ms | threshold_max | N/A | inline |
| NFT-RES-01 | RabbitMQ stopped, F1-001 fires | HTTP 200 returned to caller; outbox row stays; SUT stays alive | exact | N/A | inline |
| NFT-SEC-01 | F1-001 with JWT signed by wrong key | HTTP 401 | exact | N/A | inline |
| NFT-RES-LIM-01 | F4 outbox under sustained load | queue depth ≤ 10× steady-state for ≥ 30 min | threshold_max | N/A | inline |
External Dependency Mocks
| External Service | Mock/Stub | How Provided | Behavior |
|---|---|---|---|
| RabbitMQ Stream broker | Real rabbitmq:3.13-management with the streams plugin |
Docker service in e2e-net |
Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using docker exec rabbitmq rabbitmqctl stop_app && start_app |
| Postgres | Real postgres:13 |
Docker service | Real DB; resilience tests (NFT-RES-04) crash and restart it |
| Detections service | Not run | N/A | The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic detections[] payloads in requests/. |
| Suite-level reverse proxy / TLS terminator | Not run | N/A | Tests speak directly to http://annotations:8080. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT". |
Data Validation Rules
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|---|---|---|---|
image_bytes (POST /annotations) |
non-null, non-empty byte array | empty array [], missing field |
HTTP 400/422; error envelope |
mediaType (POST /annotations) |
enum Image=10 or Video=20 |
5, 100, missing |
HTTP 400/422; error envelope |
detections[].class_num |
int, no range validator today | -1, 999 |
HTTP 200 today (lenient); flagged as gap (SEC-05) |
detections[].centerX/Y/width/height |
float, no range validator today | 1.5, -0.1, NaN |
HTTP 200 today (lenient); flagged as gap (SEC-05) |
Authorization header |
bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with alg pinned to ES256 |
missing, wrong issuer, wrong audience, wrong signature, expired, alg=HS256 forgery |
HTTP 401; error envelope |
| Caller policy | ANN, DATASET, or ADM per endpoint |
mismatched policy | HTTP 403; error envelope |
WaypointId (POST /annotations, /media) |
GUID format | not a GUID | HTTP 400/422 from model binder |
| File-upload size (POST /media) | no explicit limit visible at controller; underlying ASP.NET form-options apply | >256 MB single file | likely HTTP 400 from form-options; verify in NFT-RES-LIM-02 |
Runtime-generated test data
Two scenario groups consume synthetic test data generated by the runner at execution time rather than static files on disk. This is intentional and explicitly allowed by templates/expected-results.md ("Test data may be generated programmatically — note this in test-data.md"):
| Scenario | Generated data | How |
|---|---|---|
| NFT-RES-LIM-02 (single-file upload boundary) | Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB | Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo. |
| NFT-PERF-LIST-01, NFT-PERF-DATASET-01 | 10,000 annotations rows + 50,000 detection rows in the test DB |
dataseed job runs a parameterised SQL script that bulk-inserts rows with media_id referencing 100 distinct seeded media rows; uses CROSS JOIN generate_series for speed. Cleared by clean-state truncation between test classes. |
The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).
Bearer token harness
Annotations is verifier-only — there is no /auth/login to call from a test. The harness reproduces the production model in miniature:
- Key pair — a fresh ES256 key pair is generated when the test stack starts (
docker compose up). The private key is mounted into the runner container; the public key is mounted into a tiny mock issuer sidecar that serves/.well-known/jwks.jsonover HTTP inside the docker-compose network. - JWKS URL configuration — the SUT is started with
JWT_ISSUER=https://e2e-issuer.test,JWT_AUDIENCE=annotations-e2e, andJWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json. The HTTPS-only constraint ofHttpDocumentRetriever { RequireHttps = true }is relaxed for tests by either (a) overridingRequireHttps=falsevia test-only configuration, or (b) running a TLS-terminating proxy in front of the issuer. Option (a) is preferred for simplicity; the relaxation is gated onASPNETCORE_ENVIRONMENT=E2ETestand never applied in production builds. (This is the testability item flagged inarchitecture.mdOpen Risks §6.) - Token minting — the runner exposes a per-test helper
mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?)that builds an ES256 JWT from the in-process private key with the configurediss/aud,exp = now + 5m, a per-role deterministicsubGUID, and the requested policy claim.overrideslets a test produce expired / wrong-iss / wrong-aud / forged-alg=HS256variants for the security suite. - No persisted users — there is no
userstable in this service. Each test mints exactly the token it needs.
Notes for the runner
- Boot order:
postgres→rabbitmq→e2e-issuer(mock JWKS) →annotations(waits for postgres, rabbitmq, and a successful JWKS fetch) →dataseed→e2e-runner. - Fresh-state vs. carry-over: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
- Stream consumption: every test that reads from
azaion-annotationsrecords the offset before the test acts, then consumes fromstart_offset = recorded_offset + 1to ignore historical messages. - Conditional probes: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.