docs+src: complete Steps 1-3 outcomes + auth re-sync baseline

This commit captures everything produced during autodev existing-code Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec), together with the targeted auth + CORS re-sync triggered on 2026-05-14 when codebase drift was detected at Step 4 entry. None of this work was previously committed. Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution, architecture, system flows, glossary, module-layout, per-component specs (01..06), modules, deployment, diagrams, data model, FINAL report, verification log, discovery. Step 2 (Architecture Baseline) — architecture_compliance_baseline.md. Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No High/Critical findings; auto-chained to Step 3 per existing-code flow. Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across blackbox, security, resilience, resource-limit, performance), plus e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh, scripts/run-performance-tests.sh. Coverage 88% over the active scope (40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered). Targeted auth + CORS re-sync — replaces the deleted in-house token issuer with a JWKS-verifier model. AuthController and TokenService removed; JwtExtensions switched from HS256 symmetric to ES256 over admin's JWKS. ConfigurationResolver and CorsConfigurationValidator added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01, SEC-02, SEC-03 marked Closed. One new testability risk recorded in architecture.md Open Risks Section 6 (JWKS HTTPS gating). Source changes: - src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning - src/Program.cs (modified) — DI wiring for ConfigurationResolver and CorsConfigurationValidator - src/Controllers/AuthController.cs (deleted) — no in-service issuance - src/Services/TokenService.cs (deleted) — same - src/Infrastructure/ConfigurationResolver.cs (new) - src/Infrastructure/CorsConfigurationValidator.cs (new) - .env.example (new) — required env var documentation - .gitignore (updated) Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec captures the change-spec for downstream services that consumed the now deleted /auth endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 13:41:07 +00:00 · 2026-05-14 20:19:05 +03:00
parent 08eadc1158
commit 03f879206e
66 changed files with 6006 additions and 133 deletions
@@ -0,0 +1,102 @@
+# Test Data Management
+
+## Seed Data Sets
+
+| Data Set | Description | Used by Tests | How Loaded | Cleanup |
+|----------|-------------|---------------|-----------|---------|
+| `tokens-test` | 3 ES256 access tokens minted on demand by the runner: `ann-token` (claim `ANN`), `dataset-token` (`DATASET`), `adm-token` (`ADM`). All carry `iss=$JWT_ISSUER`, `aud=$JWT_AUDIENCE`, `exp=now+5m`, and a deterministic `sub` GUID per role. | F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12 | The harness runs a **mock JWKS issuer** (Python script `tests/harness/mock_issuer.py` or the equivalent .NET fixture) that publishes the public ES256 key at `JWT_JWKS_URL`. The runner imports the matching private key as a fixture and mints tokens per test. | Tokens are short-lived (5m) and never persisted; key pair regenerates on `docker compose down -v` |
+| `mission-test` | One canonical waypoint id `00000000-0000-0000-0000-000000000aaa` used as `WaypointId` / `MissionId` in every annotation create. | All F1, F2, F3, F4, F5, F8 | Implicit — no FK enforcement; the GUID is just a column value. | N/A |
+| `classes-baseline` | The 19 detection classes seeded by `DatabaseMigrator` (ids 0–18, names per `data_parameters.md`). | F7-001 (catalog read), F1-* (class_num references) | Auto, by the SUT's boot-time migrator. | N/A — schema-managed |
+| `clean-state` | Empty `annotations`, `media`, `detection`, `annotations_queue_records` tables at the start of each test class. | every test class that asserts on count / depth | xUnit class fixture: `TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE;` via direct DB connection (out-of-band, runner-only). | Fixture's `Dispose()` truncates again |
+
+## Data Isolation Strategy
+
+- **Per-class truncation** — each xUnit test class declares an `IClassFixture<CleanStateFixture>` that truncates the four mutable tables before the first test in the class and again after the last.
+- **Per-test token** — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
+- **Per-test mission id** — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh `WaypointId` GUID per test so concurrent test runs don't leak events into each other.
+- **Per-test stream consumer** — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset `next` (current end of stream). They consume only messages produced after the test starts.
+- **Filesystem isolation** — `annotations-images`, `annotations-videos`, `annotations-deleted` volumes are recreated by `docker compose down -v` between full runs. Per-test cleanup removes only files the test wrote (matching `<id>` patterns).
+
+## Input Data Mapping
+
+| Input Data File | Source Location | Description | Covers Scenarios |
+|-----------------|----------------|-------------|-----------------|
+| `image_small.jpg` | `<fixtures>/image_small.jpg` | 1280×720 frame, ~1.5 MB | F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-* |
+| `image_dense01.jpg` | `<fixtures>/image_dense01.jpg` | small dense frame (~230 KB) | F1-004, F5-002, F8-002 |
+| `image_dense02.jpg` | `<fixtures>/image_dense02.jpg` | larger dense frame (~2.8 MB) | F5-002 |
+| `image_different_types.jpg` | `<fixtures>/image_different_types.jpg` | multi-class scene (900×1600) | F8-002 (class filter) |
+| `image_empty_scene.jpg` | `<fixtures>/image_empty_scene.jpg` | 1920×1080 empty scene | F1-003 (zero detections), NFT-PERF-* warmup |
+| `image_large.JPG` | `<fixtures>/image_large.JPG` | 6252×4168, ~7 MB | F1-005 (large payload), NFT-PERF-LATENCY |
+| `video_short01.mp4` | `<fixtures>/video_short01.mp4` | ~150 MB video | F1-006 (video annotation), F1-007 |
+| `video_short02.mp4` | `<fixtures>/video_short02.mp4` | distinct-bytes second video | F1-007 (distinct bytes → distinct ids) |
+
+`<fixtures>` resolves to `/fixtures` inside the test runner / SUT container, bound to `../detections/_docs/00_problem/input_data/` per `_docs/00_problem/input_data/fixtures.md`.
+
+## Synthetic request payloads
+
+JSON request bodies for `POST /annotations`, `PUT /annotations/{id}`, `POST /dataset/status/bulk`, and the auth flows live under `_docs/00_problem/input_data/requests/`. Each test references a request file by id (`F1_001_request.json`). Class numbers in detections come from the seeded `detection_classes` (ids 0–18); coordinates are normalized 0..1 floats.
+
+## Expected Results Mapping
+
+(Full table is `_docs/00_problem/input_data/expected_results/results_report.md` — 44 rows. Selected entries here for cross-reference.)
+
+| Test Scenario ID | Input Data | Expected Result | Comparison Method | Tolerance | Source |
+|-----------------|------------|-----------------|-------------------|-----------|--------|
+| FT-P-01 (=F1-001) | `image_small.jpg` + `F1_001_request.json` | HTTP 200 + `AnnotationDto`; `id =~ /^[0-9a-f]{32}$/`; `detections.length == 1` | exact, schema_match, regex | N/A | `expected_results/F1_001_response.json` |
+| FT-P-02 (=F1-002) | Same input, second POST | Same `id` as FT-P-01; no duplicate row | exact | N/A | inline |
+| FT-P-04 (=F1-004) | `image_dense01.jpg` + `F1_004_request.json` | HTTP 200; `detections.length == 5`; YOLO label file with 5 lines | exact, file_content | N/A | `expected_results/F1_004_response.json` |
+| FT-P-10 (=F3-001) | F1-001 fires, SSE subscriber connected | event with `operation == "Created"`, `latency ≤ 1000ms` | exact, threshold_max | ± 200ms | inline |
+| FT-N-04 (=F1-N-004) | F1-001 with no `Authorization` header | HTTP 401 + error envelope | exact, schema_match | N/A | inline |
+| NFT-PERF-LATENCY-01 | `image_small.jpg` × 50 sequential calls | p95 latency ≤ 1500ms | threshold_max | N/A | inline |
+| NFT-RES-01 | RabbitMQ stopped, F1-001 fires | HTTP 200 returned to caller; outbox row stays; SUT stays alive | exact | N/A | inline |
+| NFT-SEC-01 | F1-001 with JWT signed by **wrong** key | HTTP 401 | exact | N/A | inline |
+| NFT-RES-LIM-01 | F4 outbox under sustained load | queue depth ≤ 10× steady-state for ≥ 30 min | threshold_max | N/A | inline |
+
+## External Dependency Mocks
+
+| External Service | Mock/Stub | How Provided | Behavior |
+|-----------------|-----------|-------------|----------|
+| RabbitMQ Stream broker | Real `rabbitmq:3.13-management` with the streams plugin | Docker service in `e2e-net` | Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using `docker exec rabbitmq rabbitmqctl stop_app && start_app` |
+| Postgres | Real `postgres:13` | Docker service | Real DB; resilience tests (NFT-RES-04) crash and restart it |
+| Detections service | Not run | N/A | The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic `detections[]` payloads in `requests/`. |
+| Suite-level reverse proxy / TLS terminator | Not run | N/A | Tests speak directly to `http://annotations:8080`. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT". |
+
+## Data Validation Rules
+
+| Data Type | Validation | Invalid Examples | Expected System Behavior |
+|-----------|-----------|-----------------|------------------------|
+| `image_bytes` (POST /annotations) | non-null, non-empty byte array | empty array `[]`, missing field | HTTP 400/422; error envelope |
+| `mediaType` (POST /annotations) | enum `Image=10` or `Video=20` | `5`, `100`, missing | HTTP 400/422; error envelope |
+| `detections[].class_num` | int, no range validator today | `-1`, `999` | HTTP 200 today (lenient); flagged as gap (SEC-05) |
+| `detections[].centerX/Y/width/height` | float, no range validator today | `1.5`, `-0.1`, `NaN` | HTTP 200 today (lenient); flagged as gap (SEC-05) |
+| `Authorization` header | bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with `alg` pinned to ES256 | missing, wrong issuer, wrong audience, wrong signature, expired, `alg=HS256` forgery | HTTP 401; error envelope |
+| Caller policy | `ANN`, `DATASET`, or `ADM` per endpoint | mismatched policy | HTTP 403; error envelope |
+| `WaypointId` (POST /annotations, /media) | GUID format | not a GUID | HTTP 400/422 from model binder |
+| File-upload size (POST /media) | no explicit limit visible at controller; underlying ASP.NET form-options apply | >256 MB single file | likely HTTP 400 from form-options; verify in NFT-RES-LIM-02 |
+
+## Runtime-generated test data
+
+Two scenario groups consume **synthetic test data generated by the runner at execution time** rather than static files on disk. This is intentional and explicitly allowed by `templates/expected-results.md` ("Test data may be generated programmatically — note this in test-data.md"):
+
+| Scenario | Generated data | How |
+|----------|----------------|-----|
+| NFT-RES-LIM-02 (single-file upload boundary) | Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB | Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo. |
+| NFT-PERF-LIST-01, NFT-PERF-DATASET-01 | 10,000 `annotations` rows + 50,000 `detection` rows in the test DB | `dataseed` job runs a parameterised SQL script that bulk-inserts rows with `media_id` referencing 100 distinct seeded media rows; uses `CROSS JOIN generate_series` for speed. Cleared by `clean-state` truncation between test classes. |
+
+The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).
+
+## Bearer token harness
+
+Annotations is verifier-only — there is no `/auth/login` to call from a test. The harness reproduces the production model in miniature:
+
+1. **Key pair** — a fresh ES256 key pair is generated when the test stack starts (`docker compose up`). The private key is mounted into the runner container; the public key is mounted into a tiny **mock issuer** sidecar that serves `/.well-known/jwks.json` over HTTP **inside the docker-compose network**.
+2. **JWKS URL configuration** — the SUT is started with `JWT_ISSUER=https://e2e-issuer.test`, `JWT_AUDIENCE=annotations-e2e`, and `JWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json`. The HTTPS-only constraint of `HttpDocumentRetriever { RequireHttps = true }` is relaxed for tests by either (a) overriding `RequireHttps=false` via test-only configuration, or (b) running a TLS-terminating proxy in front of the issuer. Option (a) is preferred for simplicity; the relaxation is gated on `ASPNETCORE_ENVIRONMENT=E2ETest` and never applied in production builds. (This is the testability item flagged in `architecture.md` Open Risks §6.)
+3. **Token minting** — the runner exposes a per-test helper `mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?)` that builds an ES256 JWT from the in-process private key with the configured `iss`/`aud`, `exp = now + 5m`, a per-role deterministic `sub` GUID, and the requested policy claim. `overrides` lets a test produce expired / wrong-iss / wrong-aud / forged-`alg=HS256` variants for the security suite.
+4. **No persisted users** — there is no `users` table in this service. Each test mints exactly the token it needs.
+
+## Notes for the runner
+
+- **Boot order**: `postgres` → `rabbitmq` → `e2e-issuer` (mock JWKS) → `annotations` (waits for postgres, rabbitmq, and a successful JWKS fetch) → `dataseed` → `e2e-runner`.
+- **Fresh-state vs. carry-over**: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
+- **Stream consumption**: every test that reads from `azaion-annotations` records the offset before the test acts, then consumes from `start_offset = recorded_offset + 1` to ignore historical messages.
+- **Conditional probes**: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.