docs+src: complete Steps 1-3 outcomes + auth re-sync baseline

This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 20:19:05 +03:00
parent 08eadc1158
commit 03f879206e
66 changed files with 6006 additions and 133 deletions
@@ -0,0 +1,173 @@
# Expected Results — Azaion.Annotations
Maps every input data item the test corpus exercises against `Azaion.Annotations` to its quantifiable expected result. Tests use this mapping to compare actual system output against known-correct answers.
This contract is **annotations-service-shape**, not detections-service-shape. The same binary fixtures are reused (see `../fixtures.md`), but the expected outputs here describe annotation lifecycle behavior — content-addressed ids, persisted DTOs, label-file writes, SSE delivery, outbox + stream — not bounding-box inference.
## Result Format Legend
| Result Type | When to Use | Example |
|-------------|-------------|---------|
| Exact value | Output must match precisely | `status_code: 200`, `detection_count: 3` |
| Tolerance range | Numeric output with acceptable variance | `latency: 800ms ± 200ms` |
| Threshold | Output must exceed or stay below a limit | `latency ≤ 1000ms` |
| Pattern match | Output must match a string/regex pattern | `id =~ /^[0-9a-f]{32}$/` |
| File reference | Complex output compared against a reference file | `match expected_results/F1_001_response.json` |
| Schema match | Output structure must conform to a schema | `body matches AnnotationDto` |
| Set/count | Output must contain specific items or counts | `detections.length == 3` |
## Comparison Methods
| Method | Description | Tolerance Syntax |
|--------|-------------|-----------------|
| `exact` | Actual == Expected | N/A |
| `numeric_tolerance` | abs(actual - expected) ≤ tolerance | `± <value>` or `± <percent>%` |
| `threshold_min` | actual ≥ threshold | `≥ <value>` |
| `threshold_max` | actual ≤ threshold | `≤ <value>` |
| `regex` | actual matches regex pattern | regex string |
| `substring` | actual contains substring | substring |
| `json_diff` | structural comparison against reference JSON | diff tolerance per field |
| `schema_match` | actual conforms to a JSON schema | N/A |
| `file_exists` | a file at a computed path exists on disk | N/A |
| `file_content` | a file's contents match expected (line-by-line) | exact / regex |
## Global invariants
These hold for every successful response from the service unless explicitly negated by the row's own expected result.
| Invariant | Comparison | Notes |
|-----------|------------|-------|
| Response Content-Type is `application/json` for non-binary endpoints | exact | except `/health`, image/thumbnail file routes, and SSE (`text/event-stream`) |
| Error responses follow the suite envelope `{ error: { code, message, …details } }` | schema_match | `_docs/02_document/common-helpers/01_http-error-envelope.md` |
| `id` fields in annotation responses are 32 lowercase hex chars | regex `^[0-9a-f]{32}$` | derived from `XxHash3.Hash128` (post RB-04) over sampled image bytes |
| Tokens passed by callers are ES256 JWTs issued by admin (3 base64url segments) | regex `^[\w-]+\.[\w-]+\.[\w-]+$` | annotations does not issue tokens; this is the shape it accepts |
| For `[after RB-XX]` rows: skip until the listed Refactor Backlog item lands | — | Phase 3 validation removes them otherwise |
## Input → Expected Result Mapping
### Group F1 — Annotation create (`POST /annotations`)
Each row uses one binary fixture from `fixtures.md` plus a synthetic `detections[]` payload from `requests/F1_<NNN>_request.json`. The class_num values come from the seeded `detection_classes` (ids 018).
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F1-001 | `image_small` + `requests/F1_001_request.json` (1 detection: class_num=10 Plane, normalized bbox) | Single small frame, single detection | HTTP 200; body matches `AnnotationDto`; `body.detections.length == 1`; `body.id =~ /^[0-9a-f]{32}$/` | exact (status), schema_match (body), regex (id) | N/A | `expected_results/F1_001_response.json` |
| F1-002 | Same as F1-001 (re-POST with identical payload) | Idempotency check | HTTP 200; `body.id == <id from F1-001>`; no duplicate row written (verifiable via `GET /annotations/{id}` returning a single row) | exact | N/A | N/A |
| F1-003 | `image_empty_scene` + `requests/F1_003_request.json` (0 detections) | Frame with no detections | HTTP 200; `body.detections.length == 0`; YOLO label file `<images_dir>/<id>.txt` exists with 0 lines | exact (count), file_exists, file_content | N/A | N/A |
| F1-004 | `image_dense01` + `requests/F1_004_request.json` (5 detections: mixed class_nums 0,1,2,9,10) | Dense scene, multiple classes | HTTP 200; `body.detections.length == 5`; YOLO label file has 5 lines, each `<class_num> <cx> <cy> <w> <h>` with normalized floats | exact (count), file_content (regex per line) | N/A | `expected_results/F1_004_response.json` |
| F1-005 | `image_large` + `requests/F1_005_request.json` (3 detections) | Large payload (~7 MB) | HTTP 200; same shape as F1-001; latency `≤ 5000ms` (single-instance dev DB, no concurrent load) | exact, threshold_max | latency ± 1000ms | N/A |
| F1-006 | `video_short01` (mediaType=Video) + `requests/F1_006_request.json` (1 detection at videoTime=00:00:02.000) | Video frame annotation | HTTP 200; `body.id =~ /^[0-9a-f]{32}$/`; `body.videoTime == "00:00:02"` | exact, regex | N/A | N/A |
| F1-007 | `video_short01` + `video_short02` content-distinct + same detections payload | Distinct image bytes → distinct ids | `body_F1-007_a.id != body_F1-007_b.id` | exact (inequality) | N/A | N/A |
### Group F1-N — Annotation create negative cases
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F1-N-001 | `requests/F1_N_001_request.json` (no `image` bytes) | Missing image bytes | HTTP 400 or 422; error envelope present; `error.code` not empty | exact, schema_match | N/A | N/A |
| F1-N-002 | `requests/F1_N_002_request.json` (image bytes present but `mediaType` missing) | Missing required field | HTTP 400 or 422; error envelope | exact, schema_match | N/A | N/A |
| F1-N-003 | `image_small` + valid payload + JWT with policy `DATASET` only | Caller missing ANN policy | HTTP 403; error envelope `error.code` ∈ {`forbidden`, `policy_denied`} | exact, set_contains | N/A | N/A |
| F1-N-004 | `image_small` + valid payload + no `Authorization` header | Unauthenticated | HTTP 401; error envelope | exact | N/A | N/A |
| F1-N-005 | `image_small` + payload with `detections[0].centerX = 1.5` (out of 0..1 range) | Invalid bbox value | HTTP 200 today (no validator) → flag as documented gap; OR HTTP 400/422 if validation lands per SEC-05 | exact (today: 200) | N/A | N/A |
### Group F2 — Annotation listing & detail (`GET /annotations`, `/annotations/{id}`)
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F2-001 | `GET /annotations?limit=10` after F1-001..F1-004 succeeded | Paginated list | HTTP 200; `body.length == 4`; each item matches `AnnotationListItem` schema | exact (count), schema_match | N/A | N/A |
| F2-002 | `GET /annotations/{id from F1-001}` | Detail of an existing annotation | HTTP 200; `body.id == <id>`; `body.detections.length == 1` | exact | N/A | `expected_results/F1_001_response.json` (same file as F1-001) |
| F2-003 | `GET /annotations/00000000000000000000000000000000` | Nonexistent id | HTTP 404; error envelope; `error.code` matches `/not.?found/i` | exact, regex | N/A | N/A |
| F2-004 | `GET /annotations?missionId=<unknown-guid>` | Filter by mission with no annotations | HTTP 200; `body.length == 0` | exact | N/A | N/A |
### Group F3 — Realtime SSE (`GET /annotations/events`)
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F3-001 | Subscriber connects to `/annotations/events`, then F1-001 fires | SSE delivery for new annotation | Subscriber receives one event with `data` parsing as `AnnotationEventDto`; `event.operation == "Created"`; `event.annotationId == <id from F1-001>`; latency `≤ 1000ms` | schema_match, exact, threshold_max | latency ± 200ms | N/A |
| F3-002 | F1-001 fires, then subscriber connects | No backfill expected | Subscriber receives 0 events for the historical id within 5s window | exact (count) | N/A | N/A |
| F3-003 | Subscriber connects without `Authorization` header | Unauthenticated SSE | HTTP 401 on the SSE connection establishment | exact | N/A | N/A |
| F3-004 `[after RB-01]` | Subscriber connects, then `PUT /annotations/{id}` updates fields | Lifecycle observability for Update | Subscriber receives event with `event.operation == "Updated"`, payload reflecting the update | exact, schema_match | N/A | N/A |
| F3-005 `[after RB-01]` | Subscriber connects, then `DELETE /annotations/{id}` | Lifecycle observability for Delete (soft-delete) | Subscriber receives event with `event.operation == "Deleted"`; row status flips to `Deleted (40)`; image+label files relocate to `deleted_dir` | exact, file_exists | N/A | N/A |
### Group F4 — Outbox + Stream (`FailsafeProducer`)
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F4-001 | F1-001 succeeds | Outbox row inserted | After F1-001 returns 200, exactly one new row exists in `annotations_queue_records` with `annotation_id == <id>`, `operation == 10` (Created) | exact | N/A | N/A |
| F4-002 | After F4-001, wait for one drain cycle | Drainer publishes to RabbitMQ stream | Within `drain_interval + 2s`, the row is deleted AND a message lands on stream `azaion-annotations` | exact, threshold_max | N/A | N/A |
| F4-003 | Inspect the published stream message | Message wire format | gzip-decoded MessagePack body deserializes into the documented schema (`annotationId`, `operation`, `dateTime`, payload) | schema_match | N/A | `expected_results/F4_003_stream_message.json` |
| F4-004 `[after RB-09]` | Two F1-001 invocations with the same image bytes | Stream dedupe contract | Stream messages carry `(annotationId, operation, dateTime)`; a downstream consumer can collapse duplicates by that triple | exact, schema_match | N/A | N/A |
| F4-005 | RabbitMQ unreachable, then F1-001 fires | Drainer survives broker outage | Row stays in `annotations_queue_records` (does not get deleted); `FailsafeProducer` does not crash; queue depth grows; HTTP 200 still returned to the original caller | exact, schema_match | N/A | N/A |
### Group F5 — Media upload (`POST /media`, `POST /media/batch`)
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F5-001 | multipart `POST /media` with `image_small`, `mediaType=Image`, `waypointId=<guid>` | Single media upload | HTTP 200; body matches `MediaListItem`; file exists at `<media_dir>/<media_id>` (extension preserved) | exact, schema_match, file_exists | N/A | N/A |
| F5-002 | multipart `POST /media/batch` with 3 files (`image_small`, `image_dense01`, `image_dense02`) + same waypointId | Batch upload | HTTP 200; `body.length == 3`; 3 distinct `mediaId` values; 3 files on disk | exact, file_exists | N/A | N/A |
| F5-003 | `POST /media` with no `waypointId` | Missing required field | HTTP 400 or 422; error envelope | exact, schema_match | N/A | N/A |
| F5-004 | `POST /media` with caller missing ANN policy | AuthZ check | HTTP 403; error envelope | exact | N/A | N/A |
### Group F6 — Auth verification (Bearer token validation)
Annotations does not host login / refresh / register — those are owned by admin and out-of-scope for this test corpus. The annotations e2e harness runs against an in-stack **mock JWKS issuer** that mints ES256 tokens with the configured `JWT_ISSUER` / `JWT_AUDIENCE`; runtime tokens are minted on demand by the runner (`fixtures/auth/mock_issuer.py` or equivalent) using a private key whose public half lives in the JWKS the service fetches at boot. See `_docs/02_document/tests/test-data.md` and `_docs/02_document/tests/security-tests.md`.
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F6-001 | Any authenticated route called with a freshly minted ES256 token (correct iss / aud / exp) | Happy-path verification | HTTP 200 | exact | N/A | N/A |
| F6-002 | Same route with `iss` mismatched against `JWT_ISSUER` | Issuer rejection | HTTP 401; error envelope | exact, schema_match | N/A | N/A |
| F6-003 | Same route with `aud` mismatched against `JWT_AUDIENCE` | Audience rejection | HTTP 401; error envelope | exact, schema_match | N/A | N/A |
| F6-004 | Same route with `exp` 1 minute in the past | Expired token | HTTP 401; error envelope | exact, schema_match | N/A | N/A |
| F6-005 | Same route with `alg=HS256` and admin's public ES256 key reused as the HMAC key | Algorithm-confusion attack | HTTP 401; error envelope | exact, schema_match | N/A | N/A |
| F6-006 | Same route with no `Authorization` header | Anonymous rejection (except `/health`) | HTTP 401; error envelope | exact, schema_match | N/A | N/A |
### Group F7 — Settings & metadata
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F7-001 | `GET /classes` after fresh boot | Detection class catalog | HTTP 200; `body.length == 19`; ids `[0..18]` present; entry where `id == 9` has `name == "Smoke"`; entry where `id == 10` has `name == "Plane"` | exact, set_contains | N/A | `expected_results/F7_001_classes.json` |
| F7-002 | `GET /settings/system` | System settings read | HTTP 200; `body` matches `SystemSettings` shape; `silent_detection` field present today (removed post RB-02) | exact, schema_match | N/A | N/A |
| F7-003 | `PUT /settings/directories` with new `imagesDir` value | PathResolver invariant | HTTP 200; subsequent `GET /settings/directories` returns the new value; `pathResolver.Reset()` invariant — the next `POST /annotations` writes to the new path | exact, file_exists (under new path) | N/A | N/A |
| F7-004 | `PUT /settings/directories` with caller missing ADM policy | AuthZ check | HTTP 403; error envelope | exact | N/A | N/A |
| F7-005 `[after RB-06]` | `POST /classes` (admin CRUD) with caller having ADM policy | New class added | HTTP 200; `GET /classes` returns 20 rows; cache invalidated | exact | N/A | N/A |
### Group F8 — Dataset
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F8-001 | `GET /dataset?status=10` after F1-001..F1-004 | Filter by status `Pending` | HTTP 200; all returned items have `status == 10`; `body.length` matches the count of Pending rows in DB | exact | N/A | N/A |
| F8-002 | `GET /dataset?classNum=10` | Filter by class `Plane` | HTTP 200; every returned item's annotation has at least one detection with `class_num == 10` | exact | N/A | N/A |
| F8-003 | `GET /dataset/class-distribution` | Class distribution | HTTP 200; `body` is an array; each entry has `classNum`, `label`, `color`, `count`; sum of counts equals total detection count | exact, schema_match | N/A | N/A |
| F8-004 | `POST /dataset/status/bulk` with `{ annotationIds: [<id1>, <id2>], status: 20 }` | Bulk status update | HTTP 200; both annotations have `status == 20` after the call (atomic SQL `UPDATE … WHERE id IN (…)`) | exact | N/A | N/A |
| F8-005 `[after RB-08]` | F8-004 path | Lifecycle event emission | Each updated annotation emits a `Updated` SSE event AND inserts an outbox row | exact, schema_match | N/A | N/A |
### Group F9 — Health & boot
| # | Input | Input Description | Expected Result | Comparison | Tolerance | Reference File |
|---|-------|-------------------|-----------------|------------|-----------|---------------|
| F9-001 | `GET /health` | Health check | HTTP 200; latency `≤ 200ms` | exact, threshold_max | N/A | N/A |
| F9-002 | Container fresh boot, run migrator twice | Migrator idempotence | Second boot makes 0 schema changes (no new tables, no new columns); 0 errors | exact (DDL diff) | N/A | N/A |
## Coverage summary
- **Functional positive**: 26 rows (F1-001..007, F2-001..004, F3-001..005, F4-001..005, F5-001..002, F6-001/004/007, F7-001..003, F8-001..004, F9-001..002).
- **Functional negative**: 12 rows (F1-N-001..005, F2-003, F3-003, F5-003..004, F6-002/003/005/006, F7-004).
- **`[after RB-XX]` rows** (skipped until the backlog item lands): F3-004, F3-005, F4-004, F7-005, F8-005, plus the post-RB-04 hash invariant in F1-001 — 6 deferred rows.
Total: **44 rows**; **38 active today**, **6 deferred behind backlog items**.
## Reference files (to author next)
The rows above reference these reference files in `expected_results/`. They will be authored as part of this skill's Phase 1 input-data analysis if the runner needs them; complex JSON bodies are best captured here once we run F1-001 against a real DB once and capture the response. For the initial spec, the regex/schema_match patterns above are sufficient.
| File | Purpose |
|------|---------|
| `F1_001_response.json` | Reference `AnnotationDto` body for `image_small` + 1 detection |
| `F1_004_response.json` | Reference body for dense scene (5 detections) |
| `F4_003_stream_message.json` | Reference MessagePack-decoded stream payload |
| `F7_001_classes.json` | Reference class catalog (19 rows, ids 018) |
## Open data gaps (raised during this draft)
- **Performance baselines**: `F1-005` and `F9-001` use single-instance latency thresholds (5000ms / 200ms) inferred from the codebase, NOT a contracted SLA. If suite-level perf targets exist, they override these.
- **`F1-N-005` invalid bbox value**: today the service silently accepts out-of-range `centerX`. Documented in `security_approach.md` SEC-05; needs a decision on whether the test should target the current (lenient) or future (validated) behavior.
- **F4-005 outage simulation**: depends on the test harness being able to restart RabbitMQ between cases — operational concern for the runner script (Phase 4).