Files
annotations/_docs/02_document/tests/test-data.md
T
Oleksandr Bezdieniezhnykh 03f879206e docs+src: complete Steps 1-3 outcomes + auth re-sync baseline
This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 20:19:05 +03:00

12 KiB
Raw Blame History

Test Data Management

Seed Data Sets

Data Set Description Used by Tests How Loaded Cleanup
tokens-test 3 ES256 access tokens minted on demand by the runner: ann-token (claim ANN), dataset-token (DATASET), adm-token (ADM). All carry iss=$JWT_ISSUER, aud=$JWT_AUDIENCE, exp=now+5m, and a deterministic sub GUID per role. F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12 The harness runs a mock JWKS issuer (Python script tests/harness/mock_issuer.py or the equivalent .NET fixture) that publishes the public ES256 key at JWT_JWKS_URL. The runner imports the matching private key as a fixture and mints tokens per test. Tokens are short-lived (5m) and never persisted; key pair regenerates on docker compose down -v
mission-test One canonical waypoint id 00000000-0000-0000-0000-000000000aaa used as WaypointId / MissionId in every annotation create. All F1, F2, F3, F4, F5, F8 Implicit — no FK enforcement; the GUID is just a column value. N/A
classes-baseline The 19 detection classes seeded by DatabaseMigrator (ids 018, names per data_parameters.md). F7-001 (catalog read), F1-* (class_num references) Auto, by the SUT's boot-time migrator. N/A — schema-managed
clean-state Empty annotations, media, detection, annotations_queue_records tables at the start of each test class. every test class that asserts on count / depth xUnit class fixture: TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE; via direct DB connection (out-of-band, runner-only). Fixture's Dispose() truncates again

Data Isolation Strategy

  • Per-class truncation — each xUnit test class declares an IClassFixture<CleanStateFixture> that truncates the four mutable tables before the first test in the class and again after the last.
  • Per-test token — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
  • Per-test mission id — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh WaypointId GUID per test so concurrent test runs don't leak events into each other.
  • Per-test stream consumer — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset next (current end of stream). They consume only messages produced after the test starts.
  • Filesystem isolationannotations-images, annotations-videos, annotations-deleted volumes are recreated by docker compose down -v between full runs. Per-test cleanup removes only files the test wrote (matching <id> patterns).

Input Data Mapping

Input Data File Source Location Description Covers Scenarios
image_small.jpg <fixtures>/image_small.jpg 1280×720 frame, ~1.5 MB F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-*
image_dense01.jpg <fixtures>/image_dense01.jpg small dense frame (~230 KB) F1-004, F5-002, F8-002
image_dense02.jpg <fixtures>/image_dense02.jpg larger dense frame (~2.8 MB) F5-002
image_different_types.jpg <fixtures>/image_different_types.jpg multi-class scene (900×1600) F8-002 (class filter)
image_empty_scene.jpg <fixtures>/image_empty_scene.jpg 1920×1080 empty scene F1-003 (zero detections), NFT-PERF-* warmup
image_large.JPG <fixtures>/image_large.JPG 6252×4168, ~7 MB F1-005 (large payload), NFT-PERF-LATENCY
video_short01.mp4 <fixtures>/video_short01.mp4 ~150 MB video F1-006 (video annotation), F1-007
video_short02.mp4 <fixtures>/video_short02.mp4 distinct-bytes second video F1-007 (distinct bytes → distinct ids)

<fixtures> resolves to /fixtures inside the test runner / SUT container, bound to ../detections/_docs/00_problem/input_data/ per _docs/00_problem/input_data/fixtures.md.

Synthetic request payloads

JSON request bodies for POST /annotations, PUT /annotations/{id}, POST /dataset/status/bulk, and the auth flows live under _docs/00_problem/input_data/requests/. Each test references a request file by id (F1_001_request.json). Class numbers in detections come from the seeded detection_classes (ids 018); coordinates are normalized 0..1 floats.

Expected Results Mapping

(Full table is _docs/00_problem/input_data/expected_results/results_report.md — 44 rows. Selected entries here for cross-reference.)

Test Scenario ID Input Data Expected Result Comparison Method Tolerance Source
FT-P-01 (=F1-001) image_small.jpg + F1_001_request.json HTTP 200 + AnnotationDto; id =~ /^[0-9a-f]{32}$/; detections.length == 1 exact, schema_match, regex N/A expected_results/F1_001_response.json
FT-P-02 (=F1-002) Same input, second POST Same id as FT-P-01; no duplicate row exact N/A inline
FT-P-04 (=F1-004) image_dense01.jpg + F1_004_request.json HTTP 200; detections.length == 5; YOLO label file with 5 lines exact, file_content N/A expected_results/F1_004_response.json
FT-P-10 (=F3-001) F1-001 fires, SSE subscriber connected event with operation == "Created", latency ≤ 1000ms exact, threshold_max ± 200ms inline
FT-N-04 (=F1-N-004) F1-001 with no Authorization header HTTP 401 + error envelope exact, schema_match N/A inline
NFT-PERF-LATENCY-01 image_small.jpg × 50 sequential calls p95 latency ≤ 1500ms threshold_max N/A inline
NFT-RES-01 RabbitMQ stopped, F1-001 fires HTTP 200 returned to caller; outbox row stays; SUT stays alive exact N/A inline
NFT-SEC-01 F1-001 with JWT signed by wrong key HTTP 401 exact N/A inline
NFT-RES-LIM-01 F4 outbox under sustained load queue depth ≤ 10× steady-state for ≥ 30 min threshold_max N/A inline

External Dependency Mocks

External Service Mock/Stub How Provided Behavior
RabbitMQ Stream broker Real rabbitmq:3.13-management with the streams plugin Docker service in e2e-net Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using docker exec rabbitmq rabbitmqctl stop_app && start_app
Postgres Real postgres:13 Docker service Real DB; resilience tests (NFT-RES-04) crash and restart it
Detections service Not run N/A The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic detections[] payloads in requests/.
Suite-level reverse proxy / TLS terminator Not run N/A Tests speak directly to http://annotations:8080. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT".

Data Validation Rules

Data Type Validation Invalid Examples Expected System Behavior
image_bytes (POST /annotations) non-null, non-empty byte array empty array [], missing field HTTP 400/422; error envelope
mediaType (POST /annotations) enum Image=10 or Video=20 5, 100, missing HTTP 400/422; error envelope
detections[].class_num int, no range validator today -1, 999 HTTP 200 today (lenient); flagged as gap (SEC-05)
detections[].centerX/Y/width/height float, no range validator today 1.5, -0.1, NaN HTTP 200 today (lenient); flagged as gap (SEC-05)
Authorization header bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with alg pinned to ES256 missing, wrong issuer, wrong audience, wrong signature, expired, alg=HS256 forgery HTTP 401; error envelope
Caller policy ANN, DATASET, or ADM per endpoint mismatched policy HTTP 403; error envelope
WaypointId (POST /annotations, /media) GUID format not a GUID HTTP 400/422 from model binder
File-upload size (POST /media) no explicit limit visible at controller; underlying ASP.NET form-options apply >256 MB single file likely HTTP 400 from form-options; verify in NFT-RES-LIM-02

Runtime-generated test data

Two scenario groups consume synthetic test data generated by the runner at execution time rather than static files on disk. This is intentional and explicitly allowed by templates/expected-results.md ("Test data may be generated programmatically — note this in test-data.md"):

Scenario Generated data How
NFT-RES-LIM-02 (single-file upload boundary) Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo.
NFT-PERF-LIST-01, NFT-PERF-DATASET-01 10,000 annotations rows + 50,000 detection rows in the test DB dataseed job runs a parameterised SQL script that bulk-inserts rows with media_id referencing 100 distinct seeded media rows; uses CROSS JOIN generate_series for speed. Cleared by clean-state truncation between test classes.

The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).

Bearer token harness

Annotations is verifier-only — there is no /auth/login to call from a test. The harness reproduces the production model in miniature:

  1. Key pair — a fresh ES256 key pair is generated when the test stack starts (docker compose up). The private key is mounted into the runner container; the public key is mounted into a tiny mock issuer sidecar that serves /.well-known/jwks.json over HTTP inside the docker-compose network.
  2. JWKS URL configuration — the SUT is started with JWT_ISSUER=https://e2e-issuer.test, JWT_AUDIENCE=annotations-e2e, and JWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json. The HTTPS-only constraint of HttpDocumentRetriever { RequireHttps = true } is relaxed for tests by either (a) overriding RequireHttps=false via test-only configuration, or (b) running a TLS-terminating proxy in front of the issuer. Option (a) is preferred for simplicity; the relaxation is gated on ASPNETCORE_ENVIRONMENT=E2ETest and never applied in production builds. (This is the testability item flagged in architecture.md Open Risks §6.)
  3. Token minting — the runner exposes a per-test helper mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?) that builds an ES256 JWT from the in-process private key with the configured iss/aud, exp = now + 5m, a per-role deterministic sub GUID, and the requested policy claim. overrides lets a test produce expired / wrong-iss / wrong-aud / forged-alg=HS256 variants for the security suite.
  4. No persisted users — there is no users table in this service. Each test mints exactly the token it needs.

Notes for the runner

  • Boot order: postgresrabbitmqe2e-issuer (mock JWKS) → annotations (waits for postgres, rabbitmq, and a successful JWKS fetch) → dataseede2e-runner.
  • Fresh-state vs. carry-over: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
  • Stream consumption: every test that reads from azaion-annotations records the offset before the test acts, then consumes from start_offset = recorded_offset + 1 to ignore historical messages.
  • Conditional probes: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.