Files
annotations/_docs/02_document/tests/test-data.md
T
Oleksandr Bezdieniezhnykh d7d1c0ed6a [AZ-PENDING-1] [AZ-PENDING-2] Step 4 close-out: verification + docs
Phase 6 smoke (Docker, _docs/04_refactoring/01-testability-refactoring/
smoke-compose.yml):
  - Annotations app boots clean under ASPNETCORE_ENVIRONMENT=E2ETest.
  - /health 200 OK; /annotations with bearer returns 401 with the
    JWT library's own malformed-token rejection.
  - 0 IDX20108 occurrences in logs (C01 verified).
  - 0 IPAddress.Parse FormatException occurrences; FailsafeProducer
    reaches the broker via Docker DNS (C02 verified).
  - Full smoke report in verification.md.

Phase 7 docs:
  - architecture.md: retire Open Risks §6 (testability blocker
    resolved). Update the constraints block to describe the
    ASPNETCORE_ENVIRONMENT-gated RequireHttps behavior.
  - components/06_platform/description.md: one-liner on JwtExtensions
    JWKS gating.
  - components/02_annotations-realtime-sync/description.md: one-liner
    on FailsafeProducer host resolution accepting literal IP or DNS.
  - tests/test-data.md: refresh the JWKS URL configuration section to
    point at the resolved implementation instead of the open risk.

Task housekeeping:
  - _docs/02_tasks/todo/01_*.md -> done/
  - _docs/02_tasks/todo/02_*.md -> done/
  - _docs/_autodev_state.md: advance to Step 5 (Refactor Backlog Triage).

Tracker IDs remain placeholders pending Atlassian MCP availability —
real IDs to be assigned per
_docs/_process_leftovers/2026-05-14_testability-tracker.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 20:38:14 +03:00

12 KiB
Raw Blame History

Test Data Management

Seed Data Sets

Data Set Description Used by Tests How Loaded Cleanup
tokens-test 3 ES256 access tokens minted on demand by the runner: ann-token (claim ANN), dataset-token (DATASET), adm-token (ADM). All carry iss=$JWT_ISSUER, aud=$JWT_AUDIENCE, exp=now+5m, and a deterministic sub GUID per role. F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12 The harness runs a mock JWKS issuer (Python script tests/harness/mock_issuer.py or the equivalent .NET fixture) that publishes the public ES256 key at JWT_JWKS_URL. The runner imports the matching private key as a fixture and mints tokens per test. Tokens are short-lived (5m) and never persisted; key pair regenerates on docker compose down -v
mission-test One canonical waypoint id 00000000-0000-0000-0000-000000000aaa used as WaypointId / MissionId in every annotation create. All F1, F2, F3, F4, F5, F8 Implicit — no FK enforcement; the GUID is just a column value. N/A
classes-baseline The 19 detection classes seeded by DatabaseMigrator (ids 018, names per data_parameters.md). F7-001 (catalog read), F1-* (class_num references) Auto, by the SUT's boot-time migrator. N/A — schema-managed
clean-state Empty annotations, media, detection, annotations_queue_records tables at the start of each test class. every test class that asserts on count / depth xUnit class fixture: TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE; via direct DB connection (out-of-band, runner-only). Fixture's Dispose() truncates again

Data Isolation Strategy

  • Per-class truncation — each xUnit test class declares an IClassFixture<CleanStateFixture> that truncates the four mutable tables before the first test in the class and again after the last.
  • Per-test token — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
  • Per-test mission id — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh WaypointId GUID per test so concurrent test runs don't leak events into each other.
  • Per-test stream consumer — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset next (current end of stream). They consume only messages produced after the test starts.
  • Filesystem isolationannotations-images, annotations-videos, annotations-deleted volumes are recreated by docker compose down -v between full runs. Per-test cleanup removes only files the test wrote (matching <id> patterns).

Input Data Mapping

Input Data File Source Location Description Covers Scenarios
image_small.jpg <fixtures>/image_small.jpg 1280×720 frame, ~1.5 MB F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-*
image_dense01.jpg <fixtures>/image_dense01.jpg small dense frame (~230 KB) F1-004, F5-002, F8-002
image_dense02.jpg <fixtures>/image_dense02.jpg larger dense frame (~2.8 MB) F5-002
image_different_types.jpg <fixtures>/image_different_types.jpg multi-class scene (900×1600) F8-002 (class filter)
image_empty_scene.jpg <fixtures>/image_empty_scene.jpg 1920×1080 empty scene F1-003 (zero detections), NFT-PERF-* warmup
image_large.JPG <fixtures>/image_large.JPG 6252×4168, ~7 MB F1-005 (large payload), NFT-PERF-LATENCY
video_short01.mp4 <fixtures>/video_short01.mp4 ~150 MB video F1-006 (video annotation), F1-007
video_short02.mp4 <fixtures>/video_short02.mp4 distinct-bytes second video F1-007 (distinct bytes → distinct ids)

<fixtures> resolves to /fixtures inside the test runner / SUT container, bound to ../detections/_docs/00_problem/input_data/ per _docs/00_problem/input_data/fixtures.md.

Synthetic request payloads

JSON request bodies for POST /annotations, PUT /annotations/{id}, POST /dataset/status/bulk, and the auth flows live under _docs/00_problem/input_data/requests/. Each test references a request file by id (F1_001_request.json). Class numbers in detections come from the seeded detection_classes (ids 018); coordinates are normalized 0..1 floats.

Expected Results Mapping

(Full table is _docs/00_problem/input_data/expected_results/results_report.md — 44 rows. Selected entries here for cross-reference.)

Test Scenario ID Input Data Expected Result Comparison Method Tolerance Source
FT-P-01 (=F1-001) image_small.jpg + F1_001_request.json HTTP 200 + AnnotationDto; id =~ /^[0-9a-f]{32}$/; detections.length == 1 exact, schema_match, regex N/A expected_results/F1_001_response.json
FT-P-02 (=F1-002) Same input, second POST Same id as FT-P-01; no duplicate row exact N/A inline
FT-P-04 (=F1-004) image_dense01.jpg + F1_004_request.json HTTP 200; detections.length == 5; YOLO label file with 5 lines exact, file_content N/A expected_results/F1_004_response.json
FT-P-10 (=F3-001) F1-001 fires, SSE subscriber connected event with operation == "Created", latency ≤ 1000ms exact, threshold_max ± 200ms inline
FT-N-04 (=F1-N-004) F1-001 with no Authorization header HTTP 401 + error envelope exact, schema_match N/A inline
NFT-PERF-LATENCY-01 image_small.jpg × 50 sequential calls p95 latency ≤ 1500ms threshold_max N/A inline
NFT-RES-01 RabbitMQ stopped, F1-001 fires HTTP 200 returned to caller; outbox row stays; SUT stays alive exact N/A inline
NFT-SEC-01 F1-001 with JWT signed by wrong key HTTP 401 exact N/A inline
NFT-RES-LIM-01 F4 outbox under sustained load queue depth ≤ 10× steady-state for ≥ 30 min threshold_max N/A inline

External Dependency Mocks

External Service Mock/Stub How Provided Behavior
RabbitMQ Stream broker Real rabbitmq:3.13-management with the streams plugin Docker service in e2e-net Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using docker exec rabbitmq rabbitmqctl stop_app && start_app
Postgres Real postgres:13 Docker service Real DB; resilience tests (NFT-RES-04) crash and restart it
Detections service Not run N/A The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic detections[] payloads in requests/.
Suite-level reverse proxy / TLS terminator Not run N/A Tests speak directly to http://annotations:8080. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT".

Data Validation Rules

Data Type Validation Invalid Examples Expected System Behavior
image_bytes (POST /annotations) non-null, non-empty byte array empty array [], missing field HTTP 400/422; error envelope
mediaType (POST /annotations) enum Image=10 or Video=20 5, 100, missing HTTP 400/422; error envelope
detections[].class_num int, no range validator today -1, 999 HTTP 200 today (lenient); flagged as gap (SEC-05)
detections[].centerX/Y/width/height float, no range validator today 1.5, -0.1, NaN HTTP 200 today (lenient); flagged as gap (SEC-05)
Authorization header bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with alg pinned to ES256 missing, wrong issuer, wrong audience, wrong signature, expired, alg=HS256 forgery HTTP 401; error envelope
Caller policy ANN, DATASET, or ADM per endpoint mismatched policy HTTP 403; error envelope
WaypointId (POST /annotations, /media) GUID format not a GUID HTTP 400/422 from model binder
File-upload size (POST /media) no explicit limit visible at controller; underlying ASP.NET form-options apply >256 MB single file likely HTTP 400 from form-options; verify in NFT-RES-LIM-02

Runtime-generated test data

Two scenario groups consume synthetic test data generated by the runner at execution time rather than static files on disk. This is intentional and explicitly allowed by templates/expected-results.md ("Test data may be generated programmatically — note this in test-data.md"):

Scenario Generated data How
NFT-RES-LIM-02 (single-file upload boundary) Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo.
NFT-PERF-LIST-01, NFT-PERF-DATASET-01 10,000 annotations rows + 50,000 detection rows in the test DB dataseed job runs a parameterised SQL script that bulk-inserts rows with media_id referencing 100 distinct seeded media rows; uses CROSS JOIN generate_series for speed. Cleared by clean-state truncation between test classes.

The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).

Bearer token harness

Annotations is verifier-only — there is no /auth/login to call from a test. The harness reproduces the production model in miniature:

  1. Key pair — a fresh ES256 key pair is generated when the test stack starts (docker compose up). The private key is mounted into the runner container; the public key is mounted into a tiny mock issuer sidecar that serves /.well-known/jwks.json over HTTP inside the docker-compose network.
  2. JWKS URL configuration — the SUT is started with JWT_ISSUER=https://e2e-issuer.test, JWT_AUDIENCE=annotations-e2e, and JWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json. The HTTPS-only constraint of HttpDocumentRetriever.RequireHttps is relaxed in source: JwtExtensions.AddJwtAuth sets RequireHttps = false if and only if ASPNETCORE_ENVIRONMENT == "E2ETest" (case-insensitive). Any other value — including unset, Development, Staging, Production — keeps HTTPS required. This is the resolved form of architecture.md Open Risks §6 (see also _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md item C01).
  3. Token minting — the runner exposes a per-test helper mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?) that builds an ES256 JWT from the in-process private key with the configured iss/aud, exp = now + 5m, a per-role deterministic sub GUID, and the requested policy claim. overrides lets a test produce expired / wrong-iss / wrong-aud / forged-alg=HS256 variants for the security suite.
  4. No persisted users — there is no users table in this service. Each test mints exactly the token it needs.

Notes for the runner

  • Boot order: postgresrabbitmqe2e-issuer (mock JWKS) → annotations (waits for postgres, rabbitmq, and a successful JWKS fetch) → dataseede2e-runner.
  • Fresh-state vs. carry-over: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
  • Stream consumption: every test that reads from azaion-annotations records the offset before the test acts, then consumes from start_offset = recorded_offset + 1 to ignore historical messages.
  • Conditional probes: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.