mirror of https://github.com/azaion/annotations.git synced 2026-06-21 21:11:05 +00:00

Files

T

Oleksandr Bezdieniezhnykh d7d1c0ed6a [AZ-PENDING-1] [AZ-PENDING-2] Step 4 close-out: verification + docs

Phase 6 smoke (Docker, _docs/04_refactoring/01-testability-refactoring/
smoke-compose.yml):
  - Annotations app boots clean under ASPNETCORE_ENVIRONMENT=E2ETest.
  - /health 200 OK; /annotations with bearer returns 401 with the
    JWT library's own malformed-token rejection.
  - 0 IDX20108 occurrences in logs (C01 verified).
  - 0 IPAddress.Parse FormatException occurrences; FailsafeProducer
    reaches the broker via Docker DNS (C02 verified).
  - Full smoke report in verification.md.

Phase 7 docs:
  - architecture.md: retire Open Risks §6 (testability blocker
    resolved). Update the constraints block to describe the
    ASPNETCORE_ENVIRONMENT-gated RequireHttps behavior.
  - components/06_platform/description.md: one-liner on JwtExtensions
    JWKS gating.
  - components/02_annotations-realtime-sync/description.md: one-liner
    on FailsafeProducer host resolution accepting literal IP or DNS.
  - tests/test-data.md: refresh the JWKS URL configuration section to
    point at the resolved implementation instead of the open risk.

Task housekeeping:
  - _docs/02_tasks/todo/01_*.md -> done/
  - _docs/02_tasks/todo/02_*.md -> done/
  - _docs/_autodev_state.md: advance to Step 5 (Refactor Backlog Triage).

Tracker IDs remain placeholders pending Atlassian MCP availability —
real IDs to be assigned per
_docs/_process_leftovers/2026-05-14_testability-tracker.md.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-14 20:38:14 +03:00

12 KiB

Raw Blame History

Test Data Management

Seed Data Sets

Data Set	Description	Used by Tests	How Loaded	Cleanup
`tokens-test`	3 ES256 access tokens minted on demand by the runner: `ann-token` (claim `ANN`), `dataset-token` (`DATASET`), `adm-token` (`ADM`). All carry `iss=$JWT_ISSUER`, `aud=$JWT_AUDIENCE`, `exp=now+5m`, and a deterministic `sub` GUID per role.	F1-N-003, F1-N-004, F5-004, F6-001..006, F7-004, F8-*, NFT-SEC-01..10, FT-N-10..12	The harness runs a mock JWKS issuer (Python script `tests/harness/mock_issuer.py` or the equivalent .NET fixture) that publishes the public ES256 key at `JWT_JWKS_URL`. The runner imports the matching private key as a fixture and mints tokens per test.	Tokens are short-lived (5m) and never persisted; key pair regenerates on `docker compose down -v`
`mission-test`	One canonical waypoint id `00000000-0000-0000-0000-000000000aaa` used as `WaypointId` / `MissionId` in every annotation create.	All F1, F2, F3, F4, F5, F8	Implicit — no FK enforcement; the GUID is just a column value.	N/A
`classes-baseline`	The 19 detection classes seeded by `DatabaseMigrator` (ids 0–18, names per `data_parameters.md`).	F7-001 (catalog read), F1-* (class_num references)	Auto, by the SUT's boot-time migrator.	N/A — schema-managed
`clean-state`	Empty `annotations`, `media`, `detection`, `annotations_queue_records` tables at the start of each test class.	every test class that asserts on count / depth	xUnit class fixture: `TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE;` via direct DB connection (out-of-band, runner-only).	Fixture's `Dispose()` truncates again

Data Isolation Strategy

Per-class truncation — each xUnit test class declares an IClassFixture<CleanStateFixture> that truncates the four mutable tables before the first test in the class and again after the last.
Per-test token — every test mints its own ES256 token via the mock issuer fixture (see "Bearer token harness" below); tokens never cross test boundaries.
Per-test mission id — tests that need fan-out isolation (e.g., F3 SSE subscribers) generate a fresh WaypointId GUID per test so concurrent test runs don't leak events into each other.
Per-test stream consumer — F4 stream-consumer scenarios use a fresh consumer name per test and start at offset next (current end of stream). They consume only messages produced after the test starts.
Filesystem isolation — annotations-images, annotations-videos, annotations-deleted volumes are recreated by docker compose down -v between full runs. Per-test cleanup removes only files the test wrote (matching <id> patterns).

Input Data Mapping

Input Data File	Source Location	Description	Covers Scenarios
`image_small.jpg`	`<fixtures>/image_small.jpg`	1280×720 frame, ~1.5 MB	F1-001, F1-002, F1-N-003..005, F2-001/002, F3-001/002, F4-001/002, F5-001/002, F8-*
`image_dense01.jpg`	`<fixtures>/image_dense01.jpg`	small dense frame (~230 KB)	F1-004, F5-002, F8-002
`image_dense02.jpg`	`<fixtures>/image_dense02.jpg`	larger dense frame (~2.8 MB)	F5-002
`image_different_types.jpg`	`<fixtures>/image_different_types.jpg`	multi-class scene (900×1600)	F8-002 (class filter)
`image_empty_scene.jpg`	`<fixtures>/image_empty_scene.jpg`	1920×1080 empty scene	F1-003 (zero detections), NFT-PERF-* warmup
`image_large.JPG`	`<fixtures>/image_large.JPG`	6252×4168, ~7 MB	F1-005 (large payload), NFT-PERF-LATENCY
`video_short01.mp4`	`<fixtures>/video_short01.mp4`	~150 MB video	F1-006 (video annotation), F1-007
`video_short02.mp4`	`<fixtures>/video_short02.mp4`	distinct-bytes second video	F1-007 (distinct bytes → distinct ids)

<fixtures> resolves to /fixtures inside the test runner / SUT container, bound to ../detections/_docs/00_problem/input_data/ per _docs/00_problem/input_data/fixtures.md.

Synthetic request payloads

JSON request bodies for POST /annotations, PUT /annotations/{id}, POST /dataset/status/bulk, and the auth flows live under _docs/00_problem/input_data/requests/. Each test references a request file by id (F1_001_request.json). Class numbers in detections come from the seeded detection_classes (ids 0–18); coordinates are normalized 0..1 floats.

Expected Results Mapping

(Full table is _docs/00_problem/input_data/expected_results/results_report.md — 44 rows. Selected entries here for cross-reference.)

Test Scenario ID	Input Data	Expected Result	Comparison Method	Tolerance	Source
FT-P-01 (=F1-001)	`image_small.jpg` + `F1_001_request.json`	HTTP 200 + `AnnotationDto`; `id =~ /^[0-9a-f]{32}$/`; `detections.length == 1`	exact, schema_match, regex	N/A	`expected_results/F1_001_response.json`
FT-P-02 (=F1-002)	Same input, second POST	Same `id` as FT-P-01; no duplicate row	exact	N/A	inline
FT-P-04 (=F1-004)	`image_dense01.jpg` + `F1_004_request.json`	HTTP 200; `detections.length == 5`; YOLO label file with 5 lines	exact, file_content	N/A	`expected_results/F1_004_response.json`
FT-P-10 (=F3-001)	F1-001 fires, SSE subscriber connected	event with `operation == "Created"`, `latency ≤ 1000ms`	exact, threshold_max	± 200ms	inline
FT-N-04 (=F1-N-004)	F1-001 with no `Authorization` header	HTTP 401 + error envelope	exact, schema_match	N/A	inline
NFT-PERF-LATENCY-01	`image_small.jpg` × 50 sequential calls	p95 latency ≤ 1500ms	threshold_max	N/A	inline
NFT-RES-01	RabbitMQ stopped, F1-001 fires	HTTP 200 returned to caller; outbox row stays; SUT stays alive	exact	N/A	inline
NFT-SEC-01	F1-001 with JWT signed by wrong key	HTTP 401	exact	N/A	inline
NFT-RES-LIM-01	F4 outbox under sustained load	queue depth ≤ 10× steady-state for ≥ 30 min	threshold_max	N/A	inline

External Dependency Mocks

External Service	Mock/Stub	How Provided	Behavior
RabbitMQ Stream broker	Real `rabbitmq:3.13-management` with the streams plugin	Docker service in `e2e-net`	Real broker; resilience tests (NFT-RES-01..03) restart it mid-test using `docker exec rabbitmq rabbitmqctl stop_app && start_app`
Postgres	Real `postgres:13`	Docker service	Real DB; resilience tests (NFT-RES-04) crash and restart it
Detections service	Not run	N/A	The annotations service does not call the detections service; tests bypass it by hand-authoring synthetic `detections[]` payloads in `requests/`.
Suite-level reverse proxy / TLS terminator	Not run	N/A	Tests speak directly to `http://annotations:8080`. SEC-tests for HTTPS / HSTS therefore explicitly skip with reason "out-of-process for SUT".

Data Validation Rules

Data Type	Validation	Invalid Examples	Expected System Behavior
`image_bytes` (POST /annotations)	non-null, non-empty byte array	empty array `[]`, missing field	HTTP 400/422; error envelope
`mediaType` (POST /annotations)	enum `Image=10` or `Video=20`	`5`, `100`, missing	HTTP 400/422; error envelope
`detections[].class_num`	int, no range validator today	`-1`, `999`	HTTP 200 today (lenient); flagged as gap (SEC-05)
`detections[].centerX/Y/width/height`	float, no range validator today	`1.5`, `-0.1`, `NaN`	HTTP 200 today (lenient); flagged as gap (SEC-05)
`Authorization` header	bearer ES256 JWT issued by the mock issuer; validated for issuer / audience / signature / expiry, with `alg` pinned to ES256	missing, wrong issuer, wrong audience, wrong signature, expired, `alg=HS256` forgery	HTTP 401; error envelope
Caller policy	`ANN`, `DATASET`, or `ADM` per endpoint	mismatched policy	HTTP 403; error envelope
`WaypointId` (POST /annotations, /media)	GUID format	not a GUID	HTTP 400/422 from model binder
File-upload size (POST /media)	no explicit limit visible at controller; underlying ASP.NET form-options apply	>256 MB single file	likely HTTP 400 from form-options; verify in NFT-RES-LIM-02

Runtime-generated test data

Two scenario groups consume synthetic test data generated by the runner at execution time rather than static files on disk. This is intentional and explicitly allowed by templates/expected-results.md ("Test data may be generated programmatically — note this in test-data.md"):

Scenario	Generated data	How
NFT-RES-LIM-02 (single-file upload boundary)	Synthetic JPEG-prefixed binary blobs at sizes 1, 10, 50, 100, 256, 512 MB	Runner xUnit fixture writes a temp file: 4-byte JPEG magic header + pseudo-random bytes filling to the target size; uploaded once, deleted after. Files NOT committed to the repo.
NFT-PERF-LIST-01, NFT-PERF-DATASET-01	10,000 `annotations` rows + 50,000 `detection` rows in the test DB	`dataseed` job runs a parameterised SQL script that bulk-inserts rows with `media_id` referencing 100 distinct seeded media rows; uses `CROSS JOIN generate_series` for speed. Cleared by `clean-state` truncation between test classes.

The generated data still satisfies Phase 3 quantifiability: every generated input has a deterministic shape (size, count) AND a quantifiable expected result (HTTP code, latency threshold, returned row count).

Bearer token harness

Annotations is verifier-only — there is no /auth/login to call from a test. The harness reproduces the production model in miniature:

Key pair — a fresh ES256 key pair is generated when the test stack starts (docker compose up). The private key is mounted into the runner container; the public key is mounted into a tiny mock issuer sidecar that serves /.well-known/jwks.json over HTTP inside the docker-compose network.
JWKS URL configuration — the SUT is started with JWT_ISSUER=https://e2e-issuer.test, JWT_AUDIENCE=annotations-e2e, and JWT_JWKS_URL=http://e2e-issuer:8080/.well-known/jwks.json. The HTTPS-only constraint of HttpDocumentRetriever.RequireHttps is relaxed in source: JwtExtensions.AddJwtAuth sets RequireHttps = false if and only if ASPNETCORE_ENVIRONMENT == "E2ETest" (case-insensitive). Any other value — including unset, Development, Staging, Production — keeps HTTPS required. This is the resolved form of architecture.md Open Risks §6 (see also _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md item C01).
Token minting — the runner exposes a per-test helper mintToken(claim: "ANN" | "DATASET" | "ADM", overrides?) that builds an ES256 JWT from the in-process private key with the configured iss/aud, exp = now + 5m, a per-role deterministic sub GUID, and the requested policy claim. overrides lets a test produce expired / wrong-iss / wrong-aud / forged-alg=HS256 variants for the security suite.
No persisted users — there is no users table in this service. Each test mints exactly the token it needs.

Notes for the runner

Boot order: postgres → rabbitmq → e2e-issuer (mock JWKS) → annotations (waits for postgres, rabbitmq, and a successful JWKS fetch) → dataseed → e2e-runner.
Fresh-state vs. carry-over: the suite truncates per class, so test ordering inside a class matters; ordering across classes does not.
Stream consumption: every test that reads from azaion-annotations records the offset before the test acts, then consumes from start_offset = recorded_offset + 1 to ignore historical messages.
Conditional probes: tests that depend on SUT behavior decisions (e.g., specific 4xx code on a corner case) include a fixture step that probes the SUT once at class-init, records the actual behavior, then asserts that branch consistently within the test class. Mismatch on a subsequent run flags as a behavior-drift test failure.

12 KiB Raw Blame History Unescape Escape