From cf632d9e2e5c607c383ff1b2b8c2d554348ea079 Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Thu, 14 May 2026 21:13:53 +0300 Subject: [PATCH] [AZ-563] Decompose blackbox tests into AZ-564..574 task specs Step 5 of autodev existing-code flow. Epic AZ-563 plus 11 atomic tasks covering all 67 test scenarios from _docs/02_document/tests/* exactly once: - AZ-564 test infrastructure (xUnit + Docker + mock JWKS + dataseed) - AZ-565..568 functional positive (FT-P-01..22) - AZ-569..570 functional negative (FT-N-01..16) - AZ-571 security (NFT-SEC-01..10) - AZ-572 resilience (NFT-RES-01..06) - AZ-573 resource limits (NFT-RES-LIM-01..06) - AZ-574 performance (NFT-PERF-*) _dependencies_table.md records the cross-check vs traceability matrix (22 + 16 + 29 = 67 scenarios, no overlaps, no gaps; deferred items remain deferred per matrix). All task headers carry their Jira IDs (tracker: jira). Autodev state advanced to Step 6 (Implement Tests). Co-authored-by: Cursor --- _docs/02_tasks/_dependencies_table.md | 26 +++ .../todo/AZ-564_test_infrastructure.md | 196 ++++++++++++++++++ .../AZ-565_test_annotations_rest_positive.md | 53 +++++ .../AZ-566_test_realtime_outbox_positive.md | 51 +++++ .../AZ-567_test_media_dataset_positive.md | 49 +++++ ...-568_test_auth_health_migrator_positive.md | 35 ++++ ...Z-569_test_validation_envelope_negative.md | 43 ++++ .../AZ-570_test_authorization_negative.md | 44 ++++ _docs/02_tasks/todo/AZ-571_test_security.md | 52 +++++ _docs/02_tasks/todo/AZ-572_test_resilience.md | 49 +++++ .../todo/AZ-573_test_resource_limits.md | 49 +++++ .../02_tasks/todo/AZ-574_test_performance.md | 50 +++++ _docs/_autodev_state.md | 8 +- 13 files changed, 703 insertions(+), 2 deletions(-) create mode 100644 _docs/02_tasks/todo/AZ-564_test_infrastructure.md create mode 100644 _docs/02_tasks/todo/AZ-565_test_annotations_rest_positive.md create mode 100644 _docs/02_tasks/todo/AZ-566_test_realtime_outbox_positive.md create mode 100644 _docs/02_tasks/todo/AZ-567_test_media_dataset_positive.md create mode 100644 _docs/02_tasks/todo/AZ-568_test_auth_health_migrator_positive.md create mode 100644 _docs/02_tasks/todo/AZ-569_test_validation_envelope_negative.md create mode 100644 _docs/02_tasks/todo/AZ-570_test_authorization_negative.md create mode 100644 _docs/02_tasks/todo/AZ-571_test_security.md create mode 100644 _docs/02_tasks/todo/AZ-572_test_resilience.md create mode 100644 _docs/02_tasks/todo/AZ-573_test_resource_limits.md create mode 100644 _docs/02_tasks/todo/AZ-574_test_performance.md diff --git a/_docs/02_tasks/_dependencies_table.md b/_docs/02_tasks/_dependencies_table.md index e180d0a..02ee90c 100644 --- a/_docs/02_tasks/_dependencies_table.md +++ b/_docs/02_tasks/_dependencies_table.md @@ -11,6 +11,32 @@ Tracks ordering and inter-task dependencies for all task specs in `_docs/02_task Tasks AZ-561 and AZ-562 touch disjoint files and were implemented as a single batch. +## Open — Step 5: Blackbox Tests (epic AZ-563) + +All test tasks below land their xUnit code in a single new test project rooted at `e2e/Azaion.Annotations.E2E/` (per `AZ-564`). The infrastructure task is a hard prerequisite for every other test task. + +| Task | Title | Scope | Scenarios | Complexity | Depends on | +|------|-------|-------|-----------|-----------|------------| +| [AZ-564](https://denyspopov.atlassian.net/browse/AZ-564) | Test infrastructure (Annotations e2e) | `e2e/Azaion.Annotations.E2E/`, `e2e/docker-compose.test.yml`, mock JWKS issuer, dataseed, runner script | n/a — bootstrap | 5 | None | +| [AZ-565](https://denyspopov.atlassian.net/browse/AZ-565) | Annotations REST positive | `Tests/AnnotationsRest/` | FT-P-01..06 (6) | 5 | AZ-564 | +| [AZ-566](https://denyspopov.atlassian.net/browse/AZ-566) | Realtime + outbox positive | `Tests/Realtime/`, `Tests/Outbox/` | FT-P-07,08,09 + FT-P-21,22 (skipped: RB-01) (5) | 5 | AZ-564 | +| [AZ-567](https://denyspopov.atlassian.net/browse/AZ-567) | Media + Dataset positive | `Tests/Media/`, `Tests/Settings/`, `Tests/Dataset/` | FT-P-10,11,14,15,16,17,18 (7) | 5 | AZ-564 | +| [AZ-568](https://denyspopov.atlassian.net/browse/AZ-568) | Auth + Health + Migrator positive | `Tests/Auth/`, `Tests/Health/`, `Tests/Migrator/` | FT-P-12,13,19,20 (4) | 2 | AZ-564 | +| [AZ-569](https://denyspopov.atlassian.net/browse/AZ-569) | Validation + envelope negative | `Tests/Validation/` | FT-N-01,02,05,06,07,14,16 (7) | 3 | AZ-564 | +| [AZ-570](https://denyspopov.atlassian.net/browse/AZ-570) | Authorization negative | `Tests/Authorization/` | FT-N-03,04,08,09,10,11,12,13,15 (9) | 3 | AZ-564 | +| [AZ-571](https://denyspopov.atlassian.net/browse/AZ-571) | Security tests | `Tests/Security/` (+ Production-env xUnit collection) | NFT-SEC-01..10 (10) | 5 | AZ-564 | +| [AZ-572](https://denyspopov.atlassian.net/browse/AZ-572) | Resilience tests | `Tests/Resilience/` (broker / DB outage fixtures) | NFT-RES-01..06 (6) | 5 | AZ-564 | +| [AZ-573](https://denyspopov.atlassian.net/browse/AZ-573) | Resource-limit tests | `Tests/ResourceLimit/` (profile-gated nightly variants) | NFT-RES-LIM-01..06 (6) | 3 | AZ-564 | +| [AZ-574](https://denyspopov.atlassian.net/browse/AZ-574) | Performance tests | `Tests/Performance/` (perf profile + dataseed-loaded DB) | NFT-PERF-* (7) | 3 | AZ-564 (incl. dataseed) | + +### Coverage cross-check vs `_docs/02_document/tests/traceability-matrix.md` + +- **Functional positive**: FT-P-01..22 = 22 scenarios → covered exactly once across AZ-565 (6) + AZ-566 (5) + AZ-567 (7) + AZ-568 (4). +- **Functional negative**: FT-N-01..16 = 16 scenarios → covered exactly once across AZ-569 (7) + AZ-570 (9). +- **Non-functional**: 10 + 6 + 6 + 7 = 29 scenarios → covered exactly once across AZ-571..574. +- **Total decomposed**: 22 + 16 + 29 = **67 scenarios**, no overlaps, no gaps. +- **Deferred items** (RB-01 gated FT-P-21/22, RB-02/06/08/09 follow-ups, AC-F-13, ENV-04/05, OP-02 multi-instance) remain marked deferred per the traceability matrix and will be re-decomposed in cycle-update once the gating refactor tasks land. + ## Tracker Status `tracker: jira` (per `_docs/_autodev_state.md`). All task headers carry their Jira issue key. The deferred-write leftover at `_docs/_process_leftovers/2026-05-14_testability-tracker.md` was replayed on 2026-05-14 and removed. diff --git a/_docs/02_tasks/todo/AZ-564_test_infrastructure.md b/_docs/02_tasks/todo/AZ-564_test_infrastructure.md new file mode 100644 index 0000000..845ee2e --- /dev/null +++ b/_docs/02_tasks/todo/AZ-564_test_infrastructure.md @@ -0,0 +1,196 @@ +# Test Infrastructure + +**Task**: AZ-564 +**Name**: Test Infrastructure (Annotations e2e) +**Description**: Scaffold the executable blackbox test project — xUnit runner, mock JWKS issuer, ES256 key-pair fixture, Docker test stack, fixture mounts, seed script, CSV reporting. After this task lands, every other test task can declare itself a child of this scaffold. +**Complexity**: 5 points +**Dependencies**: AZ-560 (testability refactor — already landed via AZ-561 and AZ-562) +**Component**: Blackbox Tests +**Tracker**: jira +**Epic**: AZ-563 — `Blackbox Tests — Annotations` + +## Test Project Folder Layout + +``` +tests/ +├── Azaion.Annotations.E2E/ +│ ├── Azaion.Annotations.E2E.csproj +│ ├── Dockerfile +│ ├── TestBase.cs # base class with HttpClient, token helper +│ ├── Fixtures/ +│ │ ├── DockerStackFixture.cs # CollectionFixture — boot order check +│ │ ├── CleanStateFixture.cs # TRUNCATE between test classes +│ │ ├── BrokerFixture.cs # RabbitMQ stop/start helpers +│ │ └── TokenMinter.cs # ES256 token minting via the in-stack key +│ ├── Domain/ # one file per category (one task per file) +│ │ ├── (populated by AZ-565 ... AZ-573) +│ └── README.md +└── harness/ + ├── mock_issuer.py # ~40-line Python http.server (writes JWKS, mounts private key) + └── gen_keys.sh # one-shot ES256 keypair generator (invoked by mock_issuer at boot) + +e2e/ +├── docker-compose.test.yml # already produced in autodev Step 3; this task wires the new services into it +├── seed/ +│ └── run.sh # already drafted in Step 3; this task adds bulk-insert SQL for NFT-PERF-LIST-01 and NFT-PERF-DATASET-01 +└── e2e-results/ # output of test runs (gitignored) +``` + +### Layout Rationale + +- Tests live under `tests/Azaion.Annotations.E2E/` to mirror the .NET convention (sibling of `src/`). +- The mock issuer lives in `tests/harness/` so it can be shared by smoke / debug stacks without polluting the test runner project. +- Fixtures are separated from test classes to make the docker-stack boot pattern reusable. +- All tests are xUnit (matches the SUT runtime; avoids a Python toolchain in CI). + +## Mock Services + +| Mock Service | Replaces | Endpoints | Behavior | +|--------------|----------|-----------|----------| +| `e2e-issuer` (Python `http.server`) | Admin's JWKS issuer | `GET /.well-known/jwks.json` (returns a 1-key JWKS for the in-stack ES256 public key) | Static for the lifetime of the docker-compose stack. Public key regenerates per `docker compose down -v` cycle. No test-time mutability needed — variant tokens (expired / wrong-iss / wrong-aud / `alg=HS256` forgery) are minted with overrides by the runner against the same private key (NFT-SEC-01..10 verifies the SUT rejects them). | + +There are no other mock services. All other infrastructure is real (Postgres 13, RabbitMQ 3.13 streams) — restrictions.md mandates "no mocking of internal services". External dependencies that *could* be mocked (admin sync worker, AI training consumer) are simply not run because the SUT does not initiate calls to them; it publishes to the stream and the stream is read by the test runner directly. + +### Mock Control API + +Not applicable for this suite. The mock issuer is static; behavior variation is performed by the runner minting different tokens. Broker / DB resilience is performed by `docker exec rabbitmq rabbitmqctl stop_app` and `docker restart postgres` invoked from the test runner — driven via .NET's `Process.Start` against the host docker socket bound into the runner container. + +## Docker Test Environment + +### docker-compose.test.yml structure + +| Service | Image / Build | Purpose | Depends On | +|---------|---------------|---------|------------| +| `postgres` | `postgres:13` | SUT's DB | — | +| `rabbitmq` | `rabbitmq:3.13-management` + streams plugin | Stream broker | — | +| `e2e-issuer` | `python:3.12-alpine` running `tests/harness/mock_issuer.py` | Mock JWKS issuer + key pair generator | — | +| `annotations` | Built from `src/Dockerfile` | SUT | `postgres` (healthy), `rabbitmq` (healthy), `e2e-issuer` (healthy) | +| `dataseed` | `postgres:13` (one-shot psql) | Loads `classes-baseline`, mission row, and the bulk rows for NFT-PERF-LIST-01 / NFT-PERF-DATASET-01 | `annotations` (healthy) | +| `e2e-runner` | Built from `tests/Azaion.Annotations.E2E/Dockerfile` (.NET SDK 10.0) | Test runner (xUnit) | `dataseed` (completed_successfully) | + +### Networks and Volumes + +- **Network**: `e2e-net` (bridge, isolated). All services reach each other by service name. +- **Volumes**: + - `pg-data` — Postgres durability across restart (resilience scenarios). + - `annotations-images`, `annotations-videos`, `annotations-deleted` — SUT file dirs. + - `jwt-keys` — ES256 keypair shared between `e2e-issuer` (writes public + serves JWKS) and `e2e-runner` (reads private key for token minting). +- **Bind mount (read-only)**: `../detections/_docs/00_problem/input_data` → `/fixtures` in both the SUT and the runner. + +### Test runner host-docker access + +The runner needs to execute `docker exec rabbitmq rabbitmqctl stop_app` (NFT-RES-01..03) and `docker restart postgres` (NFT-RES-02..03). Solution: bind-mount the host docker socket into the runner (`/var/run/docker.sock:/var/run/docker.sock`) under a `RESILIENCE_DOCKER_SOCKET` env var; the `BrokerFixture` / `DbFixture` use it. This is gated to the test stack — the production SUT never mounts the docker socket. + +## Test Runner Configuration + +**Framework**: xUnit (matches SUT toolchain — .NET 10). +**Plugins / NuGet refs**: +- `Microsoft.NET.Test.Sdk` (xUnit discovery) +- `xunit` + `xunit.runner.visualstudio` +- `RabbitMQ.Stream.Client` (same version as `src/Azaion.Annotations.csproj`) +- `MessagePack` (same version) — to decode stream messages for FT-P-09 +- `Microsoft.AspNetCore.SignalR.Client` — NO, SSE is plain HTTP `text/event-stream`; we use `HttpClient` directly +- `System.IdentityModel.Tokens.Jwt` — for ES256 minting +- `Npgsql` — for direct DB introspection assertions (read-only, documented per test) +- `coverlet.collector` — for coverage; not gated on this run but nice to have + +**Entry point**: `dotnet test --logger "trx;LogFileName=results.trx" --results-directory /results` — followed by a tiny CSV-converter post-step in `Dockerfile`'s ENTRYPOINT that produces `/results/report.csv` from `results.trx`. + +### Fixture Strategy + +| Fixture | Scope | Purpose | +|---------|-------|---------| +| `DockerStackFixture` | Collection (one per assembly) | Smoke-pings `/health` and waits for JWKS fetch on boot. Does NOT bring the stack up — that's `docker compose up`'s job. | +| `CleanStateFixture` | Class (per test class) | `TRUNCATE annotations, media, detection, annotations_queue_records RESTART IDENTITY CASCADE` via direct Postgres. Run before first test, again after last. | +| `TokenMinter` | Singleton (within fixture lifetime) | Holds the ES256 private key (read from `/keys` mount) and exposes `MintToken(claim, overrides?)`. | +| `BrokerFixture` | Per-test (only for resilience tests) | `StopBroker()`, `StartBroker()` via `docker exec`. Asserts pre/post state. | +| `StreamConsumerFixture` | Per-test (only for stream-consumer tests) | Creates a fresh consumer name, starts at offset `next`, decodes MessagePack + gzip into typed events. | + +## Test Data Fixtures + +| Data Set | Source | Format | Used By | +|----------|--------|--------|---------| +| Image / video fixtures | Bind-mount `../detections/_docs/00_problem/input_data/` → `/fixtures` (read-only) | JPEG / MP4 binary | All FT-P-* and most FT-N-* | +| `classes-baseline` (19 detection classes) | Auto-seeded by `DatabaseMigrator` on `annotations` first boot | DB rows | FT-P-14 (catalog read), every FT-P that references `class_num` | +| `mission-test` GUID `00000000-0000-0000-0000-000000000aaa` | Inlined in request payloads | GUID | All annotation-create paths | +| Synthetic JPEGs for NFT-RES-LIM-02 | Generated at test time by `LargePayloadFixture` (1, 10, 50, 100, 256, 512 MB) | binary | NFT-RES-LIM-02 | +| Bulk rows for NFT-PERF-LIST-01 / NFT-PERF-DATASET-01 (10k annotations, 50k detections) | `dataseed/run.sh` SQL block | DB rows | NFT-PERF-LIST-01, NFT-PERF-DATASET-01 | +| Per-test ES256 tokens | `TokenMinter` (in-process minting) | JWT | All FT-* requiring `Authorization` header and all NFT-SEC-* | + +### Data Isolation + +- **Per-class truncation** via `CleanStateFixture` (above). +- **Per-test mission GUID** for SSE fan-out tests (FT-P-07, NFT-PERF-SSE-FANOUT-01). +- **Per-test stream consumer name** for FT-P-09 and NFT-RES-06. +- **Volume reset on `docker compose down -v`** — image / video dirs and the JWKS keypair regenerate. + +## Test Reporting + +**Format**: `.trx` (xUnit native), converted to flat CSV by the runner. +**CSV columns**: `test_id`, `test_name`, `category`, `traces_to`, `execution_time_ms`, `result`, `error_message`. +**Output path**: `/results/report.csv` and `/results/results.trx` inside the runner; mounted to `./e2e-results/` on the host. + +`traces_to` is populated from a `[Trait("traces_to", "AC-F-01, HW-02")]` attribute on each test method — the converter reads the attribute and writes a comma-separated cell. This makes the resulting CSV self-describing for the traceability-matrix check at autodev Step 7 (Run Tests). + +## Acceptance Criteria + +**AC-1: Test environment starts** +Given a clean clone of the repo on a host with Docker installed, +When `./scripts/run-tests.sh` is executed (or equivalent `docker compose -f e2e/docker-compose.test.yml up`), +Then `postgres`, `rabbitmq`, `e2e-issuer`, `annotations`, `dataseed`, and `e2e-runner` all start in dependency order, the `annotations` service reaches `healthy`, and the test runner begins discovery. + +**AC-2: Mock JWKS responds with the in-stack public key** +Given the test environment is up, +When `wget http://e2e-issuer:8080/.well-known/jwks.json` is executed from the `annotations` container, +Then the response is a valid JWKS with exactly one ES256 key whose `kid` matches the private key shared with `e2e-runner`. + +**AC-3: Token minter mints a valid token end-to-end** +Given the test environment is up and `TokenMinter.MintToken("ANN")` is invoked, +When the resulting token is presented as `Authorization: Bearer ` on `POST /annotations` with a fixture payload, +Then the SUT returns HTTP 200 (token validates against the JWKS-published public key). + +**AC-4: Truncation fixture isolates classes** +Given two test classes that each create one annotation row, +When both run within the same test session, +Then each class observes an empty `annotations` table at start and the SUT keeps no cross-class state. + +**AC-5: CSV report generated with required columns** +Given a test session has completed, +When the runner exits, +Then `./e2e-results/report.csv` exists on the host and contains the columns: `test_id`, `test_name`, `category`, `traces_to`, `execution_time_ms`, `result`, `error_message`. + +**AC-6: Resilience helpers work** +Given the test environment is up, +When `BrokerFixture.StopBroker()` is invoked from a test, +Then `docker exec rabbitmq rabbitmqctl stop_app` succeeds and `BrokerFixture.StartBroker()` reverses it within 5 s; the SUT recovers (subsequent `POST /annotations` returns 200) within the documented backoff window. + +## Constraints + +- `restrictions.md` SW-01: .NET 10 toolchain only — test runner pins `Microsoft.NET.Test.Sdk` to the version compatible with .NET 10. +- `restrictions.md` HW-01: ARM64-only — the e2e-runner Dockerfile uses `mcr.microsoft.com/dotnet/sdk:10.0` which is multi-arch. +- `restrictions.md` ENV-02: no in-image TLS — the test stack uses plain HTTP; the JWKS HTTPS gate (AZ-561) is satisfied by `ASPNETCORE_ENVIRONMENT=E2ETest`. +- Every test must use the Arrange / Act / Assert pattern with `// Arrange`, `// Act`, `// Assert` comments (per `coderule.mdc`). +- No mocks for internal services (`AnnotationService`, `FailsafeProducer`, etc.) — every test exercises the real public surface. +- No direct writes to the SUT's tables from the runner. Read-only DB access is allowed only for blackbox-documented assertions (outbox row count, queue depth) and must be marked with a `[Trait("db_access", "read-only")]` attribute. + +## Risks & Mitigation + +**Risk 1: Docker socket bind exposes too much** +- *Risk*: Mounting `/var/run/docker.sock` into the runner gives it root-equivalent access to the host. Acceptable in CI runners; risky on developer laptops. +- *Mitigation*: The socket bind is in `docker-compose.test.yml`'s `e2e-runner` block only (not the SUT). Document that the test stack assumes a CI-like or isolated dev environment. `restrictions.md` does not forbid this. + +**Risk 2: JWKS keypair freshness** +- *Risk*: A stale keypair lingering in the `jwt-keys` volume could cause cryptic JWKS failures between test runs. +- *Mitigation*: `mock_issuer.py` regenerates the keypair on every container start if `gen_keys.sh` has not been run in the current container lifetime. `docker compose down -v` between full runs guarantees a fresh key. + +**Risk 3: Bulk seed slows boot** +- *Risk*: 10k annotation rows + 50k detection rows in `dataseed` could push boot from ~5 s to ~30 s. +- *Mitigation*: Bulk insert uses `CROSS JOIN generate_series` and a single `COPY FROM STDIN` so the seed completes in <10 s on local hardware. NFT-PERF tests already document a separate boot allowance; functional tests do not depend on the perf seed and run independently if the seed is split into a profile-gated step. + +## Self-Verification + +- [x] Every external dependency from `environment.md` has a mock service defined OR an explicit "real service used" justification (real Postgres, real Rabbit, mock issuer only). +- [x] Docker Compose structure covers all services from `environment.md`. +- [x] Test data fixtures cover all seed data sets from `test-data.md` (tokens-test, mission-test, classes-baseline, clean-state, runtime-generated big payloads, bulk-perf rows). +- [x] Test runner configuration matches SUT tech stack (.NET 10, xUnit, RabbitMQ.Stream.Client at the same NuGet version). +- [x] Data isolation strategy is defined (per-class truncate, per-test mission/consumer/token). diff --git a/_docs/02_tasks/todo/AZ-565_test_annotations_rest_positive.md b/_docs/02_tasks/todo/AZ-565_test_annotations_rest_positive.md new file mode 100644 index 0000000..d2944f9 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-565_test_annotations_rest_positive.md @@ -0,0 +1,53 @@ +# Annotations REST positive tests + +**Task**: AZ-565 +**Name**: Annotations REST positive flow tests +**Description**: Implement xUnit tests for FT-P-01..06 — annotation create (small / empty / dense), idempotency on identical re-POST, paginated listing, detail-by-id. The core happy-path surface of the annotations REST API. +**Complexity**: 5 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Annotations REST +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| FT-P-01 | `_docs/02_document/tests/blackbox-tests.md` | Annotation create — single detection, small image. HTTP 200, AnnotationDto, 32-char hex id, label file on disk. | +| FT-P-02 | same | Idempotency on identical re-POST. Same id, no new DB row. | +| FT-P-03 | same | Empty scene, 0 detections. HTTP 200; no label file or zero-line label file (per Spec). | +| FT-P-04 | same | Dense scene, 5 mixed-class detections. HTTP 200; YOLO label file has 5 lines, class numbers from `classes-baseline`. | +| FT-P-05 | same | Paginated listing — `GET /annotations?missionId=…&offset=&limit=`. PaginatedResponse envelope; ordering deterministic. | +| FT-P-06 | same | Detail by id. `GET /annotations/{id}`. Full DTO including detections. | + +## System Under Test Boundary + +- Tests MUST drive the system through `http://annotations:8080/annotations` HTTP only. No in-process imports of `Azaion.Annotations.*`. +- Stubs are NOT allowed for `AnnotationService`, `MediaService`, `PathResolver`, the hash function, the migrator, or the SUT's DB schema. The test exercises the real production code path end to end. +- Read-only DB introspection is allowed only for asserting that the label file row exists in the `annotations` table (FT-P-01 step 4). Marked with `[Trait("db_access", "read-only")]`. No writes. +- Read-only filesystem introspection on `annotations-images` volume is allowed only for asserting the label file contents (FT-P-01, FT-P-04). The test mounts the volume read-only. +- Outputs are compared against `_docs/00_problem/input_data/expected_results/results_report.md` row F1-001 / F1-002 / F1-003 / F1-004 / F1-005 (regex on id, exact on detections.length, file_content on label files). + +## Acceptance Criteria + +**AC-1: Every scenario passes per its spec** +Given the e2e stack is up and clean, +When the runner executes each FT-P-01..06 test exactly as documented, +Then each test reports PASS against the comparison method and tolerance in `results_report.md`. + +**AC-2: Tests are deterministic across re-runs** +Given two consecutive runs of FT-P-01..06, +When both complete successfully, +Then the assertion outcomes are identical (same ids, same response shapes, same DB rows / label files). + +## Non-Functional Requirements + +- **Performance**: Each test in this batch completes in ≤5 s on the documented hardware; total batch runs in ≤30 s (no perf gates here — those are in T11). +- **Reliability**: Tests use `CleanStateFixture` to isolate state; no carry-over between tests in the class. + +## Constraints + +- Use AAA pattern with `// Arrange`, `// Act`, `// Assert` comments per `coderule.mdc`. +- Token minting via `TokenMinter.MintToken("ANN")` for every test in this batch. +- `[Trait("traces_to", "AC-F-01, AC-F-02, AC-F-03, AC-F-04, HW-02")]` (or the per-test subset) on every test method. +- One xUnit test class per scenario file or per closely-related group (e.g., `AnnotationCreateTests`, `AnnotationListingTests`). diff --git a/_docs/02_tasks/todo/AZ-566_test_realtime_outbox_positive.md b/_docs/02_tasks/todo/AZ-566_test_realtime_outbox_positive.md new file mode 100644 index 0000000..045627a --- /dev/null +++ b/_docs/02_tasks/todo/AZ-566_test_realtime_outbox_positive.md @@ -0,0 +1,51 @@ +# Realtime + outbox positive tests + +**Task**: AZ-566 +**Name**: Realtime + outbox positive tests +**Description**: Implement xUnit tests for FT-P-07..09 (SSE delivery, outbox row, stream message round-trip) plus the two RB-01-gated lifecycle tests FT-P-21/FT-P-22 (authored as `Skip(Reason="awaiting RB-01")` per the test-spec convention). +**Complexity**: 5 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Realtime + Outbox +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| FT-P-07 | `_docs/02_document/tests/blackbox-tests.md` | SSE event for new annotation. Latency ≤ 1 s. No backfill (assertion in step 2). | +| FT-P-08 | same | Outbox row inserted on annotation create. Direct DB SELECT. | +| FT-P-09 | same | Stream message round-trip. Decode MessagePack + gzip; assert payload schema. | +| FT-P-21 | same `[after RB-01]` | Lifecycle event on annotation update (Skipped — awaiting RB-01). | +| FT-P-22 | same `[after RB-01]` | Lifecycle event on delete + soft-delete file relocation (Skipped — awaiting RB-01). | + +## System Under Test Boundary + +- SSE: connect via `HttpClient` with `Accept: text/event-stream` and `Authorization: Bearer …`. No stubbing of `AnnotationEventService` or its `Channel`. +- Outbox: read-only DB query on `annotations_queue_records` table after the create call. `[Trait("db_access", "read-only")]`. +- Stream: connect to `rabbitmq:5552` via `RabbitMQ.Stream.Client` with a fresh consumer name starting at offset `next`. Decode payload using the same MessagePack + gzip pipeline the SUT uses. No stubbing of `FailsafeProducer` or `RabbitMqConfig`. +- Compare against `results_report.md` row F3-001 (latency_threshold_max), F4-001 (outbox row content), F4-002 (decoded stream message). + +## Acceptance Criteria + +**AC-1: Every active scenario passes per its spec.** +Given the e2e stack is up, +When FT-P-07, FT-P-08, FT-P-09 are executed, +Then each reports PASS within its tolerance (FT-P-07 ≤ 1 s latency; FT-P-08 row exists with expected `operation=Created`; FT-P-09 MessagePack payload matches expected schema). + +**AC-2: FT-P-21 and FT-P-22 are authored as skipped tests with the documented reason.** +Given the test discovery, +When the runner enumerates tests, +Then FT-P-21 and FT-P-22 appear in the report with `result=Skipped` and `error_message="awaiting RB-01"` (or equivalent). They auto-enable when the cycle-update test-spec mode flips them to active. + +**AC-3: SSE no-backfill assertion** +Given a subscriber that connects AFTER an annotation has been created, +When the subscriber waits 1 s, +Then no event is received for the pre-connection annotation. (FT-P-07 step 2; also satisfies AC-F-11.) + +## Constraints + +- AAA pattern with `// Arrange`, `// Act`, `// Assert` per `coderule.mdc`. +- `[Trait("traces_to", "AC-F-05, AC-F-10, AC-F-11, AC-F-12, SW-03, SW-04")]` (per-test subset). +- Stream consumer must be torn down after each test to avoid offset leakage. +- SSE client must be cancelled cleanly after each test (no zombie connections). diff --git a/_docs/02_tasks/todo/AZ-567_test_media_dataset_positive.md b/_docs/02_tasks/todo/AZ-567_test_media_dataset_positive.md new file mode 100644 index 0000000..2f804ed --- /dev/null +++ b/_docs/02_tasks/todo/AZ-567_test_media_dataset_positive.md @@ -0,0 +1,49 @@ +# Media + Dataset positive tests + +**Task**: AZ-567 +**Name**: Media + Dataset positive tests +**Description**: Implement xUnit tests for media single/batch upload (FT-P-10, FT-P-11), classes catalog read (FT-P-14), directory settings invariant (FT-P-15), dataset filter / class distribution / bulk status (FT-P-16, FT-P-17, FT-P-18). +**Complexity**: 5 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Media, Settings, Dataset +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| FT-P-10 | `_docs/02_document/tests/blackbox-tests.md` | Single media upload. 200 + DTO; file lives in `images_dir`. | +| FT-P-11 | same | Batch media upload. All rows accepted; correct ids returned. | +| FT-P-14 | same | `GET /classes` returns 19 rows from `classes-baseline`. | +| FT-P-15 | same | `PUT /settings/directories` triggers `PathResolver.Reset()`. Subsequent uploads land in the new root. | +| FT-P-16 | same | `GET /dataset?status=…` filter. Result set matches DB state. | +| FT-P-17 | same | Dataset class distribution. Sums match raw counts. | +| FT-P-18 | same | `POST /dataset/status/bulk`. Transitions exactly the listed ids; non-listed ids untouched. | + +## System Under Test Boundary + +- Drive via HTTP. No imports. +- No stubbing of `MediaService`, `DatasetService`, `SettingsService`, `ClassesController`, `PathResolver`. +- FT-P-15 requires direct write to the SUT's `images_dir` volume only to seed pre-existing files for the post-Reset assertion. Marked with `[Trait("fs_access", "write-to-image-dir")]` and only allowed for this specific test per the System Under Test Boundary rule. +- Compare against `results_report.md` rows F5-001, F5-002, F7-001, F6-002 etc. + +## Acceptance Criteria + +**AC-1: Every scenario passes per its spec.** Given the stack is up, when each FT-P-10..18 test runs, then each reports PASS within tolerance. + +**AC-2: FT-P-15 invariant holds across PUT** +Given an annotation was created under the original `images_dir`, +When `PUT /settings/directories` changes the root and a new annotation is created, +Then the new annotation's label file lives under the new root and the previous file is untouched (no migration). + +## Constraints + +- AAA pattern, `// Arrange`/`// Act`/`// Assert` comments. +- Token policy varies per endpoint: + - `POST /media` → `ANN` + - `GET /classes` → any authenticated + - `PUT /settings/*` → `ADM` + - `GET /dataset` → `DATASET` + - `POST /dataset/status/bulk` → `DATASET` +- `[Trait("traces_to", "AC-F-20, AC-F-21, AC-F-30, AC-F-31, AC-F-40, AC-F-41, HW-02")]` (per-test subset). diff --git a/_docs/02_tasks/todo/AZ-568_test_auth_health_migrator_positive.md b/_docs/02_tasks/todo/AZ-568_test_auth_health_migrator_positive.md new file mode 100644 index 0000000..b162a23 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-568_test_auth_health_migrator_positive.md @@ -0,0 +1,35 @@ +# Auth + Health + Migrator positive tests + +**Task**: AZ-568 +**Name**: Auth + Health + Migrator positive tests +**Description**: Implement xUnit tests for the bearer-token happy path (FT-P-12), alg pinning happy path (FT-P-13), health endpoint (FT-P-19), and migrator idempotence (FT-P-20). +**Complexity**: 2 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Auth + Health + Migrator +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| FT-P-12 | `_docs/02_document/tests/blackbox-tests.md` | Bearer token happy path. ES256 token with valid `iss`/`aud`/`exp` + `ANN` claim is accepted. | +| FT-P-13 | same | Alg pinning happy path — token signed with ES256 and `alg=ES256` header is accepted. (Negative variant `alg=HS256` is covered by NFT-SEC-10.) | +| FT-P-19 | same | `GET /health` returns 200 OK without auth. Anonymous. | +| FT-P-20 | same | Migrator idempotence — drop the database, recreate it twice via `docker restart annotations` and assert no errors. | + +## System Under Test Boundary + +- HTTP only. No imports. +- FT-P-20 requires `docker restart annotations` from the runner (uses `BrokerFixture` pattern but renamed `SutRestartFixture`). DB state preserved across restart (via `pg-data` volume). +- Compare against `results_report.md` row F8-001 (health), F8-002 (token validation succeeds). + +## Acceptance Criteria + +**AC-1: Every scenario passes per its spec.** Given the stack is up, when each FT-P-12, FT-P-13, FT-P-19, FT-P-20 test runs, then each reports PASS within tolerance. + +## Constraints + +- AAA pattern. +- FT-P-19 uses no auth header (verifies `/health` is anonymous and disables the auth pipeline for this endpoint). +- `[Trait("traces_to", "AC-F-50, AC-F-54, AC-N-01, AC-N-02, SW-05, OP-05")]` (per-test subset). diff --git a/_docs/02_tasks/todo/AZ-569_test_validation_envelope_negative.md b/_docs/02_tasks/todo/AZ-569_test_validation_envelope_negative.md new file mode 100644 index 0000000..e495dc5 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-569_test_validation_envelope_negative.md @@ -0,0 +1,43 @@ +# Validation + error-envelope negative tests + +**Task**: AZ-569 +**Name**: Validation + error-envelope negative tests +**Description**: Implement xUnit tests for FT-N-01, FT-N-02, FT-N-05, FT-N-06, FT-N-07, FT-N-14, FT-N-16. Cover input validation failures, lenient-bbox behaviour, unknown ids, unknown missions, missing waypoint, empty bulk list. Each test asserts the documented error envelope shape. +**Complexity**: 3 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Validation +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| FT-N-01 | `_docs/02_document/tests/blackbox-tests.md` | `POST /annotations` without `image_bytes`. HTTP 400/422; error envelope. | +| FT-N-02 | same | `POST /annotations` without `mediaType`. HTTP 400/422; error envelope. | +| FT-N-05 | same | Out-of-range bbox value — lenient behavior today (HTTP 200). Test pins that observed behavior; flagged as SEC-05 in security-tests.md. | +| FT-N-06 | same | `GET /annotations/{nonexistent_id}`. HTTP 404; error envelope. | +| FT-N-07 | same | Filter by unknown mission — returns empty page (not 404). | +| FT-N-14 | same | Media upload missing `waypoint_id`. HTTP 400/422. | +| FT-N-16 | same | `POST /dataset/status/bulk` with empty list. HTTP 400; error envelope. | + +## System Under Test Boundary + +- HTTP only. +- No stubbing. +- Every test asserts the error envelope shape against the contract in `_docs/02_document/common-helpers/01_http-error-envelope.md` and the global invariant AC-F-53. + +## Acceptance Criteria + +**AC-1: Every scenario produces the documented HTTP status + error envelope.** + +**AC-2: FT-N-05 pins the current lenient behavior and is tagged as SEC-05 follow-up.** +Given an annotation with a bbox value of `1.5` or `-0.1`, +When `POST /annotations` is called, +Then HTTP 200 is returned today (lenient). Test asserts `[Trait("known_lenient", "true")]`. When SEC-05 lands, the test flips to expect 400 — handled by the test-spec cycle-update. + +## Constraints + +- AAA pattern. +- `[Trait("traces_to", "AC-F-04, AC-F-53")]` plus per-test specific traces. +- Token policy: most tests use `ANN`; FT-N-16 uses `DATASET`. diff --git a/_docs/02_tasks/todo/AZ-570_test_authorization_negative.md b/_docs/02_tasks/todo/AZ-570_test_authorization_negative.md new file mode 100644 index 0000000..739a3d6 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-570_test_authorization_negative.md @@ -0,0 +1,44 @@ +# Authorization negative tests + +**Task**: AZ-570 +**Name**: Authorization negative tests +**Description**: Implement xUnit tests for the 9 authorization-failure scenarios in `blackbox-tests.md` — wrong policy, missing token, expired token, wrong issuer, wrong audience, SSE without auth, settings without ADM, directories without ADM, media without ANN. +**Complexity**: 3 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Authorization +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| FT-N-03 | `_docs/02_document/tests/blackbox-tests.md` | `POST /annotations` without `ANN` policy. HTTP 403. | +| FT-N-04 | same | `POST /annotations` unauthenticated. HTTP 401. | +| FT-N-08 | same | `GET /annotations/events` (SSE) without auth. HTTP 401. | +| FT-N-09 | same | Bearer token expired. HTTP 401. | +| FT-N-10 | same | Bearer token wrong issuer. HTTP 401. | +| FT-N-11 | same | Bearer token wrong audience. HTTP 401. | +| FT-N-12 | same | Mutating settings without `ADM`. HTTP 403. | +| FT-N-13 | same | `PUT /settings/directories` without `ADM`. HTTP 403. | +| FT-N-15 | same | Media upload without `ANN`. HTTP 403. | + +## System Under Test Boundary + +- HTTP only. +- Token variants minted via `TokenMinter.MintToken(claim, overrides)` with `overrides` covering: expired, wrong-iss, wrong-aud. +- Cross-policy tests use a token minted with a different claim than the endpoint requires (e.g., a `DATASET` token on `POST /annotations`). + +## Acceptance Criteria + +**AC-1: Every scenario produces the documented HTTP status + error envelope.** + +**AC-2: 401 vs 403 distinction is preserved.** +- Missing / invalid token → 401 (authentication failed). +- Valid token, wrong policy → 403 (authorization failed). + +## Constraints + +- AAA pattern. +- `[Trait("traces_to", "AC-F-50, AC-F-52, SW-05")]` plus per-test specific traces. +- Tests must not retry on 401/403 — single request, single assertion. diff --git a/_docs/02_tasks/todo/AZ-571_test_security.md b/_docs/02_tasks/todo/AZ-571_test_security.md new file mode 100644 index 0000000..92531cb --- /dev/null +++ b/_docs/02_tasks/todo/AZ-571_test_security.md @@ -0,0 +1,52 @@ +# Security tests (NFT-SEC-01..10) + +**Task**: AZ-571 +**Name**: Security tests +**Description**: Implement xUnit tests for all 10 security scenarios: JWT signature mismatch, expired, cross-policy DATASET/ANN, anonymous-access denials, error envelope no-stack-leak, path traversal in image/thumbnail GETs, token claim tampering, CORS preflight, alg-confusion `alg=HS256` forgery. +**Complexity**: 5 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Security +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| NFT-SEC-01 | `_docs/02_document/tests/security-tests.md` | JWT signed with key NOT in JWKS. HTTP 401. | +| NFT-SEC-02 | same | JWT expired. HTTP 401. | +| NFT-SEC-03 | same | DATASET token → `POST /annotations`. HTTP 403. | +| NFT-SEC-04 | same | ANN token → `PUT /settings/*`. HTTP 403. | +| NFT-SEC-05 | same | Anonymous access to non-public endpoints. Only `/health` is anonymous; everything else returns 401 without auth. | +| NFT-SEC-06 | same | Error envelope under Production env mode does NOT leak stack traces. | +| NFT-SEC-07 | same | Path traversal in image / thumbnail GET routes. `../etc/passwd` style payloads return 400/404, never 200 with foreign content. | +| NFT-SEC-08 | same | Token claim modification (signature breaks). HTTP 401. | +| NFT-SEC-09 | same | CORS preflight respects `CorsConfig:AllowedOrigins` allow-list under Production. | +| NFT-SEC-10 | same | Algorithm confusion — token forged with `alg=HS256` using the published ES256 public key as the HMAC secret. HTTP 401. | + +## System Under Test Boundary + +- HTTP only. +- Token variants minted via `TokenMinter.MintToken(claim, overrides)`. +- NFT-SEC-06 requires the SUT to be re-booted with `ASPNETCORE_ENVIRONMENT=Production` (and a Production-safe CORS config). This is a separate compose profile or test class with its own `SutRestartFixture`. +- NFT-SEC-09 requires a second SUT boot under Production with `CorsConfig__AllowedOrigins__0: https://app.azaion.local`. Asserts ACAO is exactly that one origin. + +## Acceptance Criteria + +**AC-1: Every scenario passes per its spec.** + +**AC-2: NFT-SEC-10 explicitly verifies algorithm pinning** +Given a token forged with `alg=HS256` and the published ES256 public key as the HMAC secret, +When the runner presents it to `POST /annotations`, +Then HTTP 401 is returned and the error envelope contains "Bearer error=invalid_token" in `WWW-Authenticate`. + +**AC-3: NFT-SEC-06 verifies no stack leak** +Given `ASPNETCORE_ENVIRONMENT=Production`, +When a request triggers a 500-class error, +Then the response body's error envelope contains only the safe error code and message — no `stackTrace`, no `innerException`, no file paths. + +## Constraints + +- AAA pattern. +- `[Trait("traces_to", "AC-F-50, AC-F-51, AC-F-52, SW-05, ENV-06")]` plus per-test specific traces. +- Production-env tests run in a dedicated test class with its own fixture (no leak between Production and E2ETest boots). diff --git a/_docs/02_tasks/todo/AZ-572_test_resilience.md b/_docs/02_tasks/todo/AZ-572_test_resilience.md new file mode 100644 index 0000000..1a39cd8 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-572_test_resilience.md @@ -0,0 +1,49 @@ +# Resilience tests (NFT-RES-01..06) + +**Task**: AZ-572 +**Name**: Resilience tests +**Description**: Implement xUnit tests for the 6 resilience scenarios: RabbitMQ outage during create, Postgres restart, Postgres unreachable, SSE subscriber disconnect mid-stream, `FailsafeProducer` empty-catch path, stream consumer reconnect. +**Complexity**: 5 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Resilience +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| NFT-RES-01 | `_docs/02_document/tests/resilience-tests.md` | RabbitMQ outage during create. `POST /annotations` returns 200; outbox row stays; on broker recovery, message is delivered. | +| NFT-RES-02 | same | Postgres restart between writes. `POST /annotations` after restart succeeds without errors. | +| NFT-RES-03 | same | Postgres unreachable during create. `POST /annotations` returns 5xx; error envelope; no partial state. | +| NFT-RES-04 | same | SSE subscriber disconnects mid-stream. Server tears down channel cleanly; no zombie subscriptions; per-instance state cleanup. | +| NFT-RES-05 | same | Repeated FailsafeProducer empty-catch path (catch{} swallowing IOException). Drain loop survives missing image file; no crash. | +| NFT-RES-06 | same | Stream consumer reconnect. After broker restart, consumer resumes from offset and reads the same messages. | + +## System Under Test Boundary + +- HTTP only for SUT invocations. +- `BrokerFixture.StopBroker()` / `StartBroker()` for NFT-RES-01, NFT-RES-06. +- `docker exec postgres pg_ctl stop` / `start` (or `docker restart postgres`) for NFT-RES-02, NFT-RES-03. +- NFT-RES-05 deliberately removes a specific image file from `annotations-images` volume (out-of-band, runner-only) to trigger the empty-catch path. Marked with `[Trait("fs_access", "delete-image-file")]`. +- Long-running scenarios (NFT-RES-06 reconnect window) use a `[Fact(Timeout = 60000)]` cap. + +## Acceptance Criteria + +**AC-1: Every scenario passes per its spec.** + +**AC-2: SUT never crashes during the test class** +Given any resilience test in the class is running, +When the runner asserts the test's post-condition, +Then the SUT's `/health` endpoint still returns 200 (the SUT survives every external failure). + +**AC-3: NFT-RES-01 verifies stream delivery on recovery** +Given the broker was stopped before a `POST /annotations` and restarted after, +When `BrokerFixture.StartBroker()` returns, +Then the stream consumer reads the queued message within 30 s of broker recovery. + +## Constraints + +- AAA pattern. +- `[Trait("traces_to", "AC-F-04, AC-F-12, AC-N-04, SW-03")]` plus per-test specific traces. +- Resilience tests are long; group them in their own xUnit collection to avoid blocking the fast suite. diff --git a/_docs/02_tasks/todo/AZ-573_test_resource_limits.md b/_docs/02_tasks/todo/AZ-573_test_resource_limits.md new file mode 100644 index 0000000..369ec7d --- /dev/null +++ b/_docs/02_tasks/todo/AZ-573_test_resource_limits.md @@ -0,0 +1,49 @@ +# Resource-limit tests (NFT-RES-LIM-01..06) + +**Task**: AZ-573 +**Name**: Resource-limit tests +**Description**: Implement xUnit tests for the 6 resource-limit scenarios: sustained-load process memory, single-file upload boundary, outbox depth under broker outage, disk usage by `images_dir`, concurrent SSE subscribers, migration on cold-start cost. +**Complexity**: 3 points +**Dependencies**: AZ-564 (test infrastructure) +**Component**: Blackbox Tests → Resource Limits +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| NFT-RES-LIM-01 | `_docs/02_document/tests/resource-limit-tests.md` | Sustained-load process memory stays within configured envelope. | +| NFT-RES-LIM-02 | same | Single-file upload boundary — 1, 10, 50, 100, 256, 512 MB. Uses `LargePayloadFixture` synthetic JPEGs. | +| NFT-RES-LIM-03 | same | Outbox queue depth bounded under broker outage. Depth never exceeds documented ceiling for ≥ 30 min run. | +| NFT-RES-LIM-04 | same | Disk usage by `images_dir` over many distinct uploads. Stays under documented HW-02 budget. | +| NFT-RES-LIM-05 | same | Concurrent SSE subscribers — process-memory boundary. N concurrent subscribers don't push memory past envelope. | +| NFT-RES-LIM-06 | same | Migration on cold-start cost. Time-to-`/health=200` from cold start within the documented boot budget. | + +## System Under Test Boundary + +- HTTP only. +- Memory + disk metrics read from `docker stats` (out-of-band, runner-only). Marked `[Trait("docker_stats", "true")]`. +- NFT-RES-LIM-02 uses `LargePayloadFixture` to generate synthetic JPEGs at runtime; never committed to repo. +- NFT-RES-LIM-03 long-running (30 min smoke variant); the nightly profile runs the full 30 min, the standard profile runs a 5-min smoke proxy. +- NFT-RES-LIM-05 spawns N parallel SSE subscribers via `Parallel.For` + per-subscriber `HttpClient`. + +## Acceptance Criteria + +**AC-1: Every scenario passes per its spec.** + +**AC-2: Smoke vs nightly profile distinction** +Given a profile environment variable `E2E_RUN_PROFILE=functional` (default), +When NFT-RES-LIM-03 runs, +Then it executes a 5-min smoke proxy (not the 30-min full run); under `E2E_RUN_PROFILE=performance`, it runs the full 30 min. + +**AC-3: Memory + disk readings have measurement uncertainty noted** +Given `docker stats` is the measurement source, +When the test records a memory or disk reading, +Then the result includes a tolerance margin (e.g., ± 50 MB for memory, ± 100 MB for disk) per the documented `results_report.md` tolerance. + +## Constraints + +- AAA pattern. +- `[Trait("traces_to", "AC-N-03, AC-N-05, HW-02, HW-03")]` plus per-test specific traces. +- Long-running tests `[Fact(Timeout = ?)]` per documented duration; never hang the runner. diff --git a/_docs/02_tasks/todo/AZ-574_test_performance.md b/_docs/02_tasks/todo/AZ-574_test_performance.md new file mode 100644 index 0000000..e140b4c --- /dev/null +++ b/_docs/02_tasks/todo/AZ-574_test_performance.md @@ -0,0 +1,50 @@ +# Performance tests (NFT-PERF-*) + +**Task**: AZ-574 +**Name**: Performance tests +**Description**: Implement xUnit tests for the 7 performance scenarios: annotation create p95 latency (small + large), sustained writes throughput, FailsafeProducer drain rate, SSE delivery latency under fan-out, annotation listing at scale, dataset class distribution at scale. +**Complexity**: 3 points +**Dependencies**: AZ-564 (test infrastructure; depends on dataseed populating the 10k/50k bulk rows for the "at scale" tests) +**Component**: Blackbox Tests → Performance +**Tracker**: jira +**Epic**: AZ-563 + +## Scenarios Covered + +| Test ID | Source | What it asserts | +|---------|--------|-----------------| +| NFT-PERF-LATENCY-01 | `_docs/02_document/tests/performance-tests.md` | `POST /annotations` p95 latency — small image (image_small.jpg) ≤ documented threshold (≤ 1500 ms per spec). | +| NFT-PERF-LATENCY-02 | same | `POST /annotations` p95 latency — large image (image_large.JPG, ~7 MB) ≤ documented threshold. | +| NFT-PERF-THROUGHPUT-01 | same | Sustained writes throughput — RPS over a 60-s window meets the documented threshold. | +| NFT-PERF-OUTBOX-DRAIN-01 | same | FailsafeProducer drain rate — outbox depth converges to 0 within the documented window after a burst. | +| NFT-PERF-SSE-FANOUT-01 | same | SSE delivery latency under modest fan-out (N=20 subscribers) — p95 latency ≤ documented threshold. | +| NFT-PERF-LIST-01 | same | `GET /annotations` listing on populated DB (10k rows). p95 latency ≤ documented threshold. | +| NFT-PERF-DATASET-01 | same | Dataset class distribution at scale (50k detections). p95 latency ≤ documented threshold. | + +## System Under Test Boundary + +- HTTP only. +- p95 computed by the test from a sample of N requests (per-scenario sample size in the spec). +- NFT-PERF-LIST-01 / NFT-PERF-DATASET-01 require `dataseed` to have populated the bulk rows (AZ-564 covers this). +- Profile gate: `E2E_RUN_PROFILE=performance` enables these tests; the standard `functional` profile skips them (they are too long for the merge gate). + +## Acceptance Criteria + +**AC-1: Every perf scenario passes its threshold under the `performance` profile.** + +**AC-2: Smoke variant runs in the standard profile** +Given `E2E_RUN_PROFILE=functional`, +When the test runs, +Then a short smoke variant (e.g., 10 requests instead of 1000) executes and only asserts p95 < 2× the threshold (a sanity check, not a perf gate). + +**AC-3: Measurement uncertainty acknowledged** +Given p95 is computed from a finite sample, +When the test reports its result, +Then the result includes the sample size, the actual p95, and the documented threshold. Failures include a JSON report file at `e2e-results/perf-.json`. + +## Constraints + +- AAA pattern. +- `[Trait("traces_to", "AC-F-10, AC-N-01")]` plus per-test specific traces. +- Perf tests run in their own xUnit collection so they don't block functional tests during interactive runs. +- Performance thresholds come from `results_report.md`; tests must not hard-code numbers — they read them from a fixture. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 8624645..66bff2f 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -2,8 +2,8 @@ ## Current Step flow: existing-code -step: 5 -name: Decompose Tests +step: 6 +name: Implement Tests status: not_started sub_step: phase: 0 @@ -29,6 +29,10 @@ tracker: jira name: Code Testability Revision status: completed outcome: "2 surgical fixes (C01 JWKS HTTPS env gate, C02 RabbitMQ host DNS resolution); commits 90d48cf + Phase 7 docs; smoke PASS (IDX20108=0, IPAddress.Parse FormatException=0); architecture.md Open Risks §6 retired" +- step: 5 + name: Decompose Tests + status: completed + outcome: "Epic AZ-563 + 11 tasks AZ-564..574; 67 scenarios covered exactly once (cross-checked vs traceability matrix); _dependencies_table.md updated" ## Mid-step adjustments - 2026-05-14: targeted auth + CORS re-sync triggered by codebase drift discovered at Step 4 entry.