# Resilience tests (NFT-RES-01..06) **Task**: AZ-572 **Name**: Resilience tests **Description**: Implement xUnit tests for the 6 resilience scenarios: RabbitMQ outage during create, Postgres restart, Postgres unreachable, SSE subscriber disconnect mid-stream, `FailsafeProducer` empty-catch path, stream consumer reconnect. **Complexity**: 5 points **Dependencies**: AZ-564 (test infrastructure) **Component**: Blackbox Tests → Resilience **Tracker**: jira **Epic**: AZ-563 ## Scenarios Covered | Test ID | Source | What it asserts | |---------|--------|-----------------| | NFT-RES-01 | `_docs/02_document/tests/resilience-tests.md` | RabbitMQ outage during create. `POST /annotations` returns 200; outbox row stays; on broker recovery, message is delivered. | | NFT-RES-02 | same | Postgres restart between writes. `POST /annotations` after restart succeeds without errors. | | NFT-RES-03 | same | Postgres unreachable during create. `POST /annotations` returns 5xx; error envelope; no partial state. | | NFT-RES-04 | same | SSE subscriber disconnects mid-stream. Server tears down channel cleanly; no zombie subscriptions; per-instance state cleanup. | | NFT-RES-05 | same | Repeated FailsafeProducer empty-catch path (catch{} swallowing IOException). Drain loop survives missing image file; no crash. | | NFT-RES-06 | same | Stream consumer reconnect. After broker restart, consumer resumes from offset and reads the same messages. | ## System Under Test Boundary - HTTP only for SUT invocations. - `BrokerFixture.StopBroker()` / `StartBroker()` for NFT-RES-01, NFT-RES-06. - `docker exec postgres pg_ctl stop` / `start` (or `docker restart postgres`) for NFT-RES-02, NFT-RES-03. - NFT-RES-05 deliberately removes a specific image file from `annotations-images` volume (out-of-band, runner-only) to trigger the empty-catch path. Marked with `[Trait("fs_access", "delete-image-file")]`. - Long-running scenarios (NFT-RES-06 reconnect window) use a `[Fact(Timeout = 60000)]` cap. ## Acceptance Criteria **AC-1: Every scenario passes per its spec.** **AC-2: SUT never crashes during the test class** Given any resilience test in the class is running, When the runner asserts the test's post-condition, Then the SUT's `/health` endpoint still returns 200 (the SUT survives every external failure). **AC-3: NFT-RES-01 verifies stream delivery on recovery** Given the broker was stopped before a `POST /annotations` and restarted after, When `BrokerFixture.StartBroker()` returns, Then the stream consumer reads the queued message within 30 s of broker recovery. ## Constraints - AAA pattern. - `[Trait("traces_to", "AC-F-04, AC-F-12, AC-N-04, SW-03")]` plus per-test specific traces. - Resilience tests are long; group them in their own xUnit collection to avoid blocking the fast suite.