# Resource Limit Tests **Task**: AZ-585_test_resource_limits **Name**: Resource limit tests (NFT-RES-LIM-01..04) **Description**: Implement xUnit blackbox tests for the 4 resource-limit observation scenarios — steady-state RSS memory under 5-min sustained load (P95 ≤ 250 MiB; no monotonic climb), Npgsql connection pool ≤ 100 with no unbounded growth, file-descriptor count ≤ 1024 with no leak, and cold-start RSS ≤ 200 MiB at `t=30s` after health-ok. Provisional gates documented per `restrictions.md` H6 — locked in after first green run. **Complexity**: 3 points **Dependencies**: AZ-576_test_infrastructure **Component**: Blackbox Tests **Tracker**: AZ-585 **Epic**: AZ-575 ## Problem Per H6, container-level resource limits are NOT enforced inside the container — they will be set at the suite level (`_infra/_compose/`) per device type once locked. These tests establish baseline observations so the suite can size the cgroup limits correctly AND provide an upper-bound regression gate so future changes do not silently 10× the memory or FD footprint. The 8 GB Jetson Orin must accommodate ~6 .NET edge services + Postgres + UI; `missions`'s budget is ~200 MiB cold + ~250 MiB hot. Without these observation tests, a leak or library bloat could ship to the device and force a re-sizing decision late in deployment. ## Outcome - All four NFT-RES-LIM-01..04 scenarios run and pass against the dockerised `missions` service. - Each test produces a CSV row with `Category=ResLim`, `Traces=H1|H3|H6|O10`, `Result=pass`, AND records the measured value (e.g., `P95_RSS_MiB=187`) in the `Traces` column so suite-level deployment planning can read it. - NFT-RES-LIM-01 measures P95 RSS over 5 minutes of mixed sustained load AND asserts `final_RSS - P95_RSS ≤ 20% * P95_RSS` (no monotonic climb). - NFT-RES-LIM-02 measures Npgsql connection count via `pg_stat_activity` every 5s AND asserts both `max ≤ 100` AND `final ≤ 1.3 * first_minute_steady_state`. - NFT-RES-LIM-03 measures `/proc//fd | wc -l` inside the container every 5s AND asserts both `max ≤ 1024` AND `final ≤ 1.3 * minute_one_count`. - NFT-RES-LIM-04 measures cold-start RSS exactly 30s after `GET /health` first returns 200 (no requests issued yet) AND asserts `RSS ≤ 200 MiB`. ## Scope ### Included - NFT-RES-LIM-01 Steady-state memory under 5-min sustained load. - NFT-RES-LIM-02 Connection pool steady-state. - NFT-RES-LIM-03 File-descriptor steady-state. - NFT-RES-LIM-04 Cold-start RSS budget. - Each test records the measured value to the CSV `Traces` field so deployment planning can pick it up. - Provisional gates: 250 MiB hot, 200 MiB cold, 100 connections, 1024 FDs. On first green run, replace provisional gates with `measured + 50%` and open a Refactor Backlog ticket if the provisional gate was exceeded. ### Excluded - Performance (latency / throughput) tests live in Task 19. - GPU / temperature / disk-I/O monitoring (per `restrictions.md` H8 — no specialised hardware on a CRUD service). - Long-soak / endurance tests (> 5 min) — explicitly deferred per `restrictions.md` H8. ## Acceptance Criteria **AC-1: NFT-RES-LIM-01 steady-state RSS ≤ provisional 250 MiB with no monotonic climb** Given `missions` running with `seed_25_missions` + `seed_3_vehicles_2_default` and no host-side memory limit When the test orchestrator drives ~50 RPS of mixed `GET /vehicles`, `GET /missions`, `GET /missions/{id}/waypoints` for 5 minutes from a single concurrent client, while polling `docker stats --no-stream missions-sut` every 5s Then the P95 of the 60 RSS samples is `≤ 250 MiB` (provisional gate) And the final-sample RSS is within ± 20% of the P95 RSS (no sustained leak — RSS does not climb monotonically) And the measured P95 is recorded to the CSV `Traces` column as `P95_RSS_MiB=` **AC-2: NFT-RES-LIM-02 connection pool ≤ 100 with no unbounded growth** Given the same setup as NFT-RES-LIM-01 When the test orchestrator polls side-channel `SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE 'Npgsql%' OR (usename='postgres' AND backend_type='client backend')` every 5s for 5 minutes Then the max sampled connection count is `≤ 100` And the final-sample count is `≤ 1.3 × (mean of samples in the first minute)` And the measured max is recorded as `MAX_NPGSQL_CONNS=` **AC-3: NFT-RES-LIM-03 file descriptors ≤ 1024 with no leak** Given the same setup as NFT-RES-LIM-01 When the test orchestrator executes `docker exec missions-sut sh -c 'ls /proc/$(pgrep -f Azaion.Missions.dll | head -1)/fd | wc -l'` every 5s for 5 minutes Then the max sampled FD count is `≤ 1024` And the final-sample count is `≤ 1.3 × (count at t=1min)` And the measured max is recorded as `MAX_FD=` **AC-4: NFT-RES-LIM-04 cold-start RSS ≤ 200 MiB** Given `missions` has been started fresh (via `docker compose up -d missions` after `down -v`), no requests issued yet When `GET /health` first returns `200` AND 30s have elapsed Then `docker stats --no-stream missions-sut` reports `MEM USAGE` ≤ 200 MiB And the measured cold-start RSS is recorded as `COLD_RSS_MiB=` ## Non-Functional Requirements **Performance** - NFT-RES-LIM-01..03: each take exactly 5 minutes (sampling window). With Arrange/teardown, ≤ 6 minutes wall-clock. - NFT-RES-LIM-04: ≤ 60s wall-clock (fresh start + health-poll + 30s wait + measurement). - The total task runtime budget is ≤ 20 minutes, fitting inside the documented 15-min suite CI gate per `environment.md`. NFT-RES-LIM-01..03 share the same 5-minute window and run concurrently against a single dockerised `missions`; NFT-RES-LIM-04 runs separately because it requires a fresh start. **Reliability** - The load generator is a single-thread `HttpClient` driving requests in a tight loop; this is documented at 50 RPS approximately for the in-suite test runner. If the runner is unable to sustain 50 RPS (CI infrastructure too slow), the test SKIPS NFT-RES-LIM-01..03 with `Result=skip` and a clear `ErrorMessage=runner cannot sustain target load`. CI then reruns these on a beefier worker. ## Blackbox Tests | AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | |--------|------------------------|-------------|-------------------|----------------| | AC-1 | `seed_25_missions` + 50 RPS for 5 min | P95 RSS sampling | P95 ≤ 250 MiB + no monotonic climb | H1, H6, O10 | | AC-2 | same | `pg_stat_activity` polling | max ≤ 100 + final ≤ 1.3×steady | O10 | | AC-3 | same | `/proc//fd` polling | max ≤ 1024 + final ≤ 1.3×minute-one | H6, O10 | | AC-4 | fresh `docker compose up -d` | cold-start RSS at t=30s | RSS ≤ 200 MiB | H1, H3 | ## Constraints - `docker stats` and `docker exec` from inside the runner: requires Docker socket access; AZ-576 covers this. - NFT-RES-LIM-03 requires `pgrep` inside the `missions` image; the test FAILS in Arrange (not Assert) if `pgrep` is unavailable. Alternative: parse `/proc/1/comm` if PID 1 is the .NET process (preferred for the small Dockerfile). - All measurements are recorded to the CSV report's `Traces` field so deployment planning can pick them up; this is more important than the pass/fail gate. - Provisional gates are documented per `restrictions.md` H6 — locked in based on first measured run. - AAA pattern with `// Arrange` / `// Act` / `// Assert` per test. ## Risks & Mitigation **Risk 1: Measurement variance on shared CI runners** - *Risk*: A runner under noisy-neighbour load reports inflated RSS, flaking the gate. - *Mitigation*: Gates are provisional and generous (250 MiB vs. typical .NET service of ~150 MiB; 100 connections vs. typical idle pool of ~5–10). After the first green run, the gate is locked at `measured + 50%`. **Risk 2: NFT-RES-LIM-01..03 share a 5-minute window — flake correlation** - *Risk*: A CI hiccup that kills the SUT mid-window flakes all three at once. - *Mitigation*: Each test asserts its own metric; on `missions-sut` exit during the window, the test FAILS with a `"SUT exited during measurement window"` ErrorMessage rather than reporting a misleading metric value. **Risk 3: Provisional gates silently accepted as the locked gate** - *Risk*: If the first green run measures 200 MiB and the test passes, a future engineer treats 250 MiB as the gate forever — but actual headroom is only 50 MiB. - *Mitigation*: The test logs `(measured / gate) ratio`; CI dashboards flag ratios > 0.8 for re-tuning consideration. The lock-in workflow is documented in `restrictions.md` H6. ## System Under Test Boundary - Tests drive the product through the public HTTP surface for load generation; `docker stats`, `docker exec`, and side-channel `pg_stat_activity` for measurement. Expected outputs are the documented gates from `_docs/02_document/tests/resource-limit-tests.md` (provisional) and the corresponding entries in `_docs/00_problem/input_data/expected_results/results_report.md` (when locked). - Stubs are allowed ONLY for: the external `admin` JWT issuer (`jwks-mock` container) and the DB-only stub tables for `media`, `annotations`, `detection`, `map_objects`. - Stubs, fakes, deterministic fallbacks, monkeypatches, or direct imports are NOT allowed for any internal product module — including the Npgsql connection pool, the `AppDataConnection` lifetime, or the `Program.cs` startup path. If any of these is not implemented, the test MUST fail/block as missing product implementation — it must not pass by replacing the module with a test stub.