[AZ-585] [AZ-586] ResLim+Perf NFT tests; close test cycle 1

Batch 4 of test implementation cycle 1 (existing-code Step 6, final batch).

- AZ-585 SteadyStateLoadTests + ColdStartRssTests: NFT-RES-LIM-01..04.
  SteadyStateLoadFixture runs one 5-min sustained-load window and samples
  RSS (docker stats), Npgsql conns (pg_stat_activity), and FDs
  (/proc/1/fd) every 5s; three test methods assert independently. All
  SkippableFact-gated on docker primitives.
- AZ-586 PerformanceTests: NFT-PERF-01..04. Sequential single-client,
  5 warm-ups + N measured calls, P50+P95 via LatencyPercentiles, recorded
  to PERF_RESULTS_FILE. Tagged Category=Perf so default gate excludes them.

Infrastructure:
- entrypoint.sh now applies --filter "${TEST_FILTER:-Category!=Perf}"
  per AZ-586 (default CI gate excludes performance).
- MetricCsvRecorder: idempotent CSV appender keyed on env var, used by
  both Perf and ResLim categories.

Step 6 (Implement Tests) is complete. Final report at
_docs/03_implementation/implementation_report_tests.md handoffs the
full-suite gate to test-run/SKILL.md (Step 7).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-15 09:11:53 +03:00
parent 26126e6216
commit 001e80fe96
14 changed files with 1181 additions and 52 deletions
@@ -0,0 +1,80 @@
# Batch Report
**Batch**: 4
**Tasks**: AZ-585, AZ-586
**Date**: 2026-05-15
**Run mode**: Test implementation (existing-code Step 6)
**Total complexity**: 6 SP (3 + 3)
## Task Results
| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|------|--------|----------------|-------|-------------|--------|
| AZ-585_test_resource_limits | Done | 3 added, 1 deleted | 4 / 4 discovery | 4/4 NFT-RES-LIM covered | 0 |
| AZ-586_test_performance | Done | 1 added, 1 deleted, 2 helpers added, entrypoint.sh modified | 4 / 4 discovery | 4/4 NFT-PERF covered | 0 |
## AC Test Coverage: All 8 NFT scenarios covered
- **AZ-585 (4/4)**: NFT-RES-LIM-01 → `SteadyStateLoadTests.NFT_RES_LIM_01_*` (P95 RSS + no-leak ratio), NFT-RES-LIM-02 → `SteadyStateLoadTests.NFT_RES_LIM_02_*` (Npgsql conn cap + minute-1 mean), NFT-RES-LIM-03 → `SteadyStateLoadTests.NFT_RES_LIM_03_*` (FD cap + minute-1 anchor), NFT-RES-LIM-04 → `ColdStartRssTests.NFT_RES_LIM_04_*` (30s settle + cold-RSS cap).
- **AZ-586 (4/4)**: NFT-PERF-01 → `PerformanceTests.NFT_PERF_01_*` (100 minimal-cascade DELETEs, P50 ≤ 50ms), NFT-PERF-02 → `*.NFT_PERF_02_*` (50 F3-shape cascade DELETEs, provisional P50 ≤ 200ms), NFT-PERF-03 → `*.NFT_PERF_03_*` (100 `/health`, P50 ≤ 10ms), NFT-PERF-04 → `*.NFT_PERF_04_*` (100 paginated lists vs 1000-mission seed, provisional P95 ≤ 100ms).
## Code Review Verdict: PASS_WITH_WARNINGS (self-review)
- 0 Critical, 0 High, 0 Medium.
- **Low — coverage**: 4 of 4 ResLim tests are `SkippableFact` gated on `COMPOSE_RESTART_ENABLED=1` + docker CLI in the e2e-consumer image — same Docker-socket follow-up already flagged in batch 3 report. NFT-RES-LIM-04 additionally requires `docker compose stop|rm|up` access; same gate.
- **Low — maintainability**: `SteadyStateLoadFixture.ParseHumanBytes` and `ColdStartRssTests.ParseHumanBytes` are duplicated. Both files parse the LHS of `docker stats --no-stream --format '{{.MemUsage}}'`; the duplication is intentional today because the two files have different gating predicates (fixture uses `Enabled` property + `CommandAvailable` probe, ColdStart uses `MissionsContainerHelper.Enabled`), and lifting the helper to `Helpers/HumanBytes.cs` would be a shared-helper change worth a separate refactor. Captured as a follow-up note; not auto-fixed because it touches both files. **Recommend folding into the docker-CLI follow-up task.**
- **Low — observability**: `PerformanceTests` swallows non-2xx-non-404 with `InvalidOperationException` (warmup + measured), so a misbehaving SUT mid-run yields a clear stack trace; no silent pass. This is intended.
## Auto-Fix Attempts: 1
`SteadyStateLoadFixture.cs:59` initially called `new TokenMinter()` (parameter-less ctor); `TokenMinter` requires `signUrl`. Fixed to `new TokenMinter(TestEnvironment.JwksMockBaseUrl + "/sign")` — same pattern as `TestBase`. Style/Low under the Auto-Fix Gate matrix. Rebuild: 0 warnings, 0 errors.
## Stuck Agents: None
## Files Created (5) + 2 deletions + 1 modified script
### Helpers (2)
- `tests/Azaion.Missions.E2E.Tests/Helpers/LatencyPercentiles.cs` — nearest-rank P50/P95/Percentile/Mean over `IReadOnlyList<double>`. Sorts a defensive copy.
- `tests/Azaion.Missions.E2E.Tests/Helpers/MetricCsvRecorder.cs` — appends one row per scenario (Timestamp, Category, Scenario, Result, Traces, ErrorMessage) to a CSV referenced by `PERF_RESULTS_FILE` (perf) or `RESLIM_RESULTS_FILE` (reslim). No-op when the env var is unset.
### Fixtures (1)
- `tests/Azaion.Missions.E2E.Tests/Fixtures/SteadyStateLoadFixture.cs` — class-scoped 5-minute sustained-load fixture. Generates ~50 RPS via a single-threaded `HttpClient` loop, samples RSS / Npgsql conn count / FD count every 5s. Exposes the time series + `LoadGeneratorMetTargetRps` + `SutExitedDuringWindow` + `SkipReason`. Tests inspect `SkipReason` to surface explicit skips when docker primitives are unavailable.
### Test classes (3)
- `tests/Azaion.Missions.E2E.Tests/Tests/ResourceLimits/SteadyStateLoadTests.cs` — NFT-RES-LIM-01..03 share the fixture window. Each test asserts one metric independently. `[Collection("ResLimSteadyState")]`.
- `tests/Azaion.Missions.E2E.Tests/Tests/ResourceLimits/ColdStartRssTests.cs` — NFT-RES-LIM-04. Runs `docker compose stop|rm|up missions` for a fresh start, waits 30s after `/health` returns 200, reads RSS, asserts ≤ 200 MiB. Lives in the `MigratorRestart` collection to serialise with the other compose-restarting tests.
- `tests/Azaion.Missions.E2E.Tests/Tests/Performance/PerformanceTests.cs` — NFT-PERF-01..04, all `[Trait("Category","Perf")]`. Sequential single-client, 5 warm-ups + N measured, records P50 + P95 to `PERF_RESULTS_FILE`.
### Deleted (2 Sanity placeholders)
- `tests/Azaion.Missions.E2E.Tests/Tests/Performance/Sanity.cs` — dead placeholder from AZ-576; replaced by `PerformanceTests`.
- `tests/Azaion.Missions.E2E.Tests/Tests/ResourceLimits/Sanity.cs` — same.
### Modified (entrypoint filter, per AZ-586 Spec)
- `tests/Azaion.Missions.E2E.Tests/entrypoint.sh` — added `--filter "${TEST_FILTER:-Category!=Perf}"`. The default CI gate now excludes the Performance category (AZ-586 Spec § Outcome: "default test suite filter excludes performance to keep the standard CI gate ≤ 15 min"); `scripts/run-performance-tests.sh` bypasses the entrypoint anyway and invokes `dotnet test --filter Category=Perf` directly. The shell variable `TEST_FILTER` is overridable for ad-hoc invocations (e.g., to include Perf during a local profiling session).
## Local Verification
- `dotnet build tests/Azaion.Missions.E2E.Tests/Azaion.Missions.E2E.Tests.csproj` — 0 warnings, 0 errors.
- 8 new NFT methods discoverable via `[Trait("Category","Perf")]` (4) and `[Trait("Category","ResLim")]` (4).
## Pre-existing issues NOT in scope
- `scripts/run-performance-tests.sh` line 104 references `/app/Azaion.Missions.E2E.Tests.csproj`, but the Dockerfile copies the test project to `/src/`. Pre-existing script bug — flag for the docker-CLI follow-up task that re-validates the run-perf script end-to-end. Not introduced by this batch.
- Root `Azaion.Missions.csproj` Sdk.Web globs still pull `tests/**/*.cs` into the main project compilation — same flag as batch 3 cumulative review report; pre-existing.
## Docker Stack Validation
Not run as part of this batch — same hand-off as batches 1-3. Step 7 (`test-run/SKILL.md`) owns the `docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-consumer` gate. The 5 SkippableFact tests in this batch activate when the consumer image has `docker` CLI + socket bind; otherwise they emit explicit skip reasons (no silent pass).
## Tracker Updates
AZ-585, AZ-586 transitioned to `In Testing` via the Atlassian MCP after this commit (Step 12).
## Next Batch
All 11 test tasks (AZ-576 + AZ-577..AZ-586) are now done. Step 6 (Implement Tests) is **complete**. Autodev advances to Step 7 (Run Tests) — `test-run/SKILL.md` owns the full-suite gate.