Files
Oleksandr Bezdieniezhnykh 001e80fe96 [AZ-585] [AZ-586] ResLim+Perf NFT tests; close test cycle 1
Batch 4 of test implementation cycle 1 (existing-code Step 6, final batch).

- AZ-585 SteadyStateLoadTests + ColdStartRssTests: NFT-RES-LIM-01..04.
  SteadyStateLoadFixture runs one 5-min sustained-load window and samples
  RSS (docker stats), Npgsql conns (pg_stat_activity), and FDs
  (/proc/1/fd) every 5s; three test methods assert independently. All
  SkippableFact-gated on docker primitives.
- AZ-586 PerformanceTests: NFT-PERF-01..04. Sequential single-client,
  5 warm-ups + N measured calls, P50+P95 via LatencyPercentiles, recorded
  to PERF_RESULTS_FILE. Tagged Category=Perf so default gate excludes them.

Infrastructure:
- entrypoint.sh now applies --filter "${TEST_FILTER:-Category!=Perf}"
  per AZ-586 (default CI gate excludes performance).
- MetricCsvRecorder: idempotent CSV appender keyed on env var, used by
  both Perf and ResLim categories.

Step 6 (Implement Tests) is complete. Final report at
_docs/03_implementation/implementation_report_tests.md handoffs the
full-suite gate to test-run/SKILL.md (Step 7).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-15 09:11:53 +03:00

6.8 KiB

Batch Report

Batch: 4 Tasks: AZ-585, AZ-586 Date: 2026-05-15 Run mode: Test implementation (existing-code Step 6) Total complexity: 6 SP (3 + 3)

Task Results

Task Status Files Modified Tests AC Coverage Issues
AZ-585_test_resource_limits Done 3 added, 1 deleted 4 / 4 discovery 4/4 NFT-RES-LIM covered 0
AZ-586_test_performance Done 1 added, 1 deleted, 2 helpers added, entrypoint.sh modified 4 / 4 discovery 4/4 NFT-PERF covered 0

AC Test Coverage: All 8 NFT scenarios covered

  • AZ-585 (4/4): NFT-RES-LIM-01 → SteadyStateLoadTests.NFT_RES_LIM_01_* (P95 RSS + no-leak ratio), NFT-RES-LIM-02 → SteadyStateLoadTests.NFT_RES_LIM_02_* (Npgsql conn cap + minute-1 mean), NFT-RES-LIM-03 → SteadyStateLoadTests.NFT_RES_LIM_03_* (FD cap + minute-1 anchor), NFT-RES-LIM-04 → ColdStartRssTests.NFT_RES_LIM_04_* (30s settle + cold-RSS cap).
  • AZ-586 (4/4): NFT-PERF-01 → PerformanceTests.NFT_PERF_01_* (100 minimal-cascade DELETEs, P50 ≤ 50ms), NFT-PERF-02 → *.NFT_PERF_02_* (50 F3-shape cascade DELETEs, provisional P50 ≤ 200ms), NFT-PERF-03 → *.NFT_PERF_03_* (100 /health, P50 ≤ 10ms), NFT-PERF-04 → *.NFT_PERF_04_* (100 paginated lists vs 1000-mission seed, provisional P95 ≤ 100ms).

Code Review Verdict: PASS_WITH_WARNINGS (self-review)

  • 0 Critical, 0 High, 0 Medium.
  • Low — coverage: 4 of 4 ResLim tests are SkippableFact gated on COMPOSE_RESTART_ENABLED=1 + docker CLI in the e2e-consumer image — same Docker-socket follow-up already flagged in batch 3 report. NFT-RES-LIM-04 additionally requires docker compose stop|rm|up access; same gate.
  • Low — maintainability: SteadyStateLoadFixture.ParseHumanBytes and ColdStartRssTests.ParseHumanBytes are duplicated. Both files parse the LHS of docker stats --no-stream --format '{{.MemUsage}}'; the duplication is intentional today because the two files have different gating predicates (fixture uses Enabled property + CommandAvailable probe, ColdStart uses MissionsContainerHelper.Enabled), and lifting the helper to Helpers/HumanBytes.cs would be a shared-helper change worth a separate refactor. Captured as a follow-up note; not auto-fixed because it touches both files. Recommend folding into the docker-CLI follow-up task.
  • Low — observability: PerformanceTests swallows non-2xx-non-404 with InvalidOperationException (warmup + measured), so a misbehaving SUT mid-run yields a clear stack trace; no silent pass. This is intended.

Auto-Fix Attempts: 1

SteadyStateLoadFixture.cs:59 initially called new TokenMinter() (parameter-less ctor); TokenMinter requires signUrl. Fixed to new TokenMinter(TestEnvironment.JwksMockBaseUrl + "/sign") — same pattern as TestBase. Style/Low under the Auto-Fix Gate matrix. Rebuild: 0 warnings, 0 errors.

Stuck Agents: None

Files Created (5) + 2 deletions + 1 modified script

Helpers (2)

  • tests/Azaion.Missions.E2E.Tests/Helpers/LatencyPercentiles.cs — nearest-rank P50/P95/Percentile/Mean over IReadOnlyList<double>. Sorts a defensive copy.
  • tests/Azaion.Missions.E2E.Tests/Helpers/MetricCsvRecorder.cs — appends one row per scenario (Timestamp, Category, Scenario, Result, Traces, ErrorMessage) to a CSV referenced by PERF_RESULTS_FILE (perf) or RESLIM_RESULTS_FILE (reslim). No-op when the env var is unset.

Fixtures (1)

  • tests/Azaion.Missions.E2E.Tests/Fixtures/SteadyStateLoadFixture.cs — class-scoped 5-minute sustained-load fixture. Generates ~50 RPS via a single-threaded HttpClient loop, samples RSS / Npgsql conn count / FD count every 5s. Exposes the time series + LoadGeneratorMetTargetRps + SutExitedDuringWindow + SkipReason. Tests inspect SkipReason to surface explicit skips when docker primitives are unavailable.

Test classes (3)

  • tests/Azaion.Missions.E2E.Tests/Tests/ResourceLimits/SteadyStateLoadTests.cs — NFT-RES-LIM-01..03 share the fixture window. Each test asserts one metric independently. [Collection("ResLimSteadyState")].
  • tests/Azaion.Missions.E2E.Tests/Tests/ResourceLimits/ColdStartRssTests.cs — NFT-RES-LIM-04. Runs docker compose stop|rm|up missions for a fresh start, waits 30s after /health returns 200, reads RSS, asserts ≤ 200 MiB. Lives in the MigratorRestart collection to serialise with the other compose-restarting tests.
  • tests/Azaion.Missions.E2E.Tests/Tests/Performance/PerformanceTests.cs — NFT-PERF-01..04, all [Trait("Category","Perf")]. Sequential single-client, 5 warm-ups + N measured, records P50 + P95 to PERF_RESULTS_FILE.

Deleted (2 Sanity placeholders)

  • tests/Azaion.Missions.E2E.Tests/Tests/Performance/Sanity.cs — dead placeholder from AZ-576; replaced by PerformanceTests.
  • tests/Azaion.Missions.E2E.Tests/Tests/ResourceLimits/Sanity.cs — same.

Modified (entrypoint filter, per AZ-586 Spec)

  • tests/Azaion.Missions.E2E.Tests/entrypoint.sh — added --filter "${TEST_FILTER:-Category!=Perf}". The default CI gate now excludes the Performance category (AZ-586 Spec § Outcome: "default test suite filter excludes performance to keep the standard CI gate ≤ 15 min"); scripts/run-performance-tests.sh bypasses the entrypoint anyway and invokes dotnet test --filter Category=Perf directly. The shell variable TEST_FILTER is overridable for ad-hoc invocations (e.g., to include Perf during a local profiling session).

Local Verification

  • dotnet build tests/Azaion.Missions.E2E.Tests/Azaion.Missions.E2E.Tests.csproj — 0 warnings, 0 errors.
  • 8 new NFT methods discoverable via [Trait("Category","Perf")] (4) and [Trait("Category","ResLim")] (4).

Pre-existing issues NOT in scope

  • scripts/run-performance-tests.sh line 104 references /app/Azaion.Missions.E2E.Tests.csproj, but the Dockerfile copies the test project to /src/. Pre-existing script bug — flag for the docker-CLI follow-up task that re-validates the run-perf script end-to-end. Not introduced by this batch.
  • Root Azaion.Missions.csproj Sdk.Web globs still pull tests/**/*.cs into the main project compilation — same flag as batch 3 cumulative review report; pre-existing.

Docker Stack Validation

Not run as part of this batch — same hand-off as batches 1-3. Step 7 (test-run/SKILL.md) owns the docker compose -f docker-compose.test.yml up --build --abort-on-container-exit e2e-consumer gate. The 5 SkippableFact tests in this batch activate when the consumer image has docker CLI + socket bind; otherwise they emit explicit skip reasons (no silent pass).

Tracker Updates

AZ-585, AZ-586 transitioned to In Testing via the Atlassian MCP after this commit (Step 12).

Next Batch

All 11 test tasks (AZ-576 + AZ-577..AZ-586) are now done. Step 6 (Implement Tests) is complete. Autodev advances to Step 7 (Run Tests) — test-run/SKILL.md owns the full-suite gate.