10 KiB
E2E Non-Functional Tests
Performance Tests
NFT-PERF-01: Single image detection latency
Summary: Measure end-to-end latency for a single small image detection request after engine is warm. Traces to: AC-API-2 Metric: Request-to-response latency (ms)
Preconditions:
- Engine is initialized and warm (at least 1 prior detection)
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Send 10 sequential POST /detect with small-image |
Record each request-response latency |
| 2 | Compute p50, p95, p99 | — |
Pass criteria: p95 latency < 5000ms for ONNX CPU, p95 < 1000ms for TensorRT GPU Duration: ~60s (10 requests)
NFT-PERF-02: Concurrent inference throughput
Summary: Verify the system handles 2 concurrent inference requests (ThreadPoolExecutor limit). Traces to: RESTRICT-HW-3 Metric: Throughput (requests/second), latency under concurrency
Preconditions:
- Engine is initialized and warm
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | Send 2 concurrent POST /detect requests with small-image |
Measure both response times |
| 2 | Send 3 concurrent requests | Third request should queue behind the first two |
| 3 | Record total time for 3 concurrent requests vs 2 concurrent | — |
Pass criteria: 2 concurrent requests complete without error. 3 concurrent requests: total time > time for 2 (queuing observed). Duration: ~30s
NFT-PERF-03: Large image tiling processing time
Summary: Measure processing time for a large image that triggers GSD-based tiling. Traces to: AC-IP-2 Metric: Total processing time (ms), tiles processed
Preconditions:
- Engine is initialized and warm
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | POST /detect with large-image (4000×3000) and GSD config |
Record total response time |
| 2 | Compare with small-image baseline from NFT-PERF-01 | Ratio indicates tiling overhead |
Pass criteria: Request completes within 120s. Processing time scales proportionally with number of tiles (not exponentially). Duration: ~120s
NFT-PERF-04: Video processing frame rate
Summary: Measure effective frame processing rate during video detection. Traces to: AC-VP-1 Metric: Frames processed per second, total processing time
Preconditions:
- Engine is initialized and warm
- SSE client connected
Steps:
| Step | Consumer Action | Measurement |
|---|---|---|
| 1 | POST /detect/test-media-perf with test-video and frame_period_recognition: 4 |
— |
| 2 | Count SSE events and measure total time from "started" to "AIProcessed" | Compute frames/second |
Pass criteria: Processing completes within 5× video duration (10s video → < 50s processing). Frame processing rate is consistent (no stalls > 10s between events). Duration: ~120s
Resilience Tests
NFT-RES-01: Loader service outage after engine initialization
Summary: Verify that detections continue working when the Loader service goes down after the engine is already loaded. Traces to: RESTRICT-ENV-1
Preconditions:
- Engine is initialized (model already downloaded)
Fault injection:
- Stop mock-loader service
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Stop mock-loader | — |
| 2 | POST /detect with small-image |
200 OK — detection succeeds (engine already in memory) |
| 3 | GET /health |
aiAvailability remains "Enabled" |
Pass criteria: Detection continues to work. Health status remains stable. No errors from loader unavailability.
NFT-RES-02: Annotations service outage during async detection
Summary: Verify that async detection completes and delivers SSE events even when Annotations service is down. Traces to: RESTRICT-ENV-2
Preconditions:
- Engine is initialized
- SSE client connected
Fault injection:
- Stop mock-annotations mid-processing
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Start async detection: POST /detect/test-media-res01 |
{"status": "started"} |
| 2 | After first few SSE events, stop mock-annotations | — |
| 3 | Continue listening to SSE | Events continue arriving. Annotations POST failures are silently caught |
| 4 | Wait for completion | Final AIProcessed event received |
Pass criteria: Detection pipeline completes fully. SSE delivery is unaffected. No crash or 500 errors.
NFT-RES-03: Engine initialization retry after transient loader failure
Summary: Verify that if model download fails on first attempt, a subsequent detection request retries initialization. Traces to: AC-EL-2
Preconditions:
- Fresh service (engine not initialized)
Fault injection:
- Mock-loader returns 503 on first model request, then recovers
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Configure mock-loader to fail first request | — |
| 2 | POST /detect with small-image |
Error (503 or 422) |
| 3 | Configure mock-loader to succeed | — |
| 4 | POST /detect with small-image |
200 OK — engine initializes on retry |
Pass criteria: Second detection succeeds after loader recovers. System does not permanently lock into error state.
NFT-RES-04: Service restart with in-memory state loss
Summary: Verify that after a service restart, all in-memory state (_active_detections, _event_queues) is cleanly reset. Traces to: RESTRICT-OP-5, RESTRICT-OP-6
Preconditions:
- Previous detection may have been in progress
Fault injection:
- Restart detections container
Steps:
| Step | Action | Expected Behavior |
|---|---|---|
| 1 | Restart detections container | — |
| 2 | GET /health |
Returns aiAvailability: "None" (fresh start) |
| 3 | POST /detect/any-media-id |
Accepted (no stale _active_detections blocking it) |
Pass criteria: No stale state from previous session. All endpoints functional after restart.
Security Tests
NFT-SEC-01: Malformed multipart payload handling
Summary: Verify that the service handles malformed multipart requests without crashing. Traces to: AC-API-2 (security)
Steps:
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Send POST /detect with truncated multipart body (missing boundary) |
400 or 422 — not 500 |
| 2 | Send POST /detect with Content-Type: multipart but no file part |
400 — empty image |
| 3 | GET /health after malformed requests |
Service is still healthy |
Pass criteria: All malformed requests return 4xx. Service remains operational.
NFT-SEC-02: Oversized request body
Summary: Verify system behavior when an extremely large file is uploaded. Traces to: RESTRICT-OP-4
Steps:
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | Send POST /detect with a 500 MB random file |
Error response (413, 400, or timeout) — not OOM crash |
| 2 | GET /health |
Service is still running |
Pass criteria: Service does not crash or run out of memory. Returns an error or times out gracefully.
NFT-SEC-03: JWT token is forwarded without modification
Summary: Verify that the Authorization header is forwarded to the Annotations service as-is. Traces to: AC-API-3
Steps:
| Step | Consumer Action | Expected Response |
|---|---|---|
| 1 | POST /detect/test-media-sec with Authorization: Bearer test-jwt-123 and x-refresh-token: refresh-456 |
{"status": "started"} |
| 2 | After processing, query mock-annotations GET /mock/annotations |
Recorded request contains Authorization: Bearer test-jwt-123 header |
Pass criteria: Exact token received by mock-annotations matches what the consumer sent.
Resource Limit Tests
NFT-RES-LIM-01: ThreadPoolExecutor worker limit (2 concurrent)
Summary: Verify that no more than 2 inference operations run simultaneously. Traces to: RESTRICT-HW-3
Preconditions:
- Engine is initialized
Monitoring:
- Track concurrent request timings
Steps:
| Step | Consumer Action | Expected Behavior |
|---|---|---|
| 1 | Send 4 concurrent POST /detect requests |
— |
| 2 | Measure response arrival times | First 2 complete roughly together; next 2 complete after |
Duration: ~60s Pass criteria: Clear evidence of 2-at-a-time processing (second batch starts after first completes). All 4 requests eventually succeed.
NFT-RES-LIM-02: SSE queue depth limit (100 events)
Summary: Verify that the SSE queue per client does not exceed 100 events. Traces to: AC-API-4
Preconditions:
- Engine is initialized
Monitoring:
- SSE event count
Steps:
| Step | Consumer Action | Expected Behavior |
|---|---|---|
| 1 | Open SSE connection but do not read (stall client) | — |
| 2 | Trigger async detection that produces > 100 events | — |
| 3 | After processing completes, drain the SSE queue | ≤ 100 events received |
Duration: ~120s Pass criteria: No more than 100 events buffered. No OOM or connection errors from queue growth.
NFT-RES-LIM-03: Max 300 detections per frame
Summary: Verify that the system returns at most 300 detections per frame (model output limit). Traces to: RESTRICT-SW-6
Preconditions:
- Engine is initialized
- Image with dense scene expected to produce many detections
Monitoring:
- Detection count per response
Duration: ~30s Pass criteria: No response contains more than 300 detections. Dense images hit the cap without errors.
NFT-RES-LIM-04: Log file rotation and retention
Summary: Verify that log files rotate daily and are retained for 30 days. Traces to: AC-LOG-1, AC-LOG-2
Preconditions:
- Detections service running with Logs/ volume mounted for inspection
Monitoring:
- Log file creation, naming, and count
Steps:
| Step | Consumer Action | Expected Behavior |
|---|---|---|
| 1 | Make several detection requests | Logs written to Logs/log_inference_YYYYMMDD.txt |
| 2 | Verify log file name matches current date | File name contains today's date |
| 3 | Verify log content format | Contains INFO/DEBUG/WARNING entries with timestamps |
Duration: ~10s Pass criteria: Log file exists with correct date-based naming. Content includes structured log entries.