Files
detections/_docs/02_tasks/done/AZ-146_test_performance.md
T

3.2 KiB

Performance Tests

Task: AZ-146_test_performance Name: Performance Tests Description: Implement E2E tests measuring detection latency, concurrent inference throughput, tiling overhead, and video processing frame rate Complexity: 3 points Dependencies: AZ-138_test_infrastructure Component: Integration Tests Jira: AZ-146 Epic: AZ-137

Problem

Performance characteristics must be baselined and verified: single image latency, concurrent request handling with the 2-worker ThreadPoolExecutor, tiling overhead for large images, and video processing frame rate. These tests establish performance contracts.

Outcome

  • Single image latency profiled (p50, p95, p99) for warm engine
  • Concurrent inference behavior validated (2-at-a-time processing confirmed)
  • Large image tiling overhead measured and bounded
  • Video processing frame rate baselined

Scope

Included

  • NFT-PERF-01: Single image detection latency
  • NFT-PERF-02: Concurrent inference throughput
  • NFT-PERF-03: Large image tiling processing time
  • NFT-PERF-04: Video processing frame rate

Excluded

  • GPU vs CPU comparative benchmarks
  • Memory usage profiling
  • Load testing beyond 4 concurrent requests

Acceptance Criteria

AC-1: Single image latency Given a warm engine When 10 sequential POST /detect requests are sent with small-image Then p95 latency < 5000ms for ONNX CPU or p95 < 1000ms for TensorRT GPU

AC-2: Concurrent throughput Given a warm engine When 2 concurrent POST /detect requests are sent Then both complete without error And 3 concurrent requests show queuing (total time > time for 2)

AC-3: Tiling overhead Given a warm engine When POST /detect is sent with large-image (4000x3000) Then request completes within 120s And processing time scales proportionally with tile count

AC-4: Video frame rate Given a warm engine with SSE connected When async detection processes test-video with frame_period=4 Then processing completes within 5x video duration (< 50s) And frame processing rate is consistent (no stalls > 10s)

Non-Functional Requirements

Performance

  • Tests themselves should complete within defined bounds
  • Results should be logged for trend analysis

Integration Tests

AC Ref Initial Data/Conditions What to Test Expected Behavior NFR References
AC-1 Engine warm 10 sequential detections p95 < 5000ms (CPU) ~60s
AC-2 Engine warm 2 then 3 concurrent requests Queuing observed at 3 ~30s
AC-3 Engine warm, large-image Single large image detection Completes < 120s ~120s
AC-4 Engine warm, SSE connected Video detection < 50s, consistent rate ~120s

Constraints

  • Pass criteria differ between CPU (ONNX) and GPU (TensorRT) profiles
  • Concurrent request tests must account for connection overhead
  • Video frame rate depends on hardware; test measures consistency, not absolute speed

Risks & Mitigation

Risk 1: CI hardware variability

  • Risk: Latency thresholds may fail on slower CI hardware
  • Mitigation: Use generous thresholds; mark as performance benchmark tests that can be skipped in resource-constrained CI