mirror of https://github.com/azaion/ai-training.git synced 2026-04-22 22:16:35 +00:00

Files

T

Oleksandr Bezdieniezhnykh cbf370c765 Refactor task management structure and update documentation

- Changed the directory structure for task specifications to include a dedicated `todo/` folder within `_docs/02_tasks/` for tasks ready for implementation.
- Updated references in various skills and documentation to reflect the new task lifecycle, including changes in the `implementer` and `decompose` skills.
- Enhanced the README and flow documentation to clarify the new task organization and its implications for the implementation process.

These updates improve task management clarity and streamline the implementation workflow.

2026-03-28 01:17:45 +02:00

3.2 KiB

Raw Blame History

Performance Tests

Task: AZ-146_test_performance Name: Performance Tests Description: Implement E2E tests measuring detection latency, concurrent inference throughput, tiling overhead, and video processing frame rate Complexity: 3 points Dependencies: AZ-138_test_infrastructure Component: Integration Tests Jira: AZ-146 Epic: AZ-137

Problem

Performance characteristics must be baselined and verified: single image latency, concurrent request handling with the 2-worker ThreadPoolExecutor, tiling overhead for large images, and video processing frame rate. These tests establish performance contracts.

Outcome

Single image latency profiled (p50, p95, p99) for warm engine
Concurrent inference behavior validated (2-at-a-time processing confirmed)
Large image tiling overhead measured and bounded
Video processing frame rate baselined

Scope

Included

NFT-PERF-01: Single image detection latency
NFT-PERF-02: Concurrent inference throughput
NFT-PERF-03: Large image tiling processing time
NFT-PERF-04: Video processing frame rate

Excluded

GPU vs CPU comparative benchmarks
Memory usage profiling
Load testing beyond 4 concurrent requests

Acceptance Criteria

AC-1: Single image latency Given a warm engine When 10 sequential POST /detect requests are sent with small-image Then p95 latency < 5000ms for ONNX CPU or p95 < 1000ms for TensorRT GPU

AC-2: Concurrent throughput Given a warm engine When 2 concurrent POST /detect requests are sent Then both complete without error And 3 concurrent requests show queuing (total time > time for 2)

AC-3: Tiling overhead Given a warm engine When POST /detect is sent with large-image (4000x3000) Then request completes within 120s And processing time scales proportionally with tile count

AC-4: Video frame rate Given a warm engine with SSE connected When async detection processes test-video with frame_period=4 Then processing completes within 5x video duration (< 50s) And frame processing rate is consistent (no stalls > 10s)

Non-Functional Requirements

Performance

Tests themselves should complete within defined bounds
Results should be logged for trend analysis

Integration Tests

AC Ref	Initial Data/Conditions	What to Test	Expected Behavior	NFR References
AC-1	Engine warm	10 sequential detections	p95 < 5000ms (CPU)	~60s
AC-2	Engine warm	2 then 3 concurrent requests	Queuing observed at 3	~30s
AC-3	Engine warm, large-image	Single large image detection	Completes < 120s	~120s
AC-4	Engine warm, SSE connected	Video detection	< 50s, consistent rate	~120s

Constraints

Pass criteria differ between CPU (ONNX) and GPU (TensorRT) profiles
Concurrent request tests must account for connection overhead
Video frame rate depends on hardware; test measures consistency, not absolute speed

Risks & Mitigation

Risk 1: CI hardware variability

Risk: Latency thresholds may fail on slower CI hardware
Mitigation: Use generous thresholds; mark as performance benchmark tests that can be skipped in resource-constrained CI

3.2 KiB Raw Blame History