- Changed the directory structure for task specifications to include a dedicated `todo/` folder within `_docs/02_tasks/` for tasks ready for implementation. - Updated references in various skills and documentation to reflect the new task lifecycle, including changes in the `implementer` and `decompose` skills. - Enhanced the README and flow documentation to clarify the new task organization and its implications for the implementation process. These updates improve task management clarity and streamline the implementation workflow.
10 KiB
Test Infrastructure
Task: AZ-138_test_infrastructure Name: Test Infrastructure Description: Scaffold the E2E test project — test runner, mock services, Docker test environment, test data fixtures, reporting Complexity: 5 points Dependencies: None Component: Integration Tests Jira: AZ-138 Epic: AZ-137
Test Project Folder Layout
e2e/
├── conftest.py
├── requirements.txt
├── Dockerfile
├── pytest.ini
├── mocks/
│ ├── loader/
│ │ ├── Dockerfile
│ │ └── app.py
│ └── annotations/
│ ├── Dockerfile
│ └── app.py
├── fixtures/
│ ├── image_small.jpg (1280×720 JPEG, aerial, detectable objects)
│ ├── image_large.JPG (6252×4168 JPEG, triggers tiling)
│ ├── image_dense01.jpg (1280×720 JPEG, dense scene, clustered objects)
│ ├── image_dense02.jpg (1920×1080 JPEG, dense scene variant)
│ ├── image_different_types.jpg (900×1600 JPEG, varied object classes)
│ ├── image_empty_scene.jpg (1920×1080 JPEG, no detectable objects)
│ ├── video_short01.mp4 (short MP4 with moving objects)
│ ├── video_short02.mp4 (short MP4 variant for concurrent tests)
│ ├── video_long03.mp4 (long MP4, generates >100 SSE events)
│ ├── empty_image (zero-byte file, generated at build)
│ ├── corrupt_image (random binary garbage, generated at build)
│ ├── classes.json (19 classes, 3 weather modes, MaxSizeM values)
│ └── azaion.onnx (YOLO ONNX model, 1280×1280 input, 19 classes, 81MB)
├── tests/
│ ├── test_health_engine.py
│ ├── test_single_image.py
│ ├── test_tiling.py
│ ├── test_async_sse.py
│ ├── test_video.py
│ ├── test_negative.py
│ ├── test_resilience.py
│ ├── test_performance.py
│ ├── test_security.py
│ └── test_resource_limits.py
└── docker-compose.test.yml
Layout Rationale
mocks/separated from tests — each mock is a standalone Docker service with its own Dockerfilefixtures/holds all static test data, volume-mounted into containerstests/organized by test category matching the test spec structure (one file per task group)conftest.pyprovides shared pytest fixtures (HTTP clients, SSE helpers, service readiness checks)pytest.iniconfigures markers forgpu/cpuprofiles and test ordering
Mock Services
| Mock Service | Replaces | Endpoints | Behavior |
|---|---|---|---|
| mock-loader | Loader service (model download/upload) | GET /models/azaion.onnx — serves ONNX model from volume. POST /upload — accepts TensorRT engine upload, stores in memory. POST /mock/config — control API (simulate 503, reset state). GET /mock/status — returns mock state. |
Deterministic: serves model file from /models/ volume. Configurable downtime via control endpoint. First-request-fail mode for retry tests. |
| mock-annotations | Annotations service (result posting, token refresh) | POST /annotations — accepts annotation POST, stores in memory. POST /auth/refresh — returns refreshed token. POST /mock/config — control API (simulate 503, reset state). GET /mock/annotations — returns recorded annotations for assertion. |
Records all incoming annotations in memory. Provides token refresh. Configurable downtime. Assertions via GET endpoint to verify what was received. |
Mock Control API
Both mock services expose:
POST /mock/config— accepts JSON{"mode": "normal"|"error"|"first_fail"}to control behaviorPOST /mock/reset— clears recorded state (annotations, uploads)GET /mock/status— returns current mode and recorded interaction count
Docker Test Environment
docker-compose.test.yml Structure
| Service | Image / Build | Purpose | Depends On |
|---|---|---|---|
| detections | Build from repo root (Dockerfile) | System under test — FastAPI detection service | mock-loader, mock-annotations |
| mock-loader | Build from e2e/mocks/loader/ |
Serves ONNX model, accepts TensorRT uploads | — |
| mock-annotations | Build from e2e/mocks/annotations/ |
Accepts annotation results, provides token refresh | — |
| e2e-consumer | Build from e2e/ |
pytest test runner | detections |
Networks and Volumes
Network: e2e-net — isolated bridge network, all services communicate via hostnames
Volumes:
| Volume | Mount Target | Content |
|---|---|---|
| test-models | mock-loader:/models | azaion.onnx model file |
| test-media | e2e-consumer:/media | Test images and video files |
| test-classes | detections:/app/classes.json | classes.json with 19 detection classes |
| test-results | e2e-consumer:/results | CSV test report output |
GPU Profile
Two Docker Compose profiles:
- cpu (default):
detectionsruns without GPU runtime, exercises ONNX fallback path - gpu:
detectionsruns withruntime: nvidiaandNVIDIA_VISIBLE_DEVICES=all, exercises TensorRT path
Environment Variables (detections service)
| Variable | Value | Purpose |
|---|---|---|
| LOADER_URL | http://mock-loader:8080 | Points to mock Loader |
| ANNOTATIONS_URL | http://mock-annotations:8081 | Points to mock Annotations |
Test Runner Configuration
Framework: pytest
Plugins: pytest-csv (reporting), requests (HTTP client), sseclient-py (SSE streaming), pytest-timeout (per-test timeouts)
Entry point: pytest --csv=/results/report.csv -v
Fixture Strategy
| Fixture | Scope | Purpose |
|---|---|---|
base_url |
session | Detections service base URL (http://detections:8000) |
http_client |
session | requests.Session configured with base URL and default timeout |
sse_client_factory |
function | Factory that opens SSE connection to /detect/stream |
mock_loader_url |
session | Mock-loader base URL for control API calls |
mock_annotations_url |
session | Mock-annotations base URL for control API and assertion calls |
wait_for_services |
session (autouse) | Polls health endpoints until all services are ready |
reset_mocks |
function (autouse) | Calls POST /mock/reset on both mocks before each test |
image_small |
session | Reads image_small.jpg from /media/ volume |
image_large |
session | Reads image_large.JPG from /media/ volume |
image_dense |
session | Reads image_dense01.jpg from /media/ volume |
image_dense_02 |
session | Reads image_dense02.jpg from /media/ volume |
image_different_types |
session | Reads image_different_types.jpg from /media/ volume |
image_empty_scene |
session | Reads image_empty_scene.jpg from /media/ volume |
video_short_path |
session | Path to video_short01.mp4 on /media/ volume |
video_short_02_path |
session | Path to video_short02.mp4 on /media/ volume |
video_long_path |
session | Path to video_long03.mp4 on /media/ volume |
empty_image |
session | Reads zero-byte file |
corrupt_image |
session | Reads random binary file |
jwt_token |
function | Generates a valid JWT with exp claim for auth tests |
warm_engine |
module | Sends one detection request to initialize engine, used by tests that need warm engine |
Test Data Fixtures
| Data Set | Source | Format | Used By |
|---|---|---|---|
| azaion.onnx | input_data/azaion.onnx |
ONNX (1280×1280 input, 19 classes, 81MB) | All detection tests (via mock-loader) |
| classes.json | repo root classes.json |
JSON (19 objects with Id, Name, Color, MaxSizeM) | All tests (volume mount to detections) |
| image_small.jpg | input_data/image_small.jpg |
JPEG 1280×720 | Health, single image, filtering, negative, performance tests |
| image_large.JPG | input_data/image_large.JPG |
JPEG 6252×4168 | Tiling tests, performance tests |
| image_dense01.jpg | input_data/image_dense01.jpg |
JPEG 1280×720 dense scene | Dedup tests, detection cap tests |
| image_dense02.jpg | input_data/image_dense02.jpg |
JPEG 1920×1080 dense scene | Dedup variant |
| image_different_types.jpg | input_data/image_different_types.jpg |
JPEG 900×1600 varied classes | Weather mode class variant tests |
| image_empty_scene.jpg | input_data/image_empty_scene.jpg |
JPEG 1920×1080 empty | Zero-detection edge case |
| video_short01.mp4 | input_data/video_short01.mp4 |
MP4 short video | Async, SSE, video processing tests |
| video_short02.mp4 | input_data/video_short02.mp4 |
MP4 short video variant | Concurrent, resilience tests |
| video_long03.mp4 | input_data/video_long03.mp4 |
MP4 long video (288MB) | SSE overflow, queue depth tests |
| empty_image | Generated at build | Zero-byte file | FT-N-01 |
| corrupt_image | Generated at build | Random binary | FT-N-02 |
Data Isolation
Each test run starts with fresh containers (docker compose down -v && docker compose up). The detections service is stateless — no persistent data between runs. Mock services reset state via POST /mock/reset before each test. Tests that modify mock behavior (e.g., making loader unreachable) run with function-scoped mock resets.
Test Reporting
Format: CSV
Columns: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
Output path: /results/report.csv → mounted to ./e2e-results/report.csv on host
Acceptance Criteria
AC-1: Test environment starts
Given the docker-compose.test.yml
When docker compose -f docker-compose.test.yml up is executed
Then all services start and the detections service is reachable at http://detections:8000/health
AC-2: Mock services respond Given the test environment is running When the e2e-consumer sends requests to mock-loader and mock-annotations Then mock services respond with configured behavior and record interactions
AC-3: Test runner executes
Given the test environment is running
When the e2e-consumer starts
Then pytest discovers and executes test files from tests/ directory
AC-4: Test report generated
Given tests have been executed
When the test run completes
Then /results/report.csv exists with columns: Test ID, Test Name, Execution Time, Result, Error Message