Files
ai-training/_docs/02_tasks/done/AZ-138_test_infrastructure.md
T
Oleksandr Bezdieniezhnykh cbf370c765 Refactor task management structure and update documentation
- Changed the directory structure for task specifications to include a dedicated `todo/` folder within `_docs/02_tasks/` for tasks ready for implementation.
- Updated references in various skills and documentation to reflect the new task lifecycle, including changes in the `implementer` and `decompose` skills.
- Enhanced the README and flow documentation to clarify the new task organization and its implications for the implementation process.

These updates improve task management clarity and streamline the implementation workflow.
2026-03-28 01:17:45 +02:00

10 KiB
Raw Blame History

Test Infrastructure

Task: AZ-138_test_infrastructure Name: Test Infrastructure Description: Scaffold the E2E test project — test runner, mock services, Docker test environment, test data fixtures, reporting Complexity: 5 points Dependencies: None Component: Integration Tests Jira: AZ-138 Epic: AZ-137

Test Project Folder Layout

e2e/
├── conftest.py
├── requirements.txt
├── Dockerfile
├── pytest.ini
├── mocks/
│   ├── loader/
│   │   ├── Dockerfile
│   │   └── app.py
│   └── annotations/
│       ├── Dockerfile
│       └── app.py
├── fixtures/
│   ├── image_small.jpg          (1280×720 JPEG, aerial, detectable objects)
│   ├── image_large.JPG          (6252×4168 JPEG, triggers tiling)
│   ├── image_dense01.jpg        (1280×720 JPEG, dense scene, clustered objects)
│   ├── image_dense02.jpg        (1920×1080 JPEG, dense scene variant)
│   ├── image_different_types.jpg (900×1600 JPEG, varied object classes)
│   ├── image_empty_scene.jpg    (1920×1080 JPEG, no detectable objects)
│   ├── video_short01.mp4        (short MP4 with moving objects)
│   ├── video_short02.mp4        (short MP4 variant for concurrent tests)
│   ├── video_long03.mp4         (long MP4, generates >100 SSE events)
│   ├── empty_image              (zero-byte file, generated at build)
│   ├── corrupt_image            (random binary garbage, generated at build)
│   ├── classes.json             (19 classes, 3 weather modes, MaxSizeM values)
│   └── azaion.onnx              (YOLO ONNX model, 1280×1280 input, 19 classes, 81MB)
├── tests/
│   ├── test_health_engine.py
│   ├── test_single_image.py
│   ├── test_tiling.py
│   ├── test_async_sse.py
│   ├── test_video.py
│   ├── test_negative.py
│   ├── test_resilience.py
│   ├── test_performance.py
│   ├── test_security.py
│   └── test_resource_limits.py
└── docker-compose.test.yml

Layout Rationale

  • mocks/ separated from tests — each mock is a standalone Docker service with its own Dockerfile
  • fixtures/ holds all static test data, volume-mounted into containers
  • tests/ organized by test category matching the test spec structure (one file per task group)
  • conftest.py provides shared pytest fixtures (HTTP clients, SSE helpers, service readiness checks)
  • pytest.ini configures markers for gpu/cpu profiles and test ordering

Mock Services

Mock Service Replaces Endpoints Behavior
mock-loader Loader service (model download/upload) GET /models/azaion.onnx — serves ONNX model from volume. POST /upload — accepts TensorRT engine upload, stores in memory. POST /mock/config — control API (simulate 503, reset state). GET /mock/status — returns mock state. Deterministic: serves model file from /models/ volume. Configurable downtime via control endpoint. First-request-fail mode for retry tests.
mock-annotations Annotations service (result posting, token refresh) POST /annotations — accepts annotation POST, stores in memory. POST /auth/refresh — returns refreshed token. POST /mock/config — control API (simulate 503, reset state). GET /mock/annotations — returns recorded annotations for assertion. Records all incoming annotations in memory. Provides token refresh. Configurable downtime. Assertions via GET endpoint to verify what was received.

Mock Control API

Both mock services expose:

  • POST /mock/config — accepts JSON {"mode": "normal"|"error"|"first_fail"} to control behavior
  • POST /mock/reset — clears recorded state (annotations, uploads)
  • GET /mock/status — returns current mode and recorded interaction count

Docker Test Environment

docker-compose.test.yml Structure

Service Image / Build Purpose Depends On
detections Build from repo root (Dockerfile) System under test — FastAPI detection service mock-loader, mock-annotations
mock-loader Build from e2e/mocks/loader/ Serves ONNX model, accepts TensorRT uploads
mock-annotations Build from e2e/mocks/annotations/ Accepts annotation results, provides token refresh
e2e-consumer Build from e2e/ pytest test runner detections

Networks and Volumes

Network: e2e-net — isolated bridge network, all services communicate via hostnames

Volumes:

Volume Mount Target Content
test-models mock-loader:/models azaion.onnx model file
test-media e2e-consumer:/media Test images and video files
test-classes detections:/app/classes.json classes.json with 19 detection classes
test-results e2e-consumer:/results CSV test report output

GPU Profile

Two Docker Compose profiles:

  • cpu (default): detections runs without GPU runtime, exercises ONNX fallback path
  • gpu: detections runs with runtime: nvidia and NVIDIA_VISIBLE_DEVICES=all, exercises TensorRT path

Environment Variables (detections service)

Variable Value Purpose
LOADER_URL http://mock-loader:8080 Points to mock Loader
ANNOTATIONS_URL http://mock-annotations:8081 Points to mock Annotations

Test Runner Configuration

Framework: pytest Plugins: pytest-csv (reporting), requests (HTTP client), sseclient-py (SSE streaming), pytest-timeout (per-test timeouts) Entry point: pytest --csv=/results/report.csv -v

Fixture Strategy

Fixture Scope Purpose
base_url session Detections service base URL (http://detections:8000)
http_client session requests.Session configured with base URL and default timeout
sse_client_factory function Factory that opens SSE connection to /detect/stream
mock_loader_url session Mock-loader base URL for control API calls
mock_annotations_url session Mock-annotations base URL for control API and assertion calls
wait_for_services session (autouse) Polls health endpoints until all services are ready
reset_mocks function (autouse) Calls POST /mock/reset on both mocks before each test
image_small session Reads image_small.jpg from /media/ volume
image_large session Reads image_large.JPG from /media/ volume
image_dense session Reads image_dense01.jpg from /media/ volume
image_dense_02 session Reads image_dense02.jpg from /media/ volume
image_different_types session Reads image_different_types.jpg from /media/ volume
image_empty_scene session Reads image_empty_scene.jpg from /media/ volume
video_short_path session Path to video_short01.mp4 on /media/ volume
video_short_02_path session Path to video_short02.mp4 on /media/ volume
video_long_path session Path to video_long03.mp4 on /media/ volume
empty_image session Reads zero-byte file
corrupt_image session Reads random binary file
jwt_token function Generates a valid JWT with exp claim for auth tests
warm_engine module Sends one detection request to initialize engine, used by tests that need warm engine

Test Data Fixtures

Data Set Source Format Used By
azaion.onnx input_data/azaion.onnx ONNX (1280×1280 input, 19 classes, 81MB) All detection tests (via mock-loader)
classes.json repo root classes.json JSON (19 objects with Id, Name, Color, MaxSizeM) All tests (volume mount to detections)
image_small.jpg input_data/image_small.jpg JPEG 1280×720 Health, single image, filtering, negative, performance tests
image_large.JPG input_data/image_large.JPG JPEG 6252×4168 Tiling tests, performance tests
image_dense01.jpg input_data/image_dense01.jpg JPEG 1280×720 dense scene Dedup tests, detection cap tests
image_dense02.jpg input_data/image_dense02.jpg JPEG 1920×1080 dense scene Dedup variant
image_different_types.jpg input_data/image_different_types.jpg JPEG 900×1600 varied classes Weather mode class variant tests
image_empty_scene.jpg input_data/image_empty_scene.jpg JPEG 1920×1080 empty Zero-detection edge case
video_short01.mp4 input_data/video_short01.mp4 MP4 short video Async, SSE, video processing tests
video_short02.mp4 input_data/video_short02.mp4 MP4 short video variant Concurrent, resilience tests
video_long03.mp4 input_data/video_long03.mp4 MP4 long video (288MB) SSE overflow, queue depth tests
empty_image Generated at build Zero-byte file FT-N-01
corrupt_image Generated at build Random binary FT-N-02

Data Isolation

Each test run starts with fresh containers (docker compose down -v && docker compose up). The detections service is stateless — no persistent data between runs. Mock services reset state via POST /mock/reset before each test. Tests that modify mock behavior (e.g., making loader unreachable) run with function-scoped mock resets.

Test Reporting

Format: CSV Columns: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL) Output path: /results/report.csv → mounted to ./e2e-results/report.csv on host

Acceptance Criteria

AC-1: Test environment starts Given the docker-compose.test.yml When docker compose -f docker-compose.test.yml up is executed Then all services start and the detections service is reachable at http://detections:8000/health

AC-2: Mock services respond Given the test environment is running When the e2e-consumer sends requests to mock-loader and mock-annotations Then mock services respond with configured behavior and record interactions

AC-3: Test runner executes Given the test environment is running When the e2e-consumer starts Then pytest discovers and executes test files from tests/ directory

AC-4: Test report generated Given tests have been executed When the test run completes Then /results/report.csv exists with columns: Test ID, Test Name, Execution Time, Result, Error Message