- Added Cython generated files to .gitignore to prevent unnecessary tracking. - Updated paths in `inference.c` and `coreml_engine.c` to reflect the correct virtual environment. - Revised the execution environment documentation to clarify hardware dependency checks and local execution instructions, ensuring accurate guidance for users. - Removed outdated Docker suitability checks and streamlined the assessment process for test execution environments.
6.0 KiB
Test Environment
Overview
System under test: Azaion.Detections — FastAPI HTTP service exposing POST /detect, POST /detect/{media_id}, GET /detect/stream, GET /health
Consumer app purpose: Standalone test runner that exercises the detection service through its public HTTP/SSE interfaces, validating black-box use cases without access to internals.
Test Execution
Decision: local
Hardware Dependencies Found
| # | Category | Indicator | File | Detail |
|---|---|---|---|---|
| 1 | Apple CoreML | import coremltools, ct.models.MLModel, ct.ComputeUnit.ALL |
engines/coreml_engine.pyx:15-22 |
CoreML engine requires macOS + Apple Silicon for Neural Engine / GPU inference |
| 2 | Apple platform gate | sys.platform == "darwin", platform.machine() == "arm64" |
engines/__init__.py:29 |
Engine auto-selection branches on macOS arm64 |
| 3 | NVIDIA GPU / CUDA | import pynvml, nvmlDeviceGetCudaComputeCapability |
engines/__init__.py:7-14 |
TensorRT engine requires NVIDIA GPU with compute capability ≥ 6.1 |
| 4 | TensorRT | from engines.tensorrt_engine import TensorRTEngine |
engines/__init__.py:43 |
TensorRT runtime only available on Linux with NVIDIA drivers |
| 5 | Cython compilation | setup.py build_ext --inplace |
Dockerfile:8, setup.py |
Cython modules must be compiled natively for the host architecture |
Rationale
The detection service uses a polymorphic engine factory (engines/__init__.py:EngineClass) that auto-selects the inference backend at startup:
- TensorRT — if NVIDIA GPU with CUDA compute ≥ 6.1 is detected
- CoreML — if macOS arm64 with
coremltoolsinstalled - ONNX — CPU-only fallback
Running in Docker on macOS means a Linux VM where neither CoreML nor TensorRT is available. The service falls back to the ONNX CPU engine — which is a valid but different code path. To test the actual engine used in the target deployment environment, tests must run locally.
Local Execution Instructions
Prerequisites
- macOS arm64 (Apple Silicon) with
coremltoolsinstalled, OR Linux with NVIDIA GPU + TensorRT for the GPU path - Python 3.11 with Cython modules compiled (
python setup.py build_ext --inplace) flaskandgunicorninstalled (for mock services)pytest,requests,sseclient-py,pytest-csv,pytest-timeoutinstalled (for test runner)
1. Start mock services
Open two terminal tabs and start the mocks. The mock-loader serves ONNX/CoreML model files from e2e/fixtures/models/ and the mock-annotations accepts detection result posts.
# Terminal 1 — mock-loader (port 8080)
cd e2e/mocks/loader
MODELS_ROOT=../../fixtures gunicorn -b 0.0.0.0:8080 -w 1 --timeout 120 app:app
# Terminal 2 — mock-annotations (port 8081)
cd e2e/mocks/annotations
gunicorn -b 0.0.0.0:8081 -w 1 --timeout 120 app:app
2. Start the detection service
# Terminal 3 — detections (port 8080 by default, override to 8000 to avoid mock conflict)
LOADER_URL=http://localhost:8080 \
ANNOTATIONS_URL=http://localhost:8081 \
python -m uvicorn main:app --host 0.0.0.0 --port 8000
3. Run tests
# Terminal 4 — test runner
cd e2e
BASE_URL=http://localhost:8000 \
MOCK_LOADER_URL=http://localhost:8080 \
MOCK_ANNOTATIONS_URL=http://localhost:8081 \
MEDIA_DIR=./fixtures \
pytest tests/ -v --csv=./results/report.csv --timeout=300
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
BASE_URL |
http://detections:8080 |
Detection service URL (override to http://localhost:8000 for local) |
MOCK_LOADER_URL |
http://mock-loader:8080 |
Mock loader URL (override to http://localhost:8080 for local) |
MOCK_ANNOTATIONS_URL |
http://mock-annotations:8081 |
Mock annotations URL (override to http://localhost:8081 for local) |
MEDIA_DIR |
/media |
Path to test media fixtures (override to ./fixtures for local) |
LOADER_URL |
http://loader:8080 |
Loader URL for detections service |
ANNOTATIONS_URL |
http://annotations:8080 |
Annotations URL for detections service |
Services
| Service | Role | Local Port | Startup Command |
|---|---|---|---|
| detections | System under test | 8000 | uvicorn main:app --port 8000 |
| mock-loader | Serves model files, accepts engine uploads | 8080 | gunicorn -b 0.0.0.0:8080 app:app |
| mock-annotations | Accepts detection results, provides token refresh | 8081 | gunicorn -b 0.0.0.0:8081 app:app |
Consumer Application
Tech stack: Python 3, pytest, requests, sseclient-py
Entry point: pytest tests/ -v --csv=./results/report.csv
Communication with system under test
| Interface | Protocol | Endpoint | Authentication |
|---|---|---|---|
| Health check | HTTP GET | {BASE_URL}/health |
None |
| Single image detect | HTTP POST (multipart) | {BASE_URL}/detect |
None |
| Media detect | HTTP POST (JSON) | {BASE_URL}/detect/{media_id} |
Bearer JWT + x-refresh-token headers |
| SSE stream | HTTP GET (SSE) | {BASE_URL}/detect/stream |
None |
What the consumer does NOT have access to
- No direct import of Cython modules (inference, annotation, engines)
- No direct access to the detections service filesystem or Logs/ directory
- No shared memory with the detections process
- No direct calls to mock-loader or mock-annotations (except for test setup/teardown verification)
CI/CD Integration
When to run: On PR merge to dev, nightly scheduled run Pipeline stage: After unit tests, before deployment Gate behavior: Block merge if any functional test fails; non-functional failures are warnings Timeout: 15 minutes for CPU profile, 30 minutes for GPU profile Runner requirement: macOS arm64 self-hosted runner (for CoreML path) or Linux with NVIDIA GPU (for TensorRT path)
Reporting
Format: CSV
Columns: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
Output path: e2e/results/report.csv