Update .gitignore and refine documentation for execution environment

- Added Cython generated files to .gitignore to prevent unnecessary tracking.
- Updated paths in `inference.c` and `coreml_engine.c` to reflect the correct virtual environment.
- Revised the execution environment documentation to clarify hardware dependency checks and local execution instructions, ensuring accurate guidance for users.
- Removed outdated Docker suitability checks and streamlined the assessment process for test execution environments.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-30 00:53:46 +03:00
parent 27f4aceb52
commit 5a968edcba
12 changed files with 328 additions and 401 deletions
+6 -29
View File
@@ -32,36 +32,13 @@ Check in order — first match wins:
If no runner detected → report failure and ask user to specify.
#### Docker Suitability Check
#### Execution Environment Check
Docker is the preferred test environment. Before using it, verify no constraints prevent easy Docker execution:
1. Check `_docs/02_document/tests/environment.md` for a "Test Execution" decision (if the test-spec skill already assessed this, follow that decision)
2. If no prior decision exists, check for disqualifying factors:
- Hardware bindings: GPU, MPS, CUDA, TPU, FPGA, sensors, cameras, serial devices, host-level drivers
- Host dependencies: licensed software, OS-specific services, kernel modules, proprietary SDKs
- Data/volume constraints: large files (> 100MB) impractical to copy into a container
- Network/environment: host networking, VPN, specific DNS/firewall rules
- Performance: Docker overhead would invalidate benchmarks or latency measurements
3. If any disqualifying factor found → fall back to local test runner. Present to user using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Docker is preferred but factors
preventing easy Docker execution detected
══════════════════════════════════════
Factors detected:
- [list factors]
══════════════════════════════════════
A) Run tests locally (recommended)
B) Run tests in Docker anyway
══════════════════════════════════════
Recommendation: A — detected constraints prevent
easy Docker execution
══════════════════════════════════════
```
4. If no disqualifying factors → use Docker (preferred default)
1. Check `_docs/02_document/tests/environment.md` for a "Test Execution" section. If the test-spec skill already assessed hardware dependencies and recorded a decision (local / docker / both), **follow that decision**.
2. If the "Test Execution" section says **local** → run tests directly on host (no Docker).
3. If the "Test Execution" section says **docker** → use Docker (docker-compose).
4. If the "Test Execution" section says **both** → run local first, then Docker (or vice versa), and merge results.
5. If no prior decision exists → fall back to the hardware-dependency detection logic from the test-spec skill's "Hardware-Dependency & Execution Environment Assessment" section. Ask the user if hardware indicators are found.
### 2. Run Tests
+61 -24
View File
@@ -209,7 +209,7 @@ Based on all acquired data, acceptance_criteria, and restrictions, form detailed
- [ ] Expected results use comparison methods from `.cursor/skills/test-spec/templates/expected-results.md`
- [ ] Positive and negative scenarios are balanced
- [ ] Consumer app has no direct access to system internals
- [ ] Test environment matches project constraints (see Docker Suitability Assessment below)
- [ ] Test environment matches project constraints (see Hardware-Dependency & Execution Environment Assessment below)
- [ ] External dependencies have mock/stub services defined
- [ ] Traceability matrix has no uncovered AC or restrictions
@@ -337,43 +337,80 @@ When coverage ≥ 70% and all remaining tests have validated data AND quantifiab
---
### Docker Suitability Assessment (BLOCKING — runs before Phase 4)
### Hardware-Dependency & Execution Environment Assessment (BLOCKING — runs before Phase 4)
Docker is the **preferred** test execution environment (reproducibility, isolation, CI parity). Before generating scripts, check whether the project has any constraints that prevent easy Docker usage.
Docker is the **preferred** test execution environment (reproducibility, isolation, CI parity). However, hardware-dependent projects may require local execution to exercise the real code paths. This assessment determines the right execution strategy by scanning both documentation and source code.
**Disqualifying factors** (any one is sufficient to fall back to local):
- Hardware bindings: GPU, MPS, TPU, FPGA, accelerators, sensors, cameras, serial devices, host-level drivers (CUDA, Metal, OpenCL, etc.)
- Host dependencies: licensed software, OS-specific services, kernel modules, proprietary SDKs not installable in a container
- Data/volume constraints: large files (> 100MB) that would be impractical to copy into a container, databases that must run on the host
- Network/environment: tests that require host networking, VPN access, or specific DNS/firewall rules
- Performance: Docker overhead would invalidate benchmarks or latency-sensitive measurements
#### Step 1 — Documentation scan
**Assessment steps**:
1. Scan project source, config files, and dependencies for indicators of the factors above
2. Check `TESTS_OUTPUT_DIR/environment.md` for environment requirements
3. Check `_docs/00_problem/restrictions.md` and `_docs/01_solution/solution.md` for constraints
Check the following files for mentions of hardware-specific requirements:
**Decision**:
- If ANY disqualifying factor is found → recommend **local test execution** as fallback. Present to user using Choose format:
| File | Look for |
|------|----------|
| `_docs/00_problem/restrictions.md` | Platform requirements, hardware constraints, OS-specific features |
| `_docs/01_solution/solution.md` | Engine selection logic, platform-dependent paths, hardware acceleration |
| `_docs/02_document/architecture.md` | Component diagrams showing hardware layers, engine adapters |
| `_docs/02_document/components/*/description.md` | Per-component hardware mentions |
| `TESTS_OUTPUT_DIR/environment.md` | Existing environment decisions |
#### Step 2 — Code scan
Search the project source for indicators of hardware dependence. The project is **hardware-dependent** if ANY of the following are found:
| Category | Code indicators (imports, APIs, config) |
|----------|-----------------------------------------|
| GPU / CUDA | `import pycuda`, `import tensorrt`, `import pynvml`, `torch.cuda`, `nvidia-smi`, `CUDA_VISIBLE_DEVICES`, `runtime: nvidia` |
| Apple Neural Engine / CoreML | `import coremltools`, `CoreML`, `MLModel`, `ComputeUnit`, `MPS`, `sys.platform == "darwin"`, `platform.machine() == "arm64"` |
| OpenCL / Vulkan | `import pyopencl`, `clCreateContext`, vulkan headers |
| TPU / FPGA | `import tensorflow.distribute.TPUStrategy`, FPGA bitstream loaders |
| Sensors / Cameras | `import cv2.VideoCapture(0)` (device index), serial port access, GPIO, V4L2 |
| OS-specific services | Kernel modules (`modprobe`), host-level drivers, platform-gated code (`sys.platform` branches selecting different backends) |
Also check dependency files (`requirements.txt`, `setup.py`, `pyproject.toml`, `Cargo.toml`, `*.csproj`) for hardware-specific packages.
#### Step 3 — Classify the project
Based on Steps 12, classify the project:
- **Not hardware-dependent**: no indicators found → use Docker (preferred default), skip to "Record the decision" below
- **Hardware-dependent**: one or more indicators found → proceed to Step 4
#### Step 4 — Present execution environment choice
Present the findings and ask the user using Choose format:
```
══════════════════════════════════════
DECISION REQUIRED: Test execution environment
══════════════════════════════════════
Docker is preferred, but factors preventing easy
Docker execution detected:
- [list factors found]
Hardware dependencies detected:
- [list each indicator found, with file:line]
══════════════════════════════════════
A) Local execution (recommended)
B) Docker execution (constraints may cause issues)
Running in Docker means these hardware code paths
are NOT exercised — Docker uses a Linux VM where
[specific hardware, e.g. CoreML / CUDA] is unavailable.
The system would fall back to [fallback engine/path].
══════════════════════════════════════
Recommendation: A — detected constraints prevent
easy Docker execution
A) Local execution only (tests the real hardware path)
B) Docker execution only (tests the fallback path)
C) Both local and Docker (tests both paths, requires
two test runs — recommended for CI with heterogeneous
runners)
══════════════════════════════════════
Recommendation: [A, B, or C] — [reason]
══════════════════════════════════════
```
- If NO disqualifying factors → use Docker (preferred default)
- Record the decision in `TESTS_OUTPUT_DIR/environment.md` under a "Test Execution" section
#### Step 5 — Record the decision
Write or update a **"Test Execution"** section in `TESTS_OUTPUT_DIR/environment.md` with:
1. **Decision**: local / docker / both
2. **Hardware dependencies found**: list with file references
3. **Execution instructions** per chosen mode:
- **Local mode**: prerequisites (OS, SDK, hardware), how to start services, how to run the test runner, environment variables
- **Docker mode**: docker-compose profile/command, required images, how results are collected
- **Both mode**: instructions for each, plus guidance on which CI runner type runs which mode
---
+3
View File
@@ -4,6 +4,9 @@
*~
Thumbs.db
# Cython generated
*.c
# Python
__pycache__/
*.py[cod]
+177
View File
@@ -1590,3 +1590,180 @@
[00:21:21 INFO] init AI...
[00:21:21 INFO] init AI...
[00:21:21 INFO] init AI...
[00:41:54 INFO] init AI...
[00:41:54 INFO] Downloading
[00:41:57 INFO] CoreML model: 1280x1280
[00:41:57 INFO] Enabled
[00:42:05 INFO] init AI...
[00:42:05 INFO] init AI...
[00:42:05 INFO] run inference on ./fixtures/image_small.jpg...
[00:42:06 INFO] init AI...
[00:42:06 INFO] run inference on ./fixtures/video_test01.mp4...
[00:42:06 ERROR] Failed to open video: ./fixtures/video_test01.mp4
[00:45:22 INFO] init AI...
[00:45:22 INFO] Downloading
[00:45:24 INFO] CoreML model: 1280x1280
[00:45:24 INFO] Enabled
[00:45:31 INFO] init AI...
[00:45:31 INFO] init AI...
[00:45:31 INFO] run inference on /Users/obezdienie001/dev/azaion/suite/detections/e2e/fixtures/image_small.jpg...
[00:45:31 INFO] ground sampling distance: 0.3059895833333333
[00:45:31 INFO] Initial ann: image_small_000000: class: 0 77.0% (0.47, 0.21) (0.14, 0.19)
[00:45:31 INFO] Removed (53.80277931690216 42.89022199809551) > 8. class: ArmorVehicle
[00:45:32 INFO] init AI...
[00:45:32 INFO] run inference on /Users/obezdienie001/dev/azaion/suite/detections/e2e/fixtures/video_test01.mp4...
[00:45:32 INFO] Video: 200 frames, 25.0 fps, 2560x1440
[00:45:32 INFO] Video batch 1: frame 4/200 (2%)
[00:45:32 INFO] Video batch 2: frame 8/200 (4%)
[00:45:32 INFO] Video batch 3: frame 12/200 (6%)
[00:45:32 INFO] Video batch 4: frame 16/200 (8%)
[00:45:32 INFO] Video batch 4: 1 detections from postprocess
[00:45:32 INFO] Video frame video_test01_000006: 1 dets, valid=True
[00:45:32 INFO] Video batch 5: frame 20/200 (10%)
[00:45:33 INFO] Video batch 5: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000007: 1 dets, valid=True
[00:45:33 INFO] Video batch 6: frame 24/200 (12%)
[00:45:33 INFO] Video batch 6: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000009: 1 dets, valid=True
[00:45:33 INFO] Video batch 7: frame 28/200 (14%)
[00:45:33 INFO] Video batch 7: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000010: 1 dets, valid=True
[00:45:33 INFO] Video batch 8: frame 32/200 (16%)
[00:45:33 INFO] Video batch 8: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000012: 1 dets, valid=True
[00:45:33 INFO] Video batch 9: frame 36/200 (18%)
[00:45:33 INFO] Video batch 9: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000014: 1 dets, valid=True
[00:45:33 INFO] Video batch 10: frame 40/200 (20%)
[00:45:33 INFO] Video batch 10: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000015: 1 dets, valid=True
[00:45:33 INFO] Video batch 11: frame 44/200 (22%)
[00:45:33 INFO] Video batch 11: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000017: 1 dets, valid=True
[00:45:33 INFO] Video batch 12: frame 48/200 (24%)
[00:45:33 INFO] Video batch 12: 1 detections from postprocess
[00:45:33 INFO] Video frame video_test01_000018: 1 dets, valid=True
[00:45:33 INFO] Video batch 13: frame 52/200 (26%)
[00:45:34 INFO] Video batch 13: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000020: 1 dets, valid=True
[00:45:34 INFO] Video batch 14: frame 56/200 (28%)
[00:45:34 INFO] Video batch 14: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000022: 1 dets, valid=True
[00:45:34 INFO] Video batch 15: frame 60/200 (30%)
[00:45:34 INFO] Video batch 15: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000023: 1 dets, valid=True
[00:45:34 INFO] Video batch 16: frame 64/200 (32%)
[00:45:34 INFO] Video batch 16: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000025: 1 dets, valid=True
[00:45:34 INFO] Video batch 17: frame 68/200 (34%)
[00:45:34 INFO] Video batch 17: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000026: 1 dets, valid=True
[00:45:34 INFO] Video batch 18: frame 72/200 (36%)
[00:45:34 INFO] Video batch 18: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000028: 1 dets, valid=True
[00:45:34 INFO] Video batch 19: frame 76/200 (38%)
[00:45:34 INFO] Video batch 19: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000030: 1 dets, valid=True
[00:45:34 INFO] Video batch 20: frame 80/200 (40%)
[00:45:34 INFO] Video batch 20: 1 detections from postprocess
[00:45:34 INFO] Video frame video_test01_000031: 1 dets, valid=True
[00:45:34 INFO] Video batch 21: frame 84/200 (42%)
[00:45:35 INFO] Video batch 21: 1 detections from postprocess
[00:45:35 INFO] Video frame video_test01_000033: 1 dets, valid=True
[00:45:35 INFO] Video batch 22: frame 88/200 (44%)
[00:45:35 INFO] Video batch 22: 1 detections from postprocess
[00:45:35 INFO] Video frame video_test01_000034: 1 dets, valid=True
[00:45:35 INFO] Video batch 23: frame 92/200 (46%)
[00:45:35 INFO] Video batch 24: frame 96/200 (48%)
[00:45:35 INFO] Video batch 25: frame 100/200 (50%)
[00:45:35 INFO] Video batch 26: frame 104/200 (52%)
[00:45:35 INFO] Video batch 26: 1 detections from postprocess
[00:45:35 INFO] Video frame video_test01_000041: 1 dets, valid=True
[00:45:35 INFO] Video batch 27: frame 108/200 (54%)
[00:45:35 INFO] Video batch 27: 1 detections from postprocess
[00:45:35 INFO] Video frame video_test01_000042: 1 dets, valid=True
[00:45:35 INFO] Video batch 28: frame 112/200 (56%)
[00:45:35 INFO] Video batch 28: 1 detections from postprocess
[00:45:35 INFO] Video frame video_test01_000044: 1 dets, valid=True
[00:45:35 INFO] Video batch 29: frame 116/200 (58%)
[00:45:35 INFO] Video batch 29: 1 detections from postprocess
[00:45:35 INFO] Video frame video_test01_000046: 1 dets, valid=True
[00:45:36 INFO] Video batch 30: frame 120/200 (60%)
[00:45:36 INFO] Video batch 30: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000047: 1 dets, valid=True
[00:45:36 INFO] Video batch 31: frame 124/200 (62%)
[00:45:36 INFO] Video batch 31: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000049: 1 dets, valid=True
[00:45:36 INFO] Video batch 32: frame 128/200 (64%)
[00:45:36 INFO] Video batch 32: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000050: 1 dets, valid=True
[00:45:36 INFO] Video batch 33: frame 132/200 (66%)
[00:45:36 INFO] Video batch 33: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000052: 1 dets, valid=True
[00:45:36 INFO] Video batch 34: frame 136/200 (68%)
[00:45:36 INFO] Video batch 34: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000054: 1 dets, valid=True
[00:45:36 INFO] Video batch 35: frame 140/200 (70%)
[00:45:36 INFO] Video batch 35: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000055: 1 dets, valid=True
[00:45:36 INFO] Video batch 36: frame 144/200 (72%)
[00:45:36 INFO] Video batch 36: 1 detections from postprocess
[00:45:36 INFO] Video frame video_test01_000057: 1 dets, valid=True
[00:45:36 INFO] Video batch 37: frame 148/200 (74%)
[00:45:36 INFO] Video batch 38: frame 152/200 (76%)
[00:45:37 INFO] Video batch 39: frame 156/200 (78%)
[00:45:37 INFO] Video batch 40: frame 160/200 (80%)
[00:45:37 INFO] Video batch 41: frame 164/200 (82%)
[00:45:37 INFO] Video batch 42: frame 168/200 (84%)
[00:45:37 INFO] Video batch 42: 1 detections from postprocess
[00:45:37 INFO] Video frame video_test01_000066: 1 dets, valid=True
[00:45:37 INFO] Video batch 43: frame 172/200 (86%)
[00:45:37 INFO] Video batch 43: 1 detections from postprocess
[00:45:37 INFO] Video frame video_test01_000068: 1 dets, valid=True
[00:45:37 INFO] Video batch 44: frame 176/200 (88%)
[00:45:37 INFO] Video batch 45: frame 180/200 (90%)
[00:45:37 INFO] Video batch 46: frame 184/200 (92%)
[00:45:37 INFO] Video batch 46: 1 detections from postprocess
[00:45:37 INFO] Video frame video_test01_000073: 1 dets, valid=True
[00:45:38 INFO] Video batch 47: frame 188/200 (94%)
[00:45:38 INFO] Video batch 47: 1 detections from postprocess
[00:45:38 INFO] Video frame video_test01_000074: 1 dets, valid=True
[00:45:38 INFO] Video batch 48: frame 192/200 (96%)
[00:45:38 INFO] Video batch 48: 1 detections from postprocess
[00:45:38 INFO] Video frame video_test01_000076: 1 dets, valid=True
[00:45:38 INFO] Video batch 49: frame 196/200 (98%)
[00:45:38 INFO] Video batch 49: 1 detections from postprocess
[00:45:38 INFO] Video frame video_test01_000078: 1 dets, valid=True
[00:45:38 INFO] Video batch 50: frame 200/200 (100%)
[00:45:38 INFO] Video batch 50: 1 detections from postprocess
[00:45:38 INFO] Video frame video_test01_000079: 1 dets, valid=True
[00:45:38 INFO] Video done: 200 frames read, 50 batches processed
[00:45:38 INFO] init AI...
[00:45:38 INFO] run inference on /Users/obezdienie001/dev/azaion/suite/detections/e2e/fixtures/image_small.jpg...
[00:45:38 INFO] ground sampling distance: 0.3059895833333333
[00:45:38 INFO] init AI...
[00:45:38 INFO] Initial ann: image_small_000000: class: 0 77.0% (0.47, 0.21) (0.14, 0.19)
[00:45:38 INFO] Removed (53.80277931690216 42.89022199809551) > 8. class: ArmorVehicle
[00:45:38 INFO] init AI...
[00:45:38 INFO] init AI...
[00:45:38 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:39 INFO] init AI...
[00:45:40 INFO] init AI...
[00:45:40 INFO] init AI...
[00:45:40 INFO] init AI...
[00:45:40 INFO] init AI...
[00:45:40 INFO] init AI...
[00:45:40 INFO] init AI...
[00:45:40 INFO] init AI...
[00:53:38 INFO] init AI...
[00:53:38 INFO] Downloading
[00:53:41 INFO] CoreML model: 1280x1280
[00:53:41 INFO] Enabled
+78 -75
View File
@@ -5,104 +5,106 @@
**System under test**: Azaion.Detections — FastAPI HTTP service exposing `POST /detect`, `POST /detect/{media_id}`, `GET /detect/stream`, `GET /health`
**Consumer app purpose**: Standalone test runner that exercises the detection service through its public HTTP/SSE interfaces, validating black-box use cases without access to internals.
## Docker Environment
## Test Execution
### Services
**Decision**: local
| Service | Image / Build | Purpose | Ports |
|---------|--------------|---------|-------|
| detections | Build from repo root (setup.py + Cython compile, uvicorn entrypoint) | System under test — the detection microservice | 8000:8000 |
| mock-loader | Custom lightweight HTTP stub (Python/Node) | Mock of the Loader service — serves ONNX model files, accepts TensorRT uploads | 8080:8080 |
| mock-annotations | Custom lightweight HTTP stub (Python/Node) | Mock of the Annotations service — accepts detection results, provides token refresh | 8081:8081 |
| e2e-consumer | Build from `e2e/` directory | Black-box test runner (pytest) | — |
### Hardware Dependencies Found
### GPU Configuration
| # | Category | Indicator | File | Detail |
|---|----------|-----------|------|--------|
| 1 | Apple CoreML | `import coremltools`, `ct.models.MLModel`, `ct.ComputeUnit.ALL` | `engines/coreml_engine.pyx:15-22` | CoreML engine requires macOS + Apple Silicon for Neural Engine / GPU inference |
| 2 | Apple platform gate | `sys.platform == "darwin"`, `platform.machine() == "arm64"` | `engines/__init__.py:29` | Engine auto-selection branches on macOS arm64 |
| 3 | NVIDIA GPU / CUDA | `import pynvml`, `nvmlDeviceGetCudaComputeCapability` | `engines/__init__.py:7-14` | TensorRT engine requires NVIDIA GPU with compute capability ≥ 6.1 |
| 4 | TensorRT | `from engines.tensorrt_engine import TensorRTEngine` | `engines/__init__.py:43` | TensorRT runtime only available on Linux with NVIDIA drivers |
| 5 | Cython compilation | `setup.py build_ext --inplace` | `Dockerfile:8`, `setup.py` | Cython modules must be compiled natively for the host architecture |
For tests requiring TensorRT (GPU path):
- Deploy `detections` with `runtime: nvidia` and `NVIDIA_VISIBLE_DEVICES=all`
- The test suite has two profiles: `gpu` (TensorRT tests) and `cpu` (ONNX fallback tests)
- CPU-only tests run without GPU runtime, verifying ONNX fallback behavior
### Rationale
### Networks
The detection service uses a polymorphic engine factory (`engines/__init__.py:EngineClass`) that auto-selects the inference backend at startup:
| Network | Services | Purpose |
|---------|----------|---------|
| e2e-net | all | Isolated test network — all service-to-service communication via hostnames |
1. **TensorRT** — if NVIDIA GPU with CUDA compute ≥ 6.1 is detected
2. **CoreML** — if macOS arm64 with `coremltools` installed
3. **ONNX** — CPU-only fallback
### Volumes
Running in Docker on macOS means a Linux VM where neither CoreML nor TensorRT is available. The service falls back to the ONNX CPU engine — which is a valid but different code path. To test the actual engine used in the target deployment environment, tests must run locally.
| Volume | Mounted to | Purpose |
|--------|-----------|---------|
| test-models | mock-loader:/models | Pre-built ONNX model file for test inference |
| test-media | e2e-consumer:/media | Sample images and video files for detection requests |
| test-classes | detections:/app/classes.json | classes.json with 19 detection classes |
| test-results | e2e-consumer:/results | CSV test report output |
### Local Execution Instructions
### docker-compose structure
#### Prerequisites
```yaml
services:
mock-loader:
build: ./e2e/mocks/loader
ports: ["8080:8080"]
volumes:
- test-models:/models
networks: [e2e-net]
- macOS arm64 (Apple Silicon) with `coremltools` installed, OR Linux with NVIDIA GPU + TensorRT for the GPU path
- Python 3.11 with Cython modules compiled (`python setup.py build_ext --inplace`)
- `flask` and `gunicorn` installed (for mock services)
- `pytest`, `requests`, `sseclient-py`, `pytest-csv`, `pytest-timeout` installed (for test runner)
mock-annotations:
build: ./e2e/mocks/annotations
ports: ["8081:8081"]
networks: [e2e-net]
#### 1. Start mock services
detections:
build:
context: .
dockerfile: Dockerfile
ports: ["8000:8000"]
environment:
- LOADER_URL=http://mock-loader:8080
- ANNOTATIONS_URL=http://mock-annotations:8081
volumes:
- test-classes:/app/classes.json
depends_on:
- mock-loader
- mock-annotations
networks: [e2e-net]
# GPU profile adds: runtime: nvidia
Open two terminal tabs and start the mocks. The mock-loader serves ONNX/CoreML model files from `e2e/fixtures/models/` and the mock-annotations accepts detection result posts.
e2e-consumer:
build: ./e2e
volumes:
- test-media:/media
- test-results:/results
depends_on:
- detections
networks: [e2e-net]
command: pytest --csv=/results/report.csv
```bash
# Terminal 1 — mock-loader (port 8080)
cd e2e/mocks/loader
MODELS_ROOT=../../fixtures gunicorn -b 0.0.0.0:8080 -w 1 --timeout 120 app:app
volumes:
test-models:
test-media:
test-classes:
test-results:
networks:
e2e-net:
# Terminal 2 — mock-annotations (port 8081)
cd e2e/mocks/annotations
gunicorn -b 0.0.0.0:8081 -w 1 --timeout 120 app:app
```
#### 2. Start the detection service
```bash
# Terminal 3 — detections (port 8080 by default, override to 8000 to avoid mock conflict)
LOADER_URL=http://localhost:8080 \
ANNOTATIONS_URL=http://localhost:8081 \
python -m uvicorn main:app --host 0.0.0.0 --port 8000
```
#### 3. Run tests
```bash
# Terminal 4 — test runner
cd e2e
BASE_URL=http://localhost:8000 \
MOCK_LOADER_URL=http://localhost:8080 \
MOCK_ANNOTATIONS_URL=http://localhost:8081 \
MEDIA_DIR=./fixtures \
pytest tests/ -v --csv=./results/report.csv --timeout=300
```
#### Environment Variables
| Variable | Default | Purpose |
|----------|---------|---------|
| `BASE_URL` | `http://detections:8080` | Detection service URL (override to `http://localhost:8000` for local) |
| `MOCK_LOADER_URL` | `http://mock-loader:8080` | Mock loader URL (override to `http://localhost:8080` for local) |
| `MOCK_ANNOTATIONS_URL` | `http://mock-annotations:8081` | Mock annotations URL (override to `http://localhost:8081` for local) |
| `MEDIA_DIR` | `/media` | Path to test media fixtures (override to `./fixtures` for local) |
| `LOADER_URL` | `http://loader:8080` | Loader URL for detections service |
| `ANNOTATIONS_URL` | `http://annotations:8080` | Annotations URL for detections service |
## Services
| Service | Role | Local Port | Startup Command |
|---------|------|------------|-----------------|
| detections | System under test | 8000 | `uvicorn main:app --port 8000` |
| mock-loader | Serves model files, accepts engine uploads | 8080 | `gunicorn -b 0.0.0.0:8080 app:app` |
| mock-annotations | Accepts detection results, provides token refresh | 8081 | `gunicorn -b 0.0.0.0:8081 app:app` |
## Consumer Application
**Tech stack**: Python 3, pytest, requests, sseclient-py
**Entry point**: `pytest --csv=/results/report.csv`
**Entry point**: `pytest tests/ -v --csv=./results/report.csv`
### Communication with system under test
| Interface | Protocol | Endpoint | Authentication |
|-----------|----------|----------|----------------|
| Health check | HTTP GET | `http://detections:8000/health` | None |
| Single image detect | HTTP POST (multipart) | `http://detections:8000/detect` | None |
| Media detect | HTTP POST (JSON) | `http://detections:8000/detect/{media_id}` | Bearer JWT + x-refresh-token headers |
| SSE stream | HTTP GET (SSE) | `http://detections:8000/detect/stream` | None |
| Health check | HTTP GET | `{BASE_URL}/health` | None |
| Single image detect | HTTP POST (multipart) | `{BASE_URL}/detect` | None |
| Media detect | HTTP POST (JSON) | `{BASE_URL}/detect/{media_id}` | Bearer JWT + x-refresh-token headers |
| SSE stream | HTTP GET (SSE) | `{BASE_URL}/detect/stream` | None |
### What the consumer does NOT have access to
@@ -117,9 +119,10 @@ networks:
**Pipeline stage**: After unit tests, before deployment
**Gate behavior**: Block merge if any functional test fails; non-functional failures are warnings
**Timeout**: 15 minutes for CPU profile, 30 minutes for GPU profile
**Runner requirement**: macOS arm64 self-hosted runner (for CoreML path) or Linux with NVIDIA GPU (for TensorRT path)
## Reporting
**Format**: CSV
**Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
**Output path**: `/results/report.csv` (mounted volume → `./e2e-results/report.csv` on host)
**Output path**: `e2e/results/report.csv`
-4
View File
@@ -46,7 +46,3 @@ def test_ft_n_03_loader_error_mode_detect_does_not_500(
assert r.status_code != 500
@pytest.mark.skip(reason="Requires separate Docker profile without classes.json")
@pytest.mark.cpu
def test_ft_n_05_missing_classes_json_prevents_normal_operation():
pass
-61
View File
@@ -1,14 +1,9 @@
import json
import os
import threading
import time
import uuid
from concurrent.futures import ThreadPoolExecutor
import pytest
_MEDIA = os.environ.get("MEDIA_DIR", "/media")
def _percentile_ms(sorted_ms, p):
n = len(sorted_ms)
@@ -119,59 +114,3 @@ def test_nft_perf_03_tiling_overhead_large_image(
assert large_ms > small_ms - 500.0
@pytest.mark.skip(reason="video perf covered by test_ft_p09_sse_event_delivery")
@pytest.mark.slow
@pytest.mark.timeout(300)
def test_nft_perf_04_video_frame_rate_sse(
warm_engine,
http_client,
jwt_token,
sse_client_factory,
):
media_id = f"perf-sse-{uuid.uuid4().hex}"
body = {
"probability_threshold": 0.25,
"paths": [f"{_MEDIA}/video_test01.mp4"],
"frame_period_recognition": 4,
"frame_recognition_seconds": 2,
}
headers = {"Authorization": f"Bearer {jwt_token}"}
stamps = []
thread_exc = []
done = threading.Event()
def _listen():
try:
with sse_client_factory() as sse:
time.sleep(0.3)
for event in sse.events():
if not event.data or not str(event.data).strip():
continue
data = json.loads(event.data)
if data.get("mediaId") != media_id:
continue
stamps.append(time.monotonic())
if (
data.get("mediaStatus") == "AIProcessed"
and data.get("mediaPercent") == 100
):
break
except BaseException as e:
thread_exc.append(e)
finally:
done.set()
th = threading.Thread(target=_listen, daemon=True)
th.start()
time.sleep(0.5)
r = http_client.post(f"/detect/{media_id}", json=body, headers=headers)
assert r.status_code == 200
ok = done.wait(timeout=290)
assert ok
th.join(timeout=5)
assert not thread_exc
assert len(stamps) >= 2
span = stamps[-1] - stamps[0]
assert span <= 290.0
gaps = [stamps[i + 1] - stamps[i] for i in range(len(stamps) - 1)]
assert max(gaps) <= 30.0
-137
View File
@@ -1,27 +1,7 @@
import json
import os
import threading
import time
import uuid
import pytest
import requests
_DETECT_TIMEOUT = 60
_MEDIA = os.environ.get("MEDIA_DIR", "/media")
def _ai_config_video() -> dict:
return {
"probability_threshold": 0.25,
"tracking_intersection_threshold": 0.6,
"altitude": 400,
"focal_length": 24,
"sensor_width": 23.5,
"paths": [f"{_MEDIA}/video_test01.mp4"],
"frame_period_recognition": 4,
"frame_recognition_seconds": 2,
}
def test_ft_n_06_loader_unreachable_during_init_health(
@@ -44,62 +24,6 @@ def test_ft_n_06_loader_unreachable_during_init_health(
assert d.get("errorMessage") is None
@pytest.mark.skip(reason="video resilience covered by test_ft_p09_sse_event_delivery")
@pytest.mark.slow
@pytest.mark.timeout(300)
def test_ft_n_07_annotations_unreachable_detection_continues(
warm_engine,
http_client,
jwt_token,
mock_annotations_url,
sse_client_factory,
):
requests.post(
f"{mock_annotations_url}/mock/config", json={"mode": "error"}, timeout=10
).raise_for_status()
media_id = f"res-n07-{uuid.uuid4().hex}"
body = _ai_config_video()
headers = {"Authorization": f"Bearer {jwt_token}"}
collected = []
thread_exc = []
done = threading.Event()
def _listen():
try:
with sse_client_factory() as sse:
time.sleep(0.3)
for event in sse.events():
if not event.data or not str(event.data).strip():
continue
data = json.loads(event.data)
if data.get("mediaId") != media_id:
continue
collected.append(data)
if (
data.get("mediaStatus") == "AIProcessed"
and data.get("mediaPercent") == 100
):
break
except BaseException as e:
thread_exc.append(e)
finally:
done.set()
th = threading.Thread(target=_listen, daemon=True)
th.start()
time.sleep(0.5)
pr = http_client.post(f"/detect/{media_id}", json=body, headers=headers)
assert pr.status_code == 200
ok = done.wait(timeout=290)
assert ok
th.join(timeout=5)
assert not thread_exc
assert any(
e.get("mediaStatus") == "AIProcessed" and e.get("mediaPercent") == 100
for e in collected
)
def test_nft_res_01_loader_outage_after_init(
warm_engine, http_client, mock_loader_url, image_small
):
@@ -117,62 +41,6 @@ def test_nft_res_01_loader_outage_after_init(
assert hd.get("errorMessage") is None
@pytest.mark.skip(reason="Single video run — covered by test_ft_p09_sse_event_delivery")
@pytest.mark.slow
@pytest.mark.timeout(300)
def test_nft_res_02_annotations_outage_during_async_detection(
warm_engine,
http_client,
jwt_token,
mock_annotations_url,
sse_client_factory,
):
media_id = f"res-n02-{uuid.uuid4().hex}"
body = _ai_config_video()
headers = {"Authorization": f"Bearer {jwt_token}"}
collected = []
thread_exc = []
done = threading.Event()
def _listen():
try:
with sse_client_factory() as sse:
time.sleep(0.3)
for event in sse.events():
if not event.data or not str(event.data).strip():
continue
data = json.loads(event.data)
if data.get("mediaId") != media_id:
continue
collected.append(data)
if (
data.get("mediaStatus") == "AIProcessed"
and data.get("mediaPercent") == 100
):
break
except BaseException as e:
thread_exc.append(e)
finally:
done.set()
th = threading.Thread(target=_listen, daemon=True)
th.start()
time.sleep(0.5)
pr = http_client.post(f"/detect/{media_id}", json=body, headers=headers)
assert pr.status_code == 200
requests.post(
f"{mock_annotations_url}/mock/config", json={"mode": "error"}, timeout=10
).raise_for_status()
ok = done.wait(timeout=290)
assert ok
th.join(timeout=5)
assert not thread_exc
assert any(
e.get("mediaStatus") == "AIProcessed" and e.get("mediaPercent") == 100
for e in collected
)
def test_nft_res_03_transient_loader_first_fail(
mock_loader_url, http_client, image_small
):
@@ -188,8 +56,3 @@ def test_nft_res_03_transient_loader_first_fail(
assert r1.status_code != 500
@pytest.mark.skip(
reason="Requires docker compose restart capability not available in e2e-runner"
)
def test_nft_res_04_service_restart():
pass
-68
View File
@@ -1,78 +1,10 @@
import json
import re
import threading
import time
import uuid
from datetime import datetime
from pathlib import Path
import pytest
def _video_ai_body(video_path: str) -> dict:
return {
"probability_threshold": 0.25,
"tracking_intersection_threshold": 0.6,
"altitude": 400,
"focal_length": 24,
"sensor_width": 23.5,
"paths": [video_path],
"frame_period_recognition": 4,
"frame_recognition_seconds": 2,
}
@pytest.mark.skip(reason="Single video run — covered by test_ft_p09_sse_event_delivery")
@pytest.mark.slow
@pytest.mark.timeout(300)
def test_ft_n_08_nft_res_lim_02_sse_queue_bounded_best_effort(
warm_engine,
http_client,
jwt_token,
video_short_path,
sse_client_factory,
):
media_id = f"rlim-sse-{uuid.uuid4().hex}"
body = _video_ai_body(video_short_path)
headers = {"Authorization": f"Bearer {jwt_token}"}
collected: list[dict] = []
thread_exc: list[BaseException] = []
done = threading.Event()
def _listen():
try:
with sse_client_factory() as sse:
time.sleep(0.3)
for event in sse.events():
if not event.data or not str(event.data).strip():
continue
data = json.loads(event.data)
if data.get("mediaId") != media_id:
continue
collected.append(data)
if (
data.get("mediaStatus") == "AIProcessed"
and data.get("mediaPercent") == 100
):
break
except BaseException as e:
thread_exc.append(e)
finally:
done.set()
th = threading.Thread(target=_listen, daemon=True)
th.start()
time.sleep(0.5)
r = http_client.post(f"/detect/{media_id}", json=body, headers=headers)
assert r.status_code == 200
assert done.wait(timeout=290)
th.join(timeout=5)
assert not thread_exc, thread_exc
assert collected
assert collected[-1].get("mediaStatus") == "AIProcessed"
@pytest.mark.slow
@pytest.mark.timeout(120)
def test_nft_res_lim_03_max_detections_per_frame(
+1 -1
View File
@@ -4,7 +4,7 @@
{
"distutils": {
"include_dirs": [
"/Users/obezdienie001/dev/azaion/suite/detections/.venv-e2e/lib/python3.13/site-packages/numpy/_core/include"
"/Users/obezdienie001/dev/azaion/suite/detections/.venv/lib/python3.13/site-packages/numpy/_core/include"
],
"name": "engines.coreml_engine",
"sources": [
+1 -1
View File
@@ -4,7 +4,7 @@
{
"distutils": {
"include_dirs": [
"/Users/obezdienie001/dev/azaion/suite/detections/.venv-e2e/lib/python3.13/site-packages/numpy/_core/include"
"/Users/obezdienie001/dev/azaion/suite/detections/.venv/lib/python3.13/site-packages/numpy/_core/include"
],
"name": "engines.inference_engine",
"sources": [
+1 -1
View File
@@ -4,7 +4,7 @@
{
"distutils": {
"include_dirs": [
"/Users/obezdienie001/dev/azaion/suite/detections/.venv-e2e/lib/python3.13/site-packages/numpy/_core/include"
"/Users/obezdienie001/dev/azaion/suite/detections/.venv/lib/python3.13/site-packages/numpy/_core/include"
],
"name": "inference",
"sources": [