Update .gitignore and refine documentation for execution environment

- Added Cython generated files to .gitignore to prevent unnecessary tracking. - Updated paths in `inference.c` and `coreml_engine.c` to reflect the correct virtual environment. - Revised the execution environment documentation to clarify hardware dependency checks and local execution instructions, ensuring accurate guidance for users. - Removed outdated Docker suitability checks and streamlined the assessment process for test execution environments.
2026-06-21 11:11:07 +00:00 · 2026-03-30 00:53:46 +03:00
parent 27f4aceb52
commit 5a968edcba
12 changed files with 328 additions and 401 deletions
@@ -5,104 +5,106 @@
 **System under test**: Azaion.Detections — FastAPI HTTP service exposing `POST /detect`, `POST /detect/{media_id}`, `GET /detect/stream`, `GET /health`
 **Consumer app purpose**: Standalone test runner that exercises the detection service through its public HTTP/SSE interfaces, validating black-box use cases without access to internals.

-## Docker Environment
+## Test Execution

-### Services
+**Decision**: local

-| Service | Image / Build | Purpose | Ports |
-|---------|--------------|---------|-------|
-| detections | Build from repo root (setup.py + Cython compile, uvicorn entrypoint) | System under test — the detection microservice | 8000:8000 |
-| mock-loader | Custom lightweight HTTP stub (Python/Node) | Mock of the Loader service — serves ONNX model files, accepts TensorRT uploads | 8080:8080 |
-| mock-annotations | Custom lightweight HTTP stub (Python/Node) | Mock of the Annotations service — accepts detection results, provides token refresh | 8081:8081 |
-| e2e-consumer | Build from `e2e/` directory | Black-box test runner (pytest) | — |
+### Hardware Dependencies Found

-### GPU Configuration
+| # | Category | Indicator | File | Detail |
+|---|----------|-----------|------|--------|
+| 1 | Apple CoreML | `import coremltools`, `ct.models.MLModel`, `ct.ComputeUnit.ALL` | `engines/coreml_engine.pyx:15-22` | CoreML engine requires macOS + Apple Silicon for Neural Engine / GPU inference |
+| 2 | Apple platform gate | `sys.platform == "darwin"`, `platform.machine() == "arm64"` | `engines/__init__.py:29` | Engine auto-selection branches on macOS arm64 |
+| 3 | NVIDIA GPU / CUDA | `import pynvml`, `nvmlDeviceGetCudaComputeCapability` | `engines/__init__.py:7-14` | TensorRT engine requires NVIDIA GPU with compute capability ≥ 6.1 |
+| 4 | TensorRT | `from engines.tensorrt_engine import TensorRTEngine` | `engines/__init__.py:43` | TensorRT runtime only available on Linux with NVIDIA drivers |
+| 5 | Cython compilation | `setup.py build_ext --inplace` | `Dockerfile:8`, `setup.py` | Cython modules must be compiled natively for the host architecture |

-For tests requiring TensorRT (GPU path):
- Deploy `detections` with `runtime: nvidia` and `NVIDIA_VISIBLE_DEVICES=all`
- The test suite has two profiles: `gpu` (TensorRT tests) and `cpu` (ONNX fallback tests)
- CPU-only tests run without GPU runtime, verifying ONNX fallback behavior
+### Rationale

-### Networks
+The detection service uses a polymorphic engine factory (`engines/__init__.py:EngineClass`) that auto-selects the inference backend at startup:

-| Network | Services | Purpose |
-|---------|----------|---------|
-| e2e-net | all | Isolated test network — all service-to-service communication via hostnames |
+1. **TensorRT** — if NVIDIA GPU with CUDA compute ≥ 6.1 is detected
+2. **CoreML** — if macOS arm64 with `coremltools` installed
+3. **ONNX** — CPU-only fallback

-### Volumes
+Running in Docker on macOS means a Linux VM where neither CoreML nor TensorRT is available. The service falls back to the ONNX CPU engine — which is a valid but different code path. To test the actual engine used in the target deployment environment, tests must run locally.

-| Volume | Mounted to | Purpose |
-|--------|-----------|---------|
-| test-models | mock-loader:/models | Pre-built ONNX model file for test inference |
-| test-media | e2e-consumer:/media | Sample images and video files for detection requests |
-| test-classes | detections:/app/classes.json | classes.json with 19 detection classes |
-| test-results | e2e-consumer:/results | CSV test report output |
+### Local Execution Instructions

-### docker-compose structure
+#### Prerequisites

-```yaml
-services:
-  mock-loader:
-    build: ./e2e/mocks/loader
-    ports: ["8080:8080"]
-    volumes:
-      - test-models:/models
-    networks: [e2e-net]
+- macOS arm64 (Apple Silicon) with `coremltools` installed, OR Linux with NVIDIA GPU + TensorRT for the GPU path
+- Python 3.11 with Cython modules compiled (`python setup.py build_ext --inplace`)
+- `flask` and `gunicorn` installed (for mock services)
+- `pytest`, `requests`, `sseclient-py`, `pytest-csv`, `pytest-timeout` installed (for test runner)

-  mock-annotations:
-    build: ./e2e/mocks/annotations
-    ports: ["8081:8081"]
-    networks: [e2e-net]
+#### 1. Start mock services

-  detections:
-    build:
-      context: .
-      dockerfile: Dockerfile
-    ports: ["8000:8000"]
-    environment:
-      - LOADER_URL=http://mock-loader:8080
-      - ANNOTATIONS_URL=http://mock-annotations:8081
-    volumes:
-      - test-classes:/app/classes.json
-    depends_on:
-      - mock-loader
-      - mock-annotations
-    networks: [e2e-net]
-    # GPU profile adds: runtime: nvidia
+Open two terminal tabs and start the mocks. The mock-loader serves ONNX/CoreML model files from `e2e/fixtures/models/` and the mock-annotations accepts detection result posts.

-  e2e-consumer:
-    build: ./e2e
-    volumes:
-      - test-media:/media
-      - test-results:/results
-    depends_on:
-      - detections
-    networks: [e2e-net]
-    command: pytest --csv=/results/report.csv
+```bash
+# Terminal 1 — mock-loader (port 8080)
+cd e2e/mocks/loader
+MODELS_ROOT=../../fixtures gunicorn -b 0.0.0.0:8080 -w 1 --timeout 120 app:app

-volumes:
-  test-models:
-  test-media:
-  test-classes:
-  test-results:
-
-networks:
-  e2e-net:
+# Terminal 2 — mock-annotations (port 8081)
+cd e2e/mocks/annotations
+gunicorn -b 0.0.0.0:8081 -w 1 --timeout 120 app:app
 ```

+#### 2. Start the detection service
+
+```bash
+# Terminal 3 — detections (port 8080 by default, override to 8000 to avoid mock conflict)
+LOADER_URL=http://localhost:8080 \
+ANNOTATIONS_URL=http://localhost:8081 \
+python -m uvicorn main:app --host 0.0.0.0 --port 8000
+```
+
+#### 3. Run tests
+
+```bash
+# Terminal 4 — test runner
+cd e2e
+BASE_URL=http://localhost:8000 \
+MOCK_LOADER_URL=http://localhost:8080 \
+MOCK_ANNOTATIONS_URL=http://localhost:8081 \
+MEDIA_DIR=./fixtures \
+pytest tests/ -v --csv=./results/report.csv --timeout=300
+```
+
+#### Environment Variables
+
+| Variable | Default | Purpose |
+|----------|---------|---------|
+| `BASE_URL` | `http://detections:8080` | Detection service URL (override to `http://localhost:8000` for local) |
+| `MOCK_LOADER_URL` | `http://mock-loader:8080` | Mock loader URL (override to `http://localhost:8080` for local) |
+| `MOCK_ANNOTATIONS_URL` | `http://mock-annotations:8081` | Mock annotations URL (override to `http://localhost:8081` for local) |
+| `MEDIA_DIR` | `/media` | Path to test media fixtures (override to `./fixtures` for local) |
+| `LOADER_URL` | `http://loader:8080` | Loader URL for detections service |
+| `ANNOTATIONS_URL` | `http://annotations:8080` | Annotations URL for detections service |
+
+## Services
+
+| Service | Role | Local Port | Startup Command |
+|---------|------|------------|-----------------|
+| detections | System under test | 8000 | `uvicorn main:app --port 8000` |
+| mock-loader | Serves model files, accepts engine uploads | 8080 | `gunicorn -b 0.0.0.0:8080 app:app` |
+| mock-annotations | Accepts detection results, provides token refresh | 8081 | `gunicorn -b 0.0.0.0:8081 app:app` |
+
 ## Consumer Application

 **Tech stack**: Python 3, pytest, requests, sseclient-py
-**Entry point**: `pytest --csv=/results/report.csv`
+**Entry point**: `pytest tests/ -v --csv=./results/report.csv`

 ### Communication with system under test

 | Interface | Protocol | Endpoint | Authentication |
 |-----------|----------|----------|----------------|
-| Health check | HTTP GET | `http://detections:8000/health` | None |
-| Single image detect | HTTP POST (multipart) | `http://detections:8000/detect` | None |
-| Media detect | HTTP POST (JSON) | `http://detections:8000/detect/{media_id}` | Bearer JWT + x-refresh-token headers |
-| SSE stream | HTTP GET (SSE) | `http://detections:8000/detect/stream` | None |
+| Health check | HTTP GET | `{BASE_URL}/health` | None |
+| Single image detect | HTTP POST (multipart) | `{BASE_URL}/detect` | None |
+| Media detect | HTTP POST (JSON) | `{BASE_URL}/detect/{media_id}` | Bearer JWT + x-refresh-token headers |
+| SSE stream | HTTP GET (SSE) | `{BASE_URL}/detect/stream` | None |

 ### What the consumer does NOT have access to

@@ -117,9 +119,10 @@ networks:
 **Pipeline stage**: After unit tests, before deployment
 **Gate behavior**: Block merge if any functional test fails; non-functional failures are warnings
 **Timeout**: 15 minutes for CPU profile, 30 minutes for GPU profile
+**Runner requirement**: macOS arm64 self-hosted runner (for CoreML path) or Linux with NVIDIA GPU (for TensorRT path)

 ## Reporting

 **Format**: CSV
 **Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
-**Output path**: `/results/report.csv` (mounted volume → `./e2e-results/report.csv` on host)
+**Output path**: `e2e/results/report.csv`