mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 22:16:31 +00:00
2149cd6c08
- Dockerfile.jetson: JetPack 6.x L4T base image (aarch64), TensorRT and PyCUDA from apt - requirements-jetson.txt: derived from requirements.txt, no pip tensorrt/pycuda - docker-compose.jetson.yml: runtime: nvidia for NVIDIA Container Runtime - tensorrt_engine.pyx: convert_from_source accepts optional calib_cache_path; INT8 used when cache present, FP16 fallback; get_engine_filename encodes precision suffix to avoid engine cache confusion - inference.pyx: init_ai tries INT8 engine then FP16 on lookup; downloads calibration cache before conversion thread; passes cache path through to convert_from_source - constants_inf: add INT8_CALIB_CACHE_FILE constant - Unit tests for AC-3 (INT8 flag set when cache provided) and AC-4 (FP16 when no cache) Made-with: Cursor
150 lines
4.5 KiB
Markdown
150 lines
4.5 KiB
Markdown
# Containerization Plan
|
|
|
|
## Image Variants
|
|
|
|
### detections-cpu (Dockerfile)
|
|
|
|
| Aspect | Specification |
|
|
|--------|--------------|
|
|
| Base image | `python:3.11-slim` (pinned digest recommended) |
|
|
| Build stages | Single stage (Cython compile requires gcc at runtime for setup.py) |
|
|
| Non-root user | `adduser --disabled-password --gecos '' appuser` + `USER appuser` |
|
|
| Health check | `HEALTHCHECK --interval=30s --timeout=5s CMD curl -f http://localhost:8080/health \|\| exit 1` |
|
|
| Exposed ports | 8080 |
|
|
| Entrypoint | `uvicorn main:app --host 0.0.0.0 --port 8080` |
|
|
|
|
**Changes needed to existing Dockerfile**:
|
|
1. Add non-root user (security finding F7)
|
|
2. Add HEALTHCHECK directive
|
|
3. Pin `python:3.11-slim` to specific digest
|
|
4. Add `curl` to apt-get install (for health check)
|
|
|
|
### detections-gpu (Dockerfile.gpu)
|
|
|
|
| Aspect | Specification |
|
|
|--------|--------------|
|
|
| Base image | `nvidia/cuda:12.2.0-runtime-ubuntu22.04` |
|
|
| Build stages | Single stage |
|
|
| Non-root user | `adduser --disabled-password --gecos '' appuser` + `USER appuser` |
|
|
| Health check | `HEALTHCHECK --interval=30s --timeout=5s CMD curl -f http://localhost:8080/health \|\| exit 1` |
|
|
| Exposed ports | 8080 |
|
|
| Entrypoint | `uvicorn main:app --host 0.0.0.0 --port 8080` |
|
|
| Runtime | Requires `--runtime=nvidia` or `nvidia` runtime in Docker |
|
|
|
|
**Changes needed to existing Dockerfile.gpu**:
|
|
1. Add non-root user
|
|
2. Add HEALTHCHECK directive
|
|
3. Add `curl` to apt-get install
|
|
|
|
### .dockerignore
|
|
|
|
```
|
|
.git
|
|
.gitignore
|
|
_docs/
|
|
_standalone/
|
|
e2e/
|
|
tests/
|
|
*.md
|
|
.env
|
|
.env.*
|
|
.cursor/
|
|
.venv/
|
|
venv/
|
|
__pycache__/
|
|
*.pyc
|
|
build/
|
|
dist/
|
|
*.egg-info
|
|
Logs/
|
|
```
|
|
|
|
## Docker Compose — Local Development
|
|
|
|
`docker-compose.yml` (already partially exists as `e2e/docker-compose.mocks.yml`):
|
|
|
|
```yaml
|
|
name: detections-dev
|
|
|
|
services:
|
|
mock-loader:
|
|
build: ./e2e/mocks/loader
|
|
ports:
|
|
- "18080:8080"
|
|
volumes:
|
|
- ./e2e/fixtures:/models
|
|
networks:
|
|
- dev-net
|
|
|
|
mock-annotations:
|
|
build: ./e2e/mocks/annotations
|
|
ports:
|
|
- "18081:8081"
|
|
networks:
|
|
- dev-net
|
|
|
|
detections:
|
|
build:
|
|
context: .
|
|
dockerfile: Dockerfile
|
|
ports:
|
|
- "8080:8080"
|
|
depends_on:
|
|
- mock-loader
|
|
- mock-annotations
|
|
env_file: .env
|
|
environment:
|
|
LOADER_URL: http://mock-loader:8080
|
|
ANNOTATIONS_URL: http://mock-annotations:8081
|
|
volumes:
|
|
- ./e2e/fixtures/classes.json:/app/classes.json:ro
|
|
- detections-logs:/app/Logs
|
|
shm_size: 512m
|
|
networks:
|
|
- dev-net
|
|
|
|
volumes:
|
|
detections-logs:
|
|
|
|
networks:
|
|
dev-net:
|
|
driver: bridge
|
|
```
|
|
|
|
## Docker Compose — Blackbox Tests
|
|
|
|
Already exists: `e2e/docker-compose.test.yml`. No changes needed — supports both `cpu` and `gpu` profiles with mock services and test runner.
|
|
|
|
### detections-jetson (Dockerfile.jetson)
|
|
|
|
| Aspect | Specification |
|
|
|--------|--------------|
|
|
| Base image | `nvcr.io/nvidia/l4t-base:r36.3.0` (JetPack 6.x, aarch64) |
|
|
| TensorRT | Pre-installed via JetPack — `python3-libnvinfer` apt package (NOT pip) |
|
|
| PyCUDA | Pre-installed via JetPack — `python3-pycuda` apt package (NOT pip) |
|
|
| Build stages | Single stage (Cython compile requires gcc) |
|
|
| Non-root user | `adduser --disabled-password --gecos '' appuser` + `USER appuser` |
|
|
| Exposed ports | 8080 |
|
|
| Entrypoint | `uvicorn main:app --host 0.0.0.0 --port 8080` |
|
|
| Runtime | Requires NVIDIA Container Runtime (`runtime: nvidia` in docker-compose) |
|
|
|
|
**Jetson-specific behaviour**:
|
|
- `requirements-jetson.txt` derives from `requirements.txt` — `tensorrt` and `pycuda` are excluded from pip and provided by JetPack
|
|
- Engine filename auto-encodes CC+SM (e.g. `azaion.cc_8.7_sm_16.engine` for Orin Nano), ensuring the Jetson engine is distinct from any x86-cached engine
|
|
- INT8 is used when `azaion.int8_calib.cache` is available on the Loader service; precision suffix appended to engine filename (`*.int8.engine`); FP16 fallback when cache is absent
|
|
- `docker-compose.jetson.yml` uses `runtime: nvidia` for the NVIDIA Container Runtime
|
|
|
|
**Compose usage on Jetson**:
|
|
```bash
|
|
docker compose -f docker-compose.jetson.yml up
|
|
```
|
|
|
|
## Image Tagging Strategy
|
|
|
|
| Context | Tag Format | Example |
|
|
|---------|------------|---------|
|
|
| CI builds | `<registry>/azaion/detections-cpu:<git-sha>` | `registry.example.com/azaion/detections-cpu:a1b2c3d` |
|
|
| CI builds (GPU) | `<registry>/azaion/detections-gpu:<git-sha>` | `registry.example.com/azaion/detections-gpu:a1b2c3d` |
|
|
| Local development | `detections-cpu:dev` | — |
|
|
| Latest stable | `<registry>/azaion/detections-cpu:latest` | Updated on merge to main |
|