mirror of
https://github.com/azaion/detections.git
synced 2026-06-21 05:21:08 +00:00
This commit is contained in:
@@ -6,7 +6,6 @@ RUN apt-get update && apt-get install -y \
|
|||||||
python3 python3-pip python3-dev gcc \
|
python3 python3-pip python3-dev gcc \
|
||||||
libgl1 libglib2.0-0 \
|
libgl1 libglib2.0-0 \
|
||||||
python3-libnvinfer python3-libnvinfer-dev \
|
python3-libnvinfer python3-libnvinfer-dev \
|
||||||
python3-pycuda \
|
|
||||||
&& rm -rf /var/lib/apt/lists/*
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -18,7 +18,7 @@ The detection service cannot run on NVIDIA Jetson Orin Nano for two reasons:
|
|||||||
## Outcome
|
## Outcome
|
||||||
|
|
||||||
- A `Dockerfile.jetson` that builds and runs on Jetson Orin Nano (aarch64, JetPack 6.x)
|
- A `Dockerfile.jetson` that builds and runs on Jetson Orin Nano (aarch64, JetPack 6.x)
|
||||||
- A `requirements-jetson.txt` that installs Python dependencies without pip-installing tensorrt or pycuda
|
- A `requirements-jetson.txt` that installs Python dependencies without pip-installing tensorrt
|
||||||
- A `docker-compose.jetson.yml` with NVIDIA Container Runtime configuration
|
- A `docker-compose.jetson.yml` with NVIDIA Container Runtime configuration
|
||||||
- `convert_from_source()` in `tensorrt_engine.pyx` extended to accept an optional INT8 calibration cache path — if the cache is present, INT8 is used; otherwise FP16 fallback
|
- `convert_from_source()` in `tensorrt_engine.pyx` extended to accept an optional INT8 calibration cache path — if the cache is present, INT8 is used; otherwise FP16 fallback
|
||||||
- `init_ai()` in `inference.pyx` extended to try downloading the calibration cache from the Loader service before starting the conversion thread
|
- `init_ai()` in `inference.pyx` extended to try downloading the calibration cache from the Loader service before starting the conversion thread
|
||||||
@@ -28,7 +28,7 @@ The detection service cannot run on NVIDIA Jetson Orin Nano for two reasons:
|
|||||||
|
|
||||||
### Included
|
### Included
|
||||||
- `Dockerfile.jetson` using a JetPack 6.x L4T base image with pre-installed TensorRT and PyCUDA
|
- `Dockerfile.jetson` using a JetPack 6.x L4T base image with pre-installed TensorRT and PyCUDA
|
||||||
- `requirements-jetson.txt` derived from `requirements.txt`, excluding tensorrt and pycuda
|
- `requirements-jetson.txt` derived from `requirements.txt`, excluding tensorrt and installing PyCUDA via pip where the JetPack apt package is unavailable
|
||||||
- `docker-compose.jetson.yml` with `runtime: nvidia`
|
- `docker-compose.jetson.yml` with `runtime: nvidia`
|
||||||
- `tensorrt_engine.pyx`: extend `convert_from_source(bytes onnx_model, str calib_cache_path=None)` — set `INT8` flag and load cache when path is provided; fall back to FP16 when not
|
- `tensorrt_engine.pyx`: extend `convert_from_source(bytes onnx_model, str calib_cache_path=None)` — set `INT8` flag and load cache when path is provided; fall back to FP16 when not
|
||||||
- `inference.pyx`: extend `init_ai()` to attempt download of `azaion.int8_calib.cache` from Loader before spawning the conversion thread; pass the local path to `convert_from_source()`
|
- `inference.pyx`: extend `init_ai()` to attempt download of `azaion.int8_calib.cache` from Loader before spawning the conversion thread; pass the local path to `convert_from_source()`
|
||||||
@@ -101,7 +101,7 @@ Note: AC-2, AC-5, AC-6 require physical Jetson hardware and cannot run in standa
|
|||||||
|
|
||||||
## Constraints
|
## Constraints
|
||||||
|
|
||||||
- TensorRT and PyCUDA must NOT be pip-installed — provided by JetPack in the base image
|
- TensorRT must NOT be pip-installed — provided by JetPack in the base image. PyCUDA may be pip-installed on `l4t-jetpack:r36.4.0` because `python3-pycuda` is unavailable in the apt repositories.
|
||||||
- Base image must be a JetPack 6.x L4T image — not a generic CUDA image
|
- Base image must be a JetPack 6.x L4T image — not a generic CUDA image
|
||||||
- Calibration cache download failure must be non-fatal — log a warning and fall back to FP16
|
- Calibration cache download failure must be non-fatal — log a warning and fall back to FP16
|
||||||
- INT8 conversion and FP16 conversion produce different engine files (different filenames) so cached engines are not confused
|
- INT8 conversion and FP16 conversion produce different engine files (different filenames) so cached engines are not confused
|
||||||
@@ -114,7 +114,7 @@ Note: AC-2, AC-5, AC-6 require physical Jetson hardware and cannot run in standa
|
|||||||
|
|
||||||
**Risk 2: PyCUDA availability in base image**
|
**Risk 2: PyCUDA availability in base image**
|
||||||
- *Risk*: Some L4T images do not include pycuda
|
- *Risk*: Some L4T images do not include pycuda
|
||||||
- *Mitigation*: Fall back to `apt-get install python3-pycuda` or source build with `CUDA_ROOT` set
|
- *Mitigation*: Fall back to pip source build with `CUDA_ROOT` set when no `python3-pycuda` apt package is available
|
||||||
|
|
||||||
**Risk 3: INT8 accuracy degradation**
|
**Risk 3: INT8 accuracy degradation**
|
||||||
- *Risk*: Without a well-representative calibration dataset, mAP may drop >1 point
|
- *Risk*: Without a well-representative calibration dataset, mAP may drop >1 point
|
||||||
|
|||||||
@@ -121,7 +121,7 @@ Already exists: `e2e/docker-compose.test.yml`. No changes needed — supports bo
|
|||||||
|--------|--------------|
|
|--------|--------------|
|
||||||
| Base image | `nvcr.io/nvidia/l4t-jetpack:r36.4.0` (JetPack 6.2.x-compatible, aarch64) |
|
| Base image | `nvcr.io/nvidia/l4t-jetpack:r36.4.0` (JetPack 6.2.x-compatible, aarch64) |
|
||||||
| TensorRT | Pre-installed via JetPack — `python3-libnvinfer` apt package (NOT pip) |
|
| TensorRT | Pre-installed via JetPack — `python3-libnvinfer` apt package (NOT pip) |
|
||||||
| PyCUDA | Pre-installed via JetPack — `python3-pycuda` apt package (NOT pip) |
|
| PyCUDA | Installed via pip in `requirements-jetson.txt` because `python3-pycuda` is not available in the `l4t-jetpack:r36.4.0` apt repositories |
|
||||||
| Build stages | Single stage (Cython compile requires gcc) |
|
| Build stages | Single stage (Cython compile requires gcc) |
|
||||||
| Non-root user | `adduser --disabled-password --gecos '' appuser` + `USER appuser` |
|
| Non-root user | `adduser --disabled-password --gecos '' appuser` + `USER appuser` |
|
||||||
| Exposed ports | 8080 |
|
| Exposed ports | 8080 |
|
||||||
@@ -129,7 +129,7 @@ Already exists: `e2e/docker-compose.test.yml`. No changes needed — supports bo
|
|||||||
| Runtime | Requires NVIDIA Container Runtime (`runtime: nvidia` in docker-compose) |
|
| Runtime | Requires NVIDIA Container Runtime (`runtime: nvidia` in docker-compose) |
|
||||||
|
|
||||||
**Jetson-specific behaviour**:
|
**Jetson-specific behaviour**:
|
||||||
- `requirements-jetson.txt` derives from `requirements.txt` — `tensorrt` and `pycuda` are excluded from pip; TensorRT and PyCUDA are installed from the JetPack/L4T apt packages in `Dockerfile.jetson`
|
- `requirements-jetson.txt` derives from `requirements.txt` — `tensorrt` is excluded from pip and installed from the JetPack/L4T apt packages in `Dockerfile.jetson`; PyCUDA is installed via pip on this image line because the apt package is unavailable
|
||||||
- Engine filename auto-encodes CC+SM (e.g. `azaion.cc_8.7_sm_16.engine` for Orin Nano), ensuring the Jetson engine is distinct from any x86-cached engine
|
- Engine filename auto-encodes CC+SM (e.g. `azaion.cc_8.7_sm_16.engine` for Orin Nano), ensuring the Jetson engine is distinct from any x86-cached engine
|
||||||
- INT8 is used when `azaion.int8_calib.cache` is available on the Loader service; precision suffix appended to engine filename (`*.int8.engine`); FP16 fallback when cache is absent
|
- INT8 is used when `azaion.int8_calib.cache` is available on the Loader service; precision suffix appended to engine filename (`*.int8.engine`); FP16 fallback when cache is absent
|
||||||
- `docker-compose.jetson.yml` uses `runtime: nvidia` for the NVIDIA Container Runtime
|
- `docker-compose.jetson.yml` uses `runtime: nvidia` for the NVIDIA Container Runtime
|
||||||
|
|||||||
@@ -12,3 +12,4 @@ requests==2.32.4
|
|||||||
loguru==0.7.3
|
loguru==0.7.3
|
||||||
av==14.2.0
|
av==14.2.0
|
||||||
xxhash==3.5.0
|
xxhash==3.5.0
|
||||||
|
pycuda
|
||||||
|
|||||||
Reference in New Issue
Block a user