mirror of
https://github.com/azaion/detections-semantic.git
synced 2026-04-22 20:36:37 +00:00
Initial commit
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,104 @@
|
||||
# Containerization Plan
|
||||
|
||||
## Container Architecture
|
||||
|
||||
| Container | Base Image | Purpose | GPU Access |
|
||||
|-----------|-----------|---------|------------|
|
||||
| semantic-detection | nvcr.io/nvidia/l4t-tensorrt:r36.x (JetPack 6.2) | Main detection service (Cython + TRT + scan controller + gimbal + recorder) | Yes (TRT inference) |
|
||||
| vlm-service | dustynv/nanollm:r36 (NanoLLM for JetPack 6) | VLM inference (VILA1.5-3B, 4-bit MLC) | Yes (GPU inference) |
|
||||
|
||||
## Dockerfile: semantic-detection
|
||||
|
||||
```dockerfile
|
||||
# Outline — not runnable, for planning purposes
|
||||
FROM nvcr.io/nvidia/l4t-tensorrt:r36.x
|
||||
|
||||
# System dependencies
|
||||
RUN apt-get update && apt-get install -y python3.11 python3-pip libopencv-dev
|
||||
|
||||
# Python dependencies
|
||||
COPY requirements.txt .
|
||||
RUN pip3 install -r requirements.txt # pyserial, crcmod, scikit-image, pyyaml
|
||||
|
||||
# Cython build
|
||||
COPY src/ /app/src/
|
||||
RUN cd /app/src && python3 setup.py build_ext --inplace
|
||||
|
||||
# Config and models mounted as volumes
|
||||
VOLUME ["/models", "/etc/semantic-detection", "/data/output"]
|
||||
|
||||
ENTRYPOINT ["python3", "/app/src/main.py"]
|
||||
```
|
||||
|
||||
## Dockerfile: vlm-service
|
||||
|
||||
Uses NanoLLM pre-built Docker image. No custom Dockerfile needed — configuration via environment variables and volume mounts.
|
||||
|
||||
```yaml
|
||||
# docker-compose snippet
|
||||
vlm-service:
|
||||
image: dustynv/nanollm:r36
|
||||
runtime: nvidia
|
||||
environment:
|
||||
- MODEL=VILA1.5-3B
|
||||
- QUANTIZATION=w4a16
|
||||
volumes:
|
||||
- vlm-models:/models
|
||||
- vlm-socket:/tmp
|
||||
ipc: host
|
||||
shm_size: 8g
|
||||
```
|
||||
|
||||
## Volume Strategy
|
||||
|
||||
| Volume | Mount Point | Contents | Persistence |
|
||||
|--------|-----------|----------|-------------|
|
||||
| models | /models | TRT FP16 engines (yoloe-11s-seg.engine, yoloe-26s-seg.engine, mobilenetv3.engine) | Persistent on NVMe |
|
||||
| config | /etc/semantic-detection | config.yaml, class definitions | Persistent on NVMe |
|
||||
| output | /data/output | Detection logs, recorded frames, gimbal logs | Persistent on NVMe (circular buffer) |
|
||||
| vlm-models | /models (vlm-service) | VILA1.5-3B MLC weights | Persistent on NVMe |
|
||||
| vlm-socket | /tmp (both containers) | Unix domain socket for IPC | Ephemeral |
|
||||
|
||||
## GPU Sharing
|
||||
|
||||
Both containers share the same GPU. Sequential scheduling enforced at application level:
|
||||
- During Level 1: only semantic-detection uses GPU (YOLOE inference)
|
||||
- During Level 2 Tier 3: semantic-detection pauses YOLOE, vlm-service runs VLM inference
|
||||
- `--runtime=nvidia` on both containers, but application logic prevents concurrent GPU access
|
||||
|
||||
## Resource Limits
|
||||
|
||||
| Container | Memory Limit | CPU Limit | GPU |
|
||||
|-----------|-------------|-----------|-----|
|
||||
| semantic-detection | 4GB | No limit (all 6 cores available) | Shared |
|
||||
| vlm-service | 4GB | No limit | Shared |
|
||||
|
||||
Note: Limits are soft — shared LPDDR5 means actual allocation is dynamic. Application-level monitoring (HealthMonitor) tracks actual usage.
|
||||
|
||||
## Development Environment
|
||||
|
||||
```yaml
|
||||
# docker-compose.dev.yaml
|
||||
services:
|
||||
semantic-detection:
|
||||
build: .
|
||||
environment:
|
||||
- ENV=development
|
||||
- GIMBAL_MODE=mock_tcp
|
||||
- INFERENCE_ENGINE=onnxruntime
|
||||
volumes:
|
||||
- ./src:/app/src
|
||||
- ./config/config.dev.yaml:/etc/semantic-detection/config.yaml
|
||||
ports:
|
||||
- "8080:8080"
|
||||
|
||||
vlm-stub:
|
||||
build: ./tests/vlm_stub
|
||||
volumes:
|
||||
- vlm-socket:/tmp
|
||||
|
||||
mock-gimbal:
|
||||
build: ./tests/mock_gimbal
|
||||
ports:
|
||||
- "9090:9090"
|
||||
```
|
||||
Reference in New Issue
Block a user