mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 21:56:33 +00:00
52 lines
1.5 KiB
Markdown
52 lines
1.5 KiB
Markdown
# Module: onnx_engine
|
|
|
|
## Purpose
|
|
|
|
ONNX Runtime-based inference engine — CPU/CUDA fallback when TensorRT is unavailable.
|
|
|
|
## Public Interface
|
|
|
|
### Class: OnnxEngine (extends InferenceEngine)
|
|
|
|
| Method | Signature | Description |
|
|
|--------|-----------|-------------|
|
|
| `__init__` | `(bytes model_bytes, int batch_size=1, **kwargs)` | Loads ONNX model from bytes, creates InferenceSession with CUDA > CPU provider priority. Reads input shape and batch size from model metadata. |
|
|
| `get_input_shape` | `() -> tuple` | Returns `(height, width)` from input tensor shape |
|
|
| `get_batch_size` | `() -> int` | Returns batch size (from model if not dynamic, else from constructor) |
|
|
| `run` | `(input_data) -> list` | Runs session inference, returns output tensors |
|
|
|
|
## Internal Logic
|
|
|
|
- Provider order: `["CUDAExecutionProvider", "CPUExecutionProvider"]` — ONNX Runtime selects the best available.
|
|
- If the model's batch dimension is dynamic (-1), uses the constructor's `batch_size` parameter.
|
|
- Logs model input metadata and custom metadata map at init.
|
|
|
|
## Dependencies
|
|
|
|
- **External**: `onnxruntime`
|
|
- **Internal**: `inference_engine` (base class), `constants_inf` (logging)
|
|
|
|
## Consumers
|
|
|
|
- `inference` — instantiated when no compatible NVIDIA GPU is found
|
|
|
|
## Data Models
|
|
|
|
None (wraps onnxruntime.InferenceSession).
|
|
|
|
## Configuration
|
|
|
|
None.
|
|
|
|
## External Integrations
|
|
|
|
None directly — model bytes are provided by caller (loaded via `loader_http_client`).
|
|
|
|
## Security
|
|
|
|
None.
|
|
|
|
## Tests
|
|
|
|
None found.
|