mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 22:56:31 +00:00
1.5 KiB
1.5 KiB
Module: onnx_engine
Purpose
ONNX Runtime-based inference engine — CPU/CUDA fallback when TensorRT is unavailable.
Public Interface
Class: OnnxEngine (extends InferenceEngine)
| Method | Signature | Description |
|---|---|---|
__init__ |
(bytes model_bytes, int batch_size=1, **kwargs) |
Loads ONNX model from bytes, creates InferenceSession with CUDA > CPU provider priority. Reads input shape and batch size from model metadata. |
get_input_shape |
() -> tuple |
Returns (height, width) from input tensor shape |
get_batch_size |
() -> int |
Returns batch size (from model if not dynamic, else from constructor) |
run |
(input_data) -> list |
Runs session inference, returns output tensors |
Internal Logic
- Provider order:
["CUDAExecutionProvider", "CPUExecutionProvider"]— ONNX Runtime selects the best available. - If the model's batch dimension is dynamic (-1), uses the constructor's
batch_sizeparameter. - Logs model input metadata and custom metadata map at init.
Dependencies
- External:
onnxruntime - Internal:
inference_engine(base class),constants_inf(logging)
Consumers
inference— instantiated when no compatible NVIDIA GPU is found
Data Models
None (wraps onnxruntime.InferenceSession).
Configuration
None.
External Integrations
None directly — model bytes are provided by caller (loaded via loader_http_client).
Security
None.
Tests
None found.