Add detailed file index and enhance skill documentation for autopilot, decompose, deploy, plan, and research skills. Introduce tests-only mode in decompose skill, clarify required files for deploy and plan skills, and improve prerequisite checks across skills for better user guidance and workflow efficiency.

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-22 16:15:49 +02:00
parent 60ebe686ff
commit 3165a88f0b
60 changed files with 6324 additions and 1550 deletions
+51
View File
@@ -0,0 +1,51 @@
# Module: onnx_engine
## Purpose
ONNX Runtime-based inference engine — CPU/CUDA fallback when TensorRT is unavailable.
## Public Interface
### Class: OnnxEngine (extends InferenceEngine)
| Method | Signature | Description |
|--------|-----------|-------------|
| `__init__` | `(bytes model_bytes, int batch_size=1, **kwargs)` | Loads ONNX model from bytes, creates InferenceSession with CUDA > CPU provider priority. Reads input shape and batch size from model metadata. |
| `get_input_shape` | `() -> tuple` | Returns `(height, width)` from input tensor shape |
| `get_batch_size` | `() -> int` | Returns batch size (from model if not dynamic, else from constructor) |
| `run` | `(input_data) -> list` | Runs session inference, returns output tensors |
## Internal Logic
- Provider order: `["CUDAExecutionProvider", "CPUExecutionProvider"]` — ONNX Runtime selects the best available.
- If the model's batch dimension is dynamic (-1), uses the constructor's `batch_size` parameter.
- Logs model input metadata and custom metadata map at init.
## Dependencies
- **External**: `onnxruntime`
- **Internal**: `inference_engine` (base class), `constants_inf` (logging)
## Consumers
- `inference` — instantiated when no compatible NVIDIA GPU is found
## Data Models
None (wraps onnxruntime.InferenceSession).
## Configuration
None.
## External Integrations
None directly — model bytes are provided by caller (loaded via `loader_http_client`).
## Security
None.
## Tests
None found.