Files
gps-denied-onboard/docs/02_components/04_Model_Registry/spec.md
T
2025-11-19 23:07:29 +02:00

40 lines
1.8 KiB
Markdown

# Model Registry Component
## Detailed Description
The **Model Registry** is a centralized manager for all deep learning models (SuperPoint, LightGlue, AnyLoc, LiteSAM). It abstracts the loading mechanism, supporting both **TensorRT** (for production/GPU) and **PyTorch/ONNX** (for fallback/CPU/Sandbox).
It implements the "Factory" pattern, delivering initialized and ready-to-infer model wrappers to the Layer components. It also manages GPU resource allocation (e.g., memory growth).
## API Methods
### `load_model`
- **Input:** `model_name: str` (e.g., "superpoint"), `backend: str` ("tensorrt" | "pytorch" | "auto")
- **Output:** `ModelWrapper`
- **Description:** Loads the specified model weights.
- If `backend="auto"`, attempts TensorRT first; if fails (or no GPU), falls back to PyTorch.
- Returns a wrapper object that exposes a uniform `infer()` method regardless of backend.
- **Test Cases:**
- Load "superpoint", backend="pytorch" -> Success.
- Load invalid name -> Error.
- Load "tensorrt" on CPU machine -> Fallback or Error (depending on strictness).
### `unload_model`
- **Input:** `model_name: str`
- **Output:** `void`
- **Description:** Frees GPU/RAM resources associated with the model.
- **Test Cases:**
- Unload loaded model -> Memory released.
### `list_available_models`
- **Input:** `void`
- **Output:** `List[str]`
- **Description:** Returns list of models registered and available on disk.
## Integration Tests
- **Load All:** Iterate through all required models and verify they load successfully in the sandbox environment (likely PyTorch mode).
## Non-functional Tests
- **Warmup Time:** Measure time from `load_model` to first successful inference.
- **Switching Overhead:** Measure latency if models need to be swapped in/out of VRAM (though ideally all stay loaded).