detections/_docs/01_solution/solution.md

# Azaion.Detections — Solution

## 1. Product Solution Description

Azaion.Detections is a microservice that performs automated object detection on aerial imagery and video. It accepts media via HTTP API, runs inference through ONNX Runtime or TensorRT engines, and returns structured detection results (bounding boxes, class labels, confidence scores). Results are delivered synchronously for single images, or streamed via SSE for batch/video media processing.

```mermaid
graph LR
    Client["Client App"] -->|HTTP| API["FastAPI API"]
    API -->|delegates| INF["Inference Pipeline"]
    INF -->|runs| ENG["ONNX / TensorRT Engine"]
    INF -->|downloads models| LDR["Loader Service"]
    API -->|posts results| ANN["Annotations Service"]
    API -->|streams| SSE["SSE Clients"]
```

## 2. Architecture

### Component Architecture

| Component | Modules | Responsibility |
|-----------|---------|---------------|
| Domain | constants_inf, ai_config, ai_availability_status, annotation | Shared data models, constants, logging, class registry |
| Inference Engines | inference_engine, onnx_engine, tensorrt_engine | Pluggable ML backends (Strategy pattern) |
| Inference Pipeline | inference, loader_http_client | Engine lifecycle, preprocessing, postprocessing, media processing |
| API | main | HTTP endpoints, SSE streaming, auth token forwarding |

### Solution Assessment

| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|----------|-------|-----------|-------------|-------------|----------|------|-----|
| Cython inference pipeline | Python 3, Cython 3.1.3, OpenCV 4.10 | Near-C performance for tight detection loops while retaining Python ecosystem | Build complexity, limited IDE/debug support | Compilation step via setup.py | N/A | Low (open-source) | High — critical for postprocessing throughput |
| Dual engine strategy (TensorRT + ONNX) | TensorRT 10.11, ONNX Runtime 1.22 | Maximum GPU speed with CPU fallback; auto-conversion and caching | Two code paths; GPU-specific engine files not portable | NVIDIA GPU (CC ≥ 6.1) for TensorRT | N/A | TensorRT free for NVIDIA GPUs | High — balances performance and portability |
| FastAPI HTTP service | FastAPI, Uvicorn, Pydantic | Async SSE, auto-generated docs, fast development | Sync inference offloaded to ThreadPoolExecutor (2 workers) | Python 3.8+ | Bearer token pass-through | Low (open-source) | High — fits async streaming + sync inference pattern |
| GSD-based image tiling | OpenCV, NumPy | Preserves small object detail in large aerial images | Complex tile dedup logic; overlap increases compute | Camera metadata (altitude, focal length, sensor width) | N/A | Compute cost scales with image size | High — essential for aerial imagery use case |
| Lazy engine initialization | pynvml, threading | Fast API startup; background model conversion | First request has high latency; engine may be unavailable | None | N/A | N/A | High — prevents blocking startup on slow model download/conversion |

## 3. Testing Strategy

### Current State

No tests found in the codebase. No test directories, test frameworks, or test runner configurations exist.

### Observed Validation Mechanisms

- Detection confidence threshold filtering (`probability_threshold`)
- Overlapping detection removal (containment-biased NMS)
- Physical size filtering via ground sampling distance and max_object_size_meters
- Tile deduplication via coordinate proximity
- Video annotation validity heuristics (time gap, movement, confidence)
- AI availability status tracking with error states

## 4. References

| Artifact | Path | Description |
|----------|------|-------------|
| FastAPI application | `main.py` | API endpoints, DTOs, SSE streaming |
| Inference orchestrator | `inference.pyx` / `.pxd` | Core pipeline logic |
| Engine interface | `inference_engine.pyx` / `.pxd` | Abstract base class |
| ONNX engine | `onnx_engine.pyx` | CPU/CUDA inference |
| TensorRT engine | `tensorrt_engine.pyx` / `.pxd` | GPU inference + conversion |
| Detection models | `annotation.pyx` / `.pxd` | Detection and Annotation classes |
| Configuration | `ai_config.pyx` / `.pxd` | AIRecognitionConfig |
| Status tracking | `ai_availability_status.pyx` / `.pxd` | Engine lifecycle status |
| Constants & logging | `constants_inf.pyx` / `.pxd` | Constants, class registry, logging |
| HTTP client | `loader_http_client.py` | Model download/upload |
| Class definitions | `classes.json` | 19 detection classes with metadata |
| Build config | `setup.py` | Cython compilation |
| CPU dependencies | `requirements.txt` | Python package versions |
| GPU dependencies | `requirements-gpu.txt` | TensorRT, PyCUDA additions |