# Component: API ## Overview **Purpose**: HTTP API layer exposing object detection capabilities via FastAPI — handles request/response serialization, async task management, SSE streaming, and authentication token forwarding. **Pattern**: Controller layer — thin API surface that delegates all business logic to the Inference Pipeline. **Upstream**: Inference Pipeline (Inference class), Domain (constants_inf for labels). **Downstream**: None (top-level, client-facing). ## Modules | Module | Role | |--------|------| | `main` | FastAPI app definition, endpoints, DTOs, TokenManager, SSE streaming | ## External API Specification ### GET /health **Response**: `HealthResponse` ```json { "status": "healthy", "aiAvailability": "Enabled", "errorMessage": null } ``` `aiAvailability` values: None, Downloading, Converting, Uploading, Enabled, Warning, Error. ### POST /detect **Input**: Multipart form — `file` (image bytes), optional `config` (JSON string). **Response**: `list[DetectionDto]` ```json [ { "centerX": 0.5, "centerY": 0.5, "width": 0.1, "height": 0.1, "classNum": 0, "label": "ArmorVehicle", "confidence": 0.85 } ] ``` **Errors**: 400 (empty image / invalid data), 422 (runtime error), 503 (engine unavailable). ### POST /detect/{media_id} **Input**: Path param `media_id`, optional JSON body `AIConfigDto`, headers `Authorization: Bearer {token}`, `x-refresh-token: {token}`. **Response**: `{"status": "started", "mediaId": "..."}` (202-style). **Errors**: 409 (duplicate detection for same media_id). **Side effects**: Starts async detection task; results delivered via SSE stream and/or posted to Annotations service. ### GET /detect/stream **Response**: `text/event-stream` (SSE). ``` data: {"annotations": [...], "mediaId": "...", "mediaStatus": "AIProcessing", "mediaPercent": 50} ``` `mediaStatus` values: AIProcessing, AIProcessed, Error. ## Data Access Patterns - In-memory state: - `_active_detections: dict[str, bool]` — guards against duplicate media processing - `_event_queues: list[asyncio.Queue]` — SSE client queues (maxsize=100) - No database access ## Implementation Details - `Inference` is lazy-loaded on first use via `get_inference()` global function - `ThreadPoolExecutor(max_workers=2)` runs inference off the async event loop - SSE: one `asyncio.Queue` per connected client; events broadcast to all queues; full queues silently drop events - `TokenManager` decodes JWT exp from base64 payload (no signature verification), auto-refreshes 60s before expiry - `detection_to_dto` maps Detection fields to DetectionDto, looks up label from `constants_inf.annotations_dict` - Annotations posted to external service with base64-encoded frame image ## Caveats - No CORS middleware configured - No rate limiting - No request body size limits beyond FastAPI defaults - `_active_detections` is an in-memory dict — not persistent across restarts, not distributed - SSE queue overflow silently drops events (QueueFull caught and ignored) - JWT token handling has no signature verification — relies entirely on the Annotations service for auth - No graceful shutdown handling for in-progress detections ## Dependency Graph ```mermaid graph TD main --> inference main --> constants_inf main --> loader_http_client ``` ## Logging Strategy No explicit logging in main.py — errors are caught and returned as HTTP responses. Logging happens in downstream components.