# Component: API

## Overview

**Purpose**: HTTP API layer exposing object detection capabilities via FastAPI — handles request/response serialization, async task management, SSE streaming, and authentication token forwarding.

**Pattern**: Controller layer — thin API surface that delegates all business logic to the Inference Pipeline.

**Upstream**: Inference Pipeline (Inference class), Domain (constants_inf for labels).
**Downstream**: None (top-level, client-facing).

## Modules

| Module | Role |
|--------|------|
| `main` | FastAPI app definition, endpoints, DTOs, TokenManager, SSE streaming |

## External API Specification

### GET /health

**Response**: `HealthResponse`
```json
{
  "status": "healthy",
  "aiAvailability": "Enabled",
  "errorMessage": null
}
```
`aiAvailability` values: None, Downloading, Converting, Uploading, Enabled, Warning, Error.

### POST /detect

**Input**: Multipart form — `file` (image bytes), optional `config` (JSON string).
**Response**: `list[DetectionDto]`
```json
[
  {
    "centerX": 0.5,
    "centerY": 0.5,
    "width": 0.1,
    "height": 0.1,
    "classNum": 0,
    "label": "ArmorVehicle",
    "confidence": 0.85
  }
]
```
**Errors**: 400 (empty image / invalid data), 422 (runtime error), 503 (engine unavailable).

### POST /detect/{media_id}

**Input**: Path param `media_id`, optional JSON body `AIConfigDto`, headers `Authorization: Bearer {token}`, `x-refresh-token: {token}`.
**Response**: `{"status": "started", "mediaId": "..."}` (202-style).
**Errors**: 409 (duplicate detection for same media_id).
**Side effects**: Starts async detection task; results delivered via SSE stream and/or posted to Annotations service.

### GET /detect/stream

**Response**: `text/event-stream` (SSE).
```
data: {"annotations": [...], "mediaId": "...", "mediaStatus": "AIProcessing", "mediaPercent": 50}
```
`mediaStatus` values: AIProcessing, AIProcessed, Error.

## Data Access Patterns

- In-memory state:
  - `_active_detections: dict[str, bool]` — guards against duplicate media processing
  - `_event_queues: list[asyncio.Queue]` — SSE client queues (maxsize=100)
- No database access

## Implementation Details

- `Inference` is lazy-loaded on first use via `get_inference()` global function
- `ThreadPoolExecutor(max_workers=2)` runs inference off the async event loop
- SSE: one `asyncio.Queue` per connected client; events broadcast to all queues; full queues silently drop events
- `TokenManager` decodes JWT exp from base64 payload (no signature verification), auto-refreshes 60s before expiry
- `detection_to_dto` maps Detection fields to DetectionDto, looks up label from `constants_inf.annotations_dict`
- Annotations posted to external service with base64-encoded frame image

## Caveats

- No CORS middleware configured
- No rate limiting
- No request body size limits beyond FastAPI defaults
- `_active_detections` is an in-memory dict — not persistent across restarts, not distributed
- SSE queue overflow silently drops events (QueueFull caught and ignored)
- JWT token handling has no signature verification — relies entirely on the Annotations service for auth
- No graceful shutdown handling for in-progress detections

## Dependency Graph

```mermaid
graph TD
    main --> inference
    main --> constants_inf
    main --> loader_http_client
```

## Logging Strategy

No explicit logging in main.py — errors are caught and returned as HTTP responses. Logging happens in downstream components.