Component: API

Overview

Purpose: HTTP API layer exposing object detection capabilities via FastAPI — handles request/response serialization, async task management, SSE streaming, and authentication token forwarding.

Pattern: Controller layer — thin API surface that delegates all business logic to the Inference Pipeline.

Upstream: Inference Pipeline (Inference class), Domain (constants_inf for labels). Downstream: None (top-level, client-facing).

Modules

Module	Role
`main`	FastAPI app definition, endpoints, DTOs, TokenManager, SSE streaming

External API Specification

GET /health

Response: HealthResponse

{
  "status": "healthy",
  "aiAvailability": "Enabled",
  "errorMessage": null
}

aiAvailability values: None, Downloading, Converting, Uploading, Enabled, Warning, Error.

POST /detect

Input: Multipart form — file (image bytes), optional config (JSON string). Response: list[DetectionDto]

[
  {
    "centerX": 0.5,
    "centerY": 0.5,
    "width": 0.1,
    "height": 0.1,
    "classNum": 0,
    "label": "ArmorVehicle",
    "confidence": 0.85
  }
]

Errors: 400 (empty image / invalid data), 422 (runtime error), 503 (engine unavailable).

POST /detect/{media_id}

Input: Path param media_id, optional JSON body AIConfigDto, headers Authorization: Bearer {token}, x-refresh-token: {token}. Response: {"status": "started", "mediaId": "..."} (202-style). Errors: 409 (duplicate detection for same media_id). Side effects: Starts async detection task; results delivered via SSE stream and/or posted to Annotations service.

GET /detect/stream

Response: text/event-stream (SSE).

data: {"annotations": [...], "mediaId": "...", "mediaStatus": "AIProcessing", "mediaPercent": 50}

mediaStatus values: AIProcessing, AIProcessed, Error.

Data Access Patterns

In-memory state:
- _active_detections: dict[str, bool] — guards against duplicate media processing
- _event_queues: list[asyncio.Queue] — SSE client queues (maxsize=100)
No database access

Implementation Details

Inference is lazy-loaded on first use via get_inference() global function
ThreadPoolExecutor(max_workers=2) runs inference off the async event loop
SSE: one asyncio.Queue per connected client; events broadcast to all queues; full queues silently drop events
TokenManager decodes JWT exp from base64 payload (no signature verification), auto-refreshes 60s before expiry
detection_to_dto maps Detection fields to DetectionDto, looks up label from constants_inf.annotations_dict
Annotations posted to external service with base64-encoded frame image

Caveats

No CORS middleware configured
No rate limiting
No request body size limits beyond FastAPI defaults
_active_detections is an in-memory dict — not persistent across restarts, not distributed
SSE queue overflow silently drops events (QueueFull caught and ignored)
JWT token handling has no signature verification — relies entirely on the Annotations service for auth
No graceful shutdown handling for in-progress detections

Dependency Graph

graph TD
    main --> inference
    main --> constants_inf
    main --> loader_http_client

Logging Strategy

No explicit logging in main.py — errors are caught and returned as HTTP responses. Logging happens in downstream components.

3.4 KiB Raw Blame History