Files
detections/_docs/02_document/components/04_api/description.md
T
Oleksandr Bezdieniezhnykh be4cab4fcb [AZ-178] Implement streaming video detection endpoint
- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive.
- Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object.
- Updated media hashing to include a new function for computing hashes directly from files with minimal I/O.
- Enhanced documentation to reflect changes in video processing and API behavior.

Made-with: Cursor
2026-04-01 03:11:43 +03:00

5.7 KiB

Component: API

Overview

Purpose: HTTP API layer exposing object detection capabilities via FastAPI — handles request/response serialization, async task management, SSE streaming, media lifecycle management, DB-driven configuration, and authentication token forwarding.

Pattern: Controller layer — thin API surface that delegates inference to the Inference Pipeline and manages media records via the Annotations service.

Upstream: Inference Pipeline (Inference class), Domain (constants_inf for labels), Annotations service (AI settings, media records). Downstream: None (top-level, client-facing).

Modules

Module Role
main FastAPI app definition, endpoints, DTOs, TokenManager, SSE streaming, media lifecycle, DB-driven config resolution
streaming_buffer File-like object for concurrent write+read — enables true streaming video detection (AZ-178)

External API Specification

GET /health

Response: HealthResponse

{
  "status": "healthy",
  "aiAvailability": "Enabled",
  "engineType": "onnx",
  "errorMessage": null
}

aiAvailability values: None, Downloading, Converting, Uploading, Enabled, Warning, Error.

POST /detect

Input: Multipart form — file (image or video bytes), optional config (JSON string), optional auth headers. Response: list[DetectionDto]

[
  {
    "centerX": 0.5,
    "centerY": 0.5,
    "width": 0.1,
    "height": 0.1,
    "classNum": 0,
    "label": "ArmorVehicle",
    "confidence": 0.85
  }
]

Behavior (AZ-173, AZ-175): Accepts both images and videos. Detects upload kind by extension, falls back to content probing. If authenticated: computes content hash, persists to storage, creates media record, tracks status lifecycle (New → AI Processing → AI Processed / Error). Errors: 400 (empty/invalid image data), 422 (runtime error), 503 (engine unavailable).

POST /detect/video

Input: Raw binary body (not multipart). Headers: X-Filename (e.g. clip.mp4), optional X-Config (JSON), optional Authorization: Bearer {token}, optional X-Refresh-Token. Response: {"status": "started", "mediaId": "..."} Behavior (AZ-178): True streaming video detection. Bypasses Starlette multipart buffering by accepting raw body via request.stream(). Creates a StreamingBuffer (temp file), starts inference thread immediately, feeds HTTP chunks to the buffer as they arrive. PyAV reads from the buffer concurrently, decoding frames and running inference. Detections broadcast via SSE in real-time during upload. After upload: computes content hash from file (3 KB I/O), renames to permanent path, creates media record if authenticated. Errors: 400 (non-video extension).

POST /detect/{media_id}

Input: Path param media_id, optional JSON body AIConfigDto, headers Authorization: Bearer {token}, x-refresh-token: {token}. Response: {"status": "started", "mediaId": "..."} (202-style). Behavior (AZ-174): Resolves media path and AI settings from Annotations service. Merges DB settings with client overrides. Reads file bytes from resolved path, dispatches run_detect_image or run_detect_video. Errors: 409 (duplicate detection), 503 (media path not resolved).

GET /detect/stream

Response: text/event-stream (SSE).

data: {"annotations": [...], "mediaId": "...", "mediaStatus": "AIProcessing", "mediaPercent": 50}

mediaStatus values: AIProcessing, AIProcessed, Error.

Data Access Patterns

  • In-memory state:
    • _active_detections: dict[str, asyncio.Task] — guards against duplicate media processing
    • _event_queues: list[asyncio.Queue] — SSE client queues (maxsize=100)
  • Persistent media storage to VIDEOS_DIR / IMAGES_DIR (AZ-175)
  • No direct database access — media records managed via Annotations service HTTP API

Implementation Details

  • Inference is lazy-loaded on first use via get_inference() global function
  • ThreadPoolExecutor(max_workers=2) runs inference off the async event loop
  • SSE: one asyncio.Queue per connected client; events broadcast to all queues; full queues silently drop events
  • TokenManager decodes JWT exp from base64 payload (no signature verification), auto-refreshes 60s before expiry
  • TokenManager.decode_user_id extracts user identity from multiple JWT claim formats (sub, userId, nameid, SAML)
  • DB-driven config via _resolve_media_for_detect: fetches AI settings from Annotations, merges nested sections and casing variants
  • Media lifecycle: _post_media_record + _put_media_status manage status transitions via Annotations API
  • Content hashing via compute_media_content_hash (bytes, XxHash64 with sampling) and compute_media_content_hash_from_file (file on disk, 3 KB I/O) for media deduplication
  • StreamingBuffer for /detect/video: concurrent file append + read via threading.Condition, enables PyAV to decode frames as HTTP chunks arrive

Caveats

  • No CORS middleware configured
  • No rate limiting
  • No request body size limits beyond FastAPI defaults
  • _active_detections is an in-memory dict — not persistent across restarts, not distributed
  • SSE queue overflow silently drops events (QueueFull caught and ignored)
  • JWT token handling has no signature verification — relies entirely on the Annotations service for auth
  • No graceful shutdown handling for in-progress detections
  • Media record creation failures are silently caught (detection proceeds regardless)

Dependency Graph

graph TD
    main --> inference
    main --> constants_inf
    main --> loader_http_client
    main --> media_hash
    main --> streaming_buffer

Logging Strategy

No explicit logging in main.py — errors are caught and returned as HTTP responses. Logging happens in downstream components.