[AZ-178] Implement streaming video detection endpoint

- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive. - Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object. - Updated media hashing to include a new function for computing hashes directly from files with minimal I/O. - Enhanced documentation to reflect changes in video processing and API behavior. Made-with: Cursor
2026-06-21 17:41:08 +00:00 · 2026-04-01 03:11:43 +03:00
parent e65d8da6a3
commit be4cab4fcb
42 changed files with 2983 additions and 29 deletions
@@ -14,6 +14,7 @@
 | Module | Role |
 |--------|------|
 | `main` | FastAPI app definition, endpoints, DTOs, TokenManager, SSE streaming, media lifecycle, DB-driven config resolution |
+| `streaming_buffer` | File-like object for concurrent write+read — enables true streaming video detection (AZ-178) |

 ## External API Specification

@@ -50,6 +51,13 @@
 **Behavior** (AZ-173, AZ-175): Accepts both images and videos. Detects upload kind by extension, falls back to content probing. If authenticated: computes content hash, persists to storage, creates media record, tracks status lifecycle (New → AI Processing → AI Processed / Error).
 **Errors**: 400 (empty/invalid image data), 422 (runtime error), 503 (engine unavailable).

+### POST /detect/video
+
+**Input**: Raw binary body (not multipart). Headers: `X-Filename` (e.g. `clip.mp4`), optional `X-Config` (JSON), optional `Authorization: Bearer {token}`, optional `X-Refresh-Token`.
+**Response**: `{"status": "started", "mediaId": "..."}`
+**Behavior** (AZ-178): True streaming video detection. Bypasses Starlette multipart buffering by accepting raw body via `request.stream()`. Creates a `StreamingBuffer` (temp file), starts inference thread immediately, feeds HTTP chunks to the buffer as they arrive. PyAV reads from the buffer concurrently, decoding frames and running inference. Detections broadcast via SSE in real-time during upload. After upload: computes content hash from file (3 KB I/O), renames to permanent path, creates media record if authenticated.
+**Errors**: 400 (non-video extension).
+
 ### POST /detect/{media_id}

 **Input**: Path param `media_id`, optional JSON body `AIConfigDto`, headers `Authorization: Bearer {token}`, `x-refresh-token: {token}`.
@@ -82,7 +90,8 @@ data: {"annotations": [...], "mediaId": "...", "mediaStatus": "AIProcessing", "m
 - `TokenManager.decode_user_id` extracts user identity from multiple JWT claim formats (sub, userId, nameid, SAML)
 - DB-driven config via `_resolve_media_for_detect`: fetches AI settings from Annotations, merges nested sections and casing variants
 - Media lifecycle: `_post_media_record` + `_put_media_status` manage status transitions via Annotations API
- Content hashing via `compute_media_content_hash` (XxHash64 with sampling) for media deduplication
+- Content hashing via `compute_media_content_hash` (bytes, XxHash64 with sampling) and `compute_media_content_hash_from_file` (file on disk, 3 KB I/O) for media deduplication
+- `StreamingBuffer` for `/detect/video`: concurrent file append + read via `threading.Condition`, enables PyAV to decode frames as HTTP chunks arrive

 ## Caveats

@@ -103,6 +112,7 @@ graph TD
    main --> constants_inf
    main --> loader_http_client
    main --> media_hash
+    main --> streaming_buffer
 ```

 ## Logging Strategy