[AZ-178] Implement streaming video detection endpoint

- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive. - Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object. - Updated media hashing to include a new function for computing hashes directly from files with minimal I/O. - Enhanced documentation to reflect changes in video processing and API behavior. Made-with: Cursor
2026-06-21 18:41:08 +00:00 · 2026-04-01 03:11:43 +03:00
parent e65d8da6a3
commit be4cab4fcb
42 changed files with 2983 additions and 29 deletions
@@ -1,67 +0,0 @@
-# Remove Redundant Synchronous Video Pre-Write in /detect Endpoint
-
-**Task**: AZ-177_remove_redundant_video_prewrite
-**Name**: Remove redundant synchronous video file writes in /detect endpoint
-**Description**: The `/detect` endpoint writes video bytes to disk synchronously before calling `run_detect_video`, which writes them again in a background thread concurrently with frame detection. Remove the redundant pre-writes so videos are written only once — by inference.pyx — concurrently with detection, as AZ-173 intended.
-**Complexity**: 2 points
-**Dependencies**: AZ-173 (stream-based run_detect)
-**Component**: Main
-**Jira**: AZ-177
-**Parent**: AZ-172
-
-## Problem
-
-After AZ-173 implemented simultaneous disk-write + frame-detection in `inference.pyx`, the `/detect` endpoint in `main.py` still writes video bytes to disk **synchronously before** calling `run_detect_video`. Since `run_detect_video` internally spawns a thread to write the same bytes to the same path, every video upload gets written to disk **twice**:
-
-1. **Auth'd path** (lines 394-395): `storage_path` is written synchronously via `open(storage_path, "wb").write(image_bytes)`, then `run_detect_video(..., save_path=storage_path, ...)` writes to the same path again in a background thread.
-
-2. **Non-auth'd path** (lines 427-430): A temp file is created and written synchronously via `tempfile.mkstemp` + `open(tmp_video_path, "wb").write(image_bytes)`, then `run_detect_video(..., save_path=tmp_video_path, ...)` writes to the same path again.
-
-This defeats the purpose of AZ-173's concurrent design: the video data is fully written before detection starts, so there is no actual concurrency between writing and detecting.
-
-### How inference.pyx handles it (correctly)
-
-```python
-# inference.pyx run_detect_video:
-writer_done = threading.Event()
-wt = threading.Thread(target=_write_video_bytes_to_path, args=(save_path, video_bytes, writer_done))
-wt.start()                        # thread A: writes bytes to disk
-bio = io.BytesIO(video_bytes)
-container = av.open(bio)           # thread B (caller): decodes frames via PyAV
-self._process_video_pyav(...)      # detection happens concurrently with disk write
-writer_done.wait()                 # wait for write to finish
-```
-
-## Target State
-
-### Auth'd video uploads
- Do NOT write file at line 394 for videos — only write for images (since `run_detect_image` doesn't write to disk)
- Pass `storage_path` to `run_detect_video`; let inference handle the concurrent write
-
-### Non-auth'd video uploads
- Do NOT create/write a temp file with `tempfile.mkstemp` + `open().write()`
- Instead, only create a temp file **path** (empty) and pass it to `run_detect_video` which writes the data concurrently with detection
- Alternatively: use `tempfile.mktemp` or build the path manually without pre-writing
-
-### Image uploads (no change needed)
- `run_detect_image` does NOT write to disk, so the synchronous write at line 394 remains necessary for images
-
-## Acceptance Criteria
-
- [ ] Video bytes are written to disk exactly once (by `_write_video_bytes_to_path` in inference.pyx), not twice
- [ ] For videos, disk write and frame detection happen concurrently (not sequentially)
- [ ] Image storage behavior is unchanged (synchronous write before detection)
- [ ] Temp file cleanup in the `finally` block still works correctly
- [ ] Auth'd video uploads: media record and status updates unchanged
- [ ] All existing tests pass
-
-## File Changes
-
-| File | Action | Description |
-|------|--------|-------------|
-| `src/main.py` | Modified | Split storage write by media kind; remove redundant video pre-writes |
-
-## Technical Notes
-
- The `finally` block (lines 463-468) cleans up `tmp_video_path` — this must still work after the change. Since `run_detect_video` waits for the writer thread with `writer_done.wait()`, the file will exist when cleanup runs.
- `tempfile.mkstemp` creates the file atomically (open + create); we may need `tempfile.mkstemp` still just for the safe path generation, but skip the `write()` call. Or use a different approach to generate the temp path.
@@ -0,0 +1,107 @@
+# True Streaming Video Detection
+
+**Task**: AZ-178_true_streaming_video_detect
+**Name**: Start inference as upload bytes arrive — no buffering
+**Description**: Replace the fully-buffered `/detect` upload flow with a true streaming pipeline where video bytes flow simultaneously to disk and to PyAV for frame decoding + inference. First detection must appear within ~500ms of first decodable frames arriving at the API.
+**Complexity**: 5 points
+**Dependencies**: AZ-173 (stream-based run_detect)
+**Component**: Main, Inference, MediaHash
+**Jira**: AZ-178
+**Parent**: AZ-172
+
+## Problem
+
+The current `/detect` endpoint has three sequential blocking stages before any detection runs:
+
+1. **Starlette multipart buffering**: `UploadFile = File(...)` causes Starlette to consume the entire HTTP body and spool it to a `SpooledTemporaryFile` before the handler is called. For 2 GB → user waits for full upload.
+2. **Full RAM load**: `await file.read()` copies the entire spooled file into a `bytes` object in RAM. For 2 GB → ~2 GB+ allocated.
+3. **BytesIO + writer thread**: `run_detect_video(video_bytes, ...)` wraps `bytes` in `io.BytesIO` for PyAV and spawns a separate thread to write the same bytes to disk. For 2 GB → ~4 GB RAM total + double disk write.
+
+Net result: zero detection output until the entire file is uploaded AND loaded into RAM.
+
+## Target State
+
+```
+HTTP chunks ──┬──▸ StreamingBuffer (temp file) ──▸ PyAV decode ──▸ inference ──▸ SSE
+              └──▸ (same temp file serves as permanent storage after rename)
+```
+
+- Bytes flow chunk-by-chunk from the network into a `StreamingBuffer`
+- PyAV reads from the same buffer concurrently — blocks when ahead of the writer, resumes as new data arrives
+- No intermediate `bytes` object holds the full file in RAM
+- Peak memory: ~model batch size × frame size (tens of MB), not file size
+
+## Technical Design
+
+### 1. StreamingBuffer (`src/streaming_buffer.py`)
+
+A file-like object backed by a temp file with concurrent append + read:
+
+- `append(data)` — called from the async HTTP handler (via executor); writes to temp file, flushes, notifies readers
+- `read(size)` — called by PyAV; blocks via `Condition.wait()` when data not yet available
+- `seek(offset, whence)` — supports SEEK_SET/SEEK_CUR normally; SEEK_END blocks until writer signals EOF (graceful degradation for non-faststart MP4)
+- `tell()`, `seekable()`, `readable()` — standard file protocol
+- `close_writer()` — signals EOF
+- Thread-safe via `threading.Condition`
+
+**Format compatibility:**
+- Faststart MP4, MKV, WebM → true streaming (moov/header at start)
+- Standard MP4 (moov at end) → SEEK_END blocks until upload completes, then decoding starts (correct, just not streaming)
+
+### 2. `run_detect_video_stream` in `inference.pyx`
+
+New method accepting a file-like `readable` instead of `bytes`:
+
+```python
+cpdef run_detect_video_stream(self, object readable, AIRecognitionConfig ai_config,
+                               str media_name, object annotation_callback,
+                               object status_callback=None)
+```
+
+- Opens `av.open(readable)` directly — PyAV calls `read()`/`seek()` on the StreamingBuffer
+- Reuses existing `_process_video_pyav` for frame decode → batch inference
+- No writer thread needed (StreamingBuffer already persists to disk)
+
+### 3. `compute_media_content_hash_from_file` in `media_hash.py`
+
+File-based variant of `compute_media_content_hash` that reads only 3 sampling regions (3 KB) from disk instead of loading the entire file:
+
+```python
+def compute_media_content_hash_from_file(path: str) -> str
+```
+
+Produces identical hashes to the existing `compute_media_content_hash(data)`.
+
+### 4. `POST /detect/video` endpoint in `main.py`
+
+New endpoint — raw binary body (not multipart), bypassing Starlette's buffering:
+
+- Filename via `X-Filename` header, config via `X-Config` header
+- Auth via `Authorization` / `X-Refresh-Token` headers (same as existing)
+- Uses `request.stream()` for async chunk iteration
+- Creates `StreamingBuffer`, starts inference in executor thread
+- Feeds chunks to buffer via `run_in_executor` (non-blocking event loop)
+- After upload completes: compute hash from file, rename to permanent path, create media record
+- Returns `{"status": "started", "mediaId": "<hash>"}` — inference continues in background
+- Detections flow via existing SSE `/detect/stream`
+
+## Acceptance Criteria
+
+- [ ] Video detection starts as soon as first frames are decodable (~500ms for faststart formats)
+- [ ] 2 GB video never loads entirely into RAM (peak memory < 100 MB for the streaming pipeline)
+- [ ] Video bytes written to disk exactly once (no double-write)
+- [ ] Standard MP4 (moov at end) still works correctly (graceful degradation)
+- [ ] Detections delivered via SSE in real-time during upload
+- [ ] Content hash identical to existing `compute_media_content_hash`
+- [ ] All existing tests pass
+- [ ] Existing `/detect` endpoint unchanged (images and legacy callers unaffected)
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/streaming_buffer.py` | New | StreamingBuffer class |
+| `src/inference.pyx` | Modified | Add `run_detect_video_stream` method |
+| `src/media_hash.py` | Modified | Add `compute_media_content_hash_from_file` |
+| `src/main.py` | Modified | Add `POST /detect/video` endpoint |
+| `tests/test_streaming_buffer.py` | New | Unit tests for StreamingBuffer |