[AZ-178] Implement streaming video detection endpoint

- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive.
- Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object.
- Updated media hashing to include a new function for computing hashes directly from files with minimal I/O.
- Enhanced documentation to reflect changes in video processing and API behavior.

Made-with: Cursor
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-04-01 03:11:43 +03:00
parent e65d8da6a3
commit be4cab4fcb
42 changed files with 2983 additions and 29 deletions
@@ -28,6 +28,8 @@ cdef class Inference:
annotation_callback, status_callback=None)
cpdef run_detect_video(bytes video_bytes, AIRecognitionConfig ai_config, str media_name,
str save_path, annotation_callback, status_callback=None)
cpdef run_detect_video_stream(object readable, AIRecognitionConfig ai_config, str media_name,
annotation_callback, status_callback=None)
cpdef stop()
# Internal pipeline stages:
@@ -60,6 +62,7 @@ class LoaderHttpClient:
```
def compute_media_content_hash(data: bytes, virtual: bool = False) -> str
def compute_media_content_hash_from_file(path: str, virtual: bool = False) -> str
```
## External API
@@ -70,9 +73,10 @@ None — internal component, consumed by API layer.
- Model bytes downloaded from Loader service (HTTP)
- Converted TensorRT engines uploaded back to Loader for caching
- Video frames decoded from in-memory bytes via PyAV (`av.open(BytesIO)`)
- Video frames decoded from in-memory bytes via PyAV (`av.open(BytesIO)`)`run_detect_video`
- Video frames decoded from streaming file-like via PyAV (`av.open(readable)`) — `run_detect_video_stream` (AZ-178)
- Images decoded from in-memory bytes via `cv2.imdecode`
- Video bytes concurrently written to persistent storage path in background thread
- Video bytes concurrently written to persistent storage path in background thread (`run_detect_video`) or via StreamingBuffer (`run_detect_video_stream`)
- All inference processing is in-memory
## Implementation Details
@@ -119,6 +123,15 @@ None — internal component, consumed by API layer.
- Annotation validity heuristics: time gap, detection count increase, spatial movement, confidence improvement
- JPEG encoding of valid frames for annotation images
### Streaming Video Processing (AZ-178)
- `run_detect_video_stream` accepts a file-like `readable` (e.g. `StreamingBuffer`) instead of `bytes`
- Opens `av.open(readable)` directly — PyAV calls `read()`/`seek()` on the object as needed
- No writer thread — the `StreamingBuffer` already persists data to disk as the HTTP handler feeds it chunks
- Reuses `_process_video_pyav` for all frame decoding, batching, and annotation logic
- For faststart MP4/MKV/WebM: true streaming (~500ms to first frame)
- For standard MP4 (moov at end): graceful degradation via blocking SEEK_END
### Callbacks
- `annotation_callback(annotation, percent)` — called per valid annotation