Files
detections/_docs/03_implementation/implementation_report_streaming_video.md
T
Oleksandr Bezdieniezhnykh be4cab4fcb [AZ-178] Implement streaming video detection endpoint
- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive.
- Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object.
- Updated media hashing to include a new function for computing hashes directly from files with minimal I/O.
- Enhanced documentation to reflect changes in video processing and API behavior.

Made-with: Cursor
2026-04-01 03:11:43 +03:00

3.9 KiB

Implementation Report — True Streaming Video Detection

Date: 2026-04-01 Task: AZ-178 Complexity: 5 points Parent: AZ-172

Summary

Implemented a true streaming video detection pipeline. The new POST /detect/video endpoint bypasses Starlette's multipart buffering entirely — bytes flow directly from the HTTP body to PyAV frame decoding and inference via a StreamingBuffer, simultaneously persisting to disk. For faststart MP4/MKV/WebM, first detections appear within ~500ms of first frames arriving. Peak memory is bounded by model batch size, not file size.

Problem

The existing POST /detect endpoint had three sequential blocking stages for a 2 GB video:

  1. Starlette UploadFile spools entire body to temp file (~2 GB on disk, client waits)
  2. await file.read() loads entire file into RAM (~2 GB)
  3. run_detect_video wraps bytes in BytesIO for PyAV + spawns writer thread (~4 GB peak RAM, double disk write)

Zero detection output until full upload + full RAM load completed.

Solution

Component What it does
StreamingBuffer File-like object backed by temp file. Writer appends chunks, reader blocks until data arrives. Thread-safe via Condition.
run_detect_video_stream New inference method — av.open(readable) on the buffer. Reuses _process_video_pyav. No writer thread needed.
compute_media_content_hash_from_file Reads only 3 KB sampling regions from disk (identical hashes to bytes-based version).
POST /detect/video Raw binary body via request.stream(). Starts inference immediately, feeds chunks to buffer, detections stream via SSE.

File Changes

File Action Lines Description
src/streaming_buffer.py New 84 StreamingBuffer class
src/inference.pyx Modified +19 run_detect_video_stream method
src/media_hash.py Modified +17 compute_media_content_hash_from_file function
src/main.py Modified +130 POST /detect/video endpoint
tests/test_az178_streaming_video.py New ~200 14 unit tests (StreamingBuffer, hash, endpoint)
e2e/tests/test_streaming_video_upload.py New ~250 2 e2e tests (faststart streaming, non-faststart fallback)

Documentation Updated

File Changes
_docs/02_document/system-flows.md Added Flow F7 with activity diagram, sequence diagram, buffer coordination flowchart, memory profile, format compatibility table
_docs/02_document/modules/main.md Added /detect/video endpoint docs
_docs/02_document/modules/inference.md Added run_detect_video_stream method docs
_docs/02_document/modules/media_hash.md Added compute_media_content_hash_from_file docs
_docs/02_document/modules/streaming_buffer.md New module documentation
_docs/02_document/components/04_api/description.md Added endpoint spec, dependency graph
_docs/02_document/components/03_inference_pipeline/description.md Added streaming video processing section

Memory Profile (2 GB video)

Stage Before (POST /detect) After (POST /detect/video)
HTTP buffering 2 GB (SpooledTempFile) 0 (raw stream)
File → RAM 2 GB (file.read()) ~64 KB (one chunk)
PyAV input 2 GB (BytesIO copy) 0 (reads from buffer)
Writer thread 2 GB (same ref) 0 (not needed)
Peak process RAM ~4+ GB ~50 MB (batch x frame)

Format Behavior

Format Behavior
MP4 (faststart) True streaming — ~500ms to first detection
MKV / WebM True streaming — ~500ms to first detection
MP4 (moov at end) Graceful degradation — blocks until upload completes, then decodes

Test Results

36/36 unit tests passed (18 existing + 18 new). 2 e2e tests added (require deployed service + video fixtures).