mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 16:06:31 +00:00
[AZ-178] Implement streaming video detection endpoint
- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive. - Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object. - Updated media hashing to include a new function for computing hashes directly from files with minimal I/O. - Enhanced documentation to reflect changes in video processing and API behavior. Made-with: Cursor
This commit is contained in:
@@ -10,6 +10,7 @@
|
||||
| F4 | SSE Event Streaming | Client GET /detect/stream | API | Medium |
|
||||
| F5 | Engine Initialization | First detection request | Inference Pipeline, Engines, Loader | High |
|
||||
| F6 | TensorRT Background Conversion | No pre-built TensorRT engine | Inference Pipeline, Engines, Loader | Medium |
|
||||
| F7 | Streaming Video Detection | Client POST /detect/video | API, StreamingBuffer, Inference Pipeline, Engines, Domain, Annotations | High |
|
||||
|
||||
## Flow Dependencies
|
||||
|
||||
@@ -18,9 +19,10 @@
|
||||
| F1 | F5 (for meaningful status) | — |
|
||||
| F2 | F5 (engine must be ready) | Annotations (media lifecycle) |
|
||||
| F3 | F5 (engine must be ready) | F4 (via SSE event queues), Annotations (settings, media lifecycle) |
|
||||
| F4 | — | F3 (receives events) |
|
||||
| F4 | — | F3, F7 (receives events) |
|
||||
| F5 | — | F6 (triggers conversion if needed) |
|
||||
| F6 | F5 (triggered by init failure) | F5 (provides converted bytes) |
|
||||
| F7 | F5 (engine must be ready) | F4 (via SSE event queues), Annotations (media lifecycle) |
|
||||
|
||||
---
|
||||
|
||||
@@ -317,3 +319,255 @@ sequenceDiagram
|
||||
INF->>STATUS: set_status(ENABLED)
|
||||
Note over INF: Next init_ai() call will load from _converted_model_bytes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Flow F7: Streaming Video Detection (AZ-178)
|
||||
|
||||
### Description
|
||||
|
||||
Client uploads a video file as raw binary and gets near-real-time detections via SSE as frames are decoded — **during** the upload, not after. The endpoint bypasses FastAPI's multipart buffering entirely, using `request.stream()` to read the HTTP body chunk-by-chunk. Each chunk is simultaneously written to a temp file (via `StreamingBuffer`) and read by PyAV in a background inference thread. First detections appear within ~500ms of the first decodable frames arriving at the API. Peak memory usage is bounded by the model batch size × frame size (tens of MB), regardless of video file size.
|
||||
|
||||
### Activity Diagram — Full Data Pipeline
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
subgraph CLIENT ["Client (Browser)"]
|
||||
C1([Open SSE connection<br/>GET /detect/stream])
|
||||
C2([Start upload<br/>POST /detect/video])
|
||||
C3([Receive SSE events<br/>during upload])
|
||||
end
|
||||
|
||||
subgraph API ["API Layer — main.py (async event loop)"]
|
||||
A1[Parse headers:<br/>X-Filename, X-Config, Auth]
|
||||
A2{Valid video<br/>extension?}
|
||||
A3[Create StreamingBuffer<br/>backed by temp file]
|
||||
A4[Start inference thread<br/>via run_in_executor]
|
||||
A5["Read chunk from<br/>request.stream()"]
|
||||
A6[buffer.append chunk<br/>via run_in_executor]
|
||||
A7{More chunks?}
|
||||
A8[buffer.close_writer<br/>signal EOF]
|
||||
A9[Compute content hash<br/>from temp file on disk<br/>reads only 3 KB]
|
||||
A10[Rename temp file →<br/>permanent storage path]
|
||||
A11[Create media record<br/>POST /api/media]
|
||||
A12["Return {status: started,<br/>mediaId: hash}"]
|
||||
A13[Register background task<br/>to await inference completion]
|
||||
end
|
||||
|
||||
subgraph BUF ["StreamingBuffer — streaming_buffer.py"]
|
||||
B1[/"Temp file on disk<br/>(single file, two handles)"/]
|
||||
B2["append(data):<br/>write + flush + notify"]
|
||||
B3["read(size):<br/>block if ahead of writer<br/>return available bytes"]
|
||||
B4["seek(offset, whence):<br/>SEEK_END blocks until EOF"]
|
||||
B5["close_writer():<br/>set EOF flag, notify all"]
|
||||
end
|
||||
|
||||
subgraph INF ["Inference Thread — inference.pyx"]
|
||||
I1["av.open(buffer)<br/>PyAV reads via buffer.read()"]
|
||||
I2{Moov at start?}
|
||||
I3[Decode frames immediately<br/>~500ms latency]
|
||||
I4["Blocks on seek(0, 2)<br/>until upload completes"]
|
||||
I5["Decode batch of frames<br/>(frame_period_recognition sampling)"]
|
||||
I6["engine.process_frames(batch)"]
|
||||
I7{Detections found?}
|
||||
I8["on_annotation callback<br/>→ SSE event broadcast"]
|
||||
I9{More frames?}
|
||||
I10[send_detection_status]
|
||||
end
|
||||
|
||||
C2 --> A1
|
||||
A1 --> A2
|
||||
A2 -->|No| ERR([400 Bad Request])
|
||||
A2 -->|Yes| A3
|
||||
A3 --> A4
|
||||
A4 --> A5
|
||||
|
||||
A5 --> A6
|
||||
A6 --> B2
|
||||
B2 --> B1
|
||||
A6 --> A7
|
||||
A7 -->|Yes| A5
|
||||
A7 -->|No| A8
|
||||
A8 --> B5
|
||||
|
||||
A8 --> A9
|
||||
A9 --> A10
|
||||
A10 --> A11
|
||||
A11 --> A12
|
||||
A12 --> A13
|
||||
|
||||
A4 -.->|background thread| I1
|
||||
I1 --> I2
|
||||
I2 -->|"Yes (faststart MP4,<br/>MKV, WebM)"| I3
|
||||
I2 -->|"No (standard MP4)"| I4
|
||||
I4 --> I3
|
||||
I3 --> I5
|
||||
I5 --> I6
|
||||
I6 --> I7
|
||||
I7 -->|Yes| I8
|
||||
I8 --> C3
|
||||
I7 -->|No| I9
|
||||
I8 --> I9
|
||||
I9 -->|Yes| I5
|
||||
I9 -->|No| I10
|
||||
|
||||
B3 -.->|"PyAV calls<br/>read()"| I1
|
||||
|
||||
style BUF fill:#e8f4fd,stroke:#2196F3
|
||||
style INF fill:#fce4ec,stroke:#e91e63
|
||||
style API fill:#e8f5e9,stroke:#4CAF50
|
||||
style CLIENT fill:#fff3e0,stroke:#FF9800
|
||||
```
|
||||
|
||||
### Sequence Diagram — Concurrent Timeline
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Client
|
||||
participant SSE as SSE /detect/stream
|
||||
participant API as main.py (async)
|
||||
participant BUF as StreamingBuffer
|
||||
participant INF as Inference Thread
|
||||
participant PyAV
|
||||
participant ENG as Engine (ONNX/TRT)
|
||||
participant ANN as Annotations Service
|
||||
|
||||
Client->>SSE: GET /detect/stream (open)
|
||||
Client->>API: POST /detect/video (raw body, streaming)
|
||||
API->>API: Parse X-Filename, X-Config, Auth headers
|
||||
API->>BUF: Create StreamingBuffer (temp file)
|
||||
API->>INF: Start in executor thread
|
||||
|
||||
par Upload stream (async event loop) and Inference (background thread)
|
||||
loop Each HTTP body chunk (~8-64 KB)
|
||||
API->>BUF: append(chunk) → write + flush + notify
|
||||
end
|
||||
|
||||
INF->>PyAV: av.open(buffer)
|
||||
Note over PyAV,BUF: PyAV calls buffer.read().<br/>Blocks when no data yet.<br/>Resumes as chunks arrive.
|
||||
|
||||
loop Each decodable frame batch
|
||||
PyAV->>BUF: read(size) → returns available bytes
|
||||
BUF-->>PyAV: video data
|
||||
PyAV-->>INF: decoded frames (BGR numpy)
|
||||
INF->>ENG: process_frames(batch)
|
||||
ENG-->>INF: detections
|
||||
opt Valid detections
|
||||
INF->>SSE: DetectionEvent (via callback)
|
||||
SSE-->>Client: data: {...detections...}
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
API->>BUF: close_writer() → EOF signal
|
||||
Note over INF: PyAV reads remaining frames, finishes
|
||||
|
||||
API->>API: compute_media_content_hash_from_file(temp file) — reads 3 KB
|
||||
API->>API: Rename temp file → {hash}{ext}
|
||||
|
||||
opt Authenticated user
|
||||
API->>ANN: POST /api/media (create record)
|
||||
API->>ANN: PUT /api/media/{id}/status (AI_PROCESSING)
|
||||
end
|
||||
|
||||
API-->>Client: {"status": "started", "mediaId": "abc123"}
|
||||
|
||||
Note over API: Background task awaits inference completion
|
||||
|
||||
INF-->>API: Inference completes
|
||||
opt Authenticated user
|
||||
API->>ANN: PUT /api/media/{id}/status (AI_PROCESSED)
|
||||
end
|
||||
API->>SSE: DetectionEvent(status=AIProcessed, percent=100)
|
||||
SSE-->>Client: data: {...status: AIProcessed...}
|
||||
```
|
||||
|
||||
### Flowchart — StreamingBuffer Read/Write Coordination
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
subgraph WRITER ["Writer (HTTP handler thread)"]
|
||||
W1["Receive HTTP chunk"]
|
||||
W2["Acquire Condition lock"]
|
||||
W3["file.write(chunk) + flush()"]
|
||||
W4["_written += len(chunk)"]
|
||||
W5["notify_all() → wake reader"]
|
||||
W6["Release lock"]
|
||||
W7{More chunks?}
|
||||
W8["close_writer():<br/>set _eof = True<br/>notify_all()"]
|
||||
end
|
||||
|
||||
subgraph READER ["Reader (PyAV / Inference thread)"]
|
||||
R1["PyAV calls read(size)"]
|
||||
R2["Acquire Condition lock"]
|
||||
R3{"_written > pos?"}
|
||||
R4["cond.wait()<br/>(releases lock, sleeps)"]
|
||||
R5["Calculate to_read =<br/>min(size, available)"]
|
||||
R6["Release lock"]
|
||||
R7["file.read(to_read)<br/>(outside lock)"]
|
||||
R8["Return bytes to PyAV"]
|
||||
R9{"_eof and<br/>available == 0?"}
|
||||
R10["Return b'' (EOF)"]
|
||||
end
|
||||
|
||||
W1 --> W2 --> W3 --> W4 --> W5 --> W6 --> W7
|
||||
W7 -->|Yes| W1
|
||||
W7 -->|No| W8
|
||||
|
||||
R1 --> R2 --> R3
|
||||
R3 -->|Yes| R5
|
||||
R3 -->|No| R9
|
||||
R9 -->|Yes| R10
|
||||
R9 -->|No| R4
|
||||
R4 -.->|"Woken by<br/>notify_all()"| R3
|
||||
R5 --> R6 --> R7 --> R8
|
||||
|
||||
style WRITER fill:#e8f5e9,stroke:#4CAF50
|
||||
style READER fill:#fce4ec,stroke:#e91e63
|
||||
```
|
||||
|
||||
### Data Flow
|
||||
|
||||
| Step | From | To | Data | Format |
|
||||
|------|------|----|------|--------|
|
||||
| 1 | Client | API | Raw video bytes (streaming) | HTTP POST body chunks |
|
||||
| 2 | API | StreamingBuffer | Byte chunks (8-64 KB each) | `append(bytes)` |
|
||||
| 3 | StreamingBuffer | Temp file | Same chunks | `file.write()` + `flush()` |
|
||||
| 4 | StreamingBuffer | PyAV (Inference thread) | Byte segments on demand | `read(size)` blocks when ahead |
|
||||
| 5 | PyAV | Inference | Decoded BGR numpy frames | ndarray |
|
||||
| 6 | Inference | Engine | Preprocessed batch | ndarray |
|
||||
| 7 | Engine | Inference | Raw detections | ndarray |
|
||||
| 8 | Inference | SSE clients | DetectionEvent | SSE JSON via `loop.call_soon_threadsafe` |
|
||||
| 9 | API | Temp file | Content hash (3 KB read) | `compute_media_content_hash_from_file` |
|
||||
| 10 | API | Disk | Rename temp → permanent path | `os.rename` |
|
||||
| 11 | API | Annotations Service | Media record + status | HTTP POST/PUT JSON |
|
||||
|
||||
### Memory Profile (2 GB video)
|
||||
|
||||
| Stage | Current (F2) | Streaming (F7) |
|
||||
|-------|-------------|----------------|
|
||||
| Starlette buffering | 2 GB (SpooledTempFile) | 0 (raw stream) |
|
||||
| `file.read()` / chunk buffer | 2 GB (full bytes) | ~64 KB (one chunk) |
|
||||
| BytesIO for PyAV | 2 GB (copy) | 0 (reads from buffer) |
|
||||
| Writer thread | 2 GB (same ref) | 0 (no separate writer) |
|
||||
| **Peak process RAM** | **~4+ GB** | **~50 MB** (batch × frame) |
|
||||
|
||||
### Format Compatibility
|
||||
|
||||
| Container Format | Moov Location | Streaming Behavior |
|
||||
|-----------------|--------------|-------------------|
|
||||
| MP4 (faststart) | Beginning | True streaming — first frame decoded in ~500ms |
|
||||
| MKV / WebM | Beginning | True streaming — first frame decoded in ~500ms |
|
||||
| MP4 (standard) | End of file | Graceful degradation — `seek(0, 2)` blocks until upload completes, then decoding starts |
|
||||
| MOV, AVI | Varies | Depends on header location |
|
||||
|
||||
### Error Scenarios
|
||||
|
||||
| Error | Where | Detection | Recovery |
|
||||
|-------|-------|-----------|----------|
|
||||
| Non-video extension | API | Extension check | 400 Bad Request |
|
||||
| Client disconnects mid-upload | request.stream() | Exception | buffer.close_writer() called in except, inference thread gets EOF |
|
||||
| Engine unavailable | Inference thread | engine is None | Error event via SSE |
|
||||
| PyAV decode failure | Inference thread | Exception | Error event via SSE, media status set to Error |
|
||||
| Disk full | StreamingBuffer.append | OSError | Propagated to API handler |
|
||||
| Annotations service down | _post_media_record | Exception caught | Silently continues, detections still work |
|
||||
|
||||
Reference in New Issue
Block a user