mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 11:56:31 +00:00
[AZ-178] Implement streaming video detection endpoint
- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive. - Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object. - Updated media hashing to include a new function for computing hashes directly from files with minimal I/O. - Enhanced documentation to reflect changes in video processing and API behavior. Made-with: Cursor
This commit is contained in:
@@ -39,6 +39,7 @@ Core inference orchestrator — manages the AI engine lifecycle, preprocesses me
|
||||
| `__init__` | `(loader_client)` | public | Initializes state, calls `init_ai()` |
|
||||
| `run_detect_image` | `(bytes image_bytes, AIRecognitionConfig ai_config, str media_name, annotation_callback, status_callback=None)` | cpdef | Decodes image from bytes, runs tiling + inference + postprocessing |
|
||||
| `run_detect_video` | `(bytes video_bytes, AIRecognitionConfig ai_config, str media_name, str save_path, annotation_callback, status_callback=None)` | cpdef | Processes video from in-memory bytes via PyAV, concurrently writes to save_path |
|
||||
| `run_detect_video_stream` | `(object readable, AIRecognitionConfig ai_config, str media_name, annotation_callback, status_callback=None)` | cpdef | Processes video from a file-like readable (e.g. StreamingBuffer) via PyAV — true streaming, no bytes in RAM (AZ-178) |
|
||||
| `stop` | `()` | cpdef | Sets stop_signal to True |
|
||||
| `init_ai` | `()` | cdef | Engine initialization: tries TensorRT → falls back to ONNX → background TensorRT conversion |
|
||||
| `preprocess` | `(frames) -> ndarray` | via engine | OpenCV blobFromImage: resize, normalize to 0..1, swap RGB, stack batch |
|
||||
@@ -75,6 +76,15 @@ Both `run_detect_image` and `run_detect_video` accept raw bytes instead of file
|
||||
6. Annotation validity heuristics (time gap, detection count increase, spatial movement, confidence improvement)
|
||||
7. Valid frames get JPEG-encoded image attached
|
||||
|
||||
### Streaming Video Processing (`run_detect_video_stream` — AZ-178)
|
||||
|
||||
1. Accepts a file-like `readable` object (e.g. `StreamingBuffer`) instead of `bytes`
|
||||
2. Opens directly via `av.open(readable)` — PyAV calls `read()`/`seek()` on the object
|
||||
3. No writer thread needed — the caller (API layer) manages disk persistence via the same buffer
|
||||
4. Reuses `_process_video_pyav` for frame decoding, batch inference, and annotation delivery
|
||||
5. For faststart MP4/MKV/WebM: frames are decoded as bytes stream in (~500ms latency)
|
||||
6. For standard MP4 (moov at end): PyAV's `seek(0, 2)` blocks until the buffer signals EOF, then decoding starts
|
||||
|
||||
### Ground Sampling Distance (GSD)
|
||||
|
||||
`GSD = sensor_width * altitude / (focal_length * image_width)` — meters per pixel, used for physical size filtering of aerial detections.
|
||||
@@ -86,7 +96,7 @@ Both `run_detect_image` and `run_detect_video` accept raw bytes instead of file
|
||||
|
||||
## Consumers
|
||||
|
||||
- `main` — lazy-initializes Inference, calls `run_detect_image`/`run_detect_video`, reads `ai_availability_status` and `is_engine_ready`
|
||||
- `main` — lazy-initializes Inference, calls `run_detect_image`/`run_detect_video`/`run_detect_video_stream`, reads `ai_availability_status` and `is_engine_ready`
|
||||
|
||||
## Data Models
|
||||
|
||||
@@ -107,5 +117,6 @@ None.
|
||||
## Tests
|
||||
|
||||
- `tests/test_ai_config_from_dict.py` — tests `ai_config_from_dict` helper
|
||||
- `tests/test_az178_streaming_video.py` — tests `run_detect_video_stream` via the `/detect/video` endpoint and `StreamingBuffer`
|
||||
- `e2e/tests/test_video.py` — exercises `run_detect_video` via the full API
|
||||
- `e2e/tests/test_single_image.py` — exercises `run_detect_image` via the full API
|
||||
|
||||
@@ -11,7 +11,8 @@ FastAPI application entry point — exposes HTTP API for object detection on ima
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/health` | Returns AI engine availability status |
|
||||
| POST | `/detect` | Image/video detection with media lifecycle management |
|
||||
| POST | `/detect` | Image/video detection with media lifecycle management (buffered) |
|
||||
| POST | `/detect/video` | Streaming video detection — inference starts as upload bytes arrive (AZ-178) |
|
||||
| POST | `/detect/{media_id}` | Start async detection on media resolved from Annotations service |
|
||||
| GET | `/detect/stream` | SSE stream of detection events |
|
||||
|
||||
@@ -41,7 +42,8 @@ FastAPI application entry point — exposes HTTP API for object detection on ima
|
||||
| `_detect_upload_kind` | `(filename, data) -> tuple[str, str]` | Determines if upload is image or video by extension, falls back to content probing (cv2/PyAV) |
|
||||
| `_post_media_record` | `(payload, bearer) -> bool` | Creates media record via `POST /api/media` on Annotations service |
|
||||
| `_put_media_status` | `(media_id, status, bearer) -> bool` | Updates media status via `PUT /api/media/{media_id}/status` on Annotations service |
|
||||
| `compute_media_content_hash` | (imported from `media_hash`) | XxHash64 content hash with sampling |
|
||||
| `compute_media_content_hash` | (imported from `media_hash`) | XxHash64 content hash with sampling (from bytes) |
|
||||
| `compute_media_content_hash_from_file` | (imported from `media_hash`) | XxHash64 content hash from file on disk — reads only 3 KB |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
@@ -57,9 +59,10 @@ Returns `HealthResponse` with `status="healthy"` always. `aiAvailability` reflec
|
||||
4. Parses optional JSON config
|
||||
5. Extracts auth tokens; if authenticated:
|
||||
a. Computes XxHash64 content hash
|
||||
b. Persists file to `VIDEOS_DIR` or `IMAGES_DIR`
|
||||
c. Creates media record via `POST /api/media`
|
||||
d. Sets status to `AI_PROCESSING` via `PUT /api/media/{id}/status`
|
||||
b. For images: persists file to `IMAGES_DIR` synchronously (since `run_detect_image` does not write to disk)
|
||||
c. For videos: file path is prepared but writing is deferred to `run_detect_video` which writes concurrently with frame detection (AZ-177)
|
||||
d. Creates media record via `POST /api/media`
|
||||
e. Sets status to `AI_PROCESSING` via `PUT /api/media/{id}/status`
|
||||
6. Runs `run_detect_image` or `run_detect_video` in ThreadPoolExecutor
|
||||
7. On success: sets status to `AI_PROCESSED`
|
||||
8. On failure: sets status to `ERROR`
|
||||
@@ -80,6 +83,20 @@ Returns `HealthResponse` with `status="healthy"` always. `aiAvailability` reflec
|
||||
8. Updates media status via `PUT /api/media/{id}/status`
|
||||
9. Returns immediately: `{"status": "started", "mediaId": media_id}`
|
||||
|
||||
### `/detect/video` (streaming upload — AZ-178)
|
||||
|
||||
1. Parses `X-Filename`, `X-Config`, auth headers (no multipart — raw binary body)
|
||||
2. Validates video extension
|
||||
3. Creates `StreamingBuffer` backed by a temp file in `VIDEOS_DIR`
|
||||
4. Starts inference thread via `run_in_executor`: `run_detect_video_stream(buffer, ...)`
|
||||
5. Reads HTTP body chunks via `request.stream()`, feeds each to `buffer.append()` via executor
|
||||
6. Inference thread reads from the same buffer concurrently — PyAV decodes frames as data arrives
|
||||
7. Detections are broadcast to SSE queues in real-time during upload
|
||||
8. After upload completes: signals EOF, computes content hash from temp file (3 KB read), renames to permanent path
|
||||
9. If authenticated: creates media record, tracks status lifecycle
|
||||
10. Returns `{"status": "started", "mediaId": "..."}` — inference continues in background task
|
||||
11. Background task awaits inference completion, updates status to AI_PROCESSED or Error
|
||||
|
||||
### `/detect/stream` (SSE)
|
||||
|
||||
- Creates asyncio.Queue per client (maxsize=100)
|
||||
@@ -99,7 +116,7 @@ Detections posts results to `POST {ANNOTATIONS_URL}/annotations` during async me
|
||||
## Dependencies
|
||||
|
||||
- **External**: `asyncio`, `base64`, `io`, `json`, `os`, `tempfile`, `time`, `concurrent.futures`, `pathlib`, `typing`, `av`, `cv2`, `numpy`, `requests`, `fastapi`, `pydantic`
|
||||
- **Internal**: `inference` (lazy import), `constants_inf` (label lookup), `loader_http_client` (client instantiation), `media_hash` (content hashing)
|
||||
- **Internal**: `inference` (lazy import), `constants_inf` (label lookup), `loader_http_client` (client instantiation), `media_hash` (content hashing), `streaming_buffer` (streaming video upload)
|
||||
|
||||
## Consumers
|
||||
|
||||
@@ -141,4 +158,5 @@ None (entry point).
|
||||
|
||||
- `tests/test_az174_db_driven_config.py` — `decode_user_id`, `_merged_annotation_settings_payload`, `_resolve_media_for_detect`
|
||||
- `tests/test_az175_api_calls.py` — `_post_media_record`, `_put_media_status`
|
||||
- `tests/test_az177_video_single_write.py` — video single-write, image unchanged, concurrent writer thread, temp cleanup
|
||||
- `e2e/tests/test_*.py` — full API e2e tests (health, single image, video, async, SSE, negative, security, performance, resilience)
|
||||
|
||||
@@ -9,6 +9,7 @@ Content-based hashing for media files using XxHash64 with a deterministic sampli
|
||||
| Function | Signature | Description |
|
||||
|----------|-----------|-------------|
|
||||
| `compute_media_content_hash` | `(data: bytes, virtual: bool = False) -> str` | Returns hex XxHash64 digest of sampled content. If `virtual=True`, prefixes with "V". |
|
||||
| `compute_media_content_hash_from_file` | `(path: str, virtual: bool = False) -> str` | Same algorithm but reads sampling regions directly from a file on disk — only 3 KB I/O regardless of file size. Produces identical hashes to the bytes-based version. (AZ-178) |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
@@ -27,7 +28,7 @@ The sampling avoids reading the full file through the hash function while still
|
||||
|
||||
## Consumers
|
||||
|
||||
- `main` — computes content hash for uploaded media in `POST /detect` to use as the media record ID and storage filename
|
||||
- `main` — computes content hash for uploaded media in `POST /detect` (bytes version) and `POST /detect/video` (file version) to use as the media record ID and storage filename
|
||||
|
||||
## Data Models
|
||||
|
||||
@@ -48,3 +49,4 @@ None. The hash is non-cryptographic (fast, not tamper-resistant).
|
||||
## Tests
|
||||
|
||||
- `tests/test_media_hash.py` — covers small files, large files, and virtual prefix behavior
|
||||
- `tests/test_az178_streaming_video.py::TestMediaContentHashFromFile` — verifies file-based hash matches bytes-based hash for small, large, boundary, and virtual cases
|
||||
|
||||
@@ -0,0 +1,82 @@
|
||||
# Module: streaming_buffer
|
||||
|
||||
## Purpose
|
||||
|
||||
File-like object backed by a temp file that supports concurrent append (write) and read+seek (read) from separate threads. Designed for true streaming video detection: the HTTP handler appends incoming chunks while the inference thread reads and decodes frames via PyAV — simultaneously, without buffering the entire file in memory.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: StreamingBuffer
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(temp_dir: str \| None = None)` | Creates a temp file in `temp_dir`; opens separate write and read handles |
|
||||
| `append` | `(data: bytes) -> None` | Writes data to temp file, flushes, notifies waiting readers |
|
||||
| `close_writer` | `() -> None` | Signals EOF — wakes all blocked readers |
|
||||
| `read` | `(size: int = -1) -> bytes` | Reads up to `size` bytes; blocks if data not yet available; returns `b""` on EOF |
|
||||
| `seek` | `(offset: int, whence: int = 0) -> int` | Seeks reader position; SEEK_END blocks until EOF is signaled |
|
||||
| `tell` | `() -> int` | Returns current reader position |
|
||||
| `readable` | `() -> bool` | Always returns `True` |
|
||||
| `seekable` | `() -> bool` | Always returns `True` |
|
||||
| `writable` | `() -> bool` | Always returns `False` |
|
||||
| `close` | `() -> None` | Closes both file handles |
|
||||
|
||||
### Properties
|
||||
|
||||
| Property | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `path` | `str` | Absolute path to the backing temp file |
|
||||
| `written` | `int` | Total bytes appended so far |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
### Thread Coordination
|
||||
|
||||
Uses `threading.Condition` to synchronize one writer (HTTP handler) and one reader (PyAV/inference thread):
|
||||
|
||||
- **append()**: acquires lock → writes to file → flushes → increments `_written` → `notify_all()` → releases lock
|
||||
- **read(size)**: acquires lock → checks if data available → if not and not EOF, calls `wait()` (releases lock, sleeps) → woken by `notify_all()` → calculates bytes to read → releases lock → reads from file (outside lock)
|
||||
- **seek(0, 2)** (SEEK_END): acquires lock → if EOF not signaled, calls `wait()` in loop → once EOF, delegates to `_reader.seek(offset, 2)`
|
||||
|
||||
The file read itself happens **outside** the lock to avoid holding the lock during I/O.
|
||||
|
||||
### File Handle Separation
|
||||
|
||||
Two independent file descriptors on the same temp file:
|
||||
- `_writer` opened with `"wb"` — append-only, used by the HTTP handler
|
||||
- `_reader` opened with `"rb"` — seekable, used by PyAV
|
||||
|
||||
On POSIX systems, writes flushed by one fd are immediately visible to reads on another fd of the same inode (shared kernel page cache). `os.rename()` on the path while the reader fd is open is safe — the fd retains access to the underlying inode.
|
||||
|
||||
### SEEK_END Behavior
|
||||
|
||||
When PyAV tries to seek to the end of the file (e.g. to find MP4 moov atom), `seek(0, 2)` blocks until `close_writer()` is called. This provides graceful degradation for non-faststart MP4 files: the decoder waits for the full upload, then processes normally. For faststart MP4/MKV/WebM, SEEK_END is never called and frames are decoded immediately.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `os`, `tempfile`, `threading`
|
||||
- **Internal**: none (leaf module)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `main` — creates `StreamingBuffer` in `POST /detect/video`, feeds chunks via `append()`, passes buffer to inference
|
||||
|
||||
## Data Models
|
||||
|
||||
None.
|
||||
|
||||
## Configuration
|
||||
|
||||
None.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None.
|
||||
|
||||
## Security
|
||||
|
||||
None. Temp file permissions follow OS defaults (`tempfile.mkstemp`).
|
||||
|
||||
## Tests
|
||||
|
||||
- `tests/test_az178_streaming_video.py::TestStreamingBuffer` — sequential write/read, blocking read, EOF, concurrent chunked read/write, seek set, seek end blocking, tell, file persistence, written property, seekable/readable flags
|
||||
Reference in New Issue
Block a user