mirror of
https://github.com/azaion/detections.git
synced 2026-04-23 01:16:32 +00:00
[AZ-175] Media table integration with XxHash64 content hashing and status lifecycle
Made-with: Cursor
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
# Stream-Based run_detect
|
||||
|
||||
**Task**: AZ-173_stream_based_run_detect
|
||||
**Name**: Replace path-based run_detect with stream-based API in inference.pyx
|
||||
**Description**: Refactor `run_detect` in `inference.pyx` to accept media bytes/stream instead of a config dict with local file paths. Enable simultaneous disk write and frame-by-frame detection for video.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: None (core change, other subtasks depend on this)
|
||||
**Component**: Inference
|
||||
**Jira**: AZ-173
|
||||
**Parent**: AZ-172
|
||||
|
||||
## Problem
|
||||
|
||||
`run_detect` currently takes a `config_dict` containing `paths: list[str]` — local filesystem paths. It iterates over them, guesses media type via `mimetypes.guess_type`, and opens files with `cv2.VideoCapture` or `cv2.imread`. This doesn't work when the caller is on a different device.
|
||||
|
||||
## Current State
|
||||
|
||||
```python
|
||||
cpdef run_detect(self, dict config_dict, object annotation_callback, object status_callback=None):
|
||||
ai_config = AIRecognitionConfig.from_dict(config_dict)
|
||||
for p in ai_config.paths:
|
||||
if self.is_video(p): videos.append(p)
|
||||
else: images.append(p)
|
||||
self._process_images(ai_config, images) # cv2.imread(path)
|
||||
for v in videos:
|
||||
self._process_video(ai_config, v) # cv2.VideoCapture(path)
|
||||
```
|
||||
|
||||
## Target State
|
||||
|
||||
Split into two dedicated methods:
|
||||
|
||||
- `run_detect_video(self, stream, AIRecognitionConfig ai_config, str media_name, str save_path, ...)` — accepts a video stream/bytes, writes to `save_path` while decoding frames for detection
|
||||
- `run_detect_image(self, bytes image_bytes, AIRecognitionConfig ai_config, str media_name, ...)` — accepts image bytes, decodes in memory
|
||||
|
||||
Remove:
|
||||
- `is_video(self, str filepath)` method
|
||||
- `paths` iteration loop in `run_detect`
|
||||
- Direct `cv2.VideoCapture(local_path)` and `cv2.imread(local_path)` calls
|
||||
|
||||
## Video Stream Processing — PyAV (preferred)
|
||||
|
||||
Use **PyAV** (`av` package) for byte-stream video decoding. Preferred approach per user decision.
|
||||
|
||||
**Why PyAV**: Decode frames directly from bytes/stream via `av.open(io.BytesIO(data))` without intermediate file I/O. Most flexible for streaming, no concurrent file read/write synchronization needed.
|
||||
|
||||
**Investigation needed during implementation**:
|
||||
- Verify PyAV can decode frames from a growing `BytesIO` or chunked stream (not just complete files)
|
||||
- Check if `av.open()` supports reading from an async generator or pipe for true streaming
|
||||
- Evaluate whether simultaneous disk write can be done with a `tee`-style approach (write bytes to file while feeding to PyAV)
|
||||
- Confirm PyAV frame format (RGB/BGR/YUV) compatibility with existing `process_frames` pipeline (expects BGR numpy arrays)
|
||||
- Check PyAV version compatibility with the project's Python/Cython setup
|
||||
- Fallback: if PyAV streaming is not viable, use write-then-read with `cv2.VideoCapture` on completed file
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Video can be processed from bytes/stream without a local file path from the caller
|
||||
- [ ] Video is simultaneously written to disk and processed frame-by-frame
|
||||
- [ ] Image can be processed from bytes without a local file path
|
||||
- [ ] `_process_video_batch` and batch processing logic preserved (only input source changes)
|
||||
- [ ] All existing detection logic (tile splitting, validation, tracking) unaffected
|
||||
|
||||
## File Changes
|
||||
|
||||
| File | Action | Description |
|
||||
|------|--------|-------------|
|
||||
| `src/inference.pyx` | Modified | New stream-based methods, remove path-based `run_detect` |
|
||||
| `src/main.py` | Modified | Adapt callers to new method signatures |
|
||||
@@ -0,0 +1,76 @@
|
||||
# DB-Driven AI Config
|
||||
|
||||
**Task**: AZ-174_db_driven_ai_config
|
||||
**Name**: Fetch AIRecognitionConfig from DB by userId instead of UI-passed config
|
||||
**Description**: Replace UI-passed AI configuration with database-driven config fetched by userId from the annotations service.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: Annotations service needs new endpoint `GET /api/users/{userId}/ai-settings`
|
||||
**Component**: Configuration
|
||||
**Jira**: AZ-174
|
||||
**Parent**: AZ-172
|
||||
|
||||
## Problem
|
||||
|
||||
`AIRecognitionConfig` is currently built from a dict passed by the caller (UI). In the distributed architecture, the UI should not own or pass detection configuration — it should be stored server-side per user.
|
||||
|
||||
## Current State
|
||||
|
||||
- `main.py`: `AIConfigDto` Pydantic model with hardcoded defaults, passed as `config_dict`
|
||||
- `ai_config.pyx`: `AIRecognitionConfig.from_dict(data)` builds from dict with defaults
|
||||
- Camera settings (`altitude`, `focal_length`, `sensor_width`) baked into the config DTO
|
||||
- No DB interaction for config
|
||||
|
||||
## Target State
|
||||
|
||||
- Extract userId from JWT (already parsed in `TokenManager._decode_exp`)
|
||||
- Call annotations service: `GET /api/users/{userId}/ai-settings`
|
||||
- Response contains merged `AIRecognitionSettings` + `CameraSettings` fields
|
||||
- Build `AIRecognitionConfig` from the API response
|
||||
- Remove `AIConfigDto` from `main.py` (or keep as optional override for testing)
|
||||
- Remove `paths` field from `AIRecognitionConfig` entirely
|
||||
|
||||
## DB Tables (from schema)
|
||||
|
||||
**AIRecognitionSettings:**
|
||||
- FramePeriodRecognition (default 4)
|
||||
- FrameRecognitionSeconds (default 2)
|
||||
- ProbabilityThreshold (default 0.25)
|
||||
- TrackingDistanceConfidence
|
||||
- TrackingProbabilityIncrease
|
||||
- TrackingIntersectionThreshold
|
||||
- ModelBatchSize
|
||||
- BigImageTileOverlapPercent
|
||||
|
||||
**CameraSettings:**
|
||||
- Altitude (default 400m)
|
||||
- FocalLength (default 24mm)
|
||||
- SensorWidth (default 23.5mm)
|
||||
|
||||
**Linking:** These tables currently have no FK to Users. The backend needs either:
|
||||
- Add `UserId` FK to both tables, or
|
||||
- Create a `UserAIConfig` join table referencing both
|
||||
|
||||
## Backend Dependency
|
||||
|
||||
The annotations C# service needs:
|
||||
1. New endpoint: `GET /api/users/{userId}/ai-settings` returning merged config
|
||||
2. On user creation: seed default `AIRecognitionSettings` + `CameraSettings` rows
|
||||
3. Optional: `PUT /api/users/{userId}/ai-settings` for user to update their config
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Detection endpoint extracts userId from JWT
|
||||
- [ ] AIRecognitionConfig is fetched from annotations service by userId
|
||||
- [ ] Fallback to sensible defaults if service is unreachable
|
||||
- [ ] `paths` field removed from `AIRecognitionConfig`
|
||||
- [ ] Camera settings come from DB, not request payload
|
||||
|
||||
## File Changes
|
||||
|
||||
| File | Action | Description |
|
||||
|------|--------|-------------|
|
||||
| `src/main.py` | Modified | Fetch config from annotations service via HTTP |
|
||||
| `src/ai_config.pxd` | Modified | Remove `paths` field |
|
||||
| `src/ai_config.pyx` | Modified | Remove `paths` from `__init__` and `from_dict` |
|
||||
| `src/loader_http_client.pyx` | Modified | Add method to fetch user AI config |
|
||||
| `src/loader_http_client.pxd` | Modified | Declare new method |
|
||||
Reference in New Issue
Block a user