Update autopilot state and dependencies table for architecture shift

- Changed the current step from "Refactor" to "Implement" in the autopilot state, indicating a transition to the next phase of development. - Updated the dependencies table to reflect the completion of 11 tasks and the addition of 4 new tasks related to the distributed architecture. - Removed outdated task documentation for AZ-173, AZ-174, AZ-175, and AZ-176 as they are now obsolete following the architectural changes. - Enhanced the execution order for new tasks, organizing them into batches based on dependencies. These updates aim to align the project documentation with the current development phase and improve clarity on task management moving forward.
2026-06-22 06:41:09 +00:00 · 2026-03-31 06:08:44 +03:00
parent 8ce40a9385
commit 6547c5903a
6 changed files with 48 additions and 102 deletions
@@ -0,0 +1,68 @@
+# Stream-Based run_detect
+
+**Task**: AZ-173_stream_based_run_detect
+**Name**: Replace path-based run_detect with stream-based API in inference.pyx
+**Description**: Refactor `run_detect` in `inference.pyx` to accept media bytes/stream instead of a config dict with local file paths. Enable simultaneous disk write and frame-by-frame detection for video.
+**Complexity**: 3 points
+**Dependencies**: None (core change, other subtasks depend on this)
+**Component**: Inference
+**Jira**: AZ-173
+**Parent**: AZ-172
+
+## Problem
+
+`run_detect` currently takes a `config_dict` containing `paths: list[str]` — local filesystem paths. It iterates over them, guesses media type via `mimetypes.guess_type`, and opens files with `cv2.VideoCapture` or `cv2.imread`. This doesn't work when the caller is on a different device.
+
+## Current State
+
+```python
+cpdef run_detect(self, dict config_dict, object annotation_callback, object status_callback=None):
+    ai_config = AIRecognitionConfig.from_dict(config_dict)
+    for p in ai_config.paths:
+        if self.is_video(p): videos.append(p)
+        else: images.append(p)
+    self._process_images(ai_config, images)    # cv2.imread(path)
+    for v in videos:
+        self._process_video(ai_config, v)       # cv2.VideoCapture(path)
+```
+
+## Target State
+
+Split into two dedicated methods:
+
+- `run_detect_video(self, stream, AIRecognitionConfig ai_config, str media_name, str save_path, ...)` — accepts a video stream/bytes, writes to `save_path` while decoding frames for detection
+- `run_detect_image(self, bytes image_bytes, AIRecognitionConfig ai_config, str media_name, ...)` — accepts image bytes, decodes in memory
+
+Remove:
+- `is_video(self, str filepath)` method
+- `paths` iteration loop in `run_detect`
+- Direct `cv2.VideoCapture(local_path)` and `cv2.imread(local_path)` calls
+
+## Video Stream Processing — PyAV (preferred)
+
+Use **PyAV** (`av` package) for byte-stream video decoding. Preferred approach per user decision.
+
+**Why PyAV**: Decode frames directly from bytes/stream via `av.open(io.BytesIO(data))` without intermediate file I/O. Most flexible for streaming, no concurrent file read/write synchronization needed.
+
+**Investigation needed during implementation**:
+- Verify PyAV can decode frames from a growing `BytesIO` or chunked stream (not just complete files)
+- Check if `av.open()` supports reading from an async generator or pipe for true streaming
+- Evaluate whether simultaneous disk write can be done with a `tee`-style approach (write bytes to file while feeding to PyAV)
+- Confirm PyAV frame format (RGB/BGR/YUV) compatibility with existing `process_frames` pipeline (expects BGR numpy arrays)
+- Check PyAV version compatibility with the project's Python/Cython setup
+- Fallback: if PyAV streaming is not viable, use write-then-read with `cv2.VideoCapture` on completed file
+
+## Acceptance Criteria
+
+- [ ] Video can be processed from bytes/stream without a local file path from the caller
+- [ ] Video is simultaneously written to disk and processed frame-by-frame
+- [ ] Image can be processed from bytes without a local file path
+- [ ] `_process_video_batch` and batch processing logic preserved (only input source changes)
+- [ ] All existing detection logic (tile splitting, validation, tracking) unaffected
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/inference.pyx` | Modified | New stream-based methods, remove path-based `run_detect` |
+| `src/main.py` | Modified | Adapt callers to new method signatures |
@@ -0,0 +1,76 @@
+# DB-Driven AI Config
+
+**Task**: AZ-174_db_driven_ai_config
+**Name**: Fetch AIRecognitionConfig from DB by userId instead of UI-passed config
+**Description**: Replace UI-passed AI configuration with database-driven config fetched by userId from the annotations service.
+**Complexity**: 2 points
+**Dependencies**: Annotations service needs new endpoint `GET /api/users/{userId}/ai-settings`
+**Component**: Configuration
+**Jira**: AZ-174
+**Parent**: AZ-172
+
+## Problem
+
+`AIRecognitionConfig` is currently built from a dict passed by the caller (UI). In the distributed architecture, the UI should not own or pass detection configuration — it should be stored server-side per user.
+
+## Current State
+
+- `main.py`: `AIConfigDto` Pydantic model with hardcoded defaults, passed as `config_dict`
+- `ai_config.pyx`: `AIRecognitionConfig.from_dict(data)` builds from dict with defaults
+- Camera settings (`altitude`, `focal_length`, `sensor_width`) baked into the config DTO
+- No DB interaction for config
+
+## Target State
+
+- Extract userId from JWT (already parsed in `TokenManager._decode_exp`)
+- Call annotations service: `GET /api/users/{userId}/ai-settings`
+- Response contains merged `AIRecognitionSettings` + `CameraSettings` fields
+- Build `AIRecognitionConfig` from the API response
+- Remove `AIConfigDto` from `main.py` (or keep as optional override for testing)
+- Remove `paths` field from `AIRecognitionConfig` entirely
+
+## DB Tables (from schema)
+
+**AIRecognitionSettings:**
+- FramePeriodRecognition (default 4)
+- FrameRecognitionSeconds (default 2)
+- ProbabilityThreshold (default 0.25)
+- TrackingDistanceConfidence
+- TrackingProbabilityIncrease
+- TrackingIntersectionThreshold
+- ModelBatchSize
+- BigImageTileOverlapPercent
+
+**CameraSettings:**
+- Altitude (default 400m)
+- FocalLength (default 24mm)
+- SensorWidth (default 23.5mm)
+
+**Linking:** These tables currently have no FK to Users. The backend needs either:
+- Add `UserId` FK to both tables, or
+- Create a `UserAIConfig` join table referencing both
+
+## Backend Dependency
+
+The annotations C# service needs:
+1. New endpoint: `GET /api/users/{userId}/ai-settings` returning merged config
+2. On user creation: seed default `AIRecognitionSettings` + `CameraSettings` rows
+3. Optional: `PUT /api/users/{userId}/ai-settings` for user to update their config
+
+## Acceptance Criteria
+
+- [ ] Detection endpoint extracts userId from JWT
+- [ ] AIRecognitionConfig is fetched from annotations service by userId
+- [ ] Fallback to sensible defaults if service is unreachable
+- [ ] `paths` field removed from `AIRecognitionConfig`
+- [ ] Camera settings come from DB, not request payload
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/main.py` | Modified | Fetch config from annotations service via HTTP |
+| `src/ai_config.pxd` | Modified | Remove `paths` field |
+| `src/ai_config.pyx` | Modified | Remove `paths` from `__init__` and `from_dict` |
+| `src/loader_http_client.pyx` | Modified | Add method to fetch user AI config |
+| `src/loader_http_client.pxd` | Modified | Declare new method |
@@ -0,0 +1,73 @@
+# Media Table Integration
+
+**Task**: AZ-175_media_table_integration
+**Name**: Integrate Media table: create record on upload, store file, track status
+**Description**: When a file is uploaded to the detections API, create a Media record in the DB, store the file at the proper path, and update MediaStatus throughout processing.
+**Complexity**: 2 points
+**Dependencies**: Annotations service needs Media CRUD endpoints
+**Component**: Media Management
+**Jira**: AZ-175
+**Parent**: AZ-172
+
+## Problem
+
+Currently, uploaded files are written to temp files, processed, and deleted. No `Media` record is created in the database. File persistence and status tracking are missing.
+
+## Current State
+
+- `/detect`: writes upload to `tempfile.NamedTemporaryFile`, processes, deletes via `os.unlink`
+- `/detect/{media_id}`: accepts a media_id parameter but doesn't create or manage Media records
+- No XxHash64 ID generation in the detections module
+- No file storage to persistent paths
+
+## Target State
+
+### On Upload
+
+1. Receive file bytes from HTTP upload
+2. Compute XxHash64 of file content using the sampling algorithm
+3. Determine MediaType from file extension (Video or Image)
+4. Store file at proper path (from DirectorySettings: VideosDir or ImagesDir)
+5. Create Media record via annotations service: `POST /api/media`
+   - Id: XxHash64 hex string
+   - Name: original filename
+   - Path: storage path
+   - MediaType: Video|Image
+   - MediaStatus: New (1)
+   - UserId: from JWT
+
+### During Processing
+
+6. Update MediaStatus to AIProcessing (2) via `PUT /api/media/{id}/status`
+7. Run detection (stream-based per AZ-173)
+8. Update MediaStatus to AIProcessed (3) on success, or Error (6) on failure
+
+## XxHash64 Sampling Algorithm
+
+```
+For files >= 3072 bytes:
+  Input = file_size_as_8_bytes + first_1024_bytes + middle_1024_bytes + last_1024_bytes
+  Output = XxHash64(input) as hex string
+
+For files < 3072 bytes:
+  Input = file_size_as_8_bytes + entire_file_content
+  Output = XxHash64(input) as hex string
+```
+
+Virtual hashes (in-memory only) prefixed with "V".
+
+## Acceptance Criteria
+
+- [ ] XxHash64 ID computed correctly using the sampling algorithm
+- [ ] Media record created in DB on upload with correct fields
+- [ ] File stored at proper persistent path (not temp)
+- [ ] MediaStatus transitions: New → AIProcessing → AIProcessed (or Error)
+- [ ] UserId correctly extracted from JWT and associated with Media record
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/main.py` | Modified | Upload handling, Media API calls, status updates |
+| `src/media_hash.py` | New | XxHash64 sampling hash utility |
+| `requirements.txt` | Modified | Add `xxhash` library if not present |
@@ -0,0 +1,65 @@
+# Cleanup Obsolete Path-Based Code
+
+**Task**: AZ-176_cleanup_obsolete_path_code
+**Name**: Clean up obsolete path-based code and old methods
+**Description**: Remove all code that relies on the old co-located architecture where the UI sends local file paths to the detection module.
+**Complexity**: 1 point
+**Dependencies**: AZ-173 (stream-based run_detect), AZ-174 (DB-driven config)
+**Component**: Cleanup
+**Jira**: AZ-176
+**Parent**: AZ-172
+
+## Problem
+
+After implementing stream-based detection and DB-driven config, the old path-based code becomes dead code. It must be removed to avoid confusion and maintenance burden.
+
+## Items to Remove
+
+### `inference.pyx`
+
+| Item | Reason |
+|------|--------|
+| `is_video(self, str filepath)` | Media type comes from upload metadata, not filesystem guessing |
+| `for p in ai_config.paths: ...` loop in `run_detect` | Replaced by stream-based dispatch |
+| `cv2.VideoCapture(<str>video_name)` with local path arg | Replaced by stream-based video processing |
+| `cv2.imread(<str>path)` with local path arg | Replaced by bytes-based image processing |
+| Old `run_detect` signature (if fully replaced) | Replaced by `run_detect_video` / `run_detect_image` |
+
+### `ai_config.pxd`
+
+| Item | Reason |
+|------|--------|
+| `cdef public list[str] paths` | Paths no longer part of config |
+
+### `ai_config.pyx`
+
+| Item | Reason |
+|------|--------|
+| `paths` parameter in `__init__` | Paths no longer part of config |
+| `self.paths = paths` assignment | Paths no longer part of config |
+| `data.get("paths", [])` in `from_dict` | Paths no longer part of config |
+| `paths: {self.paths}` in `__str__` | Paths no longer part of config |
+
+### `main.py`
+
+| Item | Reason |
+|------|--------|
+| `AIConfigDto.paths: list[str]` field | Paths no longer sent by caller |
+| `config_dict["paths"] = [tmp.name]` in `/detect` | Temp file path injection no longer needed |
+
+## Acceptance Criteria
+
+- [ ] No references to `paths` in `AIRecognitionConfig` or its Pydantic DTO
+- [ ] No `cv2.VideoCapture(local_path)` or `cv2.imread(local_path)` calls remain
+- [ ] No `is_video(filepath)` method remains
+- [ ] All tests pass after removal
+- [ ] No dead imports left behind
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/inference.pyx` | Modified | Remove old methods and path-based logic |
+| `src/ai_config.pxd` | Modified | Remove `paths` field declaration |
+| `src/ai_config.pyx` | Modified | Remove `paths` from init, from_dict, __str__ |
+| `src/main.py` | Modified | Remove `AIConfigDto.paths`, path injection |