Add AIAvailabilityStatus and AIRecognitionConfig classes for AI model management

- Introduced `AIAvailabilityStatus` class to manage the availability status of AI models, including methods for setting status and logging messages. - Added `AIRecognitionConfig` class to encapsulate configuration parameters for AI recognition, with a static method for creating instances from dictionaries. - Implemented enums for AI availability states to enhance clarity and maintainability. - Updated related Cython files to support the new classes and ensure proper type handling. These changes aim to improve the structure and functionality of the AI model management system, facilitating better status tracking and configuration handling.
2026-06-21 12:21:07 +00:00 · 2026-03-31 05:49:51 +03:00
parent fc57d677b4
commit 8ce40a9385
43 changed files with 1190 additions and 462 deletions
@@ -0,0 +1,107 @@
+# Distributed Architecture Adaptation
+
+**Task**: AZ-172_distributed_architecture_adaptation
+**Name**: Adapt detections module for distributed architecture: stream-based input & DB-driven AI config
+**Description**: Replace the co-located file-path-based detection flow with stream-based input and DB-driven configuration, enabling UI to run on a separate device from the detections API.
+**Complexity**: 5 points
+**Dependencies**: Annotations service (C# backend) needs endpoints for per-user AI config and Media management
+**Component**: Architecture
+**Jira**: AZ-172
+
+## Problem
+
+The detections module assumes co-located deployment (same machine as the WPF UI). The UI sends local file paths, and inference reads files directly from disk:
+
+- `inference.pyx` → `_process_video()` opens local video via `cv2.VideoCapture(<str>video_name)`
+- `inference.pyx` → `_process_images()` reads local images via `cv2.imread(<str>path)`
+- `ai_config.pyx` has a `paths: list[str]` field carrying local filesystem paths
+- `AIRecognitionConfig` is passed from UI as a dict (via the `config_dict` parameter in `run_detect`)
+
+In the new distributed architecture, UI runs on a separate device (laptop, tablet, phone). The detections module is a standalone API on a different device. Local file paths are meaningless.
+
+## Outcome
+
+- Video detection works with streamed input (no local file paths required)
+- Video is simultaneously saved to disk and processed frame-by-frame
+- Image detection works with uploaded bytes (no local file paths required)
+- AIRecognitionConfig is fetched from DB by userId, not passed from UI
+- Media table records created on upload with correct XxHash64 Id, path, type, status
+- Old path-based code removed
+
+## Subtasks
+
+| Jira | Summary | Points |
+|------|---------|--------|
+| AZ-173 | Replace path-based `run_detect` with stream-based API in `inference.pyx` | 3 |
+| AZ-174 | Fetch AIRecognitionConfig from DB by userId instead of UI-passed config | 2 |
+| AZ-175 | Integrate Media table: create record on upload, store file, track status | 2 |
+| AZ-176 | Clean up obsolete path-based code and old methods | 1 |
+
+## Acceptance Criteria
+
+**AC-1: Stream-based video detection**
+Given a video is uploaded via HTTP to the detection API
+When the detections module processes it
+Then frames are decoded and run through inference without requiring a local file path from the caller
+
+**AC-2: Concurrent write and detect for video**
+Given a video stream is being received
+When the detection module processes it
+Then the stream is simultaneously written to persistent storage AND processed frame-by-frame for detection
+
+**AC-3: Stream-based image detection**
+Given an image is uploaded via HTTP to the detection API
+When the detections module processes it
+Then the image bytes are decoded and run through inference without requiring a local file path
+
+**AC-4: DB-driven AI config**
+Given a detection request arrives with a userId (from JWT)
+When the detection module needs AIRecognitionConfig
+Then it fetches AIRecognitionSettings + CameraSettings from the DB via the annotations service, not from the request payload
+
+**AC-5: Default config on user creation**
+Given a new user is created in the system
+When their account is provisioned
+Then default AIRecognitionSettings and CameraSettings rows are created for that user
+
+**AC-6: Media record lifecycle**
+Given a file is uploaded for detection
+When the upload is received
+Then a Media record is created (XxHash64 Id, Name, Path, MediaType, UserId) and MediaStatus transitions through New → AIProcessing → AIProcessed (or Error)
+
+**AC-7: Old code removed**
+Given the refactoring is complete
+When the codebase is reviewed
+Then no references to `paths` in AIRecognitionConfig, no `cv2.VideoCapture(local_path)`, no `cv2.imread(local_path)`, and no `is_video(filepath)` remain
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/inference.pyx` | Modified | Replace `run_detect` with stream-based methods; remove path iteration |
+| `src/ai_config.pxd` | Modified | Remove `paths` field |
+| `src/ai_config.pyx` | Modified | Remove `paths` field; adapt `from_dict` |
+| `src/main.py` | Modified | Fetch config from DB; handle Media records; adapt endpoints |
+| `src/loader_http_client.pyx` | Modified | Add method to fetch user AI config from annotations service |
+
+## Technical Notes
+
+- `cv2.VideoCapture` can read from a named pipe or a file being appended to. Alternative: feed frames via a queue from the HTTP upload handler, or use PyAV for direct byte-stream decoding
+- The annotations service (C# backend) owns the DB. Config retrieval requires API endpoints on that service
+- XxHash64 ID generation algorithm is documented in `_docs/00_database_schema.md`
+- Token management (JWT refresh) is already implemented in `main.py` via `TokenManager`
+- DB tables `AIRecognitionSettings` and `CameraSettings` exist in schema but are not yet linked to `Users`; need FK or join table
+
+## Risks & Mitigation
+
+**Risk 1: Concurrent write + read of video file**
+- *Risk*: `cv2.VideoCapture` may fail or stall reading an incomplete file
+- *Mitigation*: Use a frame queue pipeline (one thread writes, another reads) or PyAV for byte-stream decoding
+
+**Risk 2: Annotations service API dependency**
+- *Risk*: New endpoints needed on the C# backend for config retrieval and Media management
+- *Mitigation*: Define API contract upfront; detections module can use fallback defaults if service is unreachable
+
+**Risk 3: Config-to-User linking not yet in DB**
+- *Risk*: `AIRecognitionSettings` and `CameraSettings` tables have no FK to `Users`
+- *Mitigation*: Add `UserId` FK or create a `UserAIConfig` join table in the backend migration
@@ -0,0 +1,65 @@
+# Stream-Based run_detect
+
+**Task**: AZ-173_stream_based_run_detect
+**Name**: Replace path-based run_detect with stream-based API in inference.pyx
+**Description**: Refactor `run_detect` in `inference.pyx` to accept media bytes/stream instead of a config dict with local file paths. Enable simultaneous disk write and frame-by-frame detection for video.
+**Complexity**: 3 points
+**Dependencies**: None (core change, other subtasks depend on this)
+**Component**: Inference
+**Jira**: AZ-173
+**Parent**: AZ-172
+
+## Problem
+
+`run_detect` currently takes a `config_dict` containing `paths: list[str]` — local filesystem paths. It iterates over them, guesses media type via `mimetypes.guess_type`, and opens files with `cv2.VideoCapture` or `cv2.imread`. This doesn't work when the caller is on a different device.
+
+## Current State
+
+```python
+cpdef run_detect(self, dict config_dict, object annotation_callback, object status_callback=None):
+    ai_config = AIRecognitionConfig.from_dict(config_dict)
+    for p in ai_config.paths:
+        if self.is_video(p): videos.append(p)
+        else: images.append(p)
+    self._process_images(ai_config, images)    # cv2.imread(path)
+    for v in videos:
+        self._process_video(ai_config, v)       # cv2.VideoCapture(path)
+```
+
+## Target State
+
+Split into two dedicated methods:
+
+- `run_detect_video(self, stream, AIRecognitionConfig ai_config, str media_name, str save_path, ...)` — accepts a video stream/bytes, writes to `save_path` while decoding frames for detection
+- `run_detect_image(self, bytes image_bytes, AIRecognitionConfig ai_config, str media_name, ...)` — accepts image bytes, decodes in memory
+
+Remove:
+- `is_video(self, str filepath)` method
+- `paths` iteration loop in `run_detect`
+- Direct `cv2.VideoCapture(local_path)` and `cv2.imread(local_path)` calls
+
+## Video Stream Processing Options
+
+**Option A: Write-then-read**
+Write entire upload to temp file, then open with `cv2.VideoCapture`. Simple but not real-time.
+
+**Option B: Concurrent pipe**
+One thread writes incoming bytes to a file; another thread reads frames via `cv2.VideoCapture` on the growing file. Requires careful synchronization.
+
+**Option C: PyAV byte-stream decoding**
+Use `av.open(io.BytesIO(data))` or a custom `av.InputContainer` to decode frames directly from bytes without file I/O. Most flexible for streaming.
+
+## Acceptance Criteria
+
+- [ ] Video can be processed from bytes/stream without a local file path from the caller
+- [ ] Video is simultaneously written to disk and processed frame-by-frame
+- [ ] Image can be processed from bytes without a local file path
+- [ ] `_process_video_batch` and batch processing logic preserved (only input source changes)
+- [ ] All existing detection logic (tile splitting, validation, tracking) unaffected
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/inference.pyx` | Modified | New stream-based methods, remove path-based `run_detect` |
+| `src/main.py` | Modified | Adapt callers to new method signatures |
@@ -0,0 +1,76 @@
+# DB-Driven AI Config
+
+**Task**: AZ-174_db_driven_ai_config
+**Name**: Fetch AIRecognitionConfig from DB by userId instead of UI-passed config
+**Description**: Replace UI-passed AI configuration with database-driven config fetched by userId from the annotations service.
+**Complexity**: 2 points
+**Dependencies**: Annotations service needs new endpoint `GET /api/users/{userId}/ai-settings`
+**Component**: Configuration
+**Jira**: AZ-174
+**Parent**: AZ-172
+
+## Problem
+
+`AIRecognitionConfig` is currently built from a dict passed by the caller (UI). In the distributed architecture, the UI should not own or pass detection configuration — it should be stored server-side per user.
+
+## Current State
+
+- `main.py`: `AIConfigDto` Pydantic model with hardcoded defaults, passed as `config_dict`
+- `ai_config.pyx`: `AIRecognitionConfig.from_dict(data)` builds from dict with defaults
+- Camera settings (`altitude`, `focal_length`, `sensor_width`) baked into the config DTO
+- No DB interaction for config
+
+## Target State
+
+- Extract userId from JWT (already parsed in `TokenManager._decode_exp`)
+- Call annotations service: `GET /api/users/{userId}/ai-settings`
+- Response contains merged `AIRecognitionSettings` + `CameraSettings` fields
+- Build `AIRecognitionConfig` from the API response
+- Remove `AIConfigDto` from `main.py` (or keep as optional override for testing)
+- Remove `paths` field from `AIRecognitionConfig` entirely
+
+## DB Tables (from schema)
+
+**AIRecognitionSettings:**
+- FramePeriodRecognition (default 4)
+- FrameRecognitionSeconds (default 2)
+- ProbabilityThreshold (default 0.25)
+- TrackingDistanceConfidence
+- TrackingProbabilityIncrease
+- TrackingIntersectionThreshold
+- ModelBatchSize
+- BigImageTileOverlapPercent
+
+**CameraSettings:**
+- Altitude (default 400m)
+- FocalLength (default 24mm)
+- SensorWidth (default 23.5mm)
+
+**Linking:** These tables currently have no FK to Users. The backend needs either:
+- Add `UserId` FK to both tables, or
+- Create a `UserAIConfig` join table referencing both
+
+## Backend Dependency
+
+The annotations C# service needs:
+1. New endpoint: `GET /api/users/{userId}/ai-settings` returning merged config
+2. On user creation: seed default `AIRecognitionSettings` + `CameraSettings` rows
+3. Optional: `PUT /api/users/{userId}/ai-settings` for user to update their config
+
+## Acceptance Criteria
+
+- [ ] Detection endpoint extracts userId from JWT
+- [ ] AIRecognitionConfig is fetched from annotations service by userId
+- [ ] Fallback to sensible defaults if service is unreachable
+- [ ] `paths` field removed from `AIRecognitionConfig`
+- [ ] Camera settings come from DB, not request payload
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/main.py` | Modified | Fetch config from annotations service via HTTP |
+| `src/ai_config.pxd` | Modified | Remove `paths` field |
+| `src/ai_config.pyx` | Modified | Remove `paths` from `__init__` and `from_dict` |
+| `src/loader_http_client.pyx` | Modified | Add method to fetch user AI config |
+| `src/loader_http_client.pxd` | Modified | Declare new method |
@@ -0,0 +1,73 @@
+# Media Table Integration
+
+**Task**: AZ-175_media_table_integration
+**Name**: Integrate Media table: create record on upload, store file, track status
+**Description**: When a file is uploaded to the detections API, create a Media record in the DB, store the file at the proper path, and update MediaStatus throughout processing.
+**Complexity**: 2 points
+**Dependencies**: Annotations service needs Media CRUD endpoints
+**Component**: Media Management
+**Jira**: AZ-175
+**Parent**: AZ-172
+
+## Problem
+
+Currently, uploaded files are written to temp files, processed, and deleted. No `Media` record is created in the database. File persistence and status tracking are missing.
+
+## Current State
+
+- `/detect`: writes upload to `tempfile.NamedTemporaryFile`, processes, deletes via `os.unlink`
+- `/detect/{media_id}`: accepts a media_id parameter but doesn't create or manage Media records
+- No XxHash64 ID generation in the detections module
+- No file storage to persistent paths
+
+## Target State
+
+### On Upload
+
+1. Receive file bytes from HTTP upload
+2. Compute XxHash64 of file content using the sampling algorithm
+3. Determine MediaType from file extension (Video or Image)
+4. Store file at proper path (from DirectorySettings: VideosDir or ImagesDir)
+5. Create Media record via annotations service: `POST /api/media`
+   - Id: XxHash64 hex string
+   - Name: original filename
+   - Path: storage path
+   - MediaType: Video|Image
+   - MediaStatus: New (1)
+   - UserId: from JWT
+
+### During Processing
+
+6. Update MediaStatus to AIProcessing (2) via `PUT /api/media/{id}/status`
+7. Run detection (stream-based per AZ-173)
+8. Update MediaStatus to AIProcessed (3) on success, or Error (6) on failure
+
+## XxHash64 Sampling Algorithm
+
+```
+For files >= 3072 bytes:
+  Input = file_size_as_8_bytes + first_1024_bytes + middle_1024_bytes + last_1024_bytes
+  Output = XxHash64(input) as hex string
+
+For files < 3072 bytes:
+  Input = file_size_as_8_bytes + entire_file_content
+  Output = XxHash64(input) as hex string
+```
+
+Virtual hashes (in-memory only) prefixed with "V".
+
+## Acceptance Criteria
+
+- [ ] XxHash64 ID computed correctly using the sampling algorithm
+- [ ] Media record created in DB on upload with correct fields
+- [ ] File stored at proper persistent path (not temp)
+- [ ] MediaStatus transitions: New → AIProcessing → AIProcessed (or Error)
+- [ ] UserId correctly extracted from JWT and associated with Media record
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/main.py` | Modified | Upload handling, Media API calls, status updates |
+| `src/media_hash.py` | New | XxHash64 sampling hash utility |
+| `requirements.txt` | Modified | Add `xxhash` library if not present |
@@ -0,0 +1,65 @@
+# Cleanup Obsolete Path-Based Code
+
+**Task**: AZ-176_cleanup_obsolete_path_code
+**Name**: Clean up obsolete path-based code and old methods
+**Description**: Remove all code that relies on the old co-located architecture where the UI sends local file paths to the detection module.
+**Complexity**: 1 point
+**Dependencies**: AZ-173 (stream-based run_detect), AZ-174 (DB-driven config)
+**Component**: Cleanup
+**Jira**: AZ-176
+**Parent**: AZ-172
+
+## Problem
+
+After implementing stream-based detection and DB-driven config, the old path-based code becomes dead code. It must be removed to avoid confusion and maintenance burden.
+
+## Items to Remove
+
+### `inference.pyx`
+
+| Item | Reason |
+|------|--------|
+| `is_video(self, str filepath)` | Media type comes from upload metadata, not filesystem guessing |
+| `for p in ai_config.paths: ...` loop in `run_detect` | Replaced by stream-based dispatch |
+| `cv2.VideoCapture(<str>video_name)` with local path arg | Replaced by stream-based video processing |
+| `cv2.imread(<str>path)` with local path arg | Replaced by bytes-based image processing |
+| Old `run_detect` signature (if fully replaced) | Replaced by `run_detect_video` / `run_detect_image` |
+
+### `ai_config.pxd`
+
+| Item | Reason |
+|------|--------|
+| `cdef public list[str] paths` | Paths no longer part of config |
+
+### `ai_config.pyx`
+
+| Item | Reason |
+|------|--------|
+| `paths` parameter in `__init__` | Paths no longer part of config |
+| `self.paths = paths` assignment | Paths no longer part of config |
+| `data.get("paths", [])` in `from_dict` | Paths no longer part of config |
+| `paths: {self.paths}` in `__str__` | Paths no longer part of config |
+
+### `main.py`
+
+| Item | Reason |
+|------|--------|
+| `AIConfigDto.paths: list[str]` field | Paths no longer sent by caller |
+| `config_dict["paths"] = [tmp.name]` in `/detect` | Temp file path injection no longer needed |
+
+## Acceptance Criteria
+
+- [ ] No references to `paths` in `AIRecognitionConfig` or its Pydantic DTO
+- [ ] No `cv2.VideoCapture(local_path)` or `cv2.imread(local_path)` calls remain
+- [ ] No `is_video(filepath)` method remains
+- [ ] All tests pass after removal
+- [ ] No dead imports left behind
+
+## File Changes
+
+| File | Action | Description |
+|------|--------|-------------|
+| `src/inference.pyx` | Modified | Remove old methods and path-based logic |
+| `src/ai_config.pxd` | Modified | Remove `paths` field declaration |
+| `src/ai_config.pyx` | Modified | Remove `paths` from init, from_dict, __str__ |
+| `src/main.py` | Modified | Remove `AIConfigDto.paths`, path injection |