Add AIAvailabilityStatus and AIRecognitionConfig classes for AI model management

- Introduced `AIAvailabilityStatus` class to manage the availability status of AI models, including methods for setting status and logging messages. - Added `AIRecognitionConfig` class to encapsulate configuration parameters for AI recognition, with a static method for creating instances from dictionaries. - Implemented enums for AI availability states to enhance clarity and maintainability. - Updated related Cython files to support the new classes and ensure proper type handling. These changes aim to improve the structure and functionality of the AI model management system, facilitating better status tracking and configuration handling.
2026-06-21 13:41:09 +00:00 · 2026-03-31 05:49:51 +03:00
parent fc57d677b4
commit 8ce40a9385
43 changed files with 1190 additions and 462 deletions
@@ -0,0 +1,52 @@
+# Baseline Metrics
+
+**Run**: 01-code-cleanup
+**Date**: 2026-03-30
+
+## Code Metrics
+
+| Metric | Value |
+|--------|-------|
+| Source LOC (pyx + pxd + py) | 1,714 |
+| Test LOC (e2e + mocks) | 1,238 |
+| Source files | 22 (.pyx: 10, .pxd: 9, .py: 3) |
+| Test files | 10 |
+| Dependencies (requirements.txt) | 11 packages |
+| Dead code items identified | 20 |
+
+## Test Suite
+
+| Metric | Value |
+|--------|-------|
+| Total tests | 23 |
+| Passing | 23 |
+| Failing | 0 |
+| Skipped | 0 |
+| Execution time | 11.93s |
+
+## Functionality Inventory
+
+| Endpoint | Method | Coverage | Status |
+|----------|--------|----------|--------|
+| /health | GET | Covered | Working |
+| /detect | POST | Covered | Working |
+| /detect/{media_id} | POST | Covered | Working |
+| /detect/stream | GET | Covered | Working |
+
+## File Structure (pre-refactoring)
+
+All source code lives in the repository root — no `src/` separation:
+- Root: main.py, setup.py, 8x .pyx, 7x .pxd, classes.json
+- engines/: 3x .pyx, 4x .pxd, __init__.py, __init__.pxd
+- e2e/: tests, mocks, fixtures, config
+
+## Dead Code Inventory
+
+| Category | Count | Files |
+|----------|-------|-------|
+| Unused methods | 4 | serialize() x2, from_msgpack(), stop() |
+| Unused fields | 3 | file_data, model_batch_size, annotation_name |
+| Unused constants | 5 | CONFIG_FILE, QUEUE_CONFIG_FILENAME, CDN_CONFIG, SMALL_SIZE_KB, QUEUE_MAXSIZE |
+| Orphaned declarations | 3 | COMMANDS_QUEUE, ANNOTATIONS_QUEUE, weather enum PXD |
+| Dead imports | 4 | msgpack x3, typing/numpy in pxd |
+| Empty files | 1 | engines/__init__.pxd |
@@ -0,0 +1,193 @@
+# Logical Flow Analysis
+
+**Run**: 01-code-cleanup
+**Date**: 2026-03-30
+
+Each documented business flow (from `_docs/02_document/system-flows.md`) traced through actual code. Contradictions classified as: Logic Bug, Performance Waste, Design Contradiction, Documentation Drift.
+
+---
+
+## F2: Single Image Detection (`detect_single_image`)
+
+### LF-01: Batch padding wastes compute (Performance Waste)
+
+**Documented**: Client uploads one image → preprocess → engine → postprocess → return detections.
+
+**Actual** (inference.pyx:261-264):
+```python
+batch_size = self.engine.get_batch_size()
+frames = [frame] * batch_size       # duplicate frame N times
+input_blob = self.preprocess(frames) # preprocess N copies
+outputs = self.engine.run(input_blob)# run inference on N copies
+list_detections = self.postprocess(outputs, ai_config)
+detections = list_detections[0]      # use only first result
+```
+
+For TensorRT (batch_size=4): 4x the preprocessing, 4x the inference, 3/4 of results discarded. For CoreML (batch_size=1): no waste. For ONNX: depends on model's batch dimension.
+
+**Impact**: Up to 4x unnecessary GPU/CPU compute per single-image request.
+
+**Fix**: Engine should support running with fewer frames than max batch size. If the engine requires fixed batch, pad only at the engine boundary, not at the preprocessing level.
+
+---
+
+## F3: Media Detection — Video Processing (`_process_video`)
+
+### LF-02: Last partial batch silently dropped (Logic Bug / Data Loss)
+
+**Documented** (system-flows.md F3): "loop For each media file → preprocess/batch → engine → postprocess"
+
+**Actual** (inference.pyx:297-340):
+```python
+while v_input.isOpened() and not self.stop_signal:
+    ret, frame = v_input.read()
+    if not ret or frame is None:
+        break
+    frame_count += 1
+    if frame_count % ai_config.frame_period_recognition == 0:
+        batch_frames.append(frame)
+        batch_timestamps.append(...)
+
+    if len(batch_frames) == self.engine.get_batch_size():
+        # process batch
+        ...
+        batch_frames.clear()
+        batch_timestamps.clear()
+
+v_input.release()        # loop ends
+self.send_detection_status()
+# batch_frames may still have 1..(batch_size-1) unprocessed frames — DROPPED
+```
+
+When the video ends, any remaining frames in `batch_frames` (fewer than `batch_size`) are silently lost. For batch_size=4 and frame_period=4: up to 3 sampled frames at the end of every video are never processed.
+
+**Impact**: Detections in the final seconds of every video are potentially missed.
+
+### LF-03: `split_list_extend` padding is unnecessary and harmful (Design Contradiction + Performance Waste)
+
+**Design intent**: With dynamic batch sizing (agreed upon during engine refactoring in Step 3), engines should accept variable-size inputs.
+
+**Actual** (inference.pyx:208-217):
+```python
+cdef split_list_extend(self, lst, chunk_size):
+    chunks = [lst[i:i + chunk_size] for i in range(0, len(lst), chunk_size)]
+    last_chunk = chunks[len(chunks) - 1]
+    if len(last_chunk) < chunk_size:
+        last_elem = last_chunk[len(last_chunk)-1]
+        while len(last_chunk) < chunk_size:
+            last_chunk.append(last_elem)
+    return chunks
+```
+
+This duplicates the last element to pad the final chunk to exactly `chunk_size`. Problems:
+1. With dynamic batch sizing, this padding is completely unnecessary — just process the smaller batch
+2. The duplicated frames go through full preprocessing and inference, wasting compute
+3. The duplicated detections from padded frames are processed by `_process_images_inner` and may emit duplicate annotations (the dedup logic only catches tile overlaps, not frame-level duplicates from padding)
+
+**Impact**: Unnecessary compute + potential duplicate detections from padded frames.
+
+### LF-04: Fixed batch gate `==` should be `>=` or removed entirely (Design Contradiction)
+
+**Actual** (inference.pyx:307):
+```python
+if len(batch_frames) == self.engine.get_batch_size():
+```
+
+This strict equality means: only process when the batch is **exactly** full. Combined with LF-02 (no flush), remaining frames are dropped. With dynamic batch support, this gate is unnecessary — process frames as they accumulate, or at minimum flush remaining frames after the loop.
+
+---
+
+## F3: Media Detection — Image Processing (`_process_images`)
+
+### LF-05: Non-last small images silently dropped (Logic Bug / Data Loss)
+
+**Actual** (inference.pyx:349-379):
+```python
+for path in image_paths:
+    frame_data = []        # ← RESET each iteration
+    frame = cv2.imread(path)
+    ...
+    frame_data.append(...)  # or .extend(...) for tiled images
+
+    if len(frame_data) > self.engine.get_batch_size():
+        for chunk in self.split_list_extend(frame_data, ...):
+            self._process_images_inner(...)
+        self.send_detection_status()
+
+# Outside loop: only the LAST image's frame_data survives
+for chunk in self.split_list_extend(frame_data, ...):
+    self._process_images_inner(...)
+self.send_detection_status()
+```
+
+Walk through with 3 images [A(small), B(small), C(small)] and batch_size=4:
+- Iteration A: `frame_data = [(A, ...)]`. `1 > 4` → False. Not processed.
+- Iteration B: `frame_data = [(B, ...)]` (A lost!). `1 > 4` → False. Not processed.
+- Iteration C: `frame_data = [(C, ...)]` (B lost!). `1 > 4` → False. Not processed.
+- After loop: `frame_data = [(C, ...)]` → processed. Only C was ever detected.
+
+**Impact**: In multi-image media detection, all images except the last are silently dropped when each is smaller than the batch size. This is a critical data loss bug.
+
+### LF-06: Large images double-processed (Logic Bug)
+
+With image D producing 10 tiles and batch_size=4:
+- Inside loop: `10 > 4` → True. All 10 tiles processed (3 chunks: 4+4+4 with last padded). `send_detection_status()` called.
+- After loop: `frame_data` still contains all 10 tiles. Processed again (3 more chunks). `send_detection_status()` called again.
+
+**Impact**: Large images get inference run twice, producing duplicate detection events.
+
+### LF-07: `frame.shape` before None check (Logic Bug / Crash)
+
+**Actual** (inference.pyx:355-358):
+```python
+frame = cv2.imread(<str>path)
+img_h, img_w, _ = frame.shape   # crashes if frame is None
+if frame is None:                # dead code — never reached
+    continue
+```
+
+**Impact**: Corrupt or missing image file crashes the entire detection pipeline instead of gracefully skipping.
+
+---
+
+## Cross-Cutting: Batch Size Design Contradiction
+
+### LF-08: Entire pipeline assumes fixed batch size (Design Contradiction)
+
+The engine polymorphism (Step 3 refactoring) established that different engines have different batch sizes: TensorRT=4, CoreML=1, ONNX=variable. But the processing pipeline treats batch size as a fixed gate:
+
+| Location | Pattern | Problem |
+|----------|---------|---------|
+| `detect_single_image:262` | `[frame] * batch_size` | Pads single frame to batch size |
+| `_process_video:307` | `== batch_size` | Only processes exact-full batches |
+| `_process_images:372` | `> batch_size` | Only processes when exceeding batch |
+| `split_list_extend` | Pads last chunk | Duplicates frames to fill batch |
+
+All engines already accept the full batch as a numpy blob. The fix is to make the pipeline batch-agnostic: collect frames, process when you have enough OR when the stream ends. Never pad with duplicates.
+
+---
+
+## Architecture Documentation Drift
+
+### LF-09: Architecture doc lists msgpack as active technology (Documentation Drift)
+
+**Architecture.md** § Technology Stack:
+> "Serialization | msgpack | 1.1.1 | Compact binary serialization for annotations and configs"
+
+**Reality**: All `serialize()` and `from_msgpack()` methods are dead code. The system uses Pydantic JSON for API responses and `from_dict()` for config parsing. msgpack is not used by any live code path.
+
+---
+
+## Summary Table
+
+| ID | Flow | Type | Severity | Description |
+|----|------|------|----------|-------------|
+| LF-01 | F2 | Performance Waste | Medium | Single image duplicated to fill batch — up to 4x wasted compute |
+| LF-02 | F3/Video | Data Loss | High | Last partial video batch silently dropped |
+| LF-03 | F3/Both | Design Contradiction + Perf | Medium | split_list_extend pads with duplicates instead of processing smaller batch |
+| LF-04 | F3/Video | Design Contradiction | High | Fixed `== batch_size` gate prevents partial batch processing |
+| LF-05 | F3/Images | Data Loss | Critical | Non-last small images silently dropped in multi-image processing |
+| LF-06 | F3/Images | Logic Bug | High | Large images processed twice (inside loop + after loop) |
+| LF-07 | F3/Images | Crash | High | frame.shape before None check |
+| LF-08 | Cross-cutting | Design Contradiction | High | Entire pipeline assumes fixed batch size vs dynamic engine reality |
+| LF-09 | Documentation | Drift | Low | Architecture lists msgpack as active; it's dead |
@@ -0,0 +1,132 @@
+# List of Changes
+
+**Run**: 01-code-cleanup
+**Mode**: automatic
+**Source**: self-discovered
+**Date**: 2026-03-30
+
+## Summary
+
+Two tiers: (1) Fix critical logical flow bugs — batch handling, data loss, crash prevention, and remove the fixed-batch-size assumption that contradicts the dynamic engine design. (2) Dead code cleanup, configurable paths, HTTP timeouts, and move source to `src/`.
+
+## Changes
+
+### C01: Move source code to `src/` directory
+- **File(s)**: main.py, inference.pyx, constants_inf.pyx, constants_inf.pxd, annotation.pyx, annotation.pxd, ai_config.pyx, ai_config.pxd, ai_availability_status.pyx, ai_availability_status.pxd, loader_http_client.pyx, loader_http_client.pxd, engines/, setup.py, run-tests.sh, e2e/run_local.sh, e2e/docker-compose.test.yml
+- **Problem**: All source code is in the repository root, mixed with config, docs, and test infrastructure.
+- **Change**: Move all application source files into `src/`. Update setup.py extension paths, run-tests.sh, e2e scripts, and docker-compose volumes. Keep setup.py, requirements, and tests at root.
+- **Rationale**: Project convention requires source under `src/`.
+- **Risk**: medium
+- **Dependencies**: None (do first — all other changes reference new paths)
+
+### C02: Fix `_process_images` — accumulate all images, process once (LF-05, LF-06)
+- **File(s)**: src/inference.pyx (`_process_images`)
+- **Problem**: `frame_data = []` is reset inside the per-image loop, so only the last image's data survives to the outer processing loop. Non-last small images are silently dropped. Large images that exceed batch_size inside the loop are also re-processed outside the loop (double-processing).
+- **Change**: Accumulate frame_data across ALL images (move reset before the loop). Process all accumulated data once after the loop. Remove the inner batch-processing + status call. Each image's tiles/frames should carry their own ground_sampling_distance so mixed-GSD images process correctly.
+- **Rationale**: Critical data loss — multi-image requests silently drop all images except the last.
+- **Risk**: medium
+- **Dependencies**: C01, C04
+
+### C03: Fix `_process_video` — flush remaining frames after loop (LF-02, LF-04)
+- **File(s)**: src/inference.pyx (`_process_video`)
+- **Problem**: The `if len(batch_frames) == self.engine.get_batch_size()` gate means frames are only processed in exact-batch-size groups. When the video ends with a partial batch (1..batch_size-1 frames), those frames are silently dropped. Detections at the end of every video are potentially missed.
+- **Change**: After the video read loop, if `batch_frames` is non-empty, process the remaining frames as a partial batch (no padding). Change the `==` gate to `>=` as a safety measure, though with the flush it's not strictly needed.
+- **Rationale**: Silent data loss — last frames of every video are dropped.
+- **Risk**: medium
+- **Dependencies**: C01, C04
+
+### C04: Remove `split_list_extend` — replace with simple chunking without padding (LF-03, LF-08)
+- **File(s)**: src/inference.pyx (`split_list_extend`, `_process_images`, `detect_single_image`)
+- **Problem**: `split_list_extend` pads the last chunk by duplicating its final element to fill `batch_size`. This wastes compute (duplicate inference), may produce duplicate detections, and contradicts the dynamic batch design established in Step 3 (engine polymorphism). In `detect_single_image`, `[frame] * batch_size` pads a single frame to batch_size copies — same issue.
+- **Change**: Replace `split_list_extend` with plain chunking (no padding). Last chunk keeps its natural size. In `detect_single_image`, pass a single-frame list. Engine `run()` and `preprocess()` must handle variable-size input — verify each engine supports this or add a minimal adapter.
+- **Rationale**: Unnecessary compute (up to 4x for TensorRT single-image), potential duplicate detections from padding, contradicts dynamic batch design.
+- **Risk**: high
+- **Dependencies**: C01
+
+### C05: Fix frame-is-None crash in `_process_images` (LF-07)
+- **File(s)**: src/inference.pyx (`_process_images`)
+- **Problem**: `frame.shape` is accessed before `frame is None` check. If `cv2.imread` fails, the pipeline crashes instead of skipping the file.
+- **Change**: Move the None check before the shape access.
+- **Rationale**: Crash prevention for missing/corrupt image files.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C06: Remove orphaned RabbitMQ declarations from constants_inf.pxd
+- **File(s)**: src/constants_inf.pxd
+- **Problem**: `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, `ANNOTATIONS_QUEUE` are declared but have no implementations. Remnants of previous RabbitMQ architecture.
+- **Change**: Remove the three declarations and their comments.
+- **Rationale**: Dead declarations mislead about system architecture.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C07: Remove unused constants from constants_inf
+- **File(s)**: src/constants_inf.pxd, src/constants_inf.pyx
+- **Problem**: `CONFIG_FILE` (with stale "zmq" comment), `QUEUE_CONFIG_FILENAME`, `CDN_CONFIG`, `SMALL_SIZE_KB` — defined but never referenced.
+- **Change**: Remove all four from .pxd and .pyx.
+- **Rationale**: Dead constants with misleading comments.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C08: Remove dead serialize/from_msgpack methods and msgpack imports
+- **File(s)**: src/annotation.pyx, src/annotation.pxd, src/ai_availability_status.pyx, src/ai_availability_status.pxd, src/ai_config.pyx, src/ai_config.pxd
+- **Problem**: `Annotation.serialize()`, `AIAvailabilityStatus.serialize()`, `AIRecognitionConfig.from_msgpack()` — all dead. Associated `import msgpack` / `from msgpack import unpackb` only serve these dead methods.
+- **Change**: Remove all three methods from .pyx and .pxd files. Remove msgpack imports.
+- **Rationale**: Legacy queue-era serialization with no callers.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C09: Remove unused fields (file_data, model_batch_size, annotation_name)
+- **File(s)**: src/ai_config.pyx, src/ai_config.pxd, src/annotation.pyx, src/annotation.pxd, src/main.py
+- **Problem**: `AIRecognitionConfig.file_data` populated but never read. `AIRecognitionConfig.model_batch_size` parsed but never used (engine owns batch size). `Detection.annotation_name` set but never read.
+- **Change**: Remove field declarations from .pxd, remove from constructors and factory methods in .pyx. Remove `file_data` and `model_batch_size` from AIConfigDto in main.py. Remove annotation_name assignment loop in Annotation.__init__.
+- **Rationale**: Dead fields that mislead about responsibilities.
+- **Risk**: low
+- **Dependencies**: C01, C08
+
+### C10: Remove misc dead code (stop no-op, empty pxd, unused pxd imports)
+- **File(s)**: src/loader_http_client.pyx, src/loader_http_client.pxd, src/engines/__init__.pxd, src/engines/inference_engine.pxd
+- **Problem**: `LoaderHttpClient.stop()` is a no-op. `engines/__init__.pxd` is empty. `inference_engine.pxd` imports `List, Tuple` from typing and `numpy` — both unused.
+- **Change**: Remove stop() from .pyx and .pxd. Delete empty __init__.pxd. Remove unused imports from inference_engine.pxd.
+- **Rationale**: Dead code noise.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C11: Remove msgpack from requirements.txt
+- **File(s)**: requirements.txt
+- **Problem**: `msgpack==1.1.1` has no consumers after C08 removes all msgpack usage.
+- **Change**: Remove from requirements.txt.
+- **Rationale**: Unused dependency.
+- **Risk**: low
+- **Dependencies**: C08
+
+### C12: Make classes.json path configurable via env var
+- **File(s)**: src/constants_inf.pyx
+- **Problem**: `open('classes.json')` is hardcoded, depends on CWD at import time.
+- **Change**: Read from `os.environ.get("CLASSES_JSON_PATH", "classes.json")`.
+- **Rationale**: Environment-appropriate configuration.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C13: Make log directory configurable via env var
+- **File(s)**: src/constants_inf.pyx
+- **Problem**: `sink="Logs/log_inference_..."` is hardcoded.
+- **Change**: Read from `os.environ.get("LOG_DIR", "Logs")`.
+- **Rationale**: Environment configurability.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C14: Add timeouts to LoaderHttpClient HTTP calls
+- **File(s)**: src/loader_http_client.pyx
+- **Problem**: No explicit timeout on `requests.post()` calls. Stalled loader hangs detections service.
+- **Change**: Add `timeout=120` to load and upload calls.
+- **Rationale**: Prevent service hangs.
+- **Risk**: low
+- **Dependencies**: C01
+
+### C15: Update architecture doc — remove msgpack from tech stack (LF-09)
+- **File(s)**: _docs/02_document/architecture.md
+- **Problem**: Tech stack lists "msgpack | 1.1.1 | Compact binary serialization for annotations and configs" but msgpack is dead code after this refactoring.
+- **Change**: Remove msgpack row from tech stack table.
+- **Rationale**: Documentation accuracy.
+- **Risk**: low
+- **Dependencies**: C08, C11