# List of Changes

**Run**: 01-code-cleanup
**Mode**: automatic
**Source**: self-discovered
**Date**: 2026-03-30

## Summary

Two tiers: (1) Fix critical logical flow bugs — batch handling, data loss, crash prevention, and remove the fixed-batch-size assumption that contradicts the dynamic engine design. (2) Dead code cleanup, configurable paths, HTTP timeouts, and move source to `src/`.

## Changes

### C01: Move source code to `src/` directory
- **File(s)**: main.py, inference.pyx, constants_inf.pyx, constants_inf.pxd, annotation.pyx, annotation.pxd, ai_config.pyx, ai_config.pxd, ai_availability_status.pyx, ai_availability_status.pxd, loader_http_client.pyx, loader_http_client.pxd, engines/, setup.py, run-tests.sh, e2e/run_local.sh, e2e/docker-compose.test.yml
- **Problem**: All source code is in the repository root, mixed with config, docs, and test infrastructure.
- **Change**: Move all application source files into `src/`. Update setup.py extension paths, run-tests.sh, e2e scripts, and docker-compose volumes. Keep setup.py, requirements, and tests at root.
- **Rationale**: Project convention requires source under `src/`.
- **Risk**: medium
- **Dependencies**: None (do first — all other changes reference new paths)

### C02: Fix `_process_images` — accumulate all images, process once (LF-05, LF-06)
- **File(s)**: src/inference.pyx (`_process_images`)
- **Problem**: `frame_data = []` is reset inside the per-image loop, so only the last image's data survives to the outer processing loop. Non-last small images are silently dropped. Large images that exceed batch_size inside the loop are also re-processed outside the loop (double-processing).
- **Change**: Accumulate frame_data across ALL images (move reset before the loop). Process all accumulated data once after the loop. Remove the inner batch-processing + status call. Each image's tiles/frames should carry their own ground_sampling_distance so mixed-GSD images process correctly.
- **Rationale**: Critical data loss — multi-image requests silently drop all images except the last.
- **Risk**: medium
- **Dependencies**: C01, C04

### C03: Fix `_process_video` — flush remaining frames after loop (LF-02, LF-04)
- **File(s)**: src/inference.pyx (`_process_video`)
- **Problem**: The `if len(batch_frames) == self.engine.get_batch_size()` gate means frames are only processed in exact-batch-size groups. When the video ends with a partial batch (1..batch_size-1 frames), those frames are silently dropped. Detections at the end of every video are potentially missed.
- **Change**: After the video read loop, if `batch_frames` is non-empty, process the remaining frames as a partial batch (no padding). Change the `==` gate to `>=` as a safety measure, though with the flush it's not strictly needed.
- **Rationale**: Silent data loss — last frames of every video are dropped.
- **Risk**: medium
- **Dependencies**: C01, C04

### C04: Remove `split_list_extend` — replace with simple chunking without padding (LF-03, LF-08)
- **File(s)**: src/inference.pyx (`split_list_extend`, `_process_images`, `detect_single_image`)
- **Problem**: `split_list_extend` pads the last chunk by duplicating its final element to fill `batch_size`. This wastes compute (duplicate inference), may produce duplicate detections, and contradicts the dynamic batch design established in Step 3 (engine polymorphism). In `detect_single_image`, `[frame] * batch_size` pads a single frame to batch_size copies — same issue.
- **Change**: Replace `split_list_extend` with plain chunking (no padding). Last chunk keeps its natural size. In `detect_single_image`, pass a single-frame list. Engine `run()` and `preprocess()` must handle variable-size input — verify each engine supports this or add a minimal adapter.
- **Rationale**: Unnecessary compute (up to 4x for TensorRT single-image), potential duplicate detections from padding, contradicts dynamic batch design.
- **Risk**: high
- **Dependencies**: C01

### C05: Fix frame-is-None crash in `_process_images` (LF-07)
- **File(s)**: src/inference.pyx (`_process_images`)
- **Problem**: `frame.shape` is accessed before `frame is None` check. If `cv2.imread` fails, the pipeline crashes instead of skipping the file.
- **Change**: Move the None check before the shape access.
- **Rationale**: Crash prevention for missing/corrupt image files.
- **Risk**: low
- **Dependencies**: C01

### C06: Remove orphaned RabbitMQ declarations from constants_inf.pxd
- **File(s)**: src/constants_inf.pxd
- **Problem**: `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, `ANNOTATIONS_QUEUE` are declared but have no implementations. Remnants of previous RabbitMQ architecture.
- **Change**: Remove the three declarations and their comments.
- **Rationale**: Dead declarations mislead about system architecture.
- **Risk**: low
- **Dependencies**: C01

### C07: Remove unused constants from constants_inf
- **File(s)**: src/constants_inf.pxd, src/constants_inf.pyx
- **Problem**: `CONFIG_FILE` (with stale "zmq" comment), `QUEUE_CONFIG_FILENAME`, `CDN_CONFIG`, `SMALL_SIZE_KB` — defined but never referenced.
- **Change**: Remove all four from .pxd and .pyx.
- **Rationale**: Dead constants with misleading comments.
- **Risk**: low
- **Dependencies**: C01

### C08: Remove dead serialize/from_msgpack methods and msgpack imports
- **File(s)**: src/annotation.pyx, src/annotation.pxd, src/ai_availability_status.pyx, src/ai_availability_status.pxd, src/ai_config.pyx, src/ai_config.pxd
- **Problem**: `Annotation.serialize()`, `AIAvailabilityStatus.serialize()`, `AIRecognitionConfig.from_msgpack()` — all dead. Associated `import msgpack` / `from msgpack import unpackb` only serve these dead methods.
- **Change**: Remove all three methods from .pyx and .pxd files. Remove msgpack imports.
- **Rationale**: Legacy queue-era serialization with no callers.
- **Risk**: low
- **Dependencies**: C01

### C09: Remove unused fields (file_data, model_batch_size, annotation_name)
- **File(s)**: src/ai_config.pyx, src/ai_config.pxd, src/annotation.pyx, src/annotation.pxd, src/main.py
- **Problem**: `AIRecognitionConfig.file_data` populated but never read. `AIRecognitionConfig.model_batch_size` parsed but never used (engine owns batch size). `Detection.annotation_name` set but never read.
- **Change**: Remove field declarations from .pxd, remove from constructors and factory methods in .pyx. Remove `file_data` and `model_batch_size` from AIConfigDto in main.py. Remove annotation_name assignment loop in Annotation.__init__.
- **Rationale**: Dead fields that mislead about responsibilities.
- **Risk**: low
- **Dependencies**: C01, C08

### C10: Remove misc dead code (stop no-op, empty pxd, unused pxd imports)
- **File(s)**: src/loader_http_client.pyx, src/loader_http_client.pxd, src/engines/__init__.pxd, src/engines/inference_engine.pxd
- **Problem**: `LoaderHttpClient.stop()` is a no-op. `engines/__init__.pxd` is empty. `inference_engine.pxd` imports `List, Tuple` from typing and `numpy` — both unused.
- **Change**: Remove stop() from .pyx and .pxd. Delete empty __init__.pxd. Remove unused imports from inference_engine.pxd.
- **Rationale**: Dead code noise.
- **Risk**: low
- **Dependencies**: C01

### C11: Remove msgpack from requirements.txt
- **File(s)**: requirements.txt
- **Problem**: `msgpack==1.1.1` has no consumers after C08 removes all msgpack usage.
- **Change**: Remove from requirements.txt.
- **Rationale**: Unused dependency.
- **Risk**: low
- **Dependencies**: C08

### C12: Make classes.json path configurable via env var
- **File(s)**: src/constants_inf.pyx
- **Problem**: `open('classes.json')` is hardcoded, depends on CWD at import time.
- **Change**: Read from `os.environ.get("CLASSES_JSON_PATH", "classes.json")`.
- **Rationale**: Environment-appropriate configuration.
- **Risk**: low
- **Dependencies**: C01

### C13: Make log directory configurable via env var
- **File(s)**: src/constants_inf.pyx
- **Problem**: `sink="Logs/log_inference_..."` is hardcoded.
- **Change**: Read from `os.environ.get("LOG_DIR", "Logs")`.
- **Rationale**: Environment configurability.
- **Risk**: low
- **Dependencies**: C01

### C14: Add timeouts to LoaderHttpClient HTTP calls
- **File(s)**: src/loader_http_client.pyx
- **Problem**: No explicit timeout on `requests.post()` calls. Stalled loader hangs detections service.
- **Change**: Add `timeout=120` to load and upload calls.
- **Rationale**: Prevent service hangs.
- **Risk**: low
- **Dependencies**: C01

### C15: Update architecture doc — remove msgpack from tech stack (LF-09)
- **File(s)**: _docs/02_document/architecture.md
- **Problem**: Tech stack lists "msgpack | 1.1.1 | Compact binary serialization for annotations and configs" but msgpack is dead code after this refactoring.
- **Change**: Remove msgpack row from tech stack table.
- **Rationale**: Documentation accuracy.
- **Risk**: low
- **Dependencies**: C08, C11