Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
2026-06-22 03:51:12 +00:00 · 2026-03-27 18:18:30 +02:00
parent b68c07b540
commit 142c6c4de8
106 changed files with 5706 additions and 654 deletions
@@ -0,0 +1,138 @@
+# Verification Log
+
+## Summary
+
+| Metric | Count |
+|--------|-------|
+| Entities verified | 87 |
+| Entities flagged | 0 |
+| Corrections applied | 0 |
+| Bugs found in code | 5 |
+| Missing modules | 1 |
+| Duplicated code | 1 pattern (3 locations) |
+| Security issues | 3 |
+| Completeness | 21/21 modules (100%) |
+
+## Entity Verification
+
+All class names, function names, method signatures, and module names referenced in documentation were verified against the actual source code. No hallucinated entities found.
+
+### Verified Entities (key samples)
+
+| Entity | Location | Doc Reference | Status |
+|--------|----------|--------------|--------|
+| `Security.encrypt_to` | security.py:14 | modules/security.md | OK |
+| `Security.decrypt_to` | security.py:28 | modules/security.md | OK |
+| `Security.get_model_encryption_key` | security.py:66 | modules/security.md | OK |
+| `get_hardware_info` | hardware_service.py:5 | modules/hardware_service.md | OK |
+| `CDNManager.upload` | cdn_manager.py:28 | modules/cdn_manager.md | OK |
+| `CDNManager.download` | cdn_manager.py:37 | modules/cdn_manager.md | OK |
+| `ApiClient.login` | api_client.py:43 | modules/api_client.md | OK |
+| `ApiClient.load_bytes` | api_client.py:63 | modules/api_client.md | OK |
+| `ApiClient.upload_big_small_resource` | api_client.py:113 | modules/api_client.md | OK |
+| `Augmentator.augment_annotations` | augmentation.py:125 | modules/augmentation.md | OK |
+| `Augmentator.augment_inner` | augmentation.py:55 | modules/augmentation.md | OK |
+| `InferenceEngine` (ABC) | inference/onnx_engine.py:7 | modules/inference_onnx_engine.md | OK |
+| `OnnxEngine` | inference/onnx_engine.py:25 | modules/inference_onnx_engine.md | OK |
+| `TensorRTEngine` | inference/tensorrt_engine.py:16 | modules/inference_tensorrt_engine.md | OK |
+| `TensorRTEngine.convert_from_onnx` | inference/tensorrt_engine.py:104 | modules/inference_tensorrt_engine.md | OK |
+| `Inference.process` | inference/inference.py:83 | modules/inference_inference.md | OK |
+| `Inference.remove_overlapping_detections` | inference/inference.py:120 | modules/inference_inference.md | OK |
+| `AnnotationQueueHandler.on_message` | annotation-queue/annotation_queue_handler.py:87 | modules/annotation_queue_handler.md | OK |
+| `AnnotationMessage` | annotation-queue/annotation_queue_dto.py:91 | modules/annotation_queue_dto.md | OK |
+| `form_dataset` | train.py:42 | modules/train.md | OK |
+| `train_dataset` | train.py:147 | modules/train.md | OK |
+| `export_onnx` | exports.py:29 | modules/exports.md | OK |
+| `export_rknn` | exports.py:19 | modules/exports.md | OK |
+| `export_tensorrt` | exports.py:45 | modules/exports.md | OK |
+| `upload_model` | exports.py:82 | modules/exports.md | OK |
+| `WeatherMode` | dto/annotationClass.py:6 | modules/dto_annotationClass.md | OK |
+| `AnnotationClass.read_json` | dto/annotationClass.py:18 | modules/dto_annotationClass.md | OK |
+| `ImageLabel.visualize` | dto/imageLabel.py:12 | modules/dto_imageLabel.md | OK |
+| `Dotdict` | utils.py:1 | modules/utils.md | OK |
+
+## Code Bugs Found During Verification
+
+### Bug 1: `augmentation.py` — undefined attribute `total_to_process`
+- **Location**: augmentation.py, line 118
+- **Issue**: References `self.total_to_process` but only `self.total_images_to_process` is defined in `__init__`
+- **Impact**: AttributeError at runtime during progress logging
+- **Documented in**: modules/augmentation.md, components/05_data_pipeline/description.md
+
+### Bug 2: `train.py` `copy_annotations` — reporting bug
+- **Location**: train.py, line 93 and 99
+- **Issue**: `copied = 0` is declared but never incremented. The global `total_files_copied` is incremented inside the inner function, but `copied` is printed in the final message: `f'Copied all {copied} annotations'` always prints 0.
+- **Impact**: Incorrect progress reporting (cosmetic)
+- **Documented in**: modules/train.md, components/06_training/description.md
+
+### Bug 3: `exports.py` `upload_model` — stale ApiClient constructor call
+- **Location**: exports.py, line 97
+- **Issue**: `ApiClient(ApiCredentials(api_c.url, api_c.user, api_c.pw, api_c.folder))` — but `ApiClient.__init__` takes no args, and `ApiCredentials.__init__` takes `(url, email, password)`, not `(url, user, pw, folder)`.
+- **Impact**: `upload_model` function would fail at runtime. This function appears to be stale code — the actual upload flow in `train.py:export_current_model` uses the correct `ApiClient()` constructor.
+- **Documented in**: modules/exports.md, components/06_training/description.md
+
+### Bug 4: `inference/tensorrt_engine.py` — potential uninitialized `batch_size`
+- **Location**: inference/tensorrt_engine.py, line 43–44
+- **Issue**: `self.batch_size` is only set if `engine_input_shape[0] != -1`. If the batch dimension is dynamic (-1), `self.batch_size` is never assigned before being used in `self.input_shape = [self.batch_size, ...]`.
+- **Impact**: NameError at runtime for models with dynamic batch size (unless batch_size is passed via kwargs/set elsewhere)
+- **Documented in**: modules/inference_tensorrt_engine.md, components/07_inference/description.md
+
+### Bug 5: `dataset-visualiser.py` — missing import
+- **Location**: dataset-visualiser.py, line 6
+- **Issue**: `from preprocessing import read_labels` — the `preprocessing` module does not exist in the codebase.
+- **Impact**: Script cannot run; ImportError at startup
+- **Documented in**: modules/dataset_visualiser.md, components/05_data_pipeline/description.md
+
+## Missing Modules
+
+| Module | Referenced By | Status |
+|--------|-------------|--------|
+| `preprocessing` | dataset-visualiser.py, tests/imagelabel_visualize_test.py | Not found in codebase |
+
+## Duplicated Code
+
+### AnnotationClass + WeatherMode (3 locations)
+| Location | Differences |
+|----------|-------------|
+| `dto/annotationClass.py` | Standard version. `color_tuple` property strips first 3 chars. |
+| `inference/dto.py` | Adds `opencv_color` BGR field. Same `read_json` logic. |
+| `annotation-queue/annotation_queue_dto.py` | Adds `opencv_color`. Reads `classes.json` from CWD (not relative to package). |
+
+## Security Issues
+
+| Issue | Location | Severity |
+|-------|----------|----------|
+| Hardcoded API credentials | config.yaml (email, password) | High |
+| Hardcoded CDN access keys | cdn.yaml (4 access keys) | High |
+| Hardcoded encryption key | security.py:67 (`get_model_encryption_key`) | High |
+| Queue credentials in plaintext | config.yaml, annotation-queue/config.yaml | Medium |
+| No TLS cert validation in API calls | api_client.py | Low |
+
+## Completeness Check
+
+All 21 source modules documented. All 8 components cover all modules with no gaps.
+
+| Component | Modules | Complete |
+|-----------|---------|----------|
+| 01 Core | constants, utils | Yes |
+| 02 Security | security, hardware_service | Yes |
+| 03 API & CDN | api_client, cdn_manager | Yes |
+| 04 Data Models | dto/annotationClass, dto/imageLabel | Yes |
+| 05 Data Pipeline | augmentation, convert-annotations, dataset-visualiser | Yes |
+| 06 Training | train, exports, manual_run | Yes |
+| 07 Inference | inference/dto, onnx_engine, tensorrt_engine, inference, start_inference | Yes |
+| 08 Annotation Queue | annotation_queue_dto, annotation_queue_handler | Yes |
+
+## Consistency Check
+
+- Component docs agree with architecture doc: Yes
+- Flow diagrams match component interfaces: Yes
+- Module dependency graph in discovery matches import analysis: Yes
+- Data model doc matches filesystem layout in architecture: Yes
+
+## Remaining Gaps / Uncertainties
+
+- The `preprocessing` module may have existed previously and been deleted or renamed
+- `exports.upload_model` may be intentionally deprecated in favor of the ApiClient-based flow in train.py
+- `checkpoint.txt` content (`2024-06-27 20:51:35`) suggests training infrastructure was last used in mid-2024
+- The `orangepi5/` shell scripts were not analyzed (bash, not Python) — they appear to be setup/run scripts for edge deployment