Files
Oleksandr Bezdieniezhnykh 142c6c4de8 Refactor constants management to use Pydantic BaseModel for configuration
- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
2026-03-27 18:18:30 +02:00

139 lines
7.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Verification Log
## Summary
| Metric | Count |
|--------|-------|
| Entities verified | 87 |
| Entities flagged | 0 |
| Corrections applied | 0 |
| Bugs found in code | 5 |
| Missing modules | 1 |
| Duplicated code | 1 pattern (3 locations) |
| Security issues | 3 |
| Completeness | 21/21 modules (100%) |
## Entity Verification
All class names, function names, method signatures, and module names referenced in documentation were verified against the actual source code. No hallucinated entities found.
### Verified Entities (key samples)
| Entity | Location | Doc Reference | Status |
|--------|----------|--------------|--------|
| `Security.encrypt_to` | security.py:14 | modules/security.md | OK |
| `Security.decrypt_to` | security.py:28 | modules/security.md | OK |
| `Security.get_model_encryption_key` | security.py:66 | modules/security.md | OK |
| `get_hardware_info` | hardware_service.py:5 | modules/hardware_service.md | OK |
| `CDNManager.upload` | cdn_manager.py:28 | modules/cdn_manager.md | OK |
| `CDNManager.download` | cdn_manager.py:37 | modules/cdn_manager.md | OK |
| `ApiClient.login` | api_client.py:43 | modules/api_client.md | OK |
| `ApiClient.load_bytes` | api_client.py:63 | modules/api_client.md | OK |
| `ApiClient.upload_big_small_resource` | api_client.py:113 | modules/api_client.md | OK |
| `Augmentator.augment_annotations` | augmentation.py:125 | modules/augmentation.md | OK |
| `Augmentator.augment_inner` | augmentation.py:55 | modules/augmentation.md | OK |
| `InferenceEngine` (ABC) | inference/onnx_engine.py:7 | modules/inference_onnx_engine.md | OK |
| `OnnxEngine` | inference/onnx_engine.py:25 | modules/inference_onnx_engine.md | OK |
| `TensorRTEngine` | inference/tensorrt_engine.py:16 | modules/inference_tensorrt_engine.md | OK |
| `TensorRTEngine.convert_from_onnx` | inference/tensorrt_engine.py:104 | modules/inference_tensorrt_engine.md | OK |
| `Inference.process` | inference/inference.py:83 | modules/inference_inference.md | OK |
| `Inference.remove_overlapping_detections` | inference/inference.py:120 | modules/inference_inference.md | OK |
| `AnnotationQueueHandler.on_message` | annotation-queue/annotation_queue_handler.py:87 | modules/annotation_queue_handler.md | OK |
| `AnnotationMessage` | annotation-queue/annotation_queue_dto.py:91 | modules/annotation_queue_dto.md | OK |
| `form_dataset` | train.py:42 | modules/train.md | OK |
| `train_dataset` | train.py:147 | modules/train.md | OK |
| `export_onnx` | exports.py:29 | modules/exports.md | OK |
| `export_rknn` | exports.py:19 | modules/exports.md | OK |
| `export_tensorrt` | exports.py:45 | modules/exports.md | OK |
| `upload_model` | exports.py:82 | modules/exports.md | OK |
| `WeatherMode` | dto/annotationClass.py:6 | modules/dto_annotationClass.md | OK |
| `AnnotationClass.read_json` | dto/annotationClass.py:18 | modules/dto_annotationClass.md | OK |
| `ImageLabel.visualize` | dto/imageLabel.py:12 | modules/dto_imageLabel.md | OK |
| `Dotdict` | utils.py:1 | modules/utils.md | OK |
## Code Bugs Found During Verification
### Bug 1: `augmentation.py` — undefined attribute `total_to_process`
- **Location**: augmentation.py, line 118
- **Issue**: References `self.total_to_process` but only `self.total_images_to_process` is defined in `__init__`
- **Impact**: AttributeError at runtime during progress logging
- **Documented in**: modules/augmentation.md, components/05_data_pipeline/description.md
### Bug 2: `train.py` `copy_annotations` — reporting bug
- **Location**: train.py, line 93 and 99
- **Issue**: `copied = 0` is declared but never incremented. The global `total_files_copied` is incremented inside the inner function, but `copied` is printed in the final message: `f'Copied all {copied} annotations'` always prints 0.
- **Impact**: Incorrect progress reporting (cosmetic)
- **Documented in**: modules/train.md, components/06_training/description.md
### Bug 3: `exports.py` `upload_model` — stale ApiClient constructor call
- **Location**: exports.py, line 97
- **Issue**: `ApiClient(ApiCredentials(api_c.url, api_c.user, api_c.pw, api_c.folder))` — but `ApiClient.__init__` takes no args, and `ApiCredentials.__init__` takes `(url, email, password)`, not `(url, user, pw, folder)`.
- **Impact**: `upload_model` function would fail at runtime. This function appears to be stale code — the actual upload flow in `train.py:export_current_model` uses the correct `ApiClient()` constructor.
- **Documented in**: modules/exports.md, components/06_training/description.md
### Bug 4: `inference/tensorrt_engine.py` — potential uninitialized `batch_size`
- **Location**: inference/tensorrt_engine.py, line 4344
- **Issue**: `self.batch_size` is only set if `engine_input_shape[0] != -1`. If the batch dimension is dynamic (-1), `self.batch_size` is never assigned before being used in `self.input_shape = [self.batch_size, ...]`.
- **Impact**: NameError at runtime for models with dynamic batch size (unless batch_size is passed via kwargs/set elsewhere)
- **Documented in**: modules/inference_tensorrt_engine.md, components/07_inference/description.md
### Bug 5: `dataset-visualiser.py` — missing import
- **Location**: dataset-visualiser.py, line 6
- **Issue**: `from preprocessing import read_labels` — the `preprocessing` module does not exist in the codebase.
- **Impact**: Script cannot run; ImportError at startup
- **Documented in**: modules/dataset_visualiser.md, components/05_data_pipeline/description.md
## Missing Modules
| Module | Referenced By | Status |
|--------|-------------|--------|
| `preprocessing` | dataset-visualiser.py, tests/imagelabel_visualize_test.py | Not found in codebase |
## Duplicated Code
### AnnotationClass + WeatherMode (3 locations)
| Location | Differences |
|----------|-------------|
| `dto/annotationClass.py` | Standard version. `color_tuple` property strips first 3 chars. |
| `inference/dto.py` | Adds `opencv_color` BGR field. Same `read_json` logic. |
| `annotation-queue/annotation_queue_dto.py` | Adds `opencv_color`. Reads `classes.json` from CWD (not relative to package). |
## Security Issues
| Issue | Location | Severity |
|-------|----------|----------|
| Hardcoded API credentials | config.yaml (email, password) | High |
| Hardcoded CDN access keys | cdn.yaml (4 access keys) | High |
| Hardcoded encryption key | security.py:67 (`get_model_encryption_key`) | High |
| Queue credentials in plaintext | config.yaml, annotation-queue/config.yaml | Medium |
| No TLS cert validation in API calls | api_client.py | Low |
## Completeness Check
All 21 source modules documented. All 8 components cover all modules with no gaps.
| Component | Modules | Complete |
|-----------|---------|----------|
| 01 Core | constants, utils | Yes |
| 02 Security | security, hardware_service | Yes |
| 03 API & CDN | api_client, cdn_manager | Yes |
| 04 Data Models | dto/annotationClass, dto/imageLabel | Yes |
| 05 Data Pipeline | augmentation, convert-annotations, dataset-visualiser | Yes |
| 06 Training | train, exports, manual_run | Yes |
| 07 Inference | inference/dto, onnx_engine, tensorrt_engine, inference, start_inference | Yes |
| 08 Annotation Queue | annotation_queue_dto, annotation_queue_handler | Yes |
## Consistency Check
- Component docs agree with architecture doc: Yes
- Flow diagrams match component interfaces: Yes
- Module dependency graph in discovery matches import analysis: Yes
- Data model doc matches filesystem layout in architecture: Yes
## Remaining Gaps / Uncertainties
- The `preprocessing` module may have existed previously and been deleted or renamed
- `exports.upload_model` may be intentionally deprecated in favor of the ApiClient-based flow in train.py
- `checkpoint.txt` content (`2024-06-27 20:51:35`) suggests training infrastructure was last used in mid-2024
- The `orangepi5/` shell scripts were not analyzed (bash, not Python) — they appear to be setup/run scripts for edge deployment