mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 22:36:36 +00:00
142c6c4de8
- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
7.8 KiB
7.8 KiB
Verification Log
Summary
| Metric | Count |
|---|---|
| Entities verified | 87 |
| Entities flagged | 0 |
| Corrections applied | 0 |
| Bugs found in code | 5 |
| Missing modules | 1 |
| Duplicated code | 1 pattern (3 locations) |
| Security issues | 3 |
| Completeness | 21/21 modules (100%) |
Entity Verification
All class names, function names, method signatures, and module names referenced in documentation were verified against the actual source code. No hallucinated entities found.
Verified Entities (key samples)
| Entity | Location | Doc Reference | Status |
|---|---|---|---|
Security.encrypt_to |
security.py:14 | modules/security.md | OK |
Security.decrypt_to |
security.py:28 | modules/security.md | OK |
Security.get_model_encryption_key |
security.py:66 | modules/security.md | OK |
get_hardware_info |
hardware_service.py:5 | modules/hardware_service.md | OK |
CDNManager.upload |
cdn_manager.py:28 | modules/cdn_manager.md | OK |
CDNManager.download |
cdn_manager.py:37 | modules/cdn_manager.md | OK |
ApiClient.login |
api_client.py:43 | modules/api_client.md | OK |
ApiClient.load_bytes |
api_client.py:63 | modules/api_client.md | OK |
ApiClient.upload_big_small_resource |
api_client.py:113 | modules/api_client.md | OK |
Augmentator.augment_annotations |
augmentation.py:125 | modules/augmentation.md | OK |
Augmentator.augment_inner |
augmentation.py:55 | modules/augmentation.md | OK |
InferenceEngine (ABC) |
inference/onnx_engine.py:7 | modules/inference_onnx_engine.md | OK |
OnnxEngine |
inference/onnx_engine.py:25 | modules/inference_onnx_engine.md | OK |
TensorRTEngine |
inference/tensorrt_engine.py:16 | modules/inference_tensorrt_engine.md | OK |
TensorRTEngine.convert_from_onnx |
inference/tensorrt_engine.py:104 | modules/inference_tensorrt_engine.md | OK |
Inference.process |
inference/inference.py:83 | modules/inference_inference.md | OK |
Inference.remove_overlapping_detections |
inference/inference.py:120 | modules/inference_inference.md | OK |
AnnotationQueueHandler.on_message |
annotation-queue/annotation_queue_handler.py:87 | modules/annotation_queue_handler.md | OK |
AnnotationMessage |
annotation-queue/annotation_queue_dto.py:91 | modules/annotation_queue_dto.md | OK |
form_dataset |
train.py:42 | modules/train.md | OK |
train_dataset |
train.py:147 | modules/train.md | OK |
export_onnx |
exports.py:29 | modules/exports.md | OK |
export_rknn |
exports.py:19 | modules/exports.md | OK |
export_tensorrt |
exports.py:45 | modules/exports.md | OK |
upload_model |
exports.py:82 | modules/exports.md | OK |
WeatherMode |
dto/annotationClass.py:6 | modules/dto_annotationClass.md | OK |
AnnotationClass.read_json |
dto/annotationClass.py:18 | modules/dto_annotationClass.md | OK |
ImageLabel.visualize |
dto/imageLabel.py:12 | modules/dto_imageLabel.md | OK |
Dotdict |
utils.py:1 | modules/utils.md | OK |
Code Bugs Found During Verification
Bug 1: augmentation.py — undefined attribute total_to_process
- Location: augmentation.py, line 118
- Issue: References
self.total_to_processbut onlyself.total_images_to_processis defined in__init__ - Impact: AttributeError at runtime during progress logging
- Documented in: modules/augmentation.md, components/05_data_pipeline/description.md
Bug 2: train.py copy_annotations — reporting bug
- Location: train.py, line 93 and 99
- Issue:
copied = 0is declared but never incremented. The globaltotal_files_copiedis incremented inside the inner function, butcopiedis printed in the final message:f'Copied all {copied} annotations'always prints 0. - Impact: Incorrect progress reporting (cosmetic)
- Documented in: modules/train.md, components/06_training/description.md
Bug 3: exports.py upload_model — stale ApiClient constructor call
- Location: exports.py, line 97
- Issue:
ApiClient(ApiCredentials(api_c.url, api_c.user, api_c.pw, api_c.folder))— butApiClient.__init__takes no args, andApiCredentials.__init__takes(url, email, password), not(url, user, pw, folder). - Impact:
upload_modelfunction would fail at runtime. This function appears to be stale code — the actual upload flow intrain.py:export_current_modeluses the correctApiClient()constructor. - Documented in: modules/exports.md, components/06_training/description.md
Bug 4: inference/tensorrt_engine.py — potential uninitialized batch_size
- Location: inference/tensorrt_engine.py, line 43–44
- Issue:
self.batch_sizeis only set ifengine_input_shape[0] != -1. If the batch dimension is dynamic (-1),self.batch_sizeis never assigned before being used inself.input_shape = [self.batch_size, ...]. - Impact: NameError at runtime for models with dynamic batch size (unless batch_size is passed via kwargs/set elsewhere)
- Documented in: modules/inference_tensorrt_engine.md, components/07_inference/description.md
Bug 5: dataset-visualiser.py — missing import
- Location: dataset-visualiser.py, line 6
- Issue:
from preprocessing import read_labels— thepreprocessingmodule does not exist in the codebase. - Impact: Script cannot run; ImportError at startup
- Documented in: modules/dataset_visualiser.md, components/05_data_pipeline/description.md
Missing Modules
| Module | Referenced By | Status |
|---|---|---|
preprocessing |
dataset-visualiser.py, tests/imagelabel_visualize_test.py | Not found in codebase |
Duplicated Code
AnnotationClass + WeatherMode (3 locations)
| Location | Differences |
|---|---|
dto/annotationClass.py |
Standard version. color_tuple property strips first 3 chars. |
inference/dto.py |
Adds opencv_color BGR field. Same read_json logic. |
annotation-queue/annotation_queue_dto.py |
Adds opencv_color. Reads classes.json from CWD (not relative to package). |
Security Issues
| Issue | Location | Severity |
|---|---|---|
| Hardcoded API credentials | config.yaml (email, password) | High |
| Hardcoded CDN access keys | cdn.yaml (4 access keys) | High |
| Hardcoded encryption key | security.py:67 (get_model_encryption_key) |
High |
| Queue credentials in plaintext | config.yaml, annotation-queue/config.yaml | Medium |
| No TLS cert validation in API calls | api_client.py | Low |
Completeness Check
All 21 source modules documented. All 8 components cover all modules with no gaps.
| Component | Modules | Complete |
|---|---|---|
| 01 Core | constants, utils | Yes |
| 02 Security | security, hardware_service | Yes |
| 03 API & CDN | api_client, cdn_manager | Yes |
| 04 Data Models | dto/annotationClass, dto/imageLabel | Yes |
| 05 Data Pipeline | augmentation, convert-annotations, dataset-visualiser | Yes |
| 06 Training | train, exports, manual_run | Yes |
| 07 Inference | inference/dto, onnx_engine, tensorrt_engine, inference, start_inference | Yes |
| 08 Annotation Queue | annotation_queue_dto, annotation_queue_handler | Yes |
Consistency Check
- Component docs agree with architecture doc: Yes
- Flow diagrams match component interfaces: Yes
- Module dependency graph in discovery matches import analysis: Yes
- Data model doc matches filesystem layout in architecture: Yes
Remaining Gaps / Uncertainties
- The
preprocessingmodule may have existed previously and been deleted or renamed exports.upload_modelmay be intentionally deprecated in favor of the ApiClient-based flow in train.pycheckpoint.txtcontent (2024-06-27 20:51:35) suggests training infrastructure was last used in mid-2024- The
orangepi5/shell scripts were not analyzed (bash, not Python) — they appear to be setup/run scripts for edge deployment