Files
ai-training/_docs/02_document/04_verification_log.md
T
Oleksandr Bezdieniezhnykh 142c6c4de8 Refactor constants management to use Pydantic BaseModel for configuration
- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
2026-03-27 18:18:30 +02:00

7.8 KiB
Raw Blame History

Verification Log

Summary

Metric Count
Entities verified 87
Entities flagged 0
Corrections applied 0
Bugs found in code 5
Missing modules 1
Duplicated code 1 pattern (3 locations)
Security issues 3
Completeness 21/21 modules (100%)

Entity Verification

All class names, function names, method signatures, and module names referenced in documentation were verified against the actual source code. No hallucinated entities found.

Verified Entities (key samples)

Entity Location Doc Reference Status
Security.encrypt_to security.py:14 modules/security.md OK
Security.decrypt_to security.py:28 modules/security.md OK
Security.get_model_encryption_key security.py:66 modules/security.md OK
get_hardware_info hardware_service.py:5 modules/hardware_service.md OK
CDNManager.upload cdn_manager.py:28 modules/cdn_manager.md OK
CDNManager.download cdn_manager.py:37 modules/cdn_manager.md OK
ApiClient.login api_client.py:43 modules/api_client.md OK
ApiClient.load_bytes api_client.py:63 modules/api_client.md OK
ApiClient.upload_big_small_resource api_client.py:113 modules/api_client.md OK
Augmentator.augment_annotations augmentation.py:125 modules/augmentation.md OK
Augmentator.augment_inner augmentation.py:55 modules/augmentation.md OK
InferenceEngine (ABC) inference/onnx_engine.py:7 modules/inference_onnx_engine.md OK
OnnxEngine inference/onnx_engine.py:25 modules/inference_onnx_engine.md OK
TensorRTEngine inference/tensorrt_engine.py:16 modules/inference_tensorrt_engine.md OK
TensorRTEngine.convert_from_onnx inference/tensorrt_engine.py:104 modules/inference_tensorrt_engine.md OK
Inference.process inference/inference.py:83 modules/inference_inference.md OK
Inference.remove_overlapping_detections inference/inference.py:120 modules/inference_inference.md OK
AnnotationQueueHandler.on_message annotation-queue/annotation_queue_handler.py:87 modules/annotation_queue_handler.md OK
AnnotationMessage annotation-queue/annotation_queue_dto.py:91 modules/annotation_queue_dto.md OK
form_dataset train.py:42 modules/train.md OK
train_dataset train.py:147 modules/train.md OK
export_onnx exports.py:29 modules/exports.md OK
export_rknn exports.py:19 modules/exports.md OK
export_tensorrt exports.py:45 modules/exports.md OK
upload_model exports.py:82 modules/exports.md OK
WeatherMode dto/annotationClass.py:6 modules/dto_annotationClass.md OK
AnnotationClass.read_json dto/annotationClass.py:18 modules/dto_annotationClass.md OK
ImageLabel.visualize dto/imageLabel.py:12 modules/dto_imageLabel.md OK
Dotdict utils.py:1 modules/utils.md OK

Code Bugs Found During Verification

Bug 1: augmentation.py — undefined attribute total_to_process

  • Location: augmentation.py, line 118
  • Issue: References self.total_to_process but only self.total_images_to_process is defined in __init__
  • Impact: AttributeError at runtime during progress logging
  • Documented in: modules/augmentation.md, components/05_data_pipeline/description.md

Bug 2: train.py copy_annotations — reporting bug

  • Location: train.py, line 93 and 99
  • Issue: copied = 0 is declared but never incremented. The global total_files_copied is incremented inside the inner function, but copied is printed in the final message: f'Copied all {copied} annotations' always prints 0.
  • Impact: Incorrect progress reporting (cosmetic)
  • Documented in: modules/train.md, components/06_training/description.md

Bug 3: exports.py upload_model — stale ApiClient constructor call

  • Location: exports.py, line 97
  • Issue: ApiClient(ApiCredentials(api_c.url, api_c.user, api_c.pw, api_c.folder)) — but ApiClient.__init__ takes no args, and ApiCredentials.__init__ takes (url, email, password), not (url, user, pw, folder).
  • Impact: upload_model function would fail at runtime. This function appears to be stale code — the actual upload flow in train.py:export_current_model uses the correct ApiClient() constructor.
  • Documented in: modules/exports.md, components/06_training/description.md

Bug 4: inference/tensorrt_engine.py — potential uninitialized batch_size

  • Location: inference/tensorrt_engine.py, line 4344
  • Issue: self.batch_size is only set if engine_input_shape[0] != -1. If the batch dimension is dynamic (-1), self.batch_size is never assigned before being used in self.input_shape = [self.batch_size, ...].
  • Impact: NameError at runtime for models with dynamic batch size (unless batch_size is passed via kwargs/set elsewhere)
  • Documented in: modules/inference_tensorrt_engine.md, components/07_inference/description.md

Bug 5: dataset-visualiser.py — missing import

  • Location: dataset-visualiser.py, line 6
  • Issue: from preprocessing import read_labels — the preprocessing module does not exist in the codebase.
  • Impact: Script cannot run; ImportError at startup
  • Documented in: modules/dataset_visualiser.md, components/05_data_pipeline/description.md

Missing Modules

Module Referenced By Status
preprocessing dataset-visualiser.py, tests/imagelabel_visualize_test.py Not found in codebase

Duplicated Code

AnnotationClass + WeatherMode (3 locations)

Location Differences
dto/annotationClass.py Standard version. color_tuple property strips first 3 chars.
inference/dto.py Adds opencv_color BGR field. Same read_json logic.
annotation-queue/annotation_queue_dto.py Adds opencv_color. Reads classes.json from CWD (not relative to package).

Security Issues

Issue Location Severity
Hardcoded API credentials config.yaml (email, password) High
Hardcoded CDN access keys cdn.yaml (4 access keys) High
Hardcoded encryption key security.py:67 (get_model_encryption_key) High
Queue credentials in plaintext config.yaml, annotation-queue/config.yaml Medium
No TLS cert validation in API calls api_client.py Low

Completeness Check

All 21 source modules documented. All 8 components cover all modules with no gaps.

Component Modules Complete
01 Core constants, utils Yes
02 Security security, hardware_service Yes
03 API & CDN api_client, cdn_manager Yes
04 Data Models dto/annotationClass, dto/imageLabel Yes
05 Data Pipeline augmentation, convert-annotations, dataset-visualiser Yes
06 Training train, exports, manual_run Yes
07 Inference inference/dto, onnx_engine, tensorrt_engine, inference, start_inference Yes
08 Annotation Queue annotation_queue_dto, annotation_queue_handler Yes

Consistency Check

  • Component docs agree with architecture doc: Yes
  • Flow diagrams match component interfaces: Yes
  • Module dependency graph in discovery matches import analysis: Yes
  • Data model doc matches filesystem layout in architecture: Yes

Remaining Gaps / Uncertainties

  • The preprocessing module may have existed previously and been deleted or renamed
  • exports.upload_model may be intentionally deprecated in favor of the ApiClient-based flow in train.py
  • checkpoint.txt content (2024-06-27 20:51:35) suggests training infrastructure was last used in mid-2024
  • The orangepi5/ shell scripts were not analyzed (bash, not Python) — they appear to be setup/run scripts for edge deployment