mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 22:26:36 +00:00
142c6c4de8
- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
9.4 KiB
9.4 KiB
Codebase Discovery
Directory Tree
ai-training/
├── annotation-queue/ # Separate sub-service: annotation message queue consumer
│ ├── annotation_queue_dto.py
│ ├── annotation_queue_handler.py
│ ├── classes.json
│ ├── config.yaml
│ ├── offset.yaml
│ ├── requirements.txt
│ └── run.sh
├── dto/ # Data transfer objects for the training pipeline
│ ├── annotationClass.py
│ ├── annotation_bulk_message.py (empty)
│ ├── annotation_message.py (empty)
│ └── imageLabel.py
├── inference/ # Inference engine subsystem (ONNX + TensorRT)
│ ├── __init__.py (empty)
│ ├── dto.py
│ ├── inference.py
│ ├── onnx_engine.py
│ └── tensorrt_engine.py
├── orangepi5/ # Setup scripts for OrangePi5 edge device
│ ├── 01 install.sh
│ ├── 02 install-inference.sh
│ └── 03 run_inference.sh
├── scripts/
│ └── init-sftp.sh
├── tests/
│ ├── data.yaml
│ ├── imagelabel_visualize_test.py
│ ├── libomp140.x86_64.dll (binary workaround for Windows)
│ └── security_test.py
├── api_client.py # API client for Azaion backend + CDN resource management
├── augmentation.py # Image augmentation pipeline (albumentations)
├── cdn_manager.py # S3-compatible CDN upload/download via boto3
├── cdn.yaml # CDN credentials config
├── checkpoint.txt # Last training checkpoint timestamp
├── classes.json # Annotation class definitions (17 classes + weather modes)
├── config.yaml # Main config (API url, queue, directories)
├── constants.py # Shared path constants and config keys
├── convert-annotations.py # Annotation format converter (Pascal VOC / bbox → YOLO)
├── dataset-visualiser.py # Interactive dataset visualization tool
├── exports.py # Model export (ONNX, TensorRT, RKNN) and upload
├── hardware_service.py # Hardware fingerprinting (CPU/GPU/RAM/drive serial)
├── install.sh # Dependency installation script
├── manual_run.py # Manual training/export entry point
├── requirements.txt # Python dependencies
├── security.py # AES-256-CBC encryption/decryption + key derivation
├── start_inference.py # Inference entry point (downloads model, runs TensorRT)
├── train.py # Main training pipeline (dataset formation → YOLO training → export)
└── utils.py # Utility classes (Dotdict)
Tech Stack Summary
| Category | Technology | Details |
|---|---|---|
| Language | Python 3.10+ | Match statements used (3.10 feature) |
| ML Framework | Ultralytics (YOLO) | YOLOv11 object detection model |
| Deep Learning | PyTorch 2.3.0 (CUDA 12.1) | GPU-accelerated training |
| Inference (Primary) | TensorRT | GPU inference with FP16/INT8 support |
| Inference (Fallback) | ONNX Runtime GPU | Cross-platform inference |
| Augmentation | Albumentations | Image augmentation pipeline |
| Computer Vision | OpenCV (cv2) | Image I/O, preprocessing, visualization |
| CDN/Storage | boto3 (S3-compatible) | Model artifact storage |
| Message Queue | RabbitMQ Streams (rstream) | Annotation message consumption |
| Serialization | msgpack | Queue message deserialization |
| Encryption | cryptography (AES-256-CBC) | Model encryption, API resource encryption |
| GPU Management | pycuda, pynvml | CUDA memory management, device queries |
| HTTP | requests | API communication |
| Config | PyYAML | Configuration files |
| Visualization | matplotlib, netron | Annotation display, model graph viewer |
| Edge Deployment | RKNN (RK3588) | OrangePi5 inference target |
Dependency Graph
Internal Module Dependencies (textual)
Leaves (no internal dependencies):
constants— path constants, config keysutils— Dotdict helpersecurity— encryption/decryption, key derivationhardware_service— hardware fingerprintingcdn_manager— S3-compatible CDN clientdto/annotationClass— annotation class model + JSON readerdto/imageLabel— image+labels container with visualizationinference/dto— Detection, Annotation, AnnotationClass (inference-specific)inference/onnx_engine— InferenceEngine ABC + OnnxEngine implementationconvert-annotations— standalone annotation format converterannotation-queue/annotation_queue_dto— queue message DTOs
Level 1 (depends on leaves):
api_client→ constants, cdn_manager, hardware_service, securityaugmentation→ constants, dto/imageLabelinference/tensorrt_engine→ inference/onnx_engine (InferenceEngine ABC)inference/inference→ inference/dto, inference/onnx_engineannotation-queue/annotation_queue_handler→ annotation_queue_dto
Level 2 (depends on level 1):
exports→ constants, api_client, cdn_manager, security, utils
Level 3 (depends on level 2):
train→ constants, api_client, cdn_manager, dto/annotationClass, inference/onnx_engine, security, utils, exportsstart_inference→ constants, api_client, cdn_manager, inference/inference, inference/tensorrt_engine, security, utils
Level 4 (depends on level 3):
manual_run→ constants, train, augmentation
Broken dependency:
dataset-visualiser→ constants, dto/annotationClass, dto/imageLabel, preprocessing (module not found in codebase)
Dependency Graph (Mermaid)
graph TD
constants --> api_client
constants --> augmentation
constants --> exports
constants --> train
constants --> manual_run
constants --> start_inference
constants --> dataset-visualiser
utils --> exports
utils --> train
utils --> start_inference
security --> api_client
security --> exports
security --> train
security --> start_inference
hardware_service --> api_client
cdn_manager --> api_client
cdn_manager --> exports
cdn_manager --> train
cdn_manager --> start_inference
api_client --> exports
api_client --> train
api_client --> start_inference
dto_annotationClass[dto/annotationClass] --> train
dto_annotationClass --> dataset-visualiser
dto_imageLabel[dto/imageLabel] --> augmentation
dto_imageLabel --> dataset-visualiser
inference_dto[inference/dto] --> inference_inference[inference/inference]
inference_onnx[inference/onnx_engine] --> inference_inference
inference_onnx --> inference_trt[inference/tensorrt_engine]
inference_onnx --> train
inference_inference --> start_inference
inference_trt --> start_inference
exports --> train
train --> manual_run
augmentation --> manual_run
aq_dto[annotation-queue/annotation_queue_dto] --> aq_handler[annotation-queue/annotation_queue_handler]
Topological Processing Order
| Batch | Modules |
|---|---|
| 1 (leaves) | constants, utils, security, hardware_service, cdn_manager |
| 2 (leaves) | dto/annotationClass, dto/imageLabel, inference/dto, inference/onnx_engine |
| 3 (level 1) | api_client, augmentation, inference/tensorrt_engine, inference/inference |
| 4 (level 2) | exports, convert-annotations, dataset-visualiser |
| 5 (level 3) | train, start_inference |
| 6 (level 4) | manual_run |
| 7 (separate) | annotation-queue/annotation_queue_dto, annotation-queue/annotation_queue_handler |
Entry Points
| Entry Point | Description |
|---|---|
train.py (__main__) |
Main pipeline: form dataset → train YOLO → export + upload ONNX model |
augmentation.py (__main__) |
Continuous augmentation loop (runs indefinitely) |
start_inference.py (__main__) |
Download encrypted TensorRT model → run video inference |
manual_run.py (script) |
Ad-hoc training/export commands |
convert-annotations.py (__main__) |
One-shot annotation format conversion |
dataset-visualiser.py (__main__) |
Interactive annotation visualization |
annotation-queue/annotation_queue_handler.py (__main__) |
Async queue consumer for annotation CRUD events |
Leaf Modules
constants, utils, security, hardware_service, cdn_manager, dto/annotationClass, dto/imageLabel, inference/dto, inference/onnx_engine, convert-annotations, annotation-queue/annotation_queue_dto
Observations
- Security concern:
config.yamlandcdn.yamlcontain hardcoded credentials (API passwords, S3 access keys). These should be moved to environment variables or a secrets manager. - Missing module:
dataset-visualiser.pyimports frompreprocessingwhich does not exist in the codebase. - Duplicate code:
AnnotationClassandWeatherModeare defined in three separate locations:dto/annotationClass.py,inference/dto.py, andannotation-queue/annotation_queue_dto.py. - Empty files:
dto/annotation_bulk_message.py,dto/annotation_message.py, andinference/__init__.pyare empty. - Separate sub-service:
annotation-queue/has its ownrequirements.txtandconfig.yaml, functioning as an independent service. - Hardcoded encryption key:
security.pyhas a hardcoded model encryption key string. - No formal test framework: tests are script-based, not using pytest/unittest.