mirror of https://github.com/azaion/ai-training.git synced 2026-04-22 21:46:35 +00:00

Files

T

Oleksandr Bezdieniezhnykh 142c6c4de8 Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.

2026-03-27 18:18:30 +02:00

3.2 KiB

Raw Blame History

Module: train

Purpose

Main training pipeline. Forms YOLO datasets from processed annotations, trains YOLOv11 models, and exports/uploads the trained model.

Public Interface

Function	Signature	Returns	Description
`form_dataset`	`()`	—	Creates train/valid/test split from processed images
`copy_annotations`	`(images, folder: str)`	—	Copies image+label pairs to a dataset split folder (concurrent)
`check_label`	`(label_path: str) -> bool`	bool	Validates YOLO label file (all coords ≤ 1.0)
`create_yaml`	`()`	—	Generates YOLO `data.yaml` with class names from `classes.json`
`resume_training`	`(last_pt_path: str)`	—	Resumes training from a checkpoint
`train_dataset`	`()`	—	Full pipeline: form_dataset → create_yaml → train YOLOv11 → save model
`export_current_model`	`()`	—	Exports current .pt to ONNX, encrypts, uploads as split resource

Internal Logic

Dataset formation: Shuffles all processed images, splits 70/20/10 (train/valid/test). Copies in parallel via ThreadPoolExecutor. Corrupted labels (coords > 1.0) are moved to /azaion/data-corrupted/.
YAML generation: Reads annotation classes from classes.json, builds data.yaml with 80 class names (17 actual + 63 placeholders "Class-N"), sets train/valid/test paths.
Training: YOLOv11 medium (yolo11m.yaml), 120 epochs, batch=11 (tuned for 24GB VRAM), 1280px input, save every epoch, 24 workers.
Post-training: Copies results to /azaion/models/{date}/, removes intermediate epoch checkpoints, copies best.pt to CURRENT_PT_MODEL.
Export: Calls export_onnx, reads the ONNX file, encrypts with model key, uploads via upload_big_small_resource.
Dataset naming: azaion-{YYYY-MM-DD} using current date.
__main__: Runs train_dataset() then export_current_model().

Dependencies

constants — all directory/path constants
api_client — ApiClient for model upload
cdn_manager — CDNCredentials, CDNManager (imported but CDN init done via api_client)
dto/annotationClass — AnnotationClass for class name generation
inference/onnx_engine — OnnxEngine (imported but unused in current code)
security — model encryption key
utils — Dotdict
exports — export_tensorrt, upload_model, export_onnx
ultralytics (external) — YOLO training and export
yaml, concurrent.futures, glob, os, random, shutil, subprocess, datetime, pathlib, time (stdlib)

Consumers

manual_run

Data Models

Uses AnnotationClass for class definitions.

Configuration

Training hyperparameters hardcoded: epochs=120, batch=11, imgsz=1280, save_period=1, workers=24
Dataset split ratios: train_set=70, valid_set=20, test_set=10
old_images_percentage=75 (declared but unused)
DEFAULT_CLASS_NUM=80

External Integrations

Ultralytics YOLOv11 training pipeline
Azaion API + CDN for model upload
Filesystem: /azaion/datasets/, /azaion/models/, /azaion/data-processed/, /azaion/data-corrupted/

Security

Trained models are encrypted before upload
Uses Security.get_model_encryption_key() for encryption

Tests

None.

3.2 KiB Raw Blame History