mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 23:06:36 +00:00
142c6c4de8
- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
52 lines
2.3 KiB
Markdown
52 lines
2.3 KiB
Markdown
# Acceptance Criteria
|
||
|
||
## Training
|
||
|
||
- Dataset split: 70% train, 20% validation, 10% test (hardcoded in train.py).
|
||
- Training parameters: YOLOv11 medium, 120 epochs, batch size 11, image size 1280px, save_period=1.
|
||
- Corrupted labels (bounding box coordinates > 1.0) are filtered to `/azaion/data-corrupted/`.
|
||
- Model export to ONNX: 1280px resolution, batch size 4, NMS baked in.
|
||
- Trained model encrypted with AES-256-CBC before upload.
|
||
- Encrypted model split: small part ≤3KB or 20% of total → API server; remainder → CDN.
|
||
- Post-training: model uploaded to both API and CDN endpoints.
|
||
|
||
## Augmentation
|
||
|
||
- Each validated image produces exactly 8 outputs (1 original + 7 augmented variants).
|
||
- Augmentation runs every 5 minutes, processing only unprocessed images.
|
||
- Bounding boxes clipped to [0, 1] range; boxes with area < 0.01% of image discarded.
|
||
- Processing is parallelized per image using ThreadPoolExecutor.
|
||
|
||
## Annotation Ingestion
|
||
|
||
- Created/Edited annotations from Validators/Admins → `/azaion/data/`.
|
||
- Created/Edited annotations from Operators → `/azaion/data-seed/`.
|
||
- Validated (bulk) events → move from `/data-seed/` to `/data/`.
|
||
- Deleted (bulk) events → move to `/data_deleted/`.
|
||
- Queue consumer offset persisted to `offset.yaml` after each message.
|
||
|
||
## Inference
|
||
|
||
- TensorRT inference: ~54s for 200s video, ~3.7GB VRAM.
|
||
- ONNX inference: ~81s for 200s video, ~6.3GB VRAM.
|
||
- Frame sampling: every 4th frame.
|
||
- Batch size: 4 (for both ONNX and TensorRT).
|
||
- Confidence threshold: 0.3 (hardcoded in inference/inference.py).
|
||
- NMS IoU threshold: 0.3 (hardcoded in inference/inference.py).
|
||
- Overlapping detection removal: IoU > 0.3 with lower confidence removed.
|
||
|
||
## Security
|
||
|
||
- API authentication via JWT (email/password login).
|
||
- Model encryption: AES-256-CBC with static key.
|
||
- Resource encryption: AES-256-CBC with hardware-derived key (CPU+GPU+RAM+drive serial hash).
|
||
- CDN access: separate read/write S3 credentials.
|
||
- Split-model storage: prevents model theft from single storage compromise.
|
||
|
||
## Data Format
|
||
|
||
- Annotation format: YOLO (class_id center_x center_y width height — all normalized 0–1).
|
||
- 17 base annotation classes × 3 weather modes = 51 active classes (80 total slots).
|
||
- Image format: JPEG.
|
||
- Queue message format: msgpack with positional integer keys.
|