ai-training/_docs/00_problem/problem.md

# Problem Statement

## What is this system?

Azaion AI Training is an end-to-end machine learning pipeline for training and deploying object detection models. It detects military and infrastructure objects in aerial/satellite imagery — including vehicles, artillery, personnel, trenches, camouflage, and buildings — under varying weather and lighting conditions.

## What problem does it solve?

Automated detection of military assets and infrastructure from aerial imagery requires:
1. Continuous ingestion of human-annotated training data from the Azaion annotation platform
2. Automated data augmentation to expand limited labeled datasets (8× multiplication)
3. GPU-accelerated model training using state-of-the-art object detection architectures
4. Secure model distribution that prevents model theft and ties deployment to authorized hardware
5. Real-time inference on video feeds with GPU acceleration
6. Edge deployment capability for low-power field devices

## Who are the users?

- **Annotators/Operators**: Create annotation data through the Azaion platform. Their annotations flow into the training pipeline via RabbitMQ.
- **Validators/Admins**: Review and approve annotations, promoting them from seed to validated status.
- **ML Engineers**: Configure and run training pipelines, monitor model quality, trigger retraining.
- **Inference Operators**: Deploy and run inference on video feeds using trained models on GPU-equipped machines.
- **Edge Deployment Operators**: Set up and run inference on OrangePi5 edge devices in the field.

## How does it work (high level)?

1. Annotations (images + bounding box labels) arrive via a RabbitMQ stream from the Azaion annotation platform
2. A queue consumer service routes annotations to the filesystem based on user role (operator → seed, validator → validated)
3. An augmentation pipeline continuously processes validated images, producing 8 augmented variants per original
4. A training pipeline assembles datasets (70/20/10 split), trains a YOLOv11 model over ~120 epochs, and exports to ONNX format
5. Trained models are encrypted with AES-256-CBC, split into small and big parts, and uploaded to the Azaion API and S3 CDN respectively
6. Inference clients download and reassemble the model, decrypt it using a hardware-bound key, and run real-time detection on video feeds using TensorRT or ONNX Runtime
7. For edge deployment, models are exported to RKNN format for OrangePi5 devices