# System Flows ## Flow 1: Annotation Ingestion (Annotation Queue → Filesystem) ```mermaid sequenceDiagram participant RMQ as RabbitMQ Streams participant AQH as AnnotationQueueHandler participant FS as Filesystem RMQ->>AQH: AMQP message (msgpack) AQH->>AQH: Decode message, read AnnotationStatus alt Created / Edited AQH->>AQH: Parse AnnotationMessage (image + detections) alt Validator / Admin role AQH->>FS: Write label → /data/labels/{name}.txt AQH->>FS: Write image → /data/images/{name}.jpg else Operator role AQH->>FS: Write label → /data-seed/labels/{name}.txt AQH->>FS: Write image → /data-seed/images/{name}.jpg end else Validated (bulk) AQH->>FS: Move images+labels from /data-seed/ → /data/ else Deleted (bulk) AQH->>FS: Move images+labels → /data_deleted/ end AQH->>FS: Persist offset to offset.yaml ``` ### Data Flow Table | Step | Input | Output | Component | |------|-------|--------|-----------| | Receive | AMQP message (msgpack) | AnnotationMessage / AnnotationBulkMessage | Annotation Queue | | Route | AnnotationStatus header | Dispatch to save/validate/delete | Annotation Queue | | Save | Image bytes + detection JSON | .jpg + .txt files on disk | Annotation Queue | | Track | Message context offset | offset.yaml | Annotation Queue | --- ## Flow 2: Data Augmentation ```mermaid sequenceDiagram participant FS as Filesystem (/azaion/data/) participant AUG as Augmentator participant PFS as Filesystem (/azaion/data-processed/) loop Every 5 minutes AUG->>FS: Scan /data/images/ for unprocessed files AUG->>AUG: Filter out already-processed images loop Each unprocessed image (parallel) AUG->>FS: Read image + labels AUG->>AUG: Correct bounding boxes (clip + filter) AUG->>AUG: Generate 7 augmented variants AUG->>PFS: Write 8 images (original + 7 augmented) AUG->>PFS: Write 8 label files end AUG->>AUG: Sleep 5 minutes end ``` --- ## Flow 3: Training Pipeline ```mermaid sequenceDiagram participant PFS as Filesystem (/data-processed/) participant TRAIN as train.py participant DS as Filesystem (/datasets/) participant YOLO as Ultralytics YOLO participant API as Azaion API participant CDN as S3 CDN TRAIN->>PFS: Read all processed images TRAIN->>TRAIN: Shuffle, split 70/20/10 TRAIN->>DS: Copy to train/valid/test folders Note over TRAIN: Corrupted labels → /data-corrupted/ TRAIN->>TRAIN: Generate data.yaml (80 class names) TRAIN->>YOLO: Train yolo11m (120 epochs, batch=11, 1280px) YOLO-->>TRAIN: Training results + best.pt TRAIN->>DS: Copy results to /models/{date}/ TRAIN->>TRAIN: Copy best.pt → /models/azaion.pt TRAIN->>TRAIN: Export .pt → .onnx (1280px, batch=4) TRAIN->>TRAIN: Read azaion.onnx bytes TRAIN->>TRAIN: Encrypt with model key (AES-256-CBC) TRAIN->>TRAIN: Split: small (≤3KB or 20%) + big (rest) TRAIN->>API: Upload azaion.onnx.small TRAIN->>CDN: Upload azaion.onnx.big ``` --- ## Flow 4: Model Download & Inference ```mermaid sequenceDiagram participant INF as start_inference.py participant API as Azaion API participant CDN as S3 CDN participant SEC as Security participant TRT as TensorRTEngine participant VID as Video File participant GUI as OpenCV Window INF->>INF: Determine GPU-specific engine filename INF->>SEC: Get model encryption key INF->>API: Login (JWT) INF->>API: Download {engine}.small (encrypted) INF->>INF: Read {engine}.big from local disk INF->>INF: Reassemble: small + big INF->>SEC: Decrypt (AES-256-CBC) INF->>TRT: Initialize engine from bytes TRT->>TRT: Allocate CUDA memory (input + output) loop Video frames INF->>VID: Read frame (every 4th) INF->>INF: Batch frames to batch_size INF->>TRT: Preprocess (blob, normalize, resize) TRT->>TRT: CUDA memcpy host→device TRT->>TRT: Execute inference (async) TRT->>TRT: CUDA memcpy device→host INF->>INF: Postprocess (confidence filter + NMS) INF->>GUI: Draw bounding boxes + display end ``` ### Data Flow Table | Step | Input | Output | Component | |------|-------|--------|-----------| | Model resolve | GPU compute capability | Engine filename | Inference | | Download small | API endpoint + JWT | Encrypted small bytes | API & CDN | | Load big | Local filesystem | Encrypted big bytes | API & CDN | | Reassemble | small + big bytes | Full encrypted model | API & CDN | | Decrypt | Encrypted model + key | Raw TensorRT engine | Security | | Init engine | Engine bytes | CUDA buffers allocated | Inference | | Preprocess | Video frame | NCHW float32 blob | Inference | | Inference | Input blob | Raw detection tensor | Inference | | Postprocess | Raw tensor | List[Detection] | Inference | | Visualize | Detections + frame | Annotated frame | Inference | --- ## Flow 5: Model Export (Multi-Format) ```mermaid flowchart LR PT[azaion.pt] -->|export_onnx| ONNX[azaion.onnx] PT -->|export_tensorrt| TRT[azaion.engine] PT -->|export_rknn| RKNN[azaion.rknn] ONNX -->|encrypt + split| UPLOAD[API + CDN upload] TRT -->|encrypt + split| UPLOAD ``` | Target Format | Resolution | Batch | Precision | Use Case | |---------------|-----------|-------|-----------|----------| | ONNX | 1280px | 4 | FP32 | Cross-platform inference | | TensorRT | auto | 4 | FP16 | Production GPU inference | | RKNN | auto | auto | auto | OrangePi5 edge device | --- ## Error Scenarios | Flow | Error | Handling | |------|-------|---------| | Annotation ingestion | Malformed message | Caught by on_message exception handler, logged | | Annotation ingestion | Queue disconnect | Process exits (no reconnect logic) | | Augmentation | Corrupted image | Caught per-thread, logged, skipped | | Augmentation | Transform failure | Caught per-variant, logged, fewer augmentations produced | | Training | Corrupted label (coords > 1.0) | Moved to /data-corrupted/ | | Training | Power outage | save_period=1 enables resume_training from last epoch | | API download | 401/403 | Auto-relogin + retry | | API download | 500 | Printed, no retry | | Inference | CUDA error | RuntimeError raised | | CDN upload/download | Any exception | Caught, printed, returns False |