Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
2026-06-22 06:11:12 +00:00 · 2026-03-27 18:18:30 +02:00
parent b68c07b540
commit 142c6c4de8
106 changed files with 5706 additions and 654 deletions
@@ -0,0 +1,38 @@
+# Restrictions
+
+## Hardware
+
+- Training requires NVIDIA GPU with ≥24GB VRAM (validated: RTX 4090). Batch size 11 consumes ~22GB; batch size 12 exceeds 24GB.
+- TensorRT inference requires NVIDIA GPU with TensorRT support. Engine files are GPU-architecture-specific (compiled per compute capability).
+- ONNX Runtime inference requires NVIDIA GPU with CUDA support (~6.3GB VRAM for 200s video).
+- Edge inference requires RK3588 SoC (OrangePi5).
+- Hardware fingerprinting reads CPU model, GPU name, RAM total, and drive serial — requires access to these system properties.
+
+## Software
+
+- Python 3.10+ (uses `match` statements).
+- CUDA 12.1 with PyTorch 2.3.0.
+- TensorRT runtime for production GPU inference.
+- ONNX Runtime with CUDAExecutionProvider for cross-platform inference.
+- Albumentations for augmentation transforms.
+- boto3 for S3-compatible CDN access.
+- rstream for RabbitMQ Streams protocol.
+- cryptography library for AES-256-CBC encryption.
+
+## Environment
+
+- Filesystem paths hardcoded to `/azaion/` root (configurable via `config.yaml`).
+- Requires network access to Azaion REST API, S3-compatible CDN, and RabbitMQ instance.
+- Configuration files (`config.yaml`, `cdn.yaml`) must be present with valid credentials.
+- `classes.json` must be present with the 17 annotation class definitions.
+- No containerization — processes run directly on host OS.
+
+## Operational
+
+- Training duration: ~11.5 days for 360K annotations on a single RTX 4090.
+- Augmentation runs as an infinite loop with 5-minute sleep intervals.
+- Annotation queue consumer runs as a persistent async process.
+- TensorRT engine files are GPU-architecture-specific — must be regenerated when moving to a different GPU.
+- Model encryption key is hardcoded — changing it invalidates all previously encrypted models.
+- No graceful shutdown mechanism for the augmentation process.
+- No reconnection logic for the annotation queue consumer on disconnect.