Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-27 18:18:30 +02:00
parent b68c07b540
commit 142c6c4de8
106 changed files with 5706 additions and 654 deletions
@@ -0,0 +1,47 @@
# Input Data Parameters
## Annotation Images
- **Format**: JPEG
- **Naming**: UUID-based (`{uuid}.jpg`)
- **Source**: Azaion annotation platform via RabbitMQ Streams
- **Volume**: Up to 360K+ annotations observed in training comments
- **Delivery**: Real-time streaming via annotation queue consumer
## Annotation Labels
- **Format**: YOLO text format (one detection per line)
- **Schema**: `{class_id} {center_x} {center_y} {width} {height}`
- **Coordinate system**: All values normalized to 01 relative to image dimensions
- **Constraints**: Coordinates must be in [0, 1]; labels with coords > 1.0 are treated as corrupted
## Annotation Classes
- **Source file**: `classes.json` (static, 17 entries)
- **Schema per class**: `{ Id: int, Name: str, ShortName: str, Color: hex_str }`
- **Classes**: ArmorVehicle, Truck, Vehicle, Artillery, Shadow, Trenches, MilitaryMan, TyreTracks, AdditArmoredTank, Smoke, Plane, Moto, CamouflageNet, CamouflageBranches, Roof, Building, Caponier
- **Weather expansion**: Each class × 3 modes (Norm offset 0, Wint offset 20, Night offset 40)
- **Total class IDs**: 80 slots (51 used, 29 reserved as placeholders)
## Queue Messages
- **Protocol**: AMQP via RabbitMQ Streams (rstream library)
- **Serialization**: msgpack with positional integer keys
- **Message types**: AnnotationMessage (single), AnnotationBulkMessage (batch validate/delete)
- **Fields**: createdDate, name, originalMediaName, time, imageExtension, detections (JSON string), image (raw bytes), createdRole, createdEmail, source, status
## Configuration Files
| File | Format | Key Contents |
|------|--------|-------------|
| `config.yaml` | YAML | API URL, email, password, queue host/port/username/password, directory paths |
| `cdn.yaml` | YAML | CDN endpoint, read access key/secret, write access key/secret, bucket name |
| `classes.json` | JSON | Annotation class definitions array |
| `checkpoint.txt` | Plain text | Last training run timestamp |
| `offset.yaml` | YAML | Queue consumer offset for resume |
## Video Input (Inference)
- **Format**: Any OpenCV-supported video format
- **Processing**: Every 4th frame sampled, batched in groups of 4
- **Resolution**: Resized to model input size (1280×1280) during preprocessing