Files
ai-training/_docs/02_document/modules/augmentation.md
T
Oleksandr Bezdieniezhnykh 142c6c4de8 Refactor constants management to use Pydantic BaseModel for configuration
- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
2026-03-27 18:18:30 +02:00

57 lines
2.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Module: augmentation
## Purpose
Image augmentation pipeline that takes raw annotated images and produces multiple augmented variants for training data expansion. Runs continuously in a loop.
## Public Interface
### Augmentator
| Method | Signature | Returns | Description |
|--------|-----------|---------|-------------|
| `__init__` | `()` | — | Initializes augmentation transforms and counters |
| `augment_annotations` | `(from_scratch: bool = False)` | — | Processes all unprocessed images from `data/images``data-processed/images` |
| `augment_annotation` | `(image_file)` | — | Processes a single image file: reads image + labels, augments, saves results |
| `augment_inner` | `(img_ann: ImageLabel) -> list[ImageLabel]` | List of augmented images | Generates 1 original + 7 augmented variants |
| `correct_bboxes` | `(labels) -> list` | Corrected labels | Clips bounding boxes to image boundaries, removes tiny boxes |
| `read_labels` | `(labels_path) -> list[list]` | Parsed YOLO labels | Reads YOLO-format label file into list of [x, y, w, h, class_id] |
## Internal Logic
- **Augmentation pipeline** (albumentations Compose):
1. HorizontalFlip (p=0.6)
2. RandomBrightnessContrast (p=0.4)
3. Affine: scale 0.81.2, rotate ±35°, shear ±10° (p=0.8)
4. MotionBlur (p=0.1)
5. HueSaturationValue (p=0.4)
- Each image produces **8 outputs**: 1 original copy + 7 augmented variants
- Naming: `{stem}_{1..7}.jpg` for augmented, original keeps its name
- **Bbox correction**: clips bounding boxes that extend outside image borders, removes boxes smaller than `correct_min_bbox_size` (0.01 of image dimension)
- **Incremental processing**: skips images already present in `processed_images_dir`
- **Concurrent**: uses `ThreadPoolExecutor` for parallel processing
- **Continuous mode**: `__main__` runs augmentation in an infinite loop with 5-minute sleep between rounds
## Dependencies
- `constants` — directory paths (data_images_dir, data_labels_dir, processed_*)
- `dto/imageLabel` — ImageLabel container class
- `albumentations` (external) — augmentation transforms
- `cv2` (external) — image read/write
- `numpy` (external) — image array handling
- `concurrent.futures`, `os`, `shutil`, `time`, `datetime`, `pathlib` (stdlib)
## Consumers
manual_run
## Data Models
Uses `ImageLabel` from `dto/imageLabel`.
## Configuration
Hardcoded augmentation parameters (probabilities, ranges). Directory paths from `constants`.
## External Integrations
Filesystem I/O: reads from `/azaion/data/`, writes to `/azaion/data-processed/`.
## Security
None.
## Tests
None.