[AZ-165] [AZ-166] [AZ-167] [AZ-168] [AZ-169] Complete refactoring: delete dead augmentation.py, move tasks to done

- Delete src/augmentation.py (dead code with broken processed_dir refs after AZ-168)
- Remove dead Augmentator import from manual_run.py
- Move all 5 refactoring tasks from todo/ to done/
- Update autopilot state: Step 7 Refactor complete, advance to Step 8 New Task
- Strengthen tracker.mdc: NEVER use ADO MCP

Made-with: Cursor
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-28 16:51:14 +02:00
parent cd04f282d0
commit 1e139d7533
9 changed files with 6 additions and 159 deletions
@@ -0,0 +1,54 @@
# Unify Configuration
**Task**: AZ-165_refactor_unify_config
**Name**: Unify configuration — remove annotation-queue/config.yaml
**Description**: Consolidate two config files into one shared Config model
**Complexity**: 3 points
**Dependencies**: None
**Component**: Configuration
**Tracker**: AZ-165
**Epic**: AZ-164
## Problem
Two separate `config.yaml` files exist (root and `src/annotation-queue/`) with overlapping content but different `dirs` values. The annotation queue handler parses YAML manually instead of using the shared `Config` Pydantic model, creating drift risk.
## Outcome
- Single `Config` model in `constants.py` covers all configuration including queue settings
- `annotation_queue_handler.py` uses the shared `Config` instead of parsing its own YAML
- `src/annotation-queue/config.yaml` is deleted
## Scope
### Included
- Add Pydantic models for `ApiConfig`, `QueueConfig`; extend `DirsConfig` with all directory fields (data, data_seed, data_processed, data_deleted, images, labels)
- Add these to the `Config` Pydantic model in `constants.py`
- Refactor `annotation_queue_handler.py` constructor to accept/import the shared Pydantic `Config`
- Delete `src/annotation-queue/config.yaml`
### Excluded
- Changing queue connection logic or message handling
- Modifying root `config.yaml` structure beyond adding queue section (it already has it)
## Acceptance Criteria
**AC-1: Single config source**
Given the root `config.yaml` contains queue and dirs settings
When `annotation_queue_handler.py` initializes
Then it reads configuration from the shared `Config` model, not a local YAML file
**AC-2: No duplicate config file**
Given the refactoring is complete
When listing `src/annotation-queue/`
Then `config.yaml` does not exist
**AC-3: Annotation queue behavior preserved**
Given the unified configuration
When the annotation queue handler processes messages
Then it uses the correct directory paths from configuration
## Constraints
- Root `config.yaml` already has the `queue` section — reuse it
- `annotation_queue_handler.py` runs as a separate process — config import path must work from its working directory
@@ -0,0 +1,56 @@
# Update YOLO Model
**Task**: AZ-166_refactor_yolo_model
**Name**: Update YOLO model to 26m variant (supports both from-scratch and pretrained)
**Description**: Update model references from YOLO11m to YOLO26m; support both training from scratch (`.yaml`) and from pretrained weights (`.pt`)
**Complexity**: 2 points
**Dependencies**: None
**Component**: Training Pipeline
**Tracker**: AZ-166
**Epic**: AZ-164
## Problem
Current `TrainingConfig.model` is set to `yolo11m.yaml` which defines a YOLO11 architecture. YOLO26m is the latest model variant. The system should support both training modes:
1. **From scratch** — using `yolo26m.yaml` (architecture definition, trains from random weights)
2. **From pretrained** — using `yolo26m.pt` (pretrained weights, faster convergence)
## Outcome
- `TrainingConfig` default model updated to `yolo26m.pt` (pretrained, recommended default)
- `config.yaml` updated to `yolo26m.pt`
- Both `yolo26m.pt` and `yolo26m.yaml` work when set in `config.yaml`
- `train_dataset()` and `resume_training()` work with either model reference
## Scope
### Included
- Update `TrainingConfig.model` default from `yolo11m.yaml` to `yolo26m.pt`
- Update `config.yaml` training.model from `yolo11m.yaml` to `yolo26m.pt`
- Verify `train_dataset()` works with both `.pt` and `.yaml` model values
### Excluded
- Changing training hyperparameters (epochs, batch, imgsz)
- Updating ultralytics library version
## Acceptance Criteria
**AC-1: Default model config updated**
Given the training configuration
When reading `TrainingConfig.model`
Then the default value is `yolo26m.pt`
**AC-2: config.yaml updated**
Given the root `config.yaml`
When reading `training.model`
Then the value is `yolo26m.pt`
**AC-3: From-scratch training supported**
Given `config.yaml` sets `training.model: yolo26m.yaml`
When `YOLO(constants.config.training.model)` is called
Then a YOLO26m model is built from the architecture definition
**AC-4: Pretrained training supported**
Given `config.yaml` sets `training.model: yolo26m.pt`
When `YOLO(constants.config.training.model)` is called
Then a YOLO26m model is loaded from pretrained weights
@@ -0,0 +1,55 @@
# Replace External Augmentation with YOLO Built-in
**Task**: AZ-167_refactor_builtin_augmentation
**Name**: Replace external augmentation with YOLO built-in
**Description**: Remove albumentations pipeline and use YOLO model.train() built-in augmentation parameters
**Complexity**: 3 points
**Dependencies**: AZ-166_refactor_yolo_model
**Component**: Training Pipeline
**Tracker**: AZ-167
**Epic**: AZ-164
## Problem
`augmentation.py` uses the `albumentations` library to augment images into a `processed_dir` before training. This creates a separate processing step, uses extra disk space (8x original), and adds complexity. YOLO's built-in augmentation applies on-the-fly during training.
## Outcome
- `train_dataset()` passes augmentation parameters directly to `model.train()`
- Each augmentation parameter is on its own line with a descriptive comment
- The external augmentation step is removed from the training pipeline
- `augmentation.py` is no longer called during training
## Scope
### Included
- Add YOLO built-in augmentation parameters to `model.train()` call in `train_dataset()`
- Parameters to add: hsv_h, hsv_s, hsv_v, degrees, translate, scale, shear, fliplr, mosaic (each with comment)
- Remove call to augmentation from training flow
### Excluded
- Deleting `augmentation.py` file (may still be useful standalone)
- Changing training hyperparameters unrelated to augmentation
## Acceptance Criteria
**AC-1: Built-in augmentation parameters with comments**
Given the `train_dataset()` function
When `model.train()` is called
Then every parameter (including augmentation: hsv_h, hsv_s, hsv_v, degrees, scale, shear, fliplr, mosaic, and training: data, epochs, batch, imgsz, etc.) is on its own line with an inline comment explaining what the parameter controls
**AC-2: No external augmentation in training flow**
Given the training pipeline
When `train_dataset()` runs
Then it does not call `augment_annotations()` or any albumentations-based augmentation
## Constraints
- Every parameter row in the `model.train()` call MUST have an inline comment describing what it does (e.g. `hsv_h=0.015, # hue shift fraction of the color wheel`)
- This applies to ALL parameters, not just augmentation — training params (data, epochs, batch, imgsz, save_period, workers) also need comments
- Augmentation parameter values should approximate the current albumentations settings:
- fliplr=0.6 (was HorizontalFlip p=0.6)
- degrees=35.0 (was Affine rotate=(-35,35))
- shear=10.0 (was Affine shear=(-10,10))
- hsv_h=0.015, hsv_s=0.7, hsv_v=0.4 (approximate HSV shifts)
- mosaic=1.0 (new YOLO built-in, recommended default)
@@ -0,0 +1,60 @@
# Remove Processed Directory
**Task**: AZ-168_refactor_remove_processed_dir
**Name**: Remove processed directory — use data dir directly
**Description**: Eliminate processed_dir concept from Config and all consumers; read from data dir directly; update e2e test fixture
**Complexity**: 3 points
**Dependencies**: AZ-167_refactor_builtin_augmentation
**Component**: Training Pipeline, Data Utilities
**Tracker**: AZ-168
**Epic**: AZ-164
## Problem
`Config` exposes `processed_dir`, `processed_images_dir`, `processed_labels_dir` properties. Multiple files read from the processed directory: `train.py::form_dataset()`, `exports.py::form_data_sample()`, `dataset-visualiser.py::visualise_processed_folder()`. With built-in augmentation, the processed directory is no longer populated.
The e2e test fixture (`tests/test_training_e2e.py`) currently copies images to both `data_images_dir` and `processed_images_dir` as a workaround — this needs cleanup once `form_dataset()` reads from data dirs.
## Outcome
- `Config` no longer has `processed_dir`/`processed_images_dir`/`processed_labels_dir` properties
- `form_dataset()` reads images/labels from `data_images_dir`/`data_labels_dir`
- `form_data_sample()` reads from `data_images_dir`
- `visualise_processed_folder()` reads from `data_images_dir`/`data_labels_dir`
- E2e test fixture copies images only to `data_images_dir`/`data_labels_dir` (no more processed dir population)
## Scope
### Included
- Remove `processed_dir`, `processed_images_dir`, `processed_labels_dir` from `Config`
- Update `form_dataset()` in `train.py` to use `data_images_dir` and `data_labels_dir`
- Update `copy_annotations()` in `train.py` to look up labels from `data_labels_dir` instead of `processed_labels_dir`
- Update `form_data_sample()` in `exports.py` to use `data_images_dir`
- Update `visualise_processed_folder()` in `dataset-visualiser.py`
- Update `tests/test_training_e2e.py` e2e fixture: remove processed dir population (only copy to data dirs)
### Excluded
- Removing `augmentation.py` file
- Changing `corrupted_dir` handling
## Acceptance Criteria
**AC-1: No processed dir in Config**
Given the `Config` class
When inspecting its properties
Then `processed_dir`, `processed_images_dir`, `processed_labels_dir` do not exist
**AC-2: Dataset formation reads data dir**
Given images and labels in `data_images_dir` / `data_labels_dir`
When `form_dataset()` runs
Then it reads from `data_images_dir` and validates labels from `data_labels_dir`
**AC-3: Data sample reads data dir**
Given images in `data_images_dir`
When `form_data_sample()` runs
Then it reads from `data_images_dir`
**AC-4: E2e test uses data dirs only**
Given the e2e test fixture
When setting up test data
Then it copies images/labels only to `data_images_dir`/`data_labels_dir` (no processed dir)
@@ -0,0 +1,42 @@
# Use Hard Symlinks for Dataset
**Task**: AZ-169_refactor_hard_symlinks
**Name**: Use hard symlinks instead of file copies for dataset formation
**Description**: Replace shutil.copy() with os.link() in dataset split creation to save disk space
**Complexity**: 2 points
**Dependencies**: AZ-168_refactor_remove_processed_dir
**Component**: Training Pipeline
**Tracker**: AZ-169
**Epic**: AZ-164
## Problem
`copy_annotations()` in `train.py` uses `shutil.copy()` to duplicate images and labels into train/valid/test splits. For large datasets this wastes significant disk space.
## Outcome
- Dataset formation uses `os.link()` (hard links) instead of `shutil.copy()`
- Fallback to `shutil.copy()` when hard links fail (cross-filesystem)
- No change in training behavior — YOLO reads hard-linked files identically
## Scope
### Included
- Replace `shutil.copy()` with `os.link()` in `copy_annotations()` inner `copy_image()` function
- Add try/except fallback to `shutil.copy()` for `OSError` (cross-filesystem)
### Excluded
- Changing `form_data_sample()` in exports.py (separate utility, lower priority)
- Changing corrupted file handling
## Acceptance Criteria
**AC-1: Hard links used**
Given images and labels in the data directory
When `copy_annotations()` creates train/valid/test splits
Then files are hard-linked via `os.link()`, not copied
**AC-2: Fallback on failure**
Given a cross-filesystem scenario where `os.link()` raises `OSError`
When `copy_annotations()` encounters the error
Then it falls back to `shutil.copy()` transparently