[AZ-165] [AZ-166] [AZ-167] [AZ-168] [AZ-169] Complete refactoring: delete dead augmentation.py, move tasks to done

- Delete src/augmentation.py (dead code with broken processed_dir refs after AZ-168) - Remove dead Augmentator import from manual_run.py - Move all 5 refactoring tasks from todo/ to done/ - Update autopilot state: Step 7 Refactor complete, advance to Step 8 New Task - Strengthen tracker.mdc: NEVER use ADO MCP Made-with: Cursor
2026-06-22 13:01:12 +00:00 · 2026-03-28 16:51:14 +02:00
parent cd04f282d0
commit 1e139d7533
9 changed files with 6 additions and 159 deletions
@@ -0,0 +1,54 @@
+# Unify Configuration
+
+**Task**: AZ-165_refactor_unify_config
+**Name**: Unify configuration — remove annotation-queue/config.yaml
+**Description**: Consolidate two config files into one shared Config model
+**Complexity**: 3 points
+**Dependencies**: None
+**Component**: Configuration
+**Tracker**: AZ-165
+**Epic**: AZ-164
+
+## Problem
+
+Two separate `config.yaml` files exist (root and `src/annotation-queue/`) with overlapping content but different `dirs` values. The annotation queue handler parses YAML manually instead of using the shared `Config` Pydantic model, creating drift risk.
+
+## Outcome
+
+- Single `Config` model in `constants.py` covers all configuration including queue settings
+- `annotation_queue_handler.py` uses the shared `Config` instead of parsing its own YAML
+- `src/annotation-queue/config.yaml` is deleted
+
+## Scope
+
+### Included
+- Add Pydantic models for `ApiConfig`, `QueueConfig`; extend `DirsConfig` with all directory fields (data, data_seed, data_processed, data_deleted, images, labels)
+- Add these to the `Config` Pydantic model in `constants.py`
+- Refactor `annotation_queue_handler.py` constructor to accept/import the shared Pydantic `Config`
+- Delete `src/annotation-queue/config.yaml`
+
+### Excluded
+- Changing queue connection logic or message handling
+- Modifying root `config.yaml` structure beyond adding queue section (it already has it)
+
+## Acceptance Criteria
+
+**AC-1: Single config source**
+Given the root `config.yaml` contains queue and dirs settings
+When `annotation_queue_handler.py` initializes
+Then it reads configuration from the shared `Config` model, not a local YAML file
+
+**AC-2: No duplicate config file**
+Given the refactoring is complete
+When listing `src/annotation-queue/`
+Then `config.yaml` does not exist
+
+**AC-3: Annotation queue behavior preserved**
+Given the unified configuration
+When the annotation queue handler processes messages
+Then it uses the correct directory paths from configuration
+
+## Constraints
+
+- Root `config.yaml` already has the `queue` section — reuse it
+- `annotation_queue_handler.py` runs as a separate process — config import path must work from its working directory
@@ -0,0 +1,56 @@
+# Update YOLO Model
+
+**Task**: AZ-166_refactor_yolo_model
+**Name**: Update YOLO model to 26m variant (supports both from-scratch and pretrained)
+**Description**: Update model references from YOLO11m to YOLO26m; support both training from scratch (`.yaml`) and from pretrained weights (`.pt`)
+**Complexity**: 2 points
+**Dependencies**: None
+**Component**: Training Pipeline
+**Tracker**: AZ-166
+**Epic**: AZ-164
+
+## Problem
+
+Current `TrainingConfig.model` is set to `yolo11m.yaml` which defines a YOLO11 architecture. YOLO26m is the latest model variant. The system should support both training modes:
+1. **From scratch** — using `yolo26m.yaml` (architecture definition, trains from random weights)
+2. **From pretrained** — using `yolo26m.pt` (pretrained weights, faster convergence)
+
+## Outcome
+
+- `TrainingConfig` default model updated to `yolo26m.pt` (pretrained, recommended default)
+- `config.yaml` updated to `yolo26m.pt`
+- Both `yolo26m.pt` and `yolo26m.yaml` work when set in `config.yaml`
+- `train_dataset()` and `resume_training()` work with either model reference
+
+## Scope
+
+### Included
+- Update `TrainingConfig.model` default from `yolo11m.yaml` to `yolo26m.pt`
+- Update `config.yaml` training.model from `yolo11m.yaml` to `yolo26m.pt`
+- Verify `train_dataset()` works with both `.pt` and `.yaml` model values
+
+### Excluded
+- Changing training hyperparameters (epochs, batch, imgsz)
+- Updating ultralytics library version
+
+## Acceptance Criteria
+
+**AC-1: Default model config updated**
+Given the training configuration
+When reading `TrainingConfig.model`
+Then the default value is `yolo26m.pt`
+
+**AC-2: config.yaml updated**
+Given the root `config.yaml`
+When reading `training.model`
+Then the value is `yolo26m.pt`
+
+**AC-3: From-scratch training supported**
+Given `config.yaml` sets `training.model: yolo26m.yaml`
+When `YOLO(constants.config.training.model)` is called
+Then a YOLO26m model is built from the architecture definition
+
+**AC-4: Pretrained training supported**
+Given `config.yaml` sets `training.model: yolo26m.pt`
+When `YOLO(constants.config.training.model)` is called
+Then a YOLO26m model is loaded from pretrained weights
@@ -0,0 +1,55 @@
+# Replace External Augmentation with YOLO Built-in
+
+**Task**: AZ-167_refactor_builtin_augmentation
+**Name**: Replace external augmentation with YOLO built-in
+**Description**: Remove albumentations pipeline and use YOLO model.train() built-in augmentation parameters
+**Complexity**: 3 points
+**Dependencies**: AZ-166_refactor_yolo_model
+**Component**: Training Pipeline
+**Tracker**: AZ-167
+**Epic**: AZ-164
+
+## Problem
+
+`augmentation.py` uses the `albumentations` library to augment images into a `processed_dir` before training. This creates a separate processing step, uses extra disk space (8x original), and adds complexity. YOLO's built-in augmentation applies on-the-fly during training.
+
+## Outcome
+
+- `train_dataset()` passes augmentation parameters directly to `model.train()`
+- Each augmentation parameter is on its own line with a descriptive comment
+- The external augmentation step is removed from the training pipeline
+- `augmentation.py` is no longer called during training
+
+## Scope
+
+### Included
+- Add YOLO built-in augmentation parameters to `model.train()` call in `train_dataset()`
+- Parameters to add: hsv_h, hsv_s, hsv_v, degrees, translate, scale, shear, fliplr, mosaic (each with comment)
+- Remove call to augmentation from training flow
+
+### Excluded
+- Deleting `augmentation.py` file (may still be useful standalone)
+- Changing training hyperparameters unrelated to augmentation
+
+## Acceptance Criteria
+
+**AC-1: Built-in augmentation parameters with comments**
+Given the `train_dataset()` function
+When `model.train()` is called
+Then every parameter (including augmentation: hsv_h, hsv_s, hsv_v, degrees, scale, shear, fliplr, mosaic, and training: data, epochs, batch, imgsz, etc.) is on its own line with an inline comment explaining what the parameter controls
+
+**AC-2: No external augmentation in training flow**
+Given the training pipeline
+When `train_dataset()` runs
+Then it does not call `augment_annotations()` or any albumentations-based augmentation
+
+## Constraints
+
+- Every parameter row in the `model.train()` call MUST have an inline comment describing what it does (e.g. `hsv_h=0.015,  # hue shift fraction of the color wheel`)
+- This applies to ALL parameters, not just augmentation — training params (data, epochs, batch, imgsz, save_period, workers) also need comments
+- Augmentation parameter values should approximate the current albumentations settings:
+  - fliplr=0.6 (was HorizontalFlip p=0.6)
+  - degrees=35.0 (was Affine rotate=(-35,35))
+  - shear=10.0 (was Affine shear=(-10,10))
+  - hsv_h=0.015, hsv_s=0.7, hsv_v=0.4 (approximate HSV shifts)
+  - mosaic=1.0 (new YOLO built-in, recommended default)
@@ -0,0 +1,60 @@
+# Remove Processed Directory
+
+**Task**: AZ-168_refactor_remove_processed_dir
+**Name**: Remove processed directory — use data dir directly
+**Description**: Eliminate processed_dir concept from Config and all consumers; read from data dir directly; update e2e test fixture
+**Complexity**: 3 points
+**Dependencies**: AZ-167_refactor_builtin_augmentation
+**Component**: Training Pipeline, Data Utilities
+**Tracker**: AZ-168
+**Epic**: AZ-164
+
+## Problem
+
+`Config` exposes `processed_dir`, `processed_images_dir`, `processed_labels_dir` properties. Multiple files read from the processed directory: `train.py::form_dataset()`, `exports.py::form_data_sample()`, `dataset-visualiser.py::visualise_processed_folder()`. With built-in augmentation, the processed directory is no longer populated.
+
+The e2e test fixture (`tests/test_training_e2e.py`) currently copies images to both `data_images_dir` and `processed_images_dir` as a workaround — this needs cleanup once `form_dataset()` reads from data dirs.
+
+## Outcome
+
+- `Config` no longer has `processed_dir`/`processed_images_dir`/`processed_labels_dir` properties
+- `form_dataset()` reads images/labels from `data_images_dir`/`data_labels_dir`
+- `form_data_sample()` reads from `data_images_dir`
+- `visualise_processed_folder()` reads from `data_images_dir`/`data_labels_dir`
+- E2e test fixture copies images only to `data_images_dir`/`data_labels_dir` (no more processed dir population)
+
+## Scope
+
+### Included
+- Remove `processed_dir`, `processed_images_dir`, `processed_labels_dir` from `Config`
+- Update `form_dataset()` in `train.py` to use `data_images_dir` and `data_labels_dir`
+- Update `copy_annotations()` in `train.py` to look up labels from `data_labels_dir` instead of `processed_labels_dir`
+- Update `form_data_sample()` in `exports.py` to use `data_images_dir`
+- Update `visualise_processed_folder()` in `dataset-visualiser.py`
+- Update `tests/test_training_e2e.py` e2e fixture: remove processed dir population (only copy to data dirs)
+
+### Excluded
+- Removing `augmentation.py` file
+- Changing `corrupted_dir` handling
+
+## Acceptance Criteria
+
+**AC-1: No processed dir in Config**
+Given the `Config` class
+When inspecting its properties
+Then `processed_dir`, `processed_images_dir`, `processed_labels_dir` do not exist
+
+**AC-2: Dataset formation reads data dir**
+Given images and labels in `data_images_dir` / `data_labels_dir`
+When `form_dataset()` runs
+Then it reads from `data_images_dir` and validates labels from `data_labels_dir`
+
+**AC-3: Data sample reads data dir**
+Given images in `data_images_dir`
+When `form_data_sample()` runs
+Then it reads from `data_images_dir`
+
+**AC-4: E2e test uses data dirs only**
+Given the e2e test fixture
+When setting up test data
+Then it copies images/labels only to `data_images_dir`/`data_labels_dir` (no processed dir)
@@ -0,0 +1,42 @@
+# Use Hard Symlinks for Dataset
+
+**Task**: AZ-169_refactor_hard_symlinks
+**Name**: Use hard symlinks instead of file copies for dataset formation
+**Description**: Replace shutil.copy() with os.link() in dataset split creation to save disk space
+**Complexity**: 2 points
+**Dependencies**: AZ-168_refactor_remove_processed_dir
+**Component**: Training Pipeline
+**Tracker**: AZ-169
+**Epic**: AZ-164
+
+## Problem
+
+`copy_annotations()` in `train.py` uses `shutil.copy()` to duplicate images and labels into train/valid/test splits. For large datasets this wastes significant disk space.
+
+## Outcome
+
+- Dataset formation uses `os.link()` (hard links) instead of `shutil.copy()`
+- Fallback to `shutil.copy()` when hard links fail (cross-filesystem)
+- No change in training behavior — YOLO reads hard-linked files identically
+
+## Scope
+
+### Included
+- Replace `shutil.copy()` with `os.link()` in `copy_annotations()` inner `copy_image()` function
+- Add try/except fallback to `shutil.copy()` for `OSError` (cross-filesystem)
+
+### Excluded
+- Changing `form_data_sample()` in exports.py (separate utility, lower priority)
+- Changing corrupted file handling
+
+## Acceptance Criteria
+
+**AC-1: Hard links used**
+Given images and labels in the data directory
+When `copy_annotations()` creates train/valid/test splits
+Then files are hard-linked via `os.link()`, not copied
+
+**AC-2: Fallback on failure**
+Given a cross-filesystem scenario where `os.link()` raises `OSError`
+When `copy_annotations()` encounters the error
+Then it falls back to `shutil.copy()` transparently