Update configuration and test structure for improved clarity and functionality

- Modified `.gitignore` to include test fixture data while excluding test results. - Updated `config.yaml` to change the model from 'yolo11m.yaml' to 'yolo26m.pt'. - Enhanced `.cursor/rules/coderule.mdc` with additional guidelines for test environment consistency and infrastructure handling. - Revised autopilot state management in `_docs/_autopilot_state.md` to reflect current progress and tasks. - Removed outdated augmentation tests and adjusted dataset formation tests to align with the new structure. These changes streamline the configuration and testing processes, ensuring better organization and clarity in the project.
2026-06-21 09:01:13 +00:00 · 2026-03-28 06:11:55 +02:00
parent cdcd1f6ea7
commit a47fa135de
119 changed files with 824 additions and 774 deletions
@@ -0,0 +1,33 @@
+# Refactoring Roadmap
+
+**Run**: 01-code-improvements
+**Date**: 2026-03-28
+
+## Execution Order
+
+All 5 changes are grouped into a single phase (straightforward, low-to-medium risk).
+
+| Priority | Change | Risk | Effort |
+|----------|--------|------|--------|
+| 1 | C05: Unify configuration | medium | 3 pts |
+| 2 | C01: Update YOLO model | medium | 2 pts |
+| 3 | C02: Replace external augmentation | medium | 3 pts |
+| 4 | C03: Remove processed directory | medium | 3 pts |
+| 5 | C04: Hard symlinks | low | 2 pts |
+
+**Total estimated effort**: 13 points across 5 tasks
+
+## Dependency Graph
+
+```
+C05 (config unification) ─── independent
+C01 (YOLO update) ← C02 (built-in aug) ← C03 (remove processed dir) ← C04 (hard symlinks)
+```
+
+C05 can be done in parallel with the C01→C04 chain.
+
+## Risk Mitigation
+
+- Existing test suite (83 tests) provides safety net
+- Each change committed separately for easy rollback
+- C02 is the highest-risk change (training pipeline behavior change) — validate by running a short training sanity check after implementation
@@ -0,0 +1,50 @@
+# Research Findings
+
+**Run**: 01-code-improvements
+**Date**: 2026-03-28
+
+## Current State Analysis
+
+### Training Pipeline
+- Uses `yolo11m.yaml` (architecture-only config, trains from scratch)
+- External augmentation via `albumentations` library in `src/augmentation.py`
+- Two-step process: augment → form dataset → train
+- Dataset formation copies files with `shutil.copy()`, duplicating ~8x storage
+
+### Configuration
+- Two config files: `config.yaml` (root) and `src/annotation-queue/config.yaml`
+- Annotation queue handler parses YAML manually instead of using shared `Config` model
+- Config drift risk between the two files
+
+## YOLO 26 Model Update
+
+Ultralytics YOLO26 is the latest model family. The medium variant `yolo26m.pt` replaces `yolo11m.yaml`:
+- Uses pretrained weights (`.pt`) rather than architecture-only (`.yaml`)
+- Faster convergence with transfer learning
+- Improved accuracy on detection benchmarks
+
+## Built-in Augmentation Parameters
+
+YOLO's `model.train()` supports the following augmentation parameters that replace the external `albumentations` pipeline:
+
+| Parameter | Default | Equivalent to current external aug |
+|-----------|---------|-----------------------------------|
+| `hsv_h` | 0.015 | HueSaturationValue(hue_shift_limit=10) |
+| `hsv_s` | 0.7 | HueSaturationValue(sat_shift_limit=10) |
+| `hsv_v` | 0.4 | RandomBrightnessContrast + HSV |
+| `degrees` | 0.0 | Affine(rotate=(-35,35)) → set to 35.0 |
+| `translate` | 0.1 | Default is sufficient |
+| `scale` | 0.5 | Affine(scale=(0.8,1.2)) → default covers this |
+| `shear` | 0.0 | Affine(shear=(-10,10)) → set to 10.0 |
+| `fliplr` | 0.5 | HorizontalFlip(p=0.6) → set to 0.6 |
+| `flipud` | 0.0 | Not used currently |
+| `mosaic` | 1.0 | New — YOLO built-in |
+| `mixup` | 0.0 | New — optional |
+
+## Hard Symlinks
+
+`os.link()` creates hard links sharing the same inode. Benefits:
+- Zero additional disk usage for dataset splits
+- Same read performance as regular files
+- Works on same filesystem (which is the case here — all under `/azaion/`)
+- Fallback to `shutil.copy()` for cross-filesystem edge cases