Files
ai-training/_docs/04_refactoring/01-code-improvements/list-of-changes.md
T
Oleksandr Bezdieniezhnykh a47fa135de Update configuration and test structure for improved clarity and functionality
- Modified `.gitignore` to include test fixture data while excluding test results.
- Updated `config.yaml` to change the model from 'yolo11m.yaml' to 'yolo26m.pt'.
- Enhanced `.cursor/rules/coderule.mdc` with additional guidelines for test environment consistency and infrastructure handling.
- Revised autopilot state management in `_docs/_autopilot_state.md` to reflect current progress and tasks.
- Removed outdated augmentation tests and adjusted dataset formation tests to align with the new structure.

These changes streamline the configuration and testing processes, ensuring better organization and clarity in the project.
2026-03-28 06:11:55 +02:00

3.8 KiB

List of Changes

Run: 01-code-improvements Mode: guided Source: _docs/02_document/refactoring_notes.md Date: 2026-03-28

Summary

Apply 5 improvements from documentation review: update YOLO model, switch to built-in augmentation, remove processed directory, use hard symlinks for dataset formation, and unify configuration files.

Changes

C01: Update YOLO model to 26m variant

  • File(s): src/constants.py, src/train.py
  • Problem: Current model config uses yolo11m.yaml which trains from a YAML architecture definition
  • Change: Update TrainingConfig.model to the YOLO 26m variant; ensure train_dataset() uses the updated model reference
  • Rationale: Use updated model version as requested; pretrained weights improve convergence
  • Risk: medium
  • Dependencies: None

C02: Replace external augmentation with YOLO built-in

  • File(s): src/train.py, src/augmentation.py
  • Problem: augmentation.py uses albumentations to augment images into a separate processed_dir before training — adds complexity, disk usage, and a separate processing step
  • Change: Remove the augment_annotations() call from the training pipeline; add YOLO built-in augmentation parameters (hsv_h, hsv_s, hsv_v, degrees, translate, scale, shear, flipud, fliplr, mosaic, mixup) to the model.train() call in train_dataset(), each on its own line with a descriptive comment; augmentation.py remains in codebase but is no longer called during training
  • Rationale: YOLO's built-in augmentation applies on-the-fly during training, eliminating the pre-processing step and processed directory
  • Risk: medium
  • Dependencies: C01

C03: Remove processed directory — use data dir directly

  • File(s): src/constants.py, src/train.py, src/exports.py, src/dataset-visualiser.py
  • Problem: processed_dir, processed_images_dir, processed_labels_dir properties in Config are no longer needed when built-in augmentation is used; form_dataset() reads from processed dir; form_data_sample() reads from processed dir; visualise_processed_folder() reads from processed dir
  • Change: Remove processed_dir/processed_images_dir/processed_labels_dir properties from Config; update form_dataset() to read from data_images_dir/data_labels_dir; update form_data_sample() similarly; update visualise_processed_folder() similarly
  • Rationale: Processed directory is unnecessary without external augmentation step
  • Risk: medium
  • Dependencies: C02
  • File(s): src/train.py
  • Problem: copy_annotations() uses shutil.copy() to duplicate images and labels into train/valid/test splits — wastes disk space on large datasets
  • Change: Replace shutil.copy() with os.link() to create hard links; add fallback to shutil.copy() for cross-filesystem scenarios
  • Rationale: Hard links share the same inode, saving disk space while maintaining independent directory entries
  • Risk: low
  • Dependencies: C03

C05: Unify configuration — remove annotation-queue/config.yaml

  • File(s): src/constants.py, src/annotation-queue/annotation_queue_handler.py, src/annotation-queue/config.yaml
  • Problem: src/annotation-queue/config.yaml duplicates root config.yaml with different dirs values; annotation_queue_handler.py parses config manually via yaml.safe_load instead of using the shared Config model
  • Change: Extend Config in constants.py to include queue and annotation-queue directory settings; refactor annotation_queue_handler.py to accept a Config instance (or import from constants); delete src/annotation-queue/config.yaml
  • Rationale: Single source of truth for configuration eliminates drift risk and inconsistency
  • Risk: medium
  • Dependencies: None