mirror of https://github.com/azaion/ai-training.git synced 2026-04-22 21:56:36 +00:00

Files

T

Oleksandr Bezdieniezhnykh a47fa135de Update configuration and test structure for improved clarity and functionality

- Modified `.gitignore` to include test fixture data while excluding test results.
- Updated `config.yaml` to change the model from 'yolo11m.yaml' to 'yolo26m.pt'.
- Enhanced `.cursor/rules/coderule.mdc` with additional guidelines for test environment consistency and infrastructure handling.
- Revised autopilot state management in `_docs/_autopilot_state.md` to reflect current progress and tasks.
- Removed outdated augmentation tests and adjusted dataset formation tests to align with the new structure.

These changes streamline the configuration and testing processes, ensuring better organization and clarity in the project.

2026-03-28 06:11:55 +02:00

3.8 KiB

Raw Blame History

List of Changes

Run: 01-code-improvements Mode: guided Source: _docs/02_document/refactoring_notes.md Date: 2026-03-28

Summary

Apply 5 improvements from documentation review: update YOLO model, switch to built-in augmentation, remove processed directory, use hard symlinks for dataset formation, and unify configuration files.

Changes

C01: Update YOLO model to 26m variant

File(s): src/constants.py, src/train.py
Problem: Current model config uses yolo11m.yaml which trains from a YAML architecture definition
Change: Update TrainingConfig.model to the YOLO 26m variant; ensure train_dataset() uses the updated model reference
Rationale: Use updated model version as requested; pretrained weights improve convergence
Risk: medium
Dependencies: None

C02: Replace external augmentation with YOLO built-in

File(s): src/train.py, src/augmentation.py
Problem: augmentation.py uses albumentations to augment images into a separate processed_dir before training — adds complexity, disk usage, and a separate processing step
Change: Remove the augment_annotations() call from the training pipeline; add YOLO built-in augmentation parameters (hsv_h, hsv_s, hsv_v, degrees, translate, scale, shear, flipud, fliplr, mosaic, mixup) to the model.train() call in train_dataset(), each on its own line with a descriptive comment; augmentation.py remains in codebase but is no longer called during training
Rationale: YOLO's built-in augmentation applies on-the-fly during training, eliminating the pre-processing step and processed directory
Risk: medium
Dependencies: C01

C03: Remove processed directory — use data dir directly

File(s): src/constants.py, src/train.py, src/exports.py, src/dataset-visualiser.py
Problem: processed_dir, processed_images_dir, processed_labels_dir properties in Config are no longer needed when built-in augmentation is used; form_dataset() reads from processed dir; form_data_sample() reads from processed dir; visualise_processed_folder() reads from processed dir
Change: Remove processed_dir/processed_images_dir/processed_labels_dir properties from Config; update form_dataset() to read from data_images_dir/data_labels_dir; update form_data_sample() similarly; update visualise_processed_folder() similarly
Rationale: Processed directory is unnecessary without external augmentation step
Risk: medium
Dependencies: C02

C04: Use hard symlinks instead of file copies for dataset

File(s): src/train.py
Problem: copy_annotations() uses shutil.copy() to duplicate images and labels into train/valid/test splits — wastes disk space on large datasets
Change: Replace shutil.copy() with os.link() to create hard links; add fallback to shutil.copy() for cross-filesystem scenarios
Rationale: Hard links share the same inode, saving disk space while maintaining independent directory entries
Risk: low
Dependencies: C03

C05: Unify configuration — remove annotation-queue/config.yaml

File(s): src/constants.py, src/annotation-queue/annotation_queue_handler.py, src/annotation-queue/config.yaml
Problem: src/annotation-queue/config.yaml duplicates root config.yaml with different dirs values; annotation_queue_handler.py parses config manually via yaml.safe_load instead of using the shared Config model
Change: Extend Config in constants.py to include queue and annotation-queue directory settings; refactor annotation_queue_handler.py to accept a Config instance (or import from constants); delete src/annotation-queue/config.yaml
Rationale: Single source of truth for configuration eliminates drift risk and inconsistency
Risk: medium
Dependencies: None

3.8 KiB Raw Blame History