mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 21:56:36 +00:00
a47fa135de
- Modified `.gitignore` to include test fixture data while excluding test results. - Updated `config.yaml` to change the model from 'yolo11m.yaml' to 'yolo26m.pt'. - Enhanced `.cursor/rules/coderule.mdc` with additional guidelines for test environment consistency and infrastructure handling. - Revised autopilot state management in `_docs/_autopilot_state.md` to reflect current progress and tasks. - Removed outdated augmentation tests and adjusted dataset formation tests to align with the new structure. These changes streamline the configuration and testing processes, ensuring better organization and clarity in the project.
3.8 KiB
3.8 KiB
List of Changes
Run: 01-code-improvements
Mode: guided
Source: _docs/02_document/refactoring_notes.md
Date: 2026-03-28
Summary
Apply 5 improvements from documentation review: update YOLO model, switch to built-in augmentation, remove processed directory, use hard symlinks for dataset formation, and unify configuration files.
Changes
C01: Update YOLO model to 26m variant
- File(s):
src/constants.py,src/train.py - Problem: Current model config uses
yolo11m.yamlwhich trains from a YAML architecture definition - Change: Update
TrainingConfig.modelto the YOLO 26m variant; ensuretrain_dataset()uses the updated model reference - Rationale: Use updated model version as requested; pretrained weights improve convergence
- Risk: medium
- Dependencies: None
C02: Replace external augmentation with YOLO built-in
- File(s):
src/train.py,src/augmentation.py - Problem:
augmentation.pyuses albumentations to augment images into a separateprocessed_dirbefore training — adds complexity, disk usage, and a separate processing step - Change: Remove the
augment_annotations()call from the training pipeline; add YOLO built-in augmentation parameters (hsv_h, hsv_s, hsv_v, degrees, translate, scale, shear, flipud, fliplr, mosaic, mixup) to themodel.train()call intrain_dataset(), each on its own line with a descriptive comment;augmentation.pyremains in codebase but is no longer called during training - Rationale: YOLO's built-in augmentation applies on-the-fly during training, eliminating the pre-processing step and processed directory
- Risk: medium
- Dependencies: C01
C03: Remove processed directory — use data dir directly
- File(s):
src/constants.py,src/train.py,src/exports.py,src/dataset-visualiser.py - Problem:
processed_dir,processed_images_dir,processed_labels_dirproperties inConfigare no longer needed when built-in augmentation is used;form_dataset()reads from processed dir;form_data_sample()reads from processed dir;visualise_processed_folder()reads from processed dir - Change: Remove
processed_dir/processed_images_dir/processed_labels_dirproperties fromConfig; updateform_dataset()to read fromdata_images_dir/data_labels_dir; updateform_data_sample()similarly; updatevisualise_processed_folder()similarly - Rationale: Processed directory is unnecessary without external augmentation step
- Risk: medium
- Dependencies: C02
C04: Use hard symlinks instead of file copies for dataset
- File(s):
src/train.py - Problem:
copy_annotations()usesshutil.copy()to duplicate images and labels into train/valid/test splits — wastes disk space on large datasets - Change: Replace
shutil.copy()withos.link()to create hard links; add fallback toshutil.copy()for cross-filesystem scenarios - Rationale: Hard links share the same inode, saving disk space while maintaining independent directory entries
- Risk: low
- Dependencies: C03
C05: Unify configuration — remove annotation-queue/config.yaml
- File(s):
src/constants.py,src/annotation-queue/annotation_queue_handler.py,src/annotation-queue/config.yaml - Problem:
src/annotation-queue/config.yamlduplicates rootconfig.yamlwith differentdirsvalues;annotation_queue_handler.pyparses config manually viayaml.safe_loadinstead of using the sharedConfigmodel - Change: Extend
Configinconstants.pyto include queue and annotation-queue directory settings; refactorannotation_queue_handler.pyto accept aConfiginstance (or import from constants); deletesrc/annotation-queue/config.yaml - Rationale: Single source of truth for configuration eliminates drift risk and inconsistency
- Risk: medium
- Dependencies: None