Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-27 18:18:30 +02:00
parent b68c07b540
commit 142c6c4de8
106 changed files with 5706 additions and 654 deletions
@@ -0,0 +1,83 @@
# Augmentation Blackbox Tests
**Task**: AZ-153_test_augmentation
**Name**: Augmentation Blackbox Tests
**Description**: Implement 8 blackbox tests for the augmentation pipeline — output count, naming, bbox validation, edge cases, filesystem integration
**Complexity**: 3 points
**Dependencies**: AZ-152_test_infrastructure
**Component**: Blackbox Tests
**Jira**: AZ-153
**Epic**: AZ-151
## Problem
The augmentation pipeline transforms annotated images into 8 variants each. Tests must verify output count, naming conventions, bounding box validity, edge cases, and filesystem integration without referencing internals.
## Outcome
- 8 passing pytest tests in `tests/test_augmentation.py`
- Covers: single-image augmentation, naming convention, bbox range, bbox clipping, tiny bbox removal, empty labels, full pipeline, skip-already-processed
## Scope
### Included
- BT-AUG-01: Single image → 8 outputs
- BT-AUG-02: Augmented filenames follow naming convention
- BT-AUG-03: All output bounding boxes in valid range [0,1]
- BT-AUG-04: Bounding box correction clips edge bboxes
- BT-AUG-05: Tiny bounding boxes removed after correction
- BT-AUG-06: Empty label produces 8 outputs with empty labels
- BT-AUG-07: Full augmentation pipeline (filesystem, 5 images → 40 outputs)
- BT-AUG-08: Augmentation skips already-processed images
### Excluded
- Performance tests (separate task)
- Resilience tests (separate task)
## Acceptance Criteria
**AC-1: Output count**
Given 1 image + 1 valid label
When augment_inner() runs
Then exactly 8 ImageLabel objects are returned
**AC-2: Naming convention**
Given image with stem "test_image"
When augment_inner() runs
Then outputs named test_image.jpg, test_image_1.jpg through test_image_7.jpg with matching .txt labels
**AC-3: Bbox validity**
Given 1 image + label with multiple bboxes
When augment_inner() runs
Then every bbox coordinate in every output is in [0.0, 1.0]
**AC-4: Edge bbox clipping**
Given label with bbox near edge (x=0.99, w=0.2)
When correct_bboxes() runs
Then width reduced to fit within bounds; no coordinate exceeds [margin, 1-margin]
**AC-5: Tiny bbox removal**
Given label with bbox that becomes < 0.01 area after clipping
When correct_bboxes() runs
Then bbox is removed from output
**AC-6: Empty label**
Given 1 image + empty label file
When augment_inner() runs
Then 8 ImageLabel objects returned, all with empty labels lists
**AC-7: Full pipeline**
Given 5 images + labels in data/ directory
When augment_annotations() runs with patched paths
Then 40 images in processed images dir, 40 matching labels
**AC-8: Skip already-processed**
Given 5 images in data/, 3 already in processed/
When augment_annotations() runs
Then only 2 new images processed (16 new outputs), existing 3 untouched
## Constraints
- Must patch constants.py paths to use tmp_path
- Fixture images from _docs/00_problem/input_data/dataset/
- Each test operates in isolated tmp_path