Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class. - Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure. - Fixed bugs related to image processing and model saving. - Enhanced test infrastructure to accommodate the new configuration approach. This refactor improves code maintainability and clarity by centralizing configuration management.
2026-04-22 22:56:34 +00:00 · 2026-03-27 18:18:30 +02:00
parent b68c07b540
commit 142c6c4de8
106 changed files with 5706 additions and 654 deletions
@@ -0,0 +1,83 @@
+# Augmentation Blackbox Tests
+
+**Task**: AZ-153_test_augmentation
+**Name**: Augmentation Blackbox Tests
+**Description**: Implement 8 blackbox tests for the augmentation pipeline — output count, naming, bbox validation, edge cases, filesystem integration
+**Complexity**: 3 points
+**Dependencies**: AZ-152_test_infrastructure
+**Component**: Blackbox Tests
+**Jira**: AZ-153
+**Epic**: AZ-151
+
+## Problem
+
+The augmentation pipeline transforms annotated images into 8 variants each. Tests must verify output count, naming conventions, bounding box validity, edge cases, and filesystem integration without referencing internals.
+
+## Outcome
+
+- 8 passing pytest tests in `tests/test_augmentation.py`
+- Covers: single-image augmentation, naming convention, bbox range, bbox clipping, tiny bbox removal, empty labels, full pipeline, skip-already-processed
+
+## Scope
+
+### Included
+- BT-AUG-01: Single image → 8 outputs
+- BT-AUG-02: Augmented filenames follow naming convention
+- BT-AUG-03: All output bounding boxes in valid range [0,1]
+- BT-AUG-04: Bounding box correction clips edge bboxes
+- BT-AUG-05: Tiny bounding boxes removed after correction
+- BT-AUG-06: Empty label produces 8 outputs with empty labels
+- BT-AUG-07: Full augmentation pipeline (filesystem, 5 images → 40 outputs)
+- BT-AUG-08: Augmentation skips already-processed images
+
+### Excluded
+- Performance tests (separate task)
+- Resilience tests (separate task)
+
+## Acceptance Criteria
+
+**AC-1: Output count**
+Given 1 image + 1 valid label
+When augment_inner() runs
+Then exactly 8 ImageLabel objects are returned
+
+**AC-2: Naming convention**
+Given image with stem "test_image"
+When augment_inner() runs
+Then outputs named test_image.jpg, test_image_1.jpg through test_image_7.jpg with matching .txt labels
+
+**AC-3: Bbox validity**
+Given 1 image + label with multiple bboxes
+When augment_inner() runs
+Then every bbox coordinate in every output is in [0.0, 1.0]
+
+**AC-4: Edge bbox clipping**
+Given label with bbox near edge (x=0.99, w=0.2)
+When correct_bboxes() runs
+Then width reduced to fit within bounds; no coordinate exceeds [margin, 1-margin]
+
+**AC-5: Tiny bbox removal**
+Given label with bbox that becomes < 0.01 area after clipping
+When correct_bboxes() runs
+Then bbox is removed from output
+
+**AC-6: Empty label**
+Given 1 image + empty label file
+When augment_inner() runs
+Then 8 ImageLabel objects returned, all with empty labels lists
+
+**AC-7: Full pipeline**
+Given 5 images + labels in data/ directory
+When augment_annotations() runs with patched paths
+Then 40 images in processed images dir, 40 matching labels
+
+**AC-8: Skip already-processed**
+Given 5 images in data/, 3 already in processed/
+When augment_annotations() runs
+Then only 2 new images processed (16 new outputs), existing 3 untouched
+
+## Constraints
+
+- Must patch constants.py paths to use tmp_path
+- Fixture images from _docs/00_problem/input_data/dataset/
+- Each test operates in isolated tmp_path