Update configuration and test structure for improved clarity and functionality

- Modified `.gitignore` to include test fixture data while excluding test results. - Updated `config.yaml` to change the model from 'yolo11m.yaml' to 'yolo26m.pt'. - Enhanced `.cursor/rules/coderule.mdc` with additional guidelines for test environment consistency and infrastructure handling. - Revised autopilot state management in `_docs/_autopilot_state.md` to reflect current progress and tasks. - Removed outdated augmentation tests and adjusted dataset formation tests to align with the new structure. These changes streamline the configuration and testing processes, ensuring better organization and clarity in the project.
2026-04-22 13:06:36 +00:00 · 2026-03-28 06:11:55 +02:00
parent cdcd1f6ea7
commit a47fa135de
119 changed files with 824 additions and 774 deletions
@@ -1,61 +1,9 @@
 # Blackbox Test Scenarios

-## BT-AUG: Augmentation Pipeline
-
-### BT-AUG-01: Single image produces 8 outputs
- **Input**: 1 image + 1 valid label from fixture dataset
- **Action**: Run `Augmentator.augment_inner()` on the image
- **Expected**: Returns list of exactly 8 ImageLabel objects
- **Traces**: AC: 8× augmentation ratio
-
-### BT-AUG-02: Augmented filenames follow naming convention
- **Input**: Image with stem "test_image"
- **Action**: Run `augment_inner()`
- **Expected**: Output filenames: `test_image.jpg`, `test_image_1.jpg` through `test_image_7.jpg`; matching `.txt` labels
- **Traces**: AC: Augmentation output format
-
-### BT-AUG-03: All output bounding boxes in valid range
- **Input**: 1 image + label with multiple bboxes
- **Action**: Run `augment_inner()`
- **Expected**: Every bbox coordinate in every output label is in [0.0, 1.0]
- **Traces**: AC: Bounding boxes clipped to [0, 1]
-
-### BT-AUG-04: Bounding box correction clips edge bboxes
- **Input**: Label with bbox near edge: `0 0.99 0.5 0.2 0.1`
- **Action**: Run `correct_bboxes()`
- **Expected**: Width reduced so bbox fits within [margin, 1-margin]; no coordinate exceeds bounds
- **Traces**: AC: Bounding boxes clipped to [0, 1]
-
-### BT-AUG-05: Tiny bounding boxes removed after correction
- **Input**: Label with tiny bbox that becomes < 0.01 after clipping
- **Action**: Run `correct_bboxes()`
- **Expected**: Bbox removed from output (area < correct_min_bbox_size)
- **Traces**: AC: Bounding boxes with area < 0.01% discarded
-
-### BT-AUG-06: Empty label produces 8 outputs with empty labels
- **Input**: 1 image + empty label file
- **Action**: Run `augment_inner()`
- **Expected**: 8 ImageLabel objects returned; all have empty labels lists
- **Traces**: AC: Augmentation handles empty annotations
-
-### BT-AUG-07: Full augmentation pipeline (filesystem integration)
- **Input**: 5 images + labels copied to data/ directory in tmp_path
- **Action**: Run `augment_annotations()` with patched paths
- **Expected**: 40 images (5 × 8) in processed images dir; 40 matching labels in processed labels dir
- **Traces**: AC: 8× augmentation, filesystem output
-
-### BT-AUG-08: Augmentation skips already-processed images
- **Input**: 5 images in data/; 3 already present in processed/ dir
- **Action**: Run `augment_annotations()`
- **Expected**: Only 2 new images processed (16 new outputs); existing 3 untouched
- **Traces**: AC: Augmentation processes only unprocessed images
-
---
-
 ## BT-DSF: Dataset Formation

 ### BT-DSF-01: 70/20/10 split ratio
- **Input**: 100 images + labels in processed/ dir
+- **Input**: 100 images + labels in data/ dir
 - **Action**: Run `form_dataset()` with patched paths
 - **Expected**: train: 70 images+labels, valid: 20, test: 10
 - **Traces**: AC: Dataset split 70/20/10
@@ -1,18 +1,5 @@
 # Performance Test Scenarios

-## PT-AUG-01: Augmentation throughput
- **Input**: 10 images from fixture dataset
- **Action**: Run `augment_annotations()`, measure wall time
- **Expected**: Completes within 60 seconds (10 images × 8 outputs = 80 files)
- **Traces**: Restriction: Augmentation runs continuously
- **Note**: Threshold is generous; actual performance depends on CPU
-
-## PT-AUG-02: Parallel augmentation speedup
- **Input**: 10 images from fixture dataset
- **Action**: Run with ThreadPoolExecutor vs sequential, compare times
- **Expected**: Parallel is ≥ 1.5× faster than sequential
- **Traces**: AC: Parallelized per-image processing
-
 ## PT-DSF-01: Dataset formation throughput
 - **Input**: 100 images + labels
 - **Action**: Run `form_dataset()`, measure wall time
@@ -1,25 +1,7 @@
 # Resilience Test Scenarios

-## RT-AUG-01: Augmentation handles corrupted image gracefully
- **Input**: 1 valid image + 1 corrupted image file (truncated JPEG) in data/ dir
- **Action**: Run `augment_annotations()`
- **Expected**: Valid image produces 8 outputs; corrupted image skipped without crashing pipeline; total output: 8 files
- **Traces**: Restriction: Augmentation exception handling per-image
-
-## RT-AUG-02: Augmentation handles missing label file
- **Input**: 1 image with no matching label file
- **Action**: Run `augment_annotation()` on the image
- **Expected**: Exception caught per-thread; does not crash pipeline
- **Traces**: Restriction: Augmentation exception handling
-
-## RT-AUG-03: Augmentation transform failure produces fewer variants
- **Input**: 1 image + label that causes some transforms to fail (extremely narrow bbox)
- **Action**: Run `augment_inner()`
- **Expected**: Returns 1-8 ImageLabel objects (original always present; failed variants skipped); no crash
- **Traces**: Restriction: Transform failure handling
-
-## RT-DSF-01: Dataset formation with empty processed directory
- **Input**: Empty processed images dir
+## RT-DSF-01: Dataset formation with empty data directory
+- **Input**: Empty data images dir
 - **Action**: Run `form_dataset()`
 - **Expected**: Creates empty train/valid/test directories; no crash
 - **Traces**: Restriction: Edge case handling
@@ -1,11 +1,5 @@
 # Resource Limit Test Scenarios

-## RL-AUG-01: Augmentation output count bounded
- **Input**: 1 image
- **Action**: Run `augment_inner()`
- **Expected**: Returns exactly 8 outputs (never more, even with retries)
- **Traces**: AC: 8× augmentation ratio (1 original + 7 augmented)
-
 ## RL-DSF-01: Dataset split ratios sum to 100%
 - **Input**: Any number of images
 - **Action**: Check `train_set + valid_set + test_set`
@@ -4,8 +4,8 @@

 | ID | Data Item | Source | Format | Preparation |
 |----|-----------|--------|--------|-------------|
-| FD-01 | Annotated images (100) | `_docs/00_problem/input_data/dataset/images/` | JPEG | Copy subset to tmp_path at test start |
-| FD-02 | YOLO labels (100) | `_docs/00_problem/input_data/dataset/labels/` | TXT | Copy subset to tmp_path at test start |
+| FD-01 | Annotated images (20) | `tests/data/images/` | JPEG | Copy subset to tmp_path at test start |
+| FD-02 | YOLO labels (20) | `tests/data/labels/` | TXT | Copy subset to tmp_path at test start |
 | FD-03 | ONNX model | `_docs/00_problem/input_data/azaion.onnx` | ONNX | Read bytes at test start |
 | FD-04 | Class definitions | `classes.json` (project root) | JSON | Copy to tmp_path at test start |
 | FD-05 | Corrupted labels | Generated at test time | TXT | Create labels with coords > 1.0 |
@@ -4,15 +4,6 @@

 | AC / Restriction | Test IDs | Coverage |
 |------------------|----------|----------|
-| 8× augmentation ratio | BT-AUG-01, BT-AUG-06, BT-AUG-07, RL-AUG-01 | Full |
-| Augmentation naming convention | BT-AUG-02 | Full |
-| Bounding boxes clipped to [0,1] | BT-AUG-03, BT-AUG-04 | Full |
-| Tiny bboxes (< 0.01) discarded | BT-AUG-05 | Full |
-| Augmentation skips already-processed | BT-AUG-08 | Full |
-| Augmentation parallelized | PT-AUG-02 | Full |
-| Augmentation handles corrupted images | RT-AUG-01 | Full |
-| Augmentation handles missing labels | RT-AUG-02 | Full |
-| Transform failure graceful | RT-AUG-03 | Full |
 | Dataset split 70/20/10 | BT-DSF-01, RL-DSF-01 | Full |
 | Dataset directory structure | BT-DSF-02 | Full |
 | Dataset integrity (no data loss) | BT-DSF-03, RL-DSF-02 | Full |
@@ -34,6 +25,17 @@
 | Static model encryption key | ST-ENC-03 | Full |
 | Random IV per encryption | ST-ENC-01 | Full |

+## Removed (augmentation now built into YOLO training)
+
+The following tests were removed because external augmentation (`augmentation.py`) is no longer part of the training pipeline. YOLO's built-in augmentation replaces it.
+
+| Removed Test IDs | Reason |
+|-------------------|--------|
+| BT-AUG-01 to BT-AUG-08 | External augmentation replaced by YOLO built-in |
+| PT-AUG-01, PT-AUG-02 | Augmentation performance no longer relevant |
+| RT-AUG-01 to RT-AUG-03 | Augmentation resilience no longer relevant |
+| RL-AUG-01 | Augmentation resource limits no longer relevant |
+
 ## Uncovered (Require External Services)

 | AC / Restriction | Reason |
@@ -50,18 +52,18 @@

 | Metric | Value |
 |--------|-------|
-| Total AC + Restrictions | 36 |
-| Covered by tests | 29 |
+| Total AC + Restrictions | 27 |
+| Covered by tests | 20 |
 | Uncovered (external deps) | 7 |
-| **Coverage** | **80.6%** |
+| **Coverage** | **74.1%** |

 ## Test Count Summary

 | Category | Count |
 |----------|-------|
-| Blackbox tests | 32 |
-| Performance tests | 5 |
-| Resilience tests | 6 |
+| Blackbox tests | 21 |
+| Performance tests | 3 |
+| Resilience tests | 3 |
 | Security tests | 7 |
-| Resource limit tests | 5 |
-| **Total** | **55** |
+| Resource limit tests | 4 |
+| **Total** | **38** |