Refactor task management structure and update documentation

- Changed the directory structure for task specifications to include a dedicated `todo/` folder within `_docs/02_tasks/` for tasks ready for implementation. - Updated references in various skills and documentation to reflect the new task lifecycle, including changes in the `implementer` and `decompose` skills. - Enhanced the README and flow documentation to clarify the new task organization and its implications for the implementation process. These updates improve task management clarity and streamline the implementation workflow.
2026-04-22 22:46:35 +00:00 · 2026-03-28 01:17:45 +02:00
parent 8c665bd0a4
commit cbf370c765
35 changed files with 1348 additions and 58 deletions
@@ -0,0 +1,92 @@
+# Single Image Detection Tests
+
+**Task**: AZ-140_test_single_image
+**Name**: Single Image Detection Tests
+**Description**: Implement E2E tests verifying single image detection, confidence filtering, overlap deduplication, physical size filtering, and weather mode classes
+**Complexity**: 3 points
+**Dependencies**: AZ-138_test_infrastructure
+**Component**: Integration Tests
+**Jira**: AZ-140
+**Epic**: AZ-137
+
+## Problem
+
+Single image detection is the core functionality of the system. Tests must verify that detections are returned with correct structure, confidence filtering works at different thresholds, overlapping detections are deduplicated, physical size filtering removes implausible detections, and weather mode class variants are recognized.
+
+## Outcome
+
+- Detection response structure validated (x, y, width, height, label, confidence)
+- Confidence threshold filtering confirmed at multiple thresholds
+- Overlap deduplication verified with configurable containment ratio
+- Physical size filtering validated against MaxSizeM from classes.json
+- Weather mode class variants (Norm, Wint, Night) recognized correctly
+
+## Scope
+
+### Included
+- FT-P-03: Single image detection returns detections
+- FT-P-05: Detection confidence filtering respects threshold
+- FT-P-06: Overlapping detections are deduplicated
+- FT-P-07: Physical size filtering removes oversized detections
+- FT-P-13: Weather mode class variants
+
+### Excluded
+- Large image tiling (covered in tiling tests)
+- Async/video detection (covered in async and video tests)
+- Negative input validation (covered in negative tests)
+
+## Acceptance Criteria
+
+**AC-1: Detection response structure**
+Given an initialized engine and a valid small image
+When POST /detect is called with the image
+Then response is 200 with an array of DetectionDto objects containing x, y, width, height, label, confidence fields with coordinates in 0.0-1.0 range
+
+**AC-2: Confidence filtering**
+Given an initialized engine
+When POST /detect is called with probability_threshold 0.8
+Then all returned detections have confidence >= 0.8
+And calling with threshold 0.1 returns >= the number from threshold 0.8
+
+**AC-3: Overlap deduplication**
+Given an initialized engine and a scene with clustered objects
+When POST /detect is called with tracking_intersection_threshold 0.6
+Then no two detections of the same class overlap by more than 60%
+And lower threshold (0.01) produces fewer or equal detections
+
+**AC-4: Physical size filtering**
+Given an initialized engine and known GSD parameters
+When POST /detect is called with altitude, focal_length, sensor_width config
+Then no detection's computed physical size exceeds the MaxSizeM for its class
+
+**AC-5: Weather mode classes**
+Given an initialized engine with classes.json including weather variants
+When POST /detect is called
+Then all returned labels are valid entries from the 19-class x 3-mode registry
+
+## Non-Functional Requirements
+
+**Performance**
+- Single image detection within 30s (includes potential engine init)
+
+## Integration Tests
+
+| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
+|--------|------------------------|-------------|-------------------|----------------|
+| AC-1 | Engine warm, small-image | POST /detect response structure | Array of DetectionDto, coords 0.0-1.0 | Max 30s |
+| AC-2 | Engine warm, small-image | Two thresholds (0.8 vs 0.1) | Higher threshold = fewer detections | Max 30s |
+| AC-3 | Engine warm, small-image | Two containment thresholds | Lower threshold = more dedup | Max 30s |
+| AC-4 | Engine warm, small-image, GSD config | Physical size vs MaxSizeM | No oversized detections returned | Max 30s |
+| AC-5 | Engine warm, small-image | Detection label validation | Labels match classes.json entries | Max 30s |
+
+## Constraints
+
+- Deduplication verification requires the test image to produce overlapping detections
+- Physical size filtering requires correct GSD parameters matching the fixture image
+- Weather mode verification depends on classes.json fixture content
+
+## Risks & Mitigation
+
+**Risk 1: Insufficient detections from test image**
+- *Risk*: Small test image may not produce enough detections for meaningful filtering/dedup tests
+- *Mitigation*: Use an image with known dense object content; accept >= 1 detection as valid