Refactor autopilot workflows and documentation: Update .gitignore to include binary and media file types, enhance agent command references in documentation, and modify annotation class for improved accessibility. Adjust inference processing to handle batch sizes and streamline test specifications for clarity and consistency across the system.

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-25 05:26:19 +02:00
parent a5fc4fe073
commit 4afa1a4eec
29 changed files with 447 additions and 362 deletions
Binary file not shown.
@@ -0,0 +1 @@
center_x,center_y,width,height,label,confidence_min
1 center_x center_y width height label confidence_min
@@ -0,0 +1 @@
center_x,center_y,width,height,label,confidence_min
1 center_x center_y width height label confidence_min
@@ -0,0 +1 @@
center_x,center_y,width,height,label,confidence_min
1 center_x center_y width height label confidence_min
@@ -0,0 +1 @@
center_x,center_y,width,height,label,confidence_min
1 center_x center_y width height label confidence_min
@@ -0,0 +1 @@
center_x,center_y,width,height,label,confidence_min
1 center_x center_y width height label confidence_min
@@ -0,0 +1 @@
center_x,center_y,width,height,label,confidence_min
1 center_x center_y width height label confidence_min
@@ -0,0 +1,104 @@
# Expected Results
Maps every input data item to its quantifiable expected result.
Tests use this mapping to compare actual system output against known-correct answers.
## Coordinate System
All bounding box coordinates are **normalized to 0.01.0** relative to the full image/frame dimensions, matching the API response format:
| Field | Meaning |
|-------|---------|
| `center_x` | Horizontal center of bounding box (0.0 = left edge, 1.0 = right edge) |
| `center_y` | Vertical center of bounding box (0.0 = top edge, 1.0 = bottom edge) |
| `width` | Bounding box width as fraction of image width |
| `height` | Bounding box height as fraction of image height |
| `label` | Class name from `classes.json` (e.g., `ArmorVehicle`, `Car`, `Person`) |
| `confidence_min` | Minimum acceptable confidence for this detection (threshold comparison, `≥`) |
For videos, the additional field:
| Field | Meaning |
|-------|---------|
| `time_sec` | Timestamp in seconds from video start when this detection is visible |
## Global Tolerances
| Parameter | Tolerance | Comparison Method |
|-----------|-----------|-------------------|
| Bounding box coordinates (center_x, center_y, width, height) | ± 0.05 | `numeric_tolerance` |
| Detection count | ± 2 | `numeric_tolerance` |
| Confidence | ≥ `confidence_min` value per row | `threshold_min` |
| Label | exact match | `exact` |
| Video time_sec | ± 1.0s | `numeric_tolerance` |
## Input → Expected Result Mapping
### Images
| # | Input File | Description | Expected Result File | Expected Detection Count | Notes |
|---|------------|-------------|---------------------|-------------------------|-------|
| 1 | `image_small.jpg` | 1280×720 aerial, contains detectable objects | `image_small_expected.csv` | ? | Primary test image for single-frame detection |
| 2 | `image_large.JPG` | 6252×4168 aerial, triggers GSD-based tiling | `image_large_expected.csv` | ? | Coordinates normalized to full image (not tile) |
| 3 | `image_dense01.jpg` | 1280×720 dense scene, many clustered objects | `image_dense01_expected.csv` | ? | Used for dedup and max-detection-cap tests |
| 4 | `image_dense02.jpg` | 1920×1080 dense scene variant | `image_dense02_expected.csv` | ? | Borderline tiling, dedup variant |
| 5 | `image_different_types.jpg` | 900×1600, varied object classes | `image_different_types_expected.csv` | ? | Must contain multiple distinct class labels |
| 6 | `image_empty_scene.jpg` | 1920×1080, no detectable objects | `image_empty_scene_expected.csv` | 0 | CSV has headers only — zero detections expected |
### Videos
| # | Input File | Description | Expected Result File | Notes |
|---|------------|-------------|---------------------|-------|
| 7 | `video_short01.mp4` | Standard test video | `video_short01_expected.csv` | Primary async/SSE/video test. List key-frame detections. |
| 8 | `video_short02.mp4` | Video variant | `video_short02_expected.csv` | Used for resilience and concurrent tests |
| 9 | `video_long03.mp4` | Long video (288MB), generates >100 SSE events | `video_long03_expected.csv` | SSE overflow test. Only key-frame samples needed. |
## How to Fill
### Images
1. Run the model on each image (or use the detection service)
2. Record every detection the model returns
3. Fill one row per detection in the CSV:
```
center_x,center_y,width,height,label,confidence_min
0.45,0.32,0.08,0.12,Car,0.25
0.71,0.55,0.06,0.09,Person,0.25
```
4. For `image_empty_scene_expected.csv` — leave only the header row (0 detections)
### Videos
1. Run the model on the video (or use the detection service with `frame_period_recognition: 1`)
2. For key frames where detections appear, record the timestamp and detections
3. Fill one row per detection per timestamp:
```
time_sec,center_x,center_y,width,height,label,confidence_min
2.0,0.45,0.32,0.08,0.12,Car,0.25
2.0,0.71,0.55,0.06,0.09,Person,0.25
4.0,0.46,0.33,0.08,0.12,Car,0.25
```
4. You don't need every single frame — sample at key moments (e.g., every 24 seconds) to validate detection presence and approximate positions
## Non-Detection Expected Results
The following test scenarios have expected results that are not per-file detections. These are defined inline in the test specs and do not need CSV files:
| Scenario | Expected Result | Comparison | Defined In |
|----------|----------------|------------|------------|
| Empty image (FT-N-01) | HTTP 400, `"Image is empty"` | exact | `blackbox-tests.md` |
| Corrupt image (FT-N-02) | HTTP 400 or 422 | exact | `blackbox-tests.md` |
| Engine unavailable (FT-N-03) | HTTP 503 or 422, not 500 | exact | `blackbox-tests.md` |
| Duplicate media_id (FT-N-04) | HTTP 409 | exact | `blackbox-tests.md` |
| Missing classes.json (FT-N-05) | Service fails or empty detections | exact | `blackbox-tests.md` |
| Health pre-init (FT-P-01) | `aiAvailability: "None"` | exact | `blackbox-tests.md` |
| Health post-init (FT-P-02) | `aiAvailability` not "None"/"Downloading" | exact | `blackbox-tests.md` |
| Async start (FT-P-08) | `{"status": "started"}`, response < 1s | exact + threshold_max | `blackbox-tests.md` |
| SSE completion (FT-P-09) | Final event: `mediaStatus: "AIProcessed"`, `percent: 100` | exact | `blackbox-tests.md` |
| Max detections (NFT-RES-LIM-03) | `len(detections) ≤ 300` | threshold_max | `resource-limit-tests.md` |
| Single image latency (NFT-PERF-01) | p95 < 5000ms (ONNX CPU) | threshold_max | `performance-tests.md` |
| Log file naming (NFT-RES-LIM-04) | `log_inference_YYYYMMDD.txt` exists | regex | `resource-limit-tests.md` |
@@ -0,0 +1 @@
time_sec,center_x,center_y,width,height,label,confidence_min
1 time_sec center_x center_y width height label confidence_min
@@ -0,0 +1 @@
time_sec,center_x,center_y,width,height,label,confidence_min
1 time_sec center_x center_y width height label confidence_min
@@ -0,0 +1 @@
time_sec,center_x,center_y,width,height,label,confidence_min
1 time_sec center_x center_y width height label confidence_min