add tests

gen_tests updated solution.md updated
2026-04-23 01:46:37 +00:00 · 2025-11-24 22:57:46 +02:00
parent f50006d100
commit 4f8c18a066
49 changed files with 7209 additions and 3 deletions
@@ -0,0 +1,129 @@
+# Acceptance Test: AC-2 - 60% of Photos < 20m Error
+
+## Summary
+Validate Acceptance Criterion 2: "The system should find out the GPS of centers of 60% of the photos from the flight within an error of no more than 20 meters in comparison to the real GPS."
+
+## Linked Acceptance Criteria
+**AC-2**: 60% of photos < 20m error
+
+## Preconditions
+1. ASTRAL-Next system fully operational
+2. All TensorRT models loaded (FP16 precision for maximum accuracy)
+3. High-quality satellite tiles cached (Zoom level 19, ~0.30 m/pixel)
+4. Ground truth GPS coordinates available
+5. Test dataset prepared: Test_Baseline (AD000001-AD000030)
+
+## Test Description
+Process same baseline flight as AC-1 test, but now validate the more stringent criterion that at least 60% of images achieve error < 20 meters. This tests the precision of LiteSAM cross-view matching.
+
+## Test Steps
+
+### Step 1: Initialize System for High Precision
+- **Action**: Start system, verify models loaded in optimal configuration
+- **Expected Result**: System ready, LiteSAM configured for maximum precision
+
+### Step 2: Create Test Flight
+- **Action**: Create flight "AC2_HighPrecision" with same parameters as AC-1
+- **Expected Result**: Flight created successfully
+
+### Step 3: Upload Test Images
+- **Action**: Upload AD000001-AD000030 (30 images)
+- **Expected Result**: All queued for processing
+
+### Step 4: Process with High-Quality Anchors
+- **Action**: System processes images, L3 provides frequent GPS anchors
+- **Expected Result**: 
+  - Processing completes
+  - Multiple GPS anchors per 10 images
+  - Factor graph well-constrained
+
+### Step 5: Retrieve Final Results
+- **Action**: GET /api/v1/flights/{id}/results with latest versions
+- **Expected Result**: Refined GPS coordinates (post-optimization)
+
+### Step 6: Calculate Errors
+- **Action**: Calculate haversine distance for each image
+- **Expected Result**: Error array with 30 values
+
+### Step 7: Validate AC-2
+- **Action**: Count images with error < 20m, calculate percentage
+- **Expected Result**: **≥ 60% of images have error < 20 meters** ✓
+
+### Step 8: Analyze High-Precision Results
+- **Action**: Identify which images achieve < 20m vs 20-50m vs > 50m
+- **Expected Result**: 
+  - Category 1 (< 20m): ≥ 18 images (60%)
+  - Category 2 (20-50m): ~10 images
+  - Category 3 (> 50m): < 2 images
+
+### Step 9: Generate Detailed Report
+- **Action**: Create comprehensive accuracy report
+- **Expected Result**:
+  - Percentage breakdown by error thresholds
+  - Distribution histogram
+  - Correlation between accuracy and image features
+  - Compliance matrix for AC-1 and AC-2
+
+## Success Criteria
+
+**Primary Criterion (AC-2)**:
+- ≥ 18 out of 30 images (60%) have GPS error < 20 meters
+
+**Supporting Criteria**:
+- Also meets AC-1 (≥ 80% < 50m)
+- Mean error < 30 meters
+- RMSE < 35 meters
+- No catastrophic failures (errors > 200m)
+
+## Expected Results
+
+```
+Total Images: 30
+Successfully Processed: 30 (100%)
+Images with error < 10m: 8 (26.7%)
+Images with error < 20m: 20 (66.7%)
+Images with error < 50m: 28 (93.3%)
+Images with error > 50m: 2 (6.7%)
+Mean Error: 24.5m
+Median Error: 18.2m
+RMSE: 28.3m
+90th Percentile: 42.1m
+AC-2 Status: PASS (66.7% > 60%)
+AC-1 Status: PASS (93.3% > 80%)
+```
+
+## Pass/Fail Criteria
+
+**TEST PASSES IF**:
+- ≥ 60% of images achieve error < 20m
+- Also passes AC-1 (≥ 80% < 50m)
+- System performance stable across multiple runs
+
+**TEST FAILS IF**:
+- < 60% of images achieve error < 20m
+- Fails AC-1 (would be critical failure)
+- Results not reproducible (high variance)
+
+## Error Analysis
+
+If test fails or borderline:
+
+**Investigate**:
+1. **Satellite Data Quality**: Check zoom level, age of imagery, resolution
+2. **LiteSAM Performance**: Review correspondence counts, homography quality
+3. **Factor Graph**: Check if GPS anchors frequent enough
+4. **Image Quality**: Verify no motion blur, good lighting conditions
+5. **Altitude Variation**: Check if altitude assumption (400m) accurate
+
+**Potential Improvements**:
+- Use Tier-2 commercial satellite data (higher resolution)
+- Increase GPS anchor frequency (every 3rd image vs every 5th)
+- Tune LiteSAM confidence threshold
+- Apply per-keyframe scale adjustment in factor graph
+
+## Notes
+- AC-2 is more stringent than AC-1 (20m vs 50m)
+- Achieving 60% at 20m while maintaining 80% at 50m validates solution design
+- LiteSAM reported RMSE of 17.86m on UAV-VisLoc dataset supports feasibility
+- Test represents high-precision navigation requirement
+