add tests

gen_tests updated solution.md updated
2026-04-23 05:26:37 +00:00 · 2025-11-24 22:57:46 +02:00
parent f50006d100
commit 4f8c18a066
49 changed files with 7209 additions and 3 deletions
@@ -0,0 +1,148 @@
+# Integration Test: Global Place Recognition
+
+## Summary
+Validate the Layer 2 (L2) Global Place Recognition component using AnyLoc (DINOv2 + VLAD) for retrieving matching satellite tiles when UAV tracking is lost.
+
+## Component Under Test
+**Component**: Global Place Recognition (L2)
+**Location**: `gps_denied_08_global_place_recognition`
+**Dependencies**:
+- Model Manager (DINOv2 model)
+- Satellite Data Manager (pre-cached satellite tiles)
+- Coordinate Transformer
+- Faiss index database
+
+## Detailed Description
+This test validates that the Global Place Recognition component can:
+1. Extract DINOv2 features from UAV images
+2. Aggregate features using VLAD into compact descriptors
+3. Query Faiss index to retrieve top-K similar satellite tiles
+4. Handle "kidnapped robot" scenarios (zero overlap with previous frame)
+5. Work with potentially outdated satellite imagery
+
+The component solves the critical problem of recovering location after sharp turns or tracking loss where sequential matching fails.
+
+## Input Data
+
+### Test Case 1: Normal Flight Recovery
+- **UAV Image**: AD000001.jpg
+- **Ground truth GPS**: 48.275292, 37.385220
+- **Expected**: Should retrieve satellite tile containing this location in top-5
+- **Satellite reference**: AD000001_gmaps.png
+
+### Test Case 2: After Sharp Turn
+- **UAV Image**: AD000044.jpg (after skipping AD000043)
+- **Ground truth GPS**: 48.251489, 37.343079
+- **Context**: Simulates zero-overlap scenario
+- **Expected**: Should relocalize despite no sequential context
+
+### Test Case 3: Maximum Distance Jump
+- **UAV Image**: AD000048.jpg (268.6m from previous)
+- **Ground truth GPS**: 48.249114, 37.346895
+- **Context**: Largest outlier in dataset
+- **Expected**: Should retrieve correct region
+
+### Test Case 4: Middle of Route
+- **UAV Image**: AD000030.jpg
+- **Ground truth GPS**: 48.259677, 37.352165
+- **Expected**: Accurate retrieval for interior route point
+
+### Test Case 5: Route Start vs End
+- **UAV Images**: AD000001.jpg and AD000060.jpg
+- **Ground truth GPS**:
+  - AD000001: 48.275292, 37.385220
+  - AD000060: 48.256246, 37.357485
+- **Expected**: Both should retrieve distinct correct regions
+
+## Expected Output
+
+For each test case:
+```json
+{
+  "success": true/false,
+  "query_image": "AD000001.jpg",
+  "top_k_tiles": [
+    {
+      "tile_id": "tile_xyz",
+      "center_gps": [lat, lon],
+      "similarity_score": <float 0-1>,
+      "distance_to_gt_m": <float>
+    }
+  ],
+  "top1_correct": true/false,
+  "top5_correct": true/false,
+  "processing_time_ms": <float>
+}
+```
+
+## Success Criteria
+
+**Per Test Case**:
+- top1_correct = true (best match within 200m of ground truth) OR
+- top5_correct = true (at least one of top-5 within 200m of ground truth)
+- processing_time_ms < 200ms
+- similarity_score of correct match > 0.6
+
+**Test Case Specific**:
+- **Test Case 1**: top1_correct = true (reference image available)
+- **Test Case 2-4**: top5_correct = true
+- **Test Case 5**: Both images should have top5_correct = true
+
+## Maximum Expected Time
+- **Per query**: < 200ms (on RTX 3070)
+- **Per query**: < 300ms (on RTX 2060)
+- **Faiss index initialization**: < 5 seconds
+- **Total test suite**: < 10 seconds
+
+## Test Execution Steps
+
+1. **Setup Phase**:
+   a. Initialize Satellite Data Manager
+   b. Load or create Faiss index with satellite tile descriptors
+   c. Verify satellite coverage for test area (48.25-48.28°N, 37.34-37.39°E)
+   d. Load DINOv2 model via Model Manager
+
+2. **Execution Phase**:
+   For each test case:
+   a. Load UAV image from test data
+   b. Extract DINOv2 features
+   c. Aggregate to VLAD descriptor
+   d. Query Faiss index for top-5 matches
+   e. Calculate distance from retrieved tiles to ground truth GPS
+   f. Record timing and accuracy metrics
+
+3. **Validation Phase**:
+   a. Verify top-K accuracy for each test case
+   b. Check processing time constraints
+   c. Validate similarity scores are reasonable
+   d. Ensure no duplicate tiles in top-K results
+
+## Pass/Fail Criteria
+
+**Overall Test Passes If**:
+- At least 4 out of 5 test cases meet success criteria (80% pass rate)
+- Average processing time < 200ms
+- No crashes or exceptions
+- Top-5 recall rate > 85%
+
+**Test Fails If**:
+- More than 1 test case fails
+- Any processing time exceeds 500ms
+- Faiss index fails to load or query
+- Memory usage exceeds 8GB
+- Top-5 recall rate < 70%
+
+## Additional Validation
+
+**Robustness Tests**:
+- Query with rotated image (90°, 180°, 270°) - should still retrieve correct tile
+- Query with brightness adjusted image (+/-30%) - should maintain similarity score > 0.5
+- Sequential queries should maintain consistent results (deterministic)
+
+**Performance Metrics to Report**:
+- Top-1 Recall@200m: percentage where best match is within 200m
+- Top-5 Recall@200m: percentage where any of top-5 within 200m
+- Mean Average Precision (MAP)
+- Average query latency
+- Memory footprint of Faiss index
+