gen_tests updated solution.md updated
4.8 KiB
Integration Test: Global Place Recognition
Summary
Validate the Layer 2 (L2) Global Place Recognition component using AnyLoc (DINOv2 + VLAD) for retrieving matching satellite tiles when UAV tracking is lost.
Component Under Test
Component: Global Place Recognition (L2)
Location: gps_denied_08_global_place_recognition
Dependencies:
- Model Manager (DINOv2 model)
- Satellite Data Manager (pre-cached satellite tiles)
- Coordinate Transformer
- Faiss index database
Detailed Description
This test validates that the Global Place Recognition component can:
- Extract DINOv2 features from UAV images
- Aggregate features using VLAD into compact descriptors
- Query Faiss index to retrieve top-K similar satellite tiles
- Handle "kidnapped robot" scenarios (zero overlap with previous frame)
- Work with potentially outdated satellite imagery
The component solves the critical problem of recovering location after sharp turns or tracking loss where sequential matching fails.
Input Data
Test Case 1: Normal Flight Recovery
- UAV Image: AD000001.jpg
- Ground truth GPS: 48.275292, 37.385220
- Expected: Should retrieve satellite tile containing this location in top-5
- Satellite reference: AD000001_gmaps.png
Test Case 2: After Sharp Turn
- UAV Image: AD000044.jpg (after skipping AD000043)
- Ground truth GPS: 48.251489, 37.343079
- Context: Simulates zero-overlap scenario
- Expected: Should relocalize despite no sequential context
Test Case 3: Maximum Distance Jump
- UAV Image: AD000048.jpg (268.6m from previous)
- Ground truth GPS: 48.249114, 37.346895
- Context: Largest outlier in dataset
- Expected: Should retrieve correct region
Test Case 4: Middle of Route
- UAV Image: AD000030.jpg
- Ground truth GPS: 48.259677, 37.352165
- Expected: Accurate retrieval for interior route point
Test Case 5: Route Start vs End
- UAV Images: AD000001.jpg and AD000060.jpg
- Ground truth GPS:
- AD000001: 48.275292, 37.385220
- AD000060: 48.256246, 37.357485
- Expected: Both should retrieve distinct correct regions
Expected Output
For each test case:
{
"success": true/false,
"query_image": "AD000001.jpg",
"top_k_tiles": [
{
"tile_id": "tile_xyz",
"center_gps": [lat, lon],
"similarity_score": <float 0-1>,
"distance_to_gt_m": <float>
}
],
"top1_correct": true/false,
"top5_correct": true/false,
"processing_time_ms": <float>
}
Success Criteria
Per Test Case:
- top1_correct = true (best match within 200m of ground truth) OR
- top5_correct = true (at least one of top-5 within 200m of ground truth)
- processing_time_ms < 200ms
- similarity_score of correct match > 0.6
Test Case Specific:
- Test Case 1: top1_correct = true (reference image available)
- Test Case 2-4: top5_correct = true
- Test Case 5: Both images should have top5_correct = true
Maximum Expected Time
- Per query: < 200ms (on RTX 3070)
- Per query: < 300ms (on RTX 2060)
- Faiss index initialization: < 5 seconds
- Total test suite: < 10 seconds
Test Execution Steps
-
Setup Phase: a. Initialize Satellite Data Manager b. Load or create Faiss index with satellite tile descriptors c. Verify satellite coverage for test area (48.25-48.28°N, 37.34-37.39°E) d. Load DINOv2 model via Model Manager
-
Execution Phase: For each test case: a. Load UAV image from test data b. Extract DINOv2 features c. Aggregate to VLAD descriptor d. Query Faiss index for top-5 matches e. Calculate distance from retrieved tiles to ground truth GPS f. Record timing and accuracy metrics
-
Validation Phase: a. Verify top-K accuracy for each test case b. Check processing time constraints c. Validate similarity scores are reasonable d. Ensure no duplicate tiles in top-K results
Pass/Fail Criteria
Overall Test Passes If:
- At least 4 out of 5 test cases meet success criteria (80% pass rate)
- Average processing time < 200ms
- No crashes or exceptions
- Top-5 recall rate > 85%
Test Fails If:
- More than 1 test case fails
- Any processing time exceeds 500ms
- Faiss index fails to load or query
- Memory usage exceeds 8GB
- Top-5 recall rate < 70%
Additional Validation
Robustness Tests:
- Query with rotated image (90°, 180°, 270°) - should still retrieve correct tile
- Query with brightness adjusted image (+/-30%) - should maintain similarity score > 0.5
- Sequential queries should maintain consistent results (deterministic)
Performance Metrics to Report:
- Top-1 Recall@200m: percentage where best match is within 200m
- Top-5 Recall@200m: percentage where any of top-5 within 200m
- Mean Average Precision (MAP)
- Average query latency
- Memory footprint of Faiss index