Files
gps-denied-onboard/docs/03_tests/02_global_place_recognition_integration_spec.md
T
Oleksandr Bezdieniezhnykh 4f8c18a066 add tests
gen_tests updated
solution.md updated
2025-11-24 22:57:46 +02:00

4.8 KiB

Integration Test: Global Place Recognition

Summary

Validate the Layer 2 (L2) Global Place Recognition component using AnyLoc (DINOv2 + VLAD) for retrieving matching satellite tiles when UAV tracking is lost.

Component Under Test

Component: Global Place Recognition (L2) Location: gps_denied_08_global_place_recognition Dependencies:

  • Model Manager (DINOv2 model)
  • Satellite Data Manager (pre-cached satellite tiles)
  • Coordinate Transformer
  • Faiss index database

Detailed Description

This test validates that the Global Place Recognition component can:

  1. Extract DINOv2 features from UAV images
  2. Aggregate features using VLAD into compact descriptors
  3. Query Faiss index to retrieve top-K similar satellite tiles
  4. Handle "kidnapped robot" scenarios (zero overlap with previous frame)
  5. Work with potentially outdated satellite imagery

The component solves the critical problem of recovering location after sharp turns or tracking loss where sequential matching fails.

Input Data

Test Case 1: Normal Flight Recovery

  • UAV Image: AD000001.jpg
  • Ground truth GPS: 48.275292, 37.385220
  • Expected: Should retrieve satellite tile containing this location in top-5
  • Satellite reference: AD000001_gmaps.png

Test Case 2: After Sharp Turn

  • UAV Image: AD000044.jpg (after skipping AD000043)
  • Ground truth GPS: 48.251489, 37.343079
  • Context: Simulates zero-overlap scenario
  • Expected: Should relocalize despite no sequential context

Test Case 3: Maximum Distance Jump

  • UAV Image: AD000048.jpg (268.6m from previous)
  • Ground truth GPS: 48.249114, 37.346895
  • Context: Largest outlier in dataset
  • Expected: Should retrieve correct region

Test Case 4: Middle of Route

  • UAV Image: AD000030.jpg
  • Ground truth GPS: 48.259677, 37.352165
  • Expected: Accurate retrieval for interior route point

Test Case 5: Route Start vs End

  • UAV Images: AD000001.jpg and AD000060.jpg
  • Ground truth GPS:
    • AD000001: 48.275292, 37.385220
    • AD000060: 48.256246, 37.357485
  • Expected: Both should retrieve distinct correct regions

Expected Output

For each test case:

{
  "success": true/false,
  "query_image": "AD000001.jpg",
  "top_k_tiles": [
    {
      "tile_id": "tile_xyz",
      "center_gps": [lat, lon],
      "similarity_score": <float 0-1>,
      "distance_to_gt_m": <float>
    }
  ],
  "top1_correct": true/false,
  "top5_correct": true/false,
  "processing_time_ms": <float>
}

Success Criteria

Per Test Case:

  • top1_correct = true (best match within 200m of ground truth) OR
  • top5_correct = true (at least one of top-5 within 200m of ground truth)
  • processing_time_ms < 200ms
  • similarity_score of correct match > 0.6

Test Case Specific:

  • Test Case 1: top1_correct = true (reference image available)
  • Test Case 2-4: top5_correct = true
  • Test Case 5: Both images should have top5_correct = true

Maximum Expected Time

  • Per query: < 200ms (on RTX 3070)
  • Per query: < 300ms (on RTX 2060)
  • Faiss index initialization: < 5 seconds
  • Total test suite: < 10 seconds

Test Execution Steps

  1. Setup Phase: a. Initialize Satellite Data Manager b. Load or create Faiss index with satellite tile descriptors c. Verify satellite coverage for test area (48.25-48.28°N, 37.34-37.39°E) d. Load DINOv2 model via Model Manager

  2. Execution Phase: For each test case: a. Load UAV image from test data b. Extract DINOv2 features c. Aggregate to VLAD descriptor d. Query Faiss index for top-5 matches e. Calculate distance from retrieved tiles to ground truth GPS f. Record timing and accuracy metrics

  3. Validation Phase: a. Verify top-K accuracy for each test case b. Check processing time constraints c. Validate similarity scores are reasonable d. Ensure no duplicate tiles in top-K results

Pass/Fail Criteria

Overall Test Passes If:

  • At least 4 out of 5 test cases meet success criteria (80% pass rate)
  • Average processing time < 200ms
  • No crashes or exceptions
  • Top-5 recall rate > 85%

Test Fails If:

  • More than 1 test case fails
  • Any processing time exceeds 500ms
  • Faiss index fails to load or query
  • Memory usage exceeds 8GB
  • Top-5 recall rate < 70%

Additional Validation

Robustness Tests:

  • Query with rotated image (90°, 180°, 270°) - should still retrieve correct tile
  • Query with brightness adjusted image (+/-30%) - should maintain similarity score > 0.5
  • Sequential queries should maintain consistent results (deterministic)

Performance Metrics to Report:

  • Top-1 Recall@200m: percentage where best match is within 200m
  • Top-5 Recall@200m: percentage where any of top-5 within 200m
  • Mean Average Precision (MAP)
  • Average query latency
  • Memory footprint of Faiss index