mirror of https://github.com/azaion/gps-denied-desktop.git synced 2026-04-22 22:26:37 +00:00

Files

T

Oleksandr Bezdieniezhnykh 4f8c18a066 add tests

gen_tests updated
solution.md updated

2025-11-24 22:57:46 +02:00

4.8 KiB

Raw Blame History

Integration Test: Global Place Recognition

Summary

Validate the Layer 2 (L2) Global Place Recognition component using AnyLoc (DINOv2 + VLAD) for retrieving matching satellite tiles when UAV tracking is lost.

Component Under Test

Component: Global Place Recognition (L2) Location: gps_denied_08_global_place_recognition Dependencies:

Model Manager (DINOv2 model)
Satellite Data Manager (pre-cached satellite tiles)
Coordinate Transformer
Faiss index database

Detailed Description

This test validates that the Global Place Recognition component can:

Extract DINOv2 features from UAV images
Aggregate features using VLAD into compact descriptors
Query Faiss index to retrieve top-K similar satellite tiles
Handle "kidnapped robot" scenarios (zero overlap with previous frame)
Work with potentially outdated satellite imagery

The component solves the critical problem of recovering location after sharp turns or tracking loss where sequential matching fails.

Input Data

Test Case 1: Normal Flight Recovery

UAV Image: AD000001.jpg
Ground truth GPS: 48.275292, 37.385220
Expected: Should retrieve satellite tile containing this location in top-5
Satellite reference: AD000001_gmaps.png

Test Case 2: After Sharp Turn

UAV Image: AD000044.jpg (after skipping AD000043)
Ground truth GPS: 48.251489, 37.343079
Context: Simulates zero-overlap scenario
Expected: Should relocalize despite no sequential context

Test Case 3: Maximum Distance Jump

UAV Image: AD000048.jpg (268.6m from previous)
Ground truth GPS: 48.249114, 37.346895
Context: Largest outlier in dataset
Expected: Should retrieve correct region

Test Case 4: Middle of Route

UAV Image: AD000030.jpg
Ground truth GPS: 48.259677, 37.352165
Expected: Accurate retrieval for interior route point

Test Case 5: Route Start vs End

UAV Images: AD000001.jpg and AD000060.jpg
Ground truth GPS:
- AD000001: 48.275292, 37.385220
- AD000060: 48.256246, 37.357485
Expected: Both should retrieve distinct correct regions

Expected Output

For each test case:

{
  "success": true/false,
  "query_image": "AD000001.jpg",
  "top_k_tiles": [
    {
      "tile_id": "tile_xyz",
      "center_gps": [lat, lon],
      "similarity_score": <float 0-1>,
      "distance_to_gt_m": <float>
    }
  ],
  "top1_correct": true/false,
  "top5_correct": true/false,
  "processing_time_ms": <float>
}

Success Criteria

Per Test Case:

top1_correct = true (best match within 200m of ground truth) OR
top5_correct = true (at least one of top-5 within 200m of ground truth)
processing_time_ms < 200ms
similarity_score of correct match > 0.6

Test Case Specific:

Test Case 1: top1_correct = true (reference image available)
Test Case 2-4: top5_correct = true
Test Case 5: Both images should have top5_correct = true

Maximum Expected Time

Per query: < 200ms (on RTX 3070)
Per query: < 300ms (on RTX 2060)
Faiss index initialization: < 5 seconds
Total test suite: < 10 seconds

Test Execution Steps

Setup Phase: a. Initialize Satellite Data Manager b. Load or create Faiss index with satellite tile descriptors c. Verify satellite coverage for test area (48.25-48.28°N, 37.34-37.39°E) d. Load DINOv2 model via Model Manager
Execution Phase: For each test case: a. Load UAV image from test data b. Extract DINOv2 features c. Aggregate to VLAD descriptor d. Query Faiss index for top-5 matches e. Calculate distance from retrieved tiles to ground truth GPS f. Record timing and accuracy metrics
Validation Phase: a. Verify top-K accuracy for each test case b. Check processing time constraints c. Validate similarity scores are reasonable d. Ensure no duplicate tiles in top-K results

Pass/Fail Criteria

Overall Test Passes If:

At least 4 out of 5 test cases meet success criteria (80% pass rate)
Average processing time < 200ms
No crashes or exceptions
Top-5 recall rate > 85%

Test Fails If:

More than 1 test case fails
Any processing time exceeds 500ms
Faiss index fails to load or query
Memory usage exceeds 8GB
Top-5 recall rate < 70%

Additional Validation

Robustness Tests:

Query with rotated image (90°, 180°, 270°) - should still retrieve correct tile
Query with brightness adjusted image (+/-30%) - should maintain similarity score > 0.5
Sequential queries should maintain consistent results (deterministic)

Performance Metrics to Report:

Top-1 Recall@200m: percentage where best match is within 200m
Top-5 Recall@200m: percentage where any of top-5 within 200m
Mean Average Precision (MAP)
Average query latency
Memory footprint of Faiss index

4.8 KiB Raw Blame History