add tests

gen_tests updated
solution.md updated
This commit is contained in:
Oleksandr Bezdieniezhnykh
2025-11-24 22:57:46 +02:00
parent f50006d100
commit 4f8c18a066
49 changed files with 7209 additions and 3 deletions
@@ -0,0 +1,157 @@
# Acceptance Test: AC-4 - Sharp Turn Recovery
## Summary
Validate Acceptance Criterion 4: "System should correctly continue the work even during sharp turns, where the next photo doesn't overlap at all, or overlaps in less than 5%. The next photo should be in less than 200m drift and at an angle of less than 70°"
## Linked Acceptance Criteria
**AC-4**: Handle sharp turns with <5% overlap
## Preconditions
1. System operational with L2 (Global Place Recognition) enabled
2. AnyLoc model and Faiss index ready
3. Test datasets:
- Dataset A: AD000042, AD000044, AD000045, AD000046 (skip AD000043)
- Dataset B: AD000003, AD000009 (5-frame gap)
4. Ground truth available
## Test Description
Test system's ability to recover from "kidnapped robot" scenarios where sequential tracking fails due to zero overlap. Validates L2 global place recognition functionality.
## Test Steps
### Step 1: Create Sharp Turn Flight (Dataset A)
- **Action**: Create flight with AD000042, AD000044, AD000045, AD000046
- **Expected Result**: Flight created, gap in sequence detected
### Step 2: Process Through L1
- **Action**: L1 processes AD000042
- **Expected Result**: AD000042 processed successfully
### Step 3: Attempt Sequential Tracking (L1 Failure Expected)
- **Action**: L1 attempts AD000042 → AD000044 (skip AD000043)
- **Expected Result**:
- L1 fails (overlap < 5% or zero)
- Low inlier count (< 10 matches)
- System triggers L2 recovery
### Step 4: L2 Global Relocalization
- **Action**: L2 (AnyLoc) queries AD000044 against satellite database
- **Expected Result**:
- L2 retrieves correct satellite tile region
- Coarse location found (within 200m of ground truth per AC-4)
- Top-5 recall succeeds
### Step 5: L3 Metric Refinement
- **Action**: L3 (LiteSAM) refines location using satellite tile
- **Expected Result**:
- Precise GPS estimate (< 50m error)
- High confidence score
### Step 6: Continue Processing
- **Action**: Process AD000045, AD000046
- **Expected Result**:
- Processing continues normally
- Sequential tracking may work for AD000044 → AD000045
- All images completed
### Step 7: Validate Recovery Success
- **Action**: Check GPS estimates for all 4 images
- **Expected Result**:
- AD000042: Accurate
- AD000044: Recovered via L2/L3, error < 200m (AC-4), preferably < 50m
- AD000045-046: Accurate
### Step 8: Test Dataset B (Larger Gap)
- **Action**: Repeat test with AD000003, AD000009 (5-frame gap)
- **Expected Result**: Similar recovery, L2 successfully relocalizes
## Success Criteria
**Primary Criterion (AC-4)**:
- System recovers from zero-overlap scenarios
- Relocated image within < 200m of ground truth (AC-4 requirement)
- Processing continues without manual intervention
**Supporting Criteria**:
- L1 failure detected appropriately
- L2 retrieves correct region (top-5 accuracy)
- L3 refines to < 50m accuracy
- All images in sequence eventually processed
## Expected Results
**Dataset A**:
```
Images: AD000042, AD000044, AD000045, AD000046
Gap: AD000043 skipped (simulates sharp turn)
Results:
- AD000042: L1 tracking, Error 21.3m ✓
- AD000043: SKIPPED (not in dataset)
- AD000044: L2 recovery, Error 38.7m ✓
- L1 failed (overlap ~0%)
- L2 top-1 retrieval: correct tile
- L3 refined GPS
- AD000045: L1/L3, Error 19.2m ✓
- AD000046: L1/L3, Error 23.8m ✓
L1 Failure Detected: Yes (AD000042 → AD000044)
L2 Recovery Success: Yes
Images < 50m: 4/4 (100%)
Images < 200m: 4/4 (100%) per AC-4
AC-4 Status: PASS
```
**Dataset B**:
```
Images: AD000003, AD000009
Gap: 5 frames (AD000004-008 skipped)
Results:
- AD000003: Error 24.1m ✓
- AD000009: L2 recovery, Error 42.3m ✓
L2 Recovery Success: Yes
AC-4 Status: PASS
```
## Pass/Fail Criteria
**TEST PASSES IF**:
- L1 failure detected when overlap < 5%
- L2 successfully retrieves correct region (top-5)
- Recovered image within 200m of ground truth (AC-4)
- Preferably < 50m (demonstrates high accuracy)
- Processing continues after recovery
**TEST FAILS IF**:
- L2 retrieves wrong region (relocalization fails)
- Recovered image > 200m error (violates AC-4)
- System halts processing after L1 failure
- Multiple recovery attempts fail
## Analysis
**Sharp Turn Characteristics** (per AC-4):
- Next photo overlap < 5% or zero
- Distance < 200m (banking turn, not long-distance jump)
- Angle < 70° (heading change)
**Recovery Pipeline**:
1. L1 detects failure (low inlier ratio)
2. L2 (AnyLoc) global place recognition
3. L3 (LiteSAM) metric refinement
4. Factor graph incorporates GPS anchor
**Why L2 Works**:
- DINOv2 features capture semantic layout (roads, field patterns)
- VLAD aggregation creates robust place descriptor
- Faiss index enables fast retrieval
- Works despite view angle differences
## Notes
- AC-4 specifies < 200m, but system targets < 50m for high quality
- Sharp turns common in wing-type UAV flight (banking maneuvers)
- L2 is critical component - if L2 fails, system requests user input per AC-6
- Test validates "kidnapped robot" problem solution