add tests

gen_tests updated solution.md updated
2026-04-23 00:26:36 +00:00 · 2025-11-24 22:57:46 +02:00
parent f50006d100
commit 4f8c18a066
49 changed files with 7209 additions and 3 deletions
@@ -0,0 +1,157 @@
+# Acceptance Test: AC-4 - Sharp Turn Recovery
+
+## Summary
+Validate Acceptance Criterion 4: "System should correctly continue the work even during sharp turns, where the next photo doesn't overlap at all, or overlaps in less than 5%. The next photo should be in less than 200m drift and at an angle of less than 70°"
+
+## Linked Acceptance Criteria
+**AC-4**: Handle sharp turns with <5% overlap
+
+## Preconditions
+1. System operational with L2 (Global Place Recognition) enabled
+2. AnyLoc model and Faiss index ready
+3. Test datasets: 
+   - Dataset A: AD000042, AD000044, AD000045, AD000046 (skip AD000043)
+   - Dataset B: AD000003, AD000009 (5-frame gap)
+4. Ground truth available
+
+## Test Description
+Test system's ability to recover from "kidnapped robot" scenarios where sequential tracking fails due to zero overlap. Validates L2 global place recognition functionality.
+
+## Test Steps
+
+### Step 1: Create Sharp Turn Flight (Dataset A)
+- **Action**: Create flight with AD000042, AD000044, AD000045, AD000046
+- **Expected Result**: Flight created, gap in sequence detected
+
+### Step 2: Process Through L1
+- **Action**: L1 processes AD000042
+- **Expected Result**: AD000042 processed successfully
+
+### Step 3: Attempt Sequential Tracking (L1 Failure Expected)
+- **Action**: L1 attempts AD000042 → AD000044 (skip AD000043)
+- **Expected Result**:
+  - L1 fails (overlap < 5% or zero)
+  - Low inlier count (< 10 matches)
+  - System triggers L2 recovery
+
+### Step 4: L2 Global Relocalization
+- **Action**: L2 (AnyLoc) queries AD000044 against satellite database
+- **Expected Result**:
+  - L2 retrieves correct satellite tile region
+  - Coarse location found (within 200m of ground truth per AC-4)
+  - Top-5 recall succeeds
+
+### Step 5: L3 Metric Refinement
+- **Action**: L3 (LiteSAM) refines location using satellite tile
+- **Expected Result**:
+  - Precise GPS estimate (< 50m error)
+  - High confidence score
+
+### Step 6: Continue Processing
+- **Action**: Process AD000045, AD000046
+- **Expected Result**:
+  - Processing continues normally
+  - Sequential tracking may work for AD000044 → AD000045
+  - All images completed
+
+### Step 7: Validate Recovery Success
+- **Action**: Check GPS estimates for all 4 images
+- **Expected Result**:
+  - AD000042: Accurate
+  - AD000044: Recovered via L2/L3, error < 200m (AC-4), preferably < 50m
+  - AD000045-046: Accurate
+
+### Step 8: Test Dataset B (Larger Gap)
+- **Action**: Repeat test with AD000003, AD000009 (5-frame gap)
+- **Expected Result**: Similar recovery, L2 successfully relocalizes
+
+## Success Criteria
+
+**Primary Criterion (AC-4)**:
+- System recovers from zero-overlap scenarios
+- Relocated image within < 200m of ground truth (AC-4 requirement)
+- Processing continues without manual intervention
+
+**Supporting Criteria**:
+- L1 failure detected appropriately
+- L2 retrieves correct region (top-5 accuracy)
+- L3 refines to < 50m accuracy
+- All images in sequence eventually processed
+
+## Expected Results
+
+**Dataset A**:
+```
+Images: AD000042, AD000044, AD000045, AD000046
+Gap: AD000043 skipped (simulates sharp turn)
+
+Results:
+- AD000042: L1 tracking, Error 21.3m ✓
+- AD000043: SKIPPED (not in dataset)
+- AD000044: L2 recovery, Error 38.7m ✓
+  - L1 failed (overlap ~0%)
+  - L2 top-1 retrieval: correct tile
+  - L3 refined GPS
+- AD000045: L1/L3, Error 19.2m ✓
+- AD000046: L1/L3, Error 23.8m ✓
+
+L1 Failure Detected: Yes (AD000042 → AD000044)
+L2 Recovery Success: Yes
+Images < 50m: 4/4 (100%)
+Images < 200m: 4/4 (100%) per AC-4
+AC-4 Status: PASS
+```
+
+**Dataset B**:
+```
+Images: AD000003, AD000009
+Gap: 5 frames (AD000004-008 skipped)
+
+Results:
+- AD000003: Error 24.1m ✓
+- AD000009: L2 recovery, Error 42.3m ✓
+
+L2 Recovery Success: Yes
+AC-4 Status: PASS
+```
+
+## Pass/Fail Criteria
+
+**TEST PASSES IF**:
+- L1 failure detected when overlap < 5%
+- L2 successfully retrieves correct region (top-5)
+- Recovered image within 200m of ground truth (AC-4)
+- Preferably < 50m (demonstrates high accuracy)
+- Processing continues after recovery
+
+**TEST FAILS IF**:
+- L2 retrieves wrong region (relocalization fails)
+- Recovered image > 200m error (violates AC-4)
+- System halts processing after L1 failure
+- Multiple recovery attempts fail
+
+## Analysis
+
+**Sharp Turn Characteristics** (per AC-4):
+- Next photo overlap < 5% or zero
+- Distance < 200m (banking turn, not long-distance jump)
+- Angle < 70° (heading change)
+
+**Recovery Pipeline**:
+1. L1 detects failure (low inlier ratio)
+2. L2 (AnyLoc) global place recognition
+3. L3 (LiteSAM) metric refinement
+4. Factor graph incorporates GPS anchor
+
+**Why L2 Works**:
+- DINOv2 features capture semantic layout (roads, field patterns)
+- VLAD aggregation creates robust place descriptor
+- Faiss index enables fast retrieval
+- Works despite view angle differences
+
+## Notes
+- AC-4 specifies < 200m, but system targets < 50m for high quality
+- Sharp turns common in wing-type UAV flight (banking maneuvers)
+- L2 is critical component - if L2 fails, system requests user input per AC-6
+- Test validates "kidnapped robot" problem solution
+