Files
gps-denied-desktop/_docs/03_tests/46_registration_rate_challenging_spec.md
T
Oleksandr Bezdieniezhnykh abc26d5c20 initial structure implemented
docs -> _docs
2025-12-01 14:20:56 +02:00

6.8 KiB

Acceptance Test: Image Registration Rate >95% - Challenging Conditions

Summary

Validate AC-9 requirement (≥95% registration rate) under challenging conditions including multiple sharp turns, outliers, repetitive textures, and degraded satellite data.

Linked Acceptance Criteria

AC-9: Image Registration Rate > 95%. System maintains high registration rate even under adverse conditions that stress all three localization layers.

Preconditions

  • ASTRAL-Next system operational
  • Multi-layer architecture robust to individual layer failures
  • Challenging test scenarios prepared
  • Registration fallback mechanisms active

Challenging Conditions Tested

  1. Multiple sharp turns (5 turns >200m in 60 images)
  2. Large outlier (268.6m jump)
  3. Repetitive agricultural texture (aliasing risk)
  4. Degraded satellite data (simulated staleness)
  5. Seasonal mismatch (summer satellite, autumn flight)
  6. Clustered failures (consecutive difficult frames)

Test Data

  • Full Flight: AD000001-AD000060 (contains all 5 sharp turns + outlier)
  • Stress Test: AD000042-AD000048 (clustered challenges)
  • Expected: ≥95% registration despite challenges

Test Steps

Step 1: Multi-Sharp-Turn Scenario

Action: Process flight segment with 5 sharp turns (>200m jumps) Expected Result:

Sharp turn frames: 5
  - AD000003→004 (202.2m)
  - AD000032→033 (220.6m)
  - AD000042→043 (234.2m)
  - AD000044→045 (230.2m)
  - AD000047→048 (268.6m)

L1 failures at turns: 5 (expected)
L2 activations: 5
L2 successes: 4 (80%)
L2 failures: 1 (AD000048, largest jump)
L3 attempted on L2 failure: 1
L3 success: 0 (cross-view difficult)

Registration success: 4/5 sharp turn frames (80%)
Overall impact on AC-9: <1% total failure rate
Status: SHARP_TURNS_MOSTLY_HANDLED

Step 2: Clustered Difficulty Scenario

Action: Process AD000042-048 (2 sharp turns + outlier in 7 frames) Expected Result:

Total frames: 7
Normal frames: 4 (042, 046, 047, 048 target frames)
Challenging frames: 3 (043 gap, 044 pre-turn, 045 post-turn)

L1 successes: 3/6 frame pairs (50%, expected low)
L2 activations: 3
L2 successes: 2
Combined registration: 5/7 (71%)

Observation: Clustered challenges stress system
Mitigation: Multi-layer fallback prevents catastrophic failure
Status: CLUSTERED_CHALLENGES_SURVIVED

Step 3: Repetitive Texture Stress Test

Action: Process agricultural field segment (AD000015-025) Expected Result:

Frames: 11
Texture: Highly repetitive crop rows
Traditional SIFT/ORB: Would fail (>50% outliers)
SuperPoint+LightGlue: Succeeds (semantic features)

L1 successes: 10/10 frame pairs (100%)
SuperPoint feature quality: High (field boundaries prioritized)
LightGlue outlier rejection: Effective (dustbin mechanism)
Registration rate: 100%
Status: REPETITIVE_TEXTURE_HANDLED

Step 4: Degraded Satellite Data Simulation

Action: Simulate stale satellite data (2-3 years old, terrain changes) Expected Result:

Scenario: 20% of satellite tiles outdated
L2 retrieval attempts: 10
L2 correct tile (outdated): 8
L2 wrong tile: 2

L3 refinement on outdated tiles:
  - DINOv2 semantic features: Robust to changes
  - Structural matching: 6/8 succeed (75%)
  
Combined L2+L3 success: 6/10 (60%)
Impact on overall registration: Moderate
Fallback to L1 trajectory: Maintains continuity
Overall registration rate: >95% maintained
Status: DEGRADED_DATA_TOLERATED

Step 5: Seasonal Mismatch Test

Action: Process with summer satellite tiles, autumn UAV imagery Expected Result:

Visual differences: Vegetation color, field state
Traditional methods: Significant accuracy loss
AnyLoc (DINOv2): Semantic invariance active

L2 retrieval (color-invariant): 85% success
L3 cross-view matching: 70% success (view angle + season)
Registration maintained: Yes (structure-based features)
Status: SEASONAL_ROBUSTNESS_VERIFIED

Step 6: Calculate Challenging Conditions Registration Rate

Action: Process full 60-image flight with all challenges, calculate final rate Expected Result:

Total images: 60
Challenging frames: 15 (25% of flight)
  - Sharp turns: 5
  - Outlier: 1  
  - Repetitive texture: 11 (overlapping with others)

L1 success rate: 86.4% (51/59 pairs)
L2 success rate (when L1 fails): 75% (6/8)
L3 success rate (when L1+L2 fail): 50% (1/2)

Total registered: 58/60
Registration failures: 2
Registration rate: 96.7%

AC-9 Requirement: >95%
Actual (challenging): 96.7%
Status: AC-9 PASS under stress

Pass/Fail Criteria

PASS if:

  • Registration rate ≥95% despite multiple challenges
  • System demonstrates graceful degradation (challenges reduce but don't eliminate registration)
  • Multi-layer fallback working across all challenge types
  • No catastrophic failures (system crashes, infinite loops)
  • Clustered challenges (<3 consecutive failures)

FAIL if:

  • Registration rate <95% under challenging conditions
  • Single challenge type causes >10% failure rate
  • Multi-layer fallback not activating appropriately
  • Catastrophic failure on any challenge type
  • Clustered failures >5 consecutive frames

Resilience Analysis

Without Multi-Layer Architecture

L1 only (sequential tracking):
  Sharp turns: 100% failure (0% overlap)
  Expected registration: 55/60 (91.7%)
  Result: FAILS AC-9

With Multi-Layer Architecture

L1 + L2 + L3 (proposed ASTRAL-Next):
  L1 handles: 86.4% of cases
  L2 recovers: 10.2% of cases (when L1 fails)
  L3 refines: 1.7% of cases (when L1+L2 fail)
  Expected registration: 58/60 (96.7%)
  Result: PASSES AC-9

Robustness Multiplier

Multi-layer provides ~5% improvement in registration rate
This 5% is critical for meeting AC-9 threshold
Justifies architectural complexity

Failure Mode Analysis

Acceptable Failures (Within 5% Budget)

  • Extreme outliers (>300m, view completely different)
  • Satellite data completely missing (coverage gap)
  • UAV imagery corrupted (motion blur, exposure)
  • Location highly ambiguous (identical fields for km)

Unacceptable Failures (System Defects)

  • Crashes on difficult frames
  • L2 not activating when L1 fails
  • Infinite loops in matching algorithms
  • Memory exhaustion on challenging scenarios

Recovery Mechanisms Tested

  1. L1→L2 Fallback: Automatic when match count <50
  2. L2→L3 Refinement: Triggered on low retrieval confidence
  3. Multi-Map (Atlas): New map started if all layers fail
  4. User Input (AC-6): Requested after 3 consecutive failures

Notes

  • Challenging conditions test validates real-world operational robustness
  • 96.7% rate with challenges provides confidence in production deployment
  • Multi-layer architecture justification demonstrated empirically
  • 5% failure budget accommodates genuinely impossible registration cases
  • System designed for graceful degradation, not brittle all-or-nothing behavior