gen_tests updated solution.md updated
4.9 KiB
Acceptance Test: Outlier Anchor Detection (<10%)
Summary
Validate AC-5 requirement that the system detects and rejects outlier global anchor points (bad satellite matches from L3) keeping outlier rate below 10%.
Linked Acceptance Criteria
AC-5: Less than 10% outlier anchors. The system can tolerate some incorrect global anchor points (bad satellite matches) but must keep them below 10% threshold through validation and rejection mechanisms.
Preconditions
- ASTRAL-Next system operational
- L3 LiteSAM cross-view matching active
- Factor graph with robust M-estimation
- Validation mechanisms enabled (geometric consistency, residual analysis)
Test Data
- Dataset: AD000001-AD000060 (60 images)
- Expected Anchors: ~20-30 global anchor attempts (not every frame needs L3)
- Acceptable Outliers: <3 outlier anchors (<10% of 30)
- Challenge: Potential satellite data staleness, seasonal differences
Test Steps
Step 1: Process Flight with L3 Anchoring
Action: Process full flight with L3 metric refinement active Expected Result:
- L2 retrieves satellite tiles for keyframes
- L3 LiteSAM performs cross-view matching
- Global anchor factors added to factor graph
- Anchor count: 20-30 across 60 images
- Status: PROCESSING_WITH_ANCHORS
Step 2: Monitor Anchor Quality Metrics
Action: Track L3 matching confidence and geometric consistency Expected Result:
- Each anchor has confidence score (0-1)
- Each anchor has initial residual error
- Anchors with confidence <0.3 flagged as suspicious
- Anchors with residual >3σ flagged as outliers
- Status: MONITORING_QUALITY
Step 3: Identify Potential Outlier Anchors
Action: Analyze anchors that conflict with trajectory consensus Expected Result:
Total anchors: 25 (example)
High confidence (>0.7): 20
Medium confidence (0.4-0.7): 3
Low confidence (<0.4): 2
Flagged as outliers: 2 (<10%)
Step 4: Validate Outlier Rejection Mechanism
Action: Verify factor graph handling of outlier anchors Expected Result:
- Outlier anchors automatically down-weighted by robust kernel
- Outlier anchor residuals remain high (not dragging trajectory)
- Non-outlier anchors maintain weight ~1.0
- Factor graph converges despite outlier anchors present
- Status: OUTLIERS_HANDLED
Step 5: Test Explicit Outlier Anchor Scenario
Action: Manually inject known bad anchor (simulated wrong satellite tile match) Expected Result:
- Bad anchor creates large residual (>100m error)
- Geometric validation detects inconsistency
- Robust cost function down-weights bad anchor
- Bad anchor does NOT corrupt trajectory
- Status: SYNTHETIC_OUTLIER_REJECTED
Step 6: Calculate Final Anchor Statistics
Action: Analyze all anchor attempts and outcomes Expected Result:
Total anchor attempts: 25-30
Successful anchors: 23-27 (90-95%)
Outlier anchors: 2-3 (<10%)
Outlier detection rate: 100% (all caught)
False positive rate: <5% (good anchors not rejected)
Trajectory accuracy: Improved by valid anchors
AC-5 Status: PASS
Pass/Fail Criteria
PASS if:
- Outlier anchor rate <10% of total anchor attempts
- All significant outliers (>100m error) detected and down-weighted
- Factor graph converges with MRE <1.5px
- Valid anchors improve trajectory accuracy vs L1-only
- No trajectory corruption from outlier anchors
FAIL if:
- Outlier anchor rate >10%
-
1 outlier anchor corrupts trajectory (causes >50m error propagation)
- Outlier detection fails (outliers not flagged)
- Factor graph diverges due to conflicting anchors
- Valid anchors incorrectly rejected (>10% false positive rate)
Outlier Detection Mechanisms Tested
Geometric Consistency Check
- Compare anchor position with L1 trajectory estimate
- Flag if discrepancy >100m
Residual Analysis
- Monitor residual error in factor graph optimization
- Flag if residual >3σ from mean anchor residual
Confidence Thresholding
- L3 LiteSAM outputs matching confidence
- Reject anchors with confidence <0.2
Robust M-Estimation
- Cauchy/Huber kernel automatically down-weights high-residual anchors
- Prevents outliers from corrupting optimization
Technical Validation Metrics
- Anchor Attempt Rate: 30-50% of frames (keyframes only)
- Anchor Success Rate: 90-95%
- Outlier Rate: <10% (AC-5 requirement)
- Detection Sensitivity: >95% (outliers caught)
- Detection Specificity: >90% (valid anchors retained)
Failure Modes Tested
- Wrong Satellite Tile: L2 retrieves incorrect location
- Stale Satellite Data: Terrain changed significantly
- Seasonal Mismatch: Summer satellite vs winter UAV imagery
- Rotation Error: L3 estimates incorrect rotation
Notes
- AC-5 is critical for hybrid localization reliability
- 10% outlier tolerance allows graceful degradation
- Robust M-estimation is the primary outlier defense
- Multiple validation layers provide defense-in-depth
- Valid anchors significantly improve absolute accuracy