Files
gps-denied-desktop/_docs/00_research/draft04_assessment/04_reasoning_chain.md
T

5.7 KiB

Reasoning Chain

Dimension 1: VO Matcher Selection

Fact Confirmation

Draft04 uses SuperPoint+LightGlue for VO at 150-200ms/frame (Fact #1). XFeat achieves AUC@10° 65.4 vs SuperPoint's 50.1, is 5x faster (~15ms GPU), and is validated for UAV VO by SatLoc-Fusion (Fact #2, #3).

Reference Comparison

SuperPoint+LightGlue provides higher quality matching for wide-baseline cross-view pairs (satellite matching). However, for consecutive frame VO with 60-80% overlap and mostly translational motion, XFeat's quality is sufficient — it actually outperforms SuperPoint on Megadepth.

Conclusion

The VO matcher should be reverted to XFeat. The regression was unintentional (not in draft04 assessment findings). XFeat provides better speed (10x) and comparable-or-better quality for the VO use case. SuperPoint+LightGlue should only be retained as a fallback option, not the primary VO matcher.

Confidence

High — XFeat superiority for this use case is supported by both benchmarks and a published UAV system (SatLoc-Fusion).


Dimension 2: LiteSAM Maturity Risk

Fact Confirmation

LiteSAM has 5 GitHub stars, 0 forks, 4 commits, no license, no issues, and no independent reproduction (Fact #5, #14). Its base, EfficientLoFTR, has 964 stars and CVPR 2024 publication (Fact #8).

Reference Comparison

For a production system, relying on a model with no community adoption, no license, and single-point-of-failure weight hosting (Google Drive) is risky. EfficientLoFTR is proven and mature but has 2.4x more parameters (15.05M vs 6.31M).

Conclusion

Keep LiteSAM as primary satellite fine matcher (it IS better on benchmarks) but add EfficientLoFTR as a proven fallback. Add startup validation: verify weight checksum, test inference on a reference pair, log a warning if LiteSAM fails any check and auto-switch to EfficientLoFTR. This hedges the maturity risk while preserving the performance advantage.

Confidence

High — maturity metrics are objective; fallback strategy is standard engineering practice.


Dimension 3: Hit Rate Claim Accuracy

Fact Confirmation

Draft04 states "77.3% hit rate in Hard conditions on satellite-aerial benchmarks." The paper shows 77.3% is on the self-made dataset (Harbin/Qiqihar). On UAV-VisLoc Hard, LiteSAM achieves 61.65% (Fact #4).

Reference Comparison

61.65% on UAV-VisLoc Hard is still better than SuperPoint+LightGlue's estimated 54-58%, but the gap is much narrower than 77.3% suggests.

Conclusion

Correct the hit rate claim in the draft. Report both numbers: 61.65% on UAV-VisLoc Hard and 77.3% on self-made dataset. The improvement over SP+LG is real but more modest (~4-7pp on UAV-VisLoc) than the draft implies (~19pp).

Confidence

High — numbers directly from the paper.


Dimension 4: Model Loading Security

Fact Confirmation

CVE-2025-32434 (PyTorch ≤2.5.1) and CVE-2026-24747 (before 2.10.0) both allow code execution through torch.load even with weights_only=True (Fact #7). LiteSAM weights are on Google Drive with no integrity verification (Fact #6).

Reference Comparison

All other models (SuperPoint, DINOv2) come from official registries (torch.hub, official repos). LiteSAM is the only model from an unverified source.

Conclusion

Pin PyTorch ≥2.10.0. Add SHA256 checksum verification for ALL model weights, especially LiteSAM. Download LiteSAM weights once, compute checksum, store in configuration. Verify on every load. Prefer safetensors format where available (DINOv2 from HuggingFace supports this).

Confidence

High — CVEs are documented, mitigation is standard practice.


Dimension 5: VRAM Budget

Fact Confirmation

With SuperPoint+LightGlue for VO, peak VRAM is ~1.6GB. With XFeat, it drops to ~900MB (Fact #15). RTX 2060 has 6GB total, with ~500MB system overhead.

Reference Comparison

Both fit under 6GB, but XFeat provides 700MB more headroom for PyTorch CUDA allocator overhead, batch processing, and unexpected spikes.

Conclusion

Reverting to XFeat for VO improves VRAM headroom from 4.4GB to 5.1GB. No further action needed on VRAM — both configurations are safe.

Confidence

⚠️ Medium — VRAM estimates are approximate; actual measurement needed.


Dimension 6: GTSAM Robustness

Fact Confirmation

iSAM2 can throw IndeterminantLinearSystemException (Fact #13). No error handling is specified in draft04.

Reference Comparison

This is a standard GTSAM failure mode. Production systems must handle it.

Conclusion

Add try/except around iSAM2.update(). On exception: log the error, skip the problematic factor, retry with relaxed noise model (10x sigma). If still fails: mark current position as VO-only. Never crash the pipeline on optimizer failure.

Confidence

High — standard GTSAM robustness pattern.


Dimension 7: Satellite Imagery Freshness

Fact Confirmation

Google Maps imagery for eastern Ukraine conflict zones is 1-3 years old and intentionally kept outdated (Fact #12). This can significantly degrade feature matching accuracy.

Reference Comparison

DINOv2 coarse retrieval is robust to seasonal changes (semantic matching). Fine matching (LiteSAM/SuperPoint) is more sensitive to structural changes (destroyed buildings, new constructions in conflict zone).

Conclusion

Add imagery age awareness: 1) log satellite tile age when available, 2) increase satellite match noise sigma for known-outdated regions, 3) lower confidence thresholds for matches in areas with known imagery staleness, 4) document Maxar (paid, fresh) and user-provided tiles as higher-priority alternatives for conflict zones. The existing multi-provider architecture already supports this — just needs tuning.

Confidence

High — Google's policy is documented; impact on matching is well-understood.