8.8 KiB
Reasoning Chain
Dimension 1: Core Architecture
Fact Confirmation
Fixed-wing monocular VO accumulates drift because scale and terrain assumptions are imperfect (Fact #1). Aerial VPR can provide absolute anchors but has weather, scale, and repetition failure modes (Fact #2).
Reference Comparison
Pure VO/IMU cannot satisfy the long 8-hour mission. Pure satellite matching cannot run every frame inside the latency budget and will fail in turns, stale imagery, and repetitive fields.
Conclusion
Select a hybrid architecture: steady-state VO/IMU propagation, conditional satellite/VPR anchoring, and ESKF covariance gates.
Confidence
High. Supported by Sources #1 and #2.
Dimension 2: VO / VIO Dependency
Fact Confirmation
OpenVINS and ORB-SLAM3 both support visual-inertial estimation patterns, but both are GPL-family production dependency risks (Facts #4 and #5).
Reference Comparison
The project needs a production system with custom mode labels, covariance propagation, MAVLink-specific output behavior, and no hidden GPL obligation. A full SLAM stack also adds initialization and map-management complexity that the ACs do not require.
Conclusion
Use a custom VO/IMU propagation and ESKF mode machine for production. Use OpenVINS/ORB-SLAM3 only as benchmarks/reference algorithms.
Confidence
High for licensing/scope decision; medium for final estimator performance until prototype profiling.
Dimension 3: Satellite Retrieval And Local Matching
Fact Confirmation
Aerial VPR benefits from scale/overlap-aware chunks and retrieval + local alignment (Fact #2). LightGlue provides keypoint correspondences/scores from local descriptors (Fact #7). FAISS provides top-K retrieval over descriptors (Fact #12).
Reference Comparison
Global DINOv2/VLAD retrieval alone gives candidate chunks but not enough precision for AC-2.2. Local matching alone over the whole map is too expensive. The two-stage retrieval+alignment structure matches the operational need.
Conclusion
Use DINOv2-VLAD descriptors with FAISS top-K for conditional candidate retrieval, followed by DISK/ALIKED+LightGlue and OpenCV RANSAC homography for local alignment.
Confidence
Medium-high. API fit is strong; embedded runtime must be profiled.
Dimension 4: Latency And Memory
Fact Confirmation
Some aerial VPR re-ranking methods are too slow for the steady-state path (Fact #3). Jetson Orin Nano Super has 8 GB shared memory and 25 W power mode (Fact #18), with thermal throttling risk (Fact #19).
Reference Comparison
At 3 fps and <400 ms p95, the system can process every frame through lightweight VO/IMU and use heavier VPR only on triggers. Running DINOv2/LightGlue across many candidates per frame would violate the AC unless proven otherwise.
Conclusion
Make VPR conditional, cap top-K by covariance/sector, run descriptor extraction on downsampled/orthorectified crops, precompute cache descriptors offline, and add performance regression tests.
Confidence
High for architecture direction; exact model sizes need profiling.
Dimension 5: Cache Format
Fact Confirmation
COG supports tiled compressed rasters and geospatial tooling (Fact #21). PMTiles is read-efficient but read-only (Fact #20).
Reference Comparison
The onboard system must both read service tiles and write new generated tiles in-flight. A read-only archive is a poor primary mutable store.
Conclusion
Use COG files plus STAC-like manifests/sidecars for imagery and metadata; use FAISS sidecar indexes for descriptors. PMTiles may be an export/snapshot format, not the live mutable cache.
Confidence
High.
Dimension 6: Autopilot Integration
Fact Confirmation
ArduPilot MAVLink GPS input requires GPS1_TYPE=14 (Fact #15). GPS_INPUT carries the fields needed for WGS84 position and accuracy (Fact #16). MAVSDK covers telemetry but raw GPS_INPUT emission still needs pymavlink/raw MAVLink (Fact #14).
Reference Comparison
MAVSDK-only output would not satisfy AC-4.3. Raw pymavlink-only telemetry is possible but gives up MAVSDK's high-level subscriptions.
Conclusion
Use MAVSDK for telemetry subscriptions and pymavlink for GPS_INPUT emission. Verify all failsafe/spoofing behavior in ArduPilot Plane SITL.
Confidence
High for interface; medium for exact Plane failsafe timing until SITL.
Dimension 7: Validation
Fact Confirmation
AerialVL provides aerial localization/reference-map data (Fact #22). EuRoC provides camera+IMU+ground truth but not fixed-wing nadir imagery (Fact #23). The current sample set lacks IMU/ground truth.
Reference Comparison
No single public dataset proves every AC. Public datasets can de-risk components, while final acceptance requires representative synchronized flight/replay data.
Conclusion
Use layered validation: EuRoC for VIO sanity, AerialVL/VPAir-style data for VPR/anchor tests, ArduPilot Plane SITL for MAVLink/failsafe/spoofing, and a final representative flight/replay rig for AC proof.
Confidence
High.
Dimension 8: OpenVINS vs Project-Owned Estimator
Fact Confirmation
OpenVINS is a strong monocular+IMU EKF/MSCKF VIO reference with covariance-aware estimation (Facts #4 and #30). OpenCV supplies calibration, undistortion, homography, RANSAC/USAC, and reprojection-error primitives, but it is not a full estimator by itself (Facts #6 and #31).
Reference Comparison
If the alternative is a naive OpenCV-only VIO stack, OpenVINS is the better technical starting point. The project's actual production choice is different: it needs a project-owned ESKF/mode machine that fuses VO/IMU, accepts/rejects satellite anchors, emits GPS_INPUT, labels every estimate, handles spoofing/blackout, and gates cache write-back.
Conclusion
Keep OpenVINS as a mandatory benchmark/reference implementation, not as the default production dependency. The production estimator remains project-owned, with OpenCV as the geometry utility layer.
Confidence
High. The technical comparison favors OpenVINS over naive custom VIO, while the product fit favors the project-owned estimator.
Dimension 9: Satellite Retrieval And Anchor Verification
Fact Confirmation
DINOv2-VLAD/AnyLoc-style retrieval has strong VPR evidence but descriptor/model size and TensorRT fidelity must be validated (Facts #33 and #36). Aerial VPR survey evidence emphasizes tile scale, overlap, season/weather shifts, repetitive patterns, and re-ranking cost (Fact #34). LightGlue supports ALIKED/DISK/SIFT feature matching and returns correspondences/scores suitable for RANSAC verification, but Jetson latency must be profiled (Facts #7, #8, #35).
Reference Comparison
Classical SIFT/ORB is simpler and cheap but weaker for cross-domain UAV-to-satellite matching. SuperPoint+LightGlue is technically strong but remains license-gated. Pure global retrieval without local verification is unsafe because repetitive farmland and stale imagery can produce plausible but wrong candidates.
Conclusion
Use DINOv2-VLAD + CPU-first FAISS as the triggered global retriever, then verify bounded top-K candidates with ALIKED/LightGlue + OpenCV RANSAC. Keep SIFT/ORB as a regression baseline and SuperPoint only after legal approval.
Confidence
High for architecture and interfaces; medium for final runtime until Jetson profiling.
Dimension 10: BASALT vs Kimera-VIO vs OpenVINS
Fact Confirmation
BASALT has strong published EuRoC evidence and production-friendly licensing (Fact #37). OpenVINS has the clearest EKF covariance API and consistency tooling, but GPLv3 remains a production constraint (Fact #38). Kimera-VIO is BSD-friendly but heavier and has documented mono-inertial caveats (Fact #39). All three require calibrated camera-to-IMU extrinsics; none has a special fixed-wing nadir mode (Fact #40).
Reference Comparison
For a single fixed downward camera, the selection criterion is not just benchmark ATE. The project needs a VIO core that can run on Jetson, tolerate calibrated nadir geometry, and be wrapped by project-specific satellite-anchor, confidence, MAVLink, and safety logic. OpenVINS is attractive for confidence/covariance but problematic as a shipped component. Kimera is acceptable as a BSD fallback, but mono-inertial risk makes it weaker as the first pick. BASALT provides the best production trade-off if its uncertainty can be calibrated and wrapped.
Conclusion
Select BASALT as the production VIO candidate. Keep OpenVINS as a reference/covariance baseline and Kimera-VIO as a backup candidate. The project-owned safety/anchor wrapper remains mandatory around BASALT because BASALT alone does not satisfy GPS-denied source labels, satellite anchors, false-position budgets, cache-write gates, or GPS_INPUT behavior.
Confidence
Medium-high. Library ranking is well supported; final acceptance still depends on representative fixed-wing nadir replay data.