# Question Decomposition ## Classification - **Original question**: Design a GPS-denied onboard localization system for a fixed-wing UAV using a nadir camera, IMU, preloaded satellite imagery, and ArduPilot `GPS_INPUT`. - **Active mode**: Mode A Phase 2, initial solution research. - **Research output class**: Technical-component selection. - **Question type**: Decision support with knowledge organization. - **Timeliness sensitivity**: High for VPR, embedded AI inference, and MAVLink/ArduPilot integration; medium for geometry and filtering fundamentals. ## Research Boundary | Dimension | Boundary | |-----------|----------| | Population | Fixed-wing UAV missions; not multirotor hover workflows. | | Geography | Eastern/southern Ukraine operational areas east/left of the Dnipro River. | | Timeframe | Current implementation target with 2024-2026 component evidence where possible. | | Level | Onboard real-time production system, not offline post-processing. | | Operating context | 8 h flight, 60 km/h, <=1 km AGL, 3 fps nav camera, Jetson Orin Nano Super, GPS denied/spoofed. | | Required interfaces | Offline Satellite Service cache in; MAVLink `GPS_INPUT`, QGC telemetry, FDR records, and object-coordinate API out. | | Non-functional envelope | <400 ms p95, <8 GB shared memory, 10 GB persistent cache target, 64 GB FDR cap, safety covariance and false-position budgets. | ## Project Constraint Matrix Summary | Constraint Area | Binding Constraint | |-----------------|-------------------| | Camera | ADTi 20MP 20L V1, APS-C, ~5472 x 3648, fixed nadir, no gimbal stabilization. | | Sensors | FC IMU/attitude/airspeed/altitude available over MAVLink; original still-image sample lacks synchronized IMU, while Derkachi replay data now provides synchronized IMU and `GLOBAL_POSITION_INT` trajectory. | | Reference imagery | Offline cache only, 0.5 m/px minimum and 0.3 m/px ideal, freshness gates, no in-flight provider fetch. | | Runtime | Jetson Orin Nano Super, CUDA/TensorRT available, 25 W thermal envelope. | | Autopilot | ArduPilot only, v1 emits `GPS_INPUT` only; ODOMETRY intentionally disabled. | | Storage | No raw frame retention; tiles + FDR only. Descriptor/index storage must be budgeted. | | Safety | Reject weak anchors, never under-report covariance, fail/degrade honestly in blackout and spoofing. | | Hard disqualifiers | Per-frame heavy VPR without profiling, runtime dependence on external network, stale-tile confident anchors, GPL production dependency unless licensing is accepted. | ## Perspectives | Perspective | Focus | |-------------|-------| | Operator / mission user | Does the system keep the UAV navigable and report honest confidence under spoofing/blackout? | | Embedded implementer | Can the pipeline fit <400 ms p95 and <8 GB on Jetson with maintainable interfaces? | | Safety reviewer | Are false-position and cache-poisoning paths gated before they can steer the FC or poison future caches? | | Field practitioner | Will seasonal agricultural repetition, turns, haze/smoke, and stale imagery break the architecture? | | Contrarian | Which attractive libraries or SOTA models fail because of licensing, memory, latency, or input mismatch? | ## Sub-Questions And Query Variants 1. What architecture bounds drift while GPS is denied? - fixed-wing UAV GPS-denied satellite image matching visual odometry - visual odometry satellite imagery accumulated error fixed wing UAV - monocular VIO aerial navigation scale ambiguity satellite anchor - GPS spoofed UAV visual inertial navigation covariance failover 2. Which VO/VIO approach fits one nadir camera + IMU? - OpenVINS monocular visual inertial odometry Jetson - ORB-SLAM3 monocular inertial Jetson UAV limitations - VINS-Fusion fixed wing monocular IMU outdoor aerial - homography visual odometry nadir UAV IMU fusion 3. Which satellite retrieval and matching approach fits offline cache + <400 ms? - aerial visual place recognition survey DINOv2 FAISS - DINOv2 VLAD aerial VPR embedded memory - LightGlue SuperPoint DISK ALIKED TensorRT Jetson - cross-view UAV satellite matching failure modes farmland 4. How should the estimator and safety modes work? - ESKF visual inertial GPS denied UAV covariance - GPS_INPUT horiz_accuracy covariance external GPS ArduPilot - visual blackout IMU dead reckoning UAV covariance growth - false position rejection Mahalanobis gate visual localization 5. What cache format and data contract fit the onboard/Satellite Service boundary? - COG PMTiles MBTiles offline raster cache embedded - satellite tile descriptor index storage FAISS PMTiles - cloud optimized geotiff local update limitations - PMTiles read only update PostgreSQL/PostGIS-backed raster cache 6. How should MAVLink output integrate with ArduPilot Plane? - ArduPilot GPS_INPUT GPS1_TYPE 14 Plane SITL - pymavlink gps_input_send external GPS example - MAVSDK GPS_INPUT support raw MAVLink - ArduPilot EKF GPS glitch spoof failsafe Plane parameters 7. What validation datasets and tests are needed? - AerialVL UAV satellite visual localization dataset - VPAir aerial visual place recognition dataset - EuRoC MAV visual inertial odometry dataset - ArduPilot Plane SITL fake GPS spoofing simulation ## Component Option Search Plan | Component Area | Option Families / Candidates | Evidence Needed | |----------------|------------------------------|-----------------| | Camera calibration and geometry | OpenCV calibration/homography; custom NumPy geometry; ROS camera pipeline | Official API for intrinsics, distortion, homography, RANSAC; permissive licensing; Jetson compatibility. | | VO / VIO propagation | OpenVINS, ORB-SLAM3, VINS-Fusion, custom homography+IMU ESKF | Exact monocular+IMU input fit, output pose/covariance, licensing, runtime, initialization behavior. | | VPR global retrieval | DINOv2-VLAD/AnyLoc, MixVPR/SALAD/SelaVPR, classical NetVLAD/BoW | Aerial benchmark evidence, descriptor size, offline index fit, embedded feasibility. | | Local cross-domain matching | LightGlue + DISK/ALIKED, SuperPoint+LightGlue, LoFTR/XFeat, SIFT/ORB baseline | Inputs/outputs, match coordinates, license, runtime knobs, TensorRT/Jetson feasibility. | | Vector index | FAISS CPU/GPU, PostgreSQL/pgvector metadata-assisted search, Annoy/HNSWLIB | Top-K retrieval, saved index, memory/compression knobs, ARM/Jetson feasibility. | | Estimator | Custom ESKF, factor graph, robot_localization | Covariance output, mode labels, Mahalanobis gates, source-specific update control. | | Cache/storage | COG, PostgreSQL/PostGIS manifest, PMTiles, MBTiles, raw tile folders | Offline read/update behavior, storage efficiency, metadata/manifest support. | | MAVLink integration | pymavlink, MAVSDK, MAVProxy bridge | `GPS_INPUT` support, ArduPilot `GPS1_TYPE=14`, telemetry subscriptions, QGC status. | | FDR | PostgreSQL event index, Parquet export, CBOR segment files | Streaming writes, rollover, compact typed records, replayability. | ## Completeness Audit - **Cost/resources**: covered by Jetson, cache, thermal, and descriptor storage constraints. - **Legal/licensing**: covered; GPL VIO/SLAM tools are not selected for production. - **Dependencies**: Satellite Service cache contract, ArduPilot Plane SITL, and synchronized validation data are explicit dependencies. - **Operating environment**: fixed-wing, altitude, terrain, seasonal/visibility classes, and blackout cases covered. - **Failure modes**: VO failure, stale tiles, spoofing, blackout, thermal throttling, false anchors, cache poisoning covered. - **Practitioner concerns**: real-time embedded performance and dataset mismatch covered through survey and benchmark sources. - **Change over time**: DINOv2/VPR models and Jetson/TensorRT assumptions require version-pinned profiling during implementation. ## Mode B Round 2 Addendum — User-Requested Technology Check ### Research Output Class Technical-component selection. The addendum verifies two implementation choices before autodev proceeds to planning: 1. Whether OpenVINS should replace the custom OpenCV-based VO/ESKF direction. 2. Whether DINOv2-VLAD + ALIKED/LightGlue is still the right satellite retrieval and anchor-verification stack. ### Boundary Clarification "Custom OpenCV" is treated as OpenCV for calibration, undistortion, feature geometry, homography/RANSAC, and MRE measurement, plus a project-owned ESKF/mode machine. It is not treated as a naive OpenCV-only replacement for VIO. ### Additional Query Variants Executed - OpenVINS GPL-3 license MSCKF visual inertial odometry documentation monocular IMU 2026 - OpenVINS visual inertial odometry GPS denied UAV MSCKF limitations monocular high altitude nadir camera - why not use OpenVINS production GPL ROS dependency visual inertial odometry limitations - OpenCV license BSD 3-Clause camera calibration findHomography RANSAC documentation 4.x - custom visual odometry OpenCV homography IMU EKF fixed wing UAV satellite imagery GPS denied 2024 - DINOv2 VLAD AnyLoc visual place recognition aerial satellite retrieval benchmark 2024 2025 - DINOv2 VLAD limitations visual place recognition storage compute AnyLoc limitations - DINOv2 TensorRT Jetson performance issue embedding accuracy visual place recognition - ALIKED LightGlue license local feature matching aerial image registration 2024 2025 - ALIKED LightGlue ONNX TensorRT Jetson performance benchmark local feature matching - aerial visual place recognition survey 2024 runtime memory re-ranking SuperGlue LightGlue satellite UAV retrieval ### Addendum Conclusion OpenVINS is better than a pure custom OpenCV-only VIO implementation, but the production architecture should keep OpenCV as the utility layer and keep the project-owned ESKF/mode machine as the shipped estimator. OpenVINS becomes a mandatory benchmark/reference because it does not own the satellite anchor, spoofing/blackout, source-label, cache-write, and MAVLink semantics required by the acceptance criteria, and GPLv3 remains a production dependency blocker. DINOv2-VLAD + CPU-first FAISS + ALIKED/LightGlue remains the preferred anchor stack, with two non-negotiable constraints: retrieval is trigger-based rather than per-frame, and TensorRT/ONNX optimizations are accepted only after descriptor-fidelity and Jetson latency tests.