Files
gps-denied-onboard/_docs/01_solution/solution_draft02.md
T
2026-04-29 17:03:57 +03:00

15 KiB

Solution Draft

Assessment Findings

Old Component Solution Weak Point (functional/security/performance) New Solution
SuperPoint + LightGlue-style local matching Official Magic Leap SuperPoint pretrained weights are noncommercial research-only, so they are not a selectable product dependency by default. Use a local-verification abstraction: LightGlue only with a license-cleared extractor, plus SIFT/AKAZE/classical matching as the legal v1 baseline.
AnyLoc/DINOv2-VLAD VPR chunks Raw 49,152-dimensional descriptors can consume too much RAM/cache once multi-scale chunks, overlap, indexes, and metadata are included. Keep event-triggered VPR, but add a mandatory descriptor compression/index-size gate before implementation freeze.
10 GB persistent satellite cache 400 km² at 0.3-0.5 m/px plus overviews, manifests, VPR descriptors, and generated tiles is not proven by zoom-level math. Keep the 10 GB target, but require a representative cache-packing benchmark using Suite Satellite Service sample imagery.
Public datasets for validation AerialVL/UAV-VisLoc are useful but do not prove FC IMU timing, covariance calibration, thermal behavior, or MAVLink source behavior. Use public datasets for early VPR/matcher tests, then require ArduPilot SITL IMU traces and real FC/camera timing captures before final acceptance.
cuVSLAM rejection rationale The first draft treated the stereo mismatch as enough; official docs also show IMU-only degraded tracking is short-duration. Keep cuVSLAM rejected for v1 product use, but retain it as a benchmark/reference if future hardware adds stereo.
GPS_INPUT + ODOMETRY hybrid Richer ODOMETRY semantics are attractive, but source-fusion behavior is version-sensitive. v1 emits GPS_INPUT only. ODOMETRY remains a v1.1 item gated by exact ArduPilot release and SITL proof.

Product Solution Description

Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the WGS84 coordinate of each navigation-camera frame center, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible GPS_INPUT messages with calibrated confidence.

Nav camera + FC IMU/attitude/altitude
  -> frame ingest + timestamp sync + calibration
  -> planar VO/IMU relative motion
  -> conditional compressed-descriptor VPR over preloaded satellite chunks
  -> license-cleared local geometric verification
  -> ESKF state + covariance + source label
  -> pymavlink GPS_INPUT + local API + FDR

The architecture separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors. VPR runs only on cold start, sharp turns, disconnected segments, VO failure, covariance growth, or operator-assisted relocalization.

Architecture

Component: Frame Ingest, Calibration, and Time Sync

Solution Tools Advantages Limitations Requirements Security Performance Fit
Python/C++ ingest with OpenCV/GStreamer, camera calibration files, and MAVLink timestamp alignment OpenCV, GStreamer, NumPy, calibration YAML Simple, debuggable, works with USB/MIPI/GigE once the camera module is pinned Driver and hardware timestamp behavior are module-specific Locked nav camera/lens, checkerboard calibration, FC clock sync, altitude/attitude stream Reject unexpected dimensions/intrinsics; signed calibration profiles; no raw-frame persistence 3 Hz full-res ingest; processing may downsample/ROI before hot path Selected

Exact-fit evidence:

  • Project constraints checked: fixed downward nav camera, no raw photo storage, high-res frames, FC IMU/attitude, 400 ms p95.
  • Evidence: Facts #4, #9, #20.
  • Disqualifiers: final v1 camera/lens and hardware timestamp behavior must be pinned before calibration tasks.

Component: Relative Motion Estimation

Solution Tools Advantages Limitations Requirements Security Performance Fit
Custom planar VO/IMU module OpenCV, Eigen/SciPy, optional C++ hot path Matches nadir fixed camera, flat-terrain assumption, and FC attitude/altitude Needs careful calibration, rolling-shutter assessment, and covariance model Camera intrinsics, altitude, FC attitude/IMU, frame timestamps Reject low-inlier/high-innovation updates Must stay within steady-state budget after downsampling/ROI Selected
NVIDIA cuVSLAM Isaac ROS/cuVSLAM Strong Jetson ecosystem and IMU fallback Official docs emphasize stereo-visual-inertial assumptions; IMU-only fallback is short-duration Stereo or documented exact monocular path ROS 2 surface area Good Jetson acceleration, wrong v1 input fit Rejected for v1
ORB-SLAM3 / VINS-Fusion Research SLAM/VIO stacks Mono-IMU capability GPL-family licensing and product integration risk Legal approval, ROS/C++ integration Larger dependency surface Benchmark/offline only Experimental only

Exact-fit evidence:

  • Project constraints checked: single downward nav camera, Jetson runtime, flat terrain, VO drift AC.
  • Evidence: Facts #7, #8, #20.
  • Disqualifiers: stereo-required or GPL-family stacks are not product dependencies for v1.

Component: Satellite Cache and Preprocessing

Solution Tools Advantages Limitations Requirements Security Performance Fit
Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars GDAL/Rasterio, SQLite, local manifest schema Clear offline service boundary, explicit pixel-size/freshness metadata, fast local lookup The 10 GB budget is unproven until representative imagery, overviews, descriptors, and sidecars are packed together 0.5 m/px minimum, 0.3 m/px ideal, capture date, source, CRS, tile matrix, compression profile Signed manifests, checksums, immutable service-source tiles, stale-tile rejection Cache-packing benchmark must include descriptors and generated-tile sidecars Selected with storage gate

Exact-fit evidence:

  • Project constraints checked: offline-only cache, 10 GB cap, freshness gates, mid-flight tile write-back, no direct provider calls.
  • Evidence: Facts #13, #16, #21.
  • Disqualifiers: zoom level alone cannot define physical resolution or storage cost.

Component: Visual Place Recognition

Solution Tools Advantages Limitations Requirements Security Performance Fit
AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m chunks PyTorch/TensorRT export path, FAISS CPU/HNSW-flat baseline, PCA/quantization Strong cross-domain retrieval family; offline gallery descriptors; event-triggered online cost Raw 49,152-dimensional descriptors can violate memory/cache budgets Precomputed compressed descriptors, top-K dynamic sizing, covariance-aware search window Retrieval is candidate generation only; never trusted without local verification VPR invoked only on relocalization triggers; descriptor compression/index-size gate required Selected with compression gate
FAISS GPU/cuVS FAISS source build or cuVS Potential lower query latency ARM64 GPU deployment must be proven; not assumed Jetson source build and benchmark Same candidate-only trust model Optimization path only Experimental only

Exact-fit evidence:

  • Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency, 10 GB cache cap.
  • Evidence: Facts #5, #6, #10, #15, #19.
  • Disqualifiers: uncompressed descriptors and per-frame VPR are rejected.

Component: Local Satellite/UAV Geometric Verification

Solution Tools Advantages Limitations Requirements Security Performance Fit
License-cleared feature extractor + LightGlue/classical matching + RANSAC homography/geodesic projection LightGlue where licensed, SIFT/AKAZE fallback, OpenCV, TensorRT if applicable Keeps geometric proof stage without depending on noncommercial weights Exact extractor accuracy and Jetson speed must be measured on steppe/agricultural imagery Candidate chunk/tile, camera intrinsics, attitude, altitude, freshness metadata Strict inlier, reprojection, freshness, Mahalanobis, and covariance gates Inline matcher target <=200 ms/pair; fallback relocalization can use longer budget Selected
Official Magic Leap SuperPoint pretrained weights SuperPoint Technically strong local features Noncommercial research license blocks product use by default Separate commercial license License noncompliance risk Not product path Rejected for v1

Exact-fit evidence:

  • Project constraints checked: product licensing, cross-view false-match risk, sparse terrain, <400 ms p95.
  • Evidence: Facts #10, #17, #18.
  • Disqualifiers: official SuperPoint weights are not selected unless licensing changes.

Component: State Estimator and Confidence

Solution Tools Advantages Limitations Requirements Security Performance Fit
Error-state Kalman filter in local NED/ENU NumPy/SciPy prototype, C++ Eigen if profiling requires Owns covariance, source labels, anchor gating, and output smoothing Requires calibration, Monte Carlo validation, and conservative covariance floors IMU propagation, VO deltas, satellite-anchor measurements, innovation gates Reject overconfident anchors; log every gate decision Bounded CPU path; hot path may move to C++ only if measured Selected

Exact-fit evidence:

  • Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
  • Evidence: Facts #1, #2, #9, #10.
  • Disqualifiers: direct matcher-to-GPS output is rejected.

Component: Flight Controller and Ground Station Interface

Solution Tools Advantages Limitations Requirements Security Performance Fit
v1 GPS_INPUT emitter pymavlink Matches GPS-replacement framing and ArduPilot MAVLink GPS input path Less expressive than full external-nav ODOMETRY; fields must be honest raw-GPS-sensor values ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields Validate rate, sequence, fix_type, and fail-closed behavior 5-10 Hz output; never batch frame-center outputs Selected
ODOMETRY auxiliary pymavlink Better covariance/yaw semantics EKF source-fusion and source-switching risk by ArduPilot version Version-pinned SITL and source-switch tests Avoid double-fusion v1.1 only Deferred

Exact-fit evidence:

  • Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
  • Evidence: Facts #1, #2, #3.
  • Disqualifiers: v1 emits no ODOMETRY.

Component: Local API, Object Localization, and FDR

Solution Tools Advantages Limitations Requirements Security Performance Fit
FastAPI local service + segmented FDR writer FastAPI, Pydantic, SQLite/Parquet/JSONL segments OpenAPI docs, health/session/object endpoints, replayable FDR Must stay outside hot frame path Local-first API, object pixel validation, rollover schema, no raw frame retention Bind localhost by default; JWT/API key for network exposure; segment checksums 1-2 Hz GCS summary; high-rate data local only Selected

Exact-fit evidence:

  • Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation, no raw frame storage.
  • Evidence: Fact #14 and Context7 FastAPI docs.
  • Disqualifiers: API cannot block GPS_INPUT emission.

Testing Strategy

Integration / Functional Tests

  • Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds and no frame over the allowed maximum error.
  • Simulate sharp turns, disconnected segments, and 350 m outliers; assert VPR/local verification recovers or the estimator downgrades confidence without false anchors.
  • Run ArduPilot SITL with v1 parameters and assert accepted GPS_INPUT messages at 5-10 Hz, no ODOMETRY emission, correct fix_type transitions, and QGC downsampled telemetry.
  • Inject stale tiles, mismatched manifests, corrupted descriptors, and cache-poisoning candidates; assert rejection or confidence downgrade.
  • Validate object localization with level-flight AI-camera geometry and out-of-frame input errors.

Non-Functional Tests

  • Jetson benchmark: capture-to-GPS_INPUT p95 <400 ms with VO, VPR triggers, local matcher, ESKF, API, and FDR active.
  • VPR memory/index benchmark: prove compressed descriptors, FAISS index, TensorRT engines, and runtime buffers stay below 8 GB.
  • Cache-packing benchmark: package representative 400 km² imagery with overviews, manifests, descriptors, indexes, and generated-tile sidecars under the 10 GB persistent-cache budget.
  • Thermal soak: 25 W workload for 8 hours at the upper environmental envelope without throttling.
  • Monte Carlo false-position and cache-poisoning validation over public datasets plus SITL/real FC traces.
  • License and dependency scan: fail CI if noncommercial SuperPoint weights or unapproved model artifacts enter product builds.

References

  • ArduPilot MAVProxy GPSInput documentation: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
  • MAVLink GPS_INPUT message spec: https://mavlink.io/en/messages/common.html#GPS_INPUT
  • NVIDIA Isaac ROS cuVSLAM docs: https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html
  • Magic Leap SuperPoint pretrained network license: https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE
  • LightGlue license and extractor-license issue: https://github.com/cvg/LightGlue/blob/main/LICENSE, https://github.com/cvg/LightGlue/issues/38
  • AnyLoc/DINO repository: https://github.com/AnyLoc/DINO
  • GDAL COG driver and OGC COG standard: https://gdal.org/en/stable/drivers/raster/cog.html, http://www.opengis.net/doc/is/COG/1.0
  • AerialVL dataset: https://github.com/hmf21/AerialVL
  • UAV-VisLoc paper: https://arxiv.org/html/2405.11936v1
  • AC assessment: _docs/00_research/00_ac_assessment.md
  • Question decomposition: _docs/00_research/00_question_decomposition.md
  • Source registry: _docs/00_research/01_source_registry.md
  • Fact cards: _docs/00_research/02_fact_cards.md
  • Component fit matrix: _docs/00_research/06_component_fit_matrix.md
  • Tech stack evaluation: _docs/01_solution/tech_stack.md
  • Security analysis: _docs/01_solution/security_analysis.md