Files
gps-denied-onboard/_docs/01_solution/solution_draft03.md
T
2026-04-29 17:03:57 +03:00

17 KiB

Solution Draft

Assessment Findings

Old Component Solution Weak Point (functional/security/performance) New Solution
"License-cleared extractor" in the local matcher Too abstract for planning; tasks need concrete candidates and legal baselines. Use ALIKED + LightGlue as the first learned-feature candidate, OpenCV SIFT/AKAZE as the legal baseline, and DeDoDe as an experimental fallback.
Image pipeline without an explicit scheduler A FIFO frame queue can violate <400 ms p95 latency even when individual stages are fast, because frames arrive every ~333 ms at 3 Hz. Add a bounded latest-frame scheduler: camera queue size 1, explicit frame-drop accounting, deadline-aware VPR/matching, and timestamp-correct GPS_INPUT.
SuperPoint + LightGlue-style local matching Official Magic Leap SuperPoint pretrained weights are noncommercial research-only. Reject official SuperPoint weights for product v1 unless a commercial license is obtained.
AnyLoc/DINOv2-VLAD VPR chunks Raw 49,152-dimensional descriptors can consume too much RAM/cache once multi-scale chunks, overlap, indexes, and metadata are included. Keep event-triggered VPR, but add a mandatory descriptor compression/index-size gate before implementation freeze.
10 GB persistent satellite cache 400 km² at 0.3-0.5 m/px plus overviews, manifests, VPR descriptors, and generated tiles is not proven by zoom-level math. Keep the 10 GB target, but require a representative cache-packing benchmark using Suite Satellite Service sample imagery.
Public datasets for validation AerialVL/UAV-VisLoc are useful but do not prove FC IMU timing, covariance calibration, thermal behavior, or MAVLink source behavior. Use public datasets for early VPR/matcher tests, then require ArduPilot SITL IMU traces and real FC/camera timing captures before final acceptance.
GPS_INPUT + ODOMETRY hybrid Richer ODOMETRY semantics are attractive, but source-fusion behavior is version-sensitive. v1 emits GPS_INPUT only. ODOMETRY remains a v1.1 item gated by exact ArduPilot release and SITL proof.

Product Solution Description

Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the WGS84 coordinate of each navigation-camera frame center, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible GPS_INPUT messages with calibrated confidence.

Nav camera + FC IMU/attitude/altitude
  -> bounded latest-frame scheduler + timestamp sync
  -> calibration + frame normalization
  -> planar VO/IMU relative motion
  -> conditional compressed-descriptor VPR over preloaded satellite chunks
  -> ALIKED/LightGlue or SIFT/AKAZE local geometric verification
  -> ESKF state + covariance + source label
  -> pymavlink GPS_INPUT + local API + FDR

The architecture separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors. VPR runs only on cold start, sharp turns, disconnected segments, VO failure, covariance growth, or operator-assisted relocalization. The scheduler owns frame freshness: if processing pressure rises, it drops stale frames instead of letting a FIFO backlog delay flight-controller output.

Architecture

Component: Real-Time Frame Scheduler

Solution Tools Advantages Limitations Requirements Security Performance Fit
Bounded latest-frame scheduler Python/C++ worker loop, monotonic timestamps, metrics counters Makes latency/drop behavior explicit and prevents stale FIFO backlog Requires careful timestamp ownership across camera, IMU, VO, and MAVLink output Camera queue size 1, drop accounting, deadline-aware VPR/matching, IMU propagation between image fixes Logs every drop and stale-frame rejection to FDR Supports <400 ms p95 and <=10% frame drops without batching Selected

Exact-fit evidence:

  • Project constraints checked: 3 Hz camera input, AC-4.1 latency/drop budget, AC-4.4 no batching, GPS_INPUT timestamp correctness.
  • Evidence: Fact #27.
  • Disqualifiers: unbounded FIFO image queues are rejected.

Component: Frame Ingest, Calibration, and Time Sync

Solution Tools Advantages Limitations Requirements Security Performance Fit
Python/C++ ingest with OpenCV/GStreamer, camera calibration files, and MAVLink timestamp alignment OpenCV, GStreamer, NumPy, calibration YAML Simple, debuggable, works with USB/MIPI/GigE once the camera module is pinned Driver and hardware timestamp behavior are module-specific Locked nav camera/lens, checkerboard calibration, FC clock sync, altitude/attitude stream Reject unexpected dimensions/intrinsics; signed calibration profiles; no raw-frame persistence 3 Hz full-res ingest; hot path may downsample/ROI Selected

Exact-fit evidence:

  • Project constraints checked: fixed downward nav camera, no raw photo storage, high-res frames, FC IMU/attitude, 400 ms p95.
  • Evidence: Facts #4, #9, #20.
  • Disqualifiers: final v1 camera/lens and hardware timestamp behavior must be pinned before calibration tasks.

Component: Relative Motion Estimation

Solution Tools Advantages Limitations Requirements Security Performance Fit
Custom planar VO/IMU module OpenCV, Eigen/SciPy, optional C++ hot path Matches nadir fixed camera, flat-terrain assumption, and FC attitude/altitude Needs calibration, rolling-shutter assessment, and covariance model Camera intrinsics, altitude, FC attitude/IMU, frame timestamps Reject low-inlier/high-innovation updates Must stay within steady-state deadline after downsampling/ROI Selected
NVIDIA cuVSLAM Isaac ROS/cuVSLAM Strong Jetson ecosystem and IMU fallback Official docs emphasize stereo-visual-inertial assumptions; IMU-only fallback is short-duration Stereo or documented exact monocular path ROS 2 surface area Good Jetson acceleration, wrong v1 input fit Rejected for v1
ORB-SLAM3 / VINS-Fusion Research SLAM/VIO stacks Mono-IMU capability GPL-family licensing and product integration risk Legal approval, ROS/C++ integration Larger dependency surface Benchmark/offline only Experimental only

Exact-fit evidence:

  • Project constraints checked: single downward nav camera, Jetson runtime, flat terrain, VO drift AC.
  • Evidence: Facts #7, #8, #20.
  • Disqualifiers: stereo-required or GPL-family stacks are not product dependencies for v1.

Component: Satellite Cache and Preprocessing

Solution Tools Advantages Limitations Requirements Security Performance Fit
Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars GDAL/Rasterio, SQLite, local manifest schema Clear offline service boundary, explicit pixel-size/freshness metadata, fast local lookup The 10 GB budget is unproven until representative imagery, overviews, descriptors, and sidecars are packed together 0.5 m/px minimum, 0.3 m/px ideal, capture date, source, CRS, tile matrix, compression profile Signed manifests, checksums, immutable service-source tiles, stale-tile rejection Cache-packing benchmark must include descriptors and generated-tile sidecars Selected with storage gate

Exact-fit evidence:

  • Project constraints checked: offline-only cache, 10 GB cap, freshness gates, mid-flight tile write-back, no direct provider calls.
  • Evidence: Facts #13, #16, #21.
  • Disqualifiers: zoom level alone cannot define physical resolution or storage cost.

Component: Visual Place Recognition

Solution Tools Advantages Limitations Requirements Security Performance Fit
AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m chunks PyTorch/TensorRT export path, FAISS CPU/HNSW-flat baseline, PCA/quantization Strong cross-domain retrieval family; offline gallery descriptors; event-triggered online cost Raw 49,152-dimensional descriptors can violate memory/cache budgets Precomputed compressed descriptors, top-K dynamic sizing, covariance-aware search window Retrieval is candidate generation only; never trusted without local verification VPR invoked only on relocalization triggers; descriptor compression/index-size gate required Selected with compression gate
FAISS GPU/cuVS FAISS source build or cuVS Potential lower query latency ARM64 GPU deployment must be proven; not assumed Jetson source build and benchmark Same candidate-only trust model Optimization path only Experimental only

Exact-fit evidence:

  • Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency, 10 GB cache cap.
  • Evidence: Facts #5, #6, #10, #15, #19.
  • Disqualifiers: uncompressed descriptors and per-frame VPR are rejected.

Component: Local Satellite/UAV Geometric Verification

Solution Tools Advantages Limitations Requirements Security Performance Fit
ALIKED + LightGlue + RANSAC LightGlue, ALIKED, OpenCV, ONNX/TensorRT path Concrete learned-feature candidate without the official SuperPoint license blocker Jetson speed and sparse-steppe accuracy must be measured Candidate chunk/tile, camera intrinsics, attitude, altitude, freshness metadata Strict inlier, reprojection, freshness, Mahalanobis, and covariance gates Inline matcher target <=200 ms/pair Selected candidate
OpenCV SIFT/AKAZE + classical matching OpenCV Commercial-safe legal baseline and regression target May be weaker on cross-domain imagery and sparse fields Same geometric verification inputs Same verification gates CPU/GPU baseline before learned extractor optimization Selected baseline
DeDoDe DeDoDe, ONNX/TensorRT ports MIT-licensed learned-feature fallback Model size, DINOv2-related variants, and Jetson runtime need validation Model artifact approval and benchmark Same verification gates Fallback if ALIKED/SIFT miss robustness targets Experimental only
Official Magic Leap SuperPoint pretrained weights SuperPoint Technically strong local features Noncommercial research license blocks product use by default Separate commercial license License noncompliance risk Not product path Rejected for v1

Exact-fit evidence:

  • Project constraints checked: product licensing, cross-view false-match risk, sparse terrain, <400 ms p95.
  • Evidence: Facts #10, #17, #18, #24, #25, #26.
  • Disqualifiers: official SuperPoint weights are not selected unless licensing changes.

Component: State Estimator and Confidence

Solution Tools Advantages Limitations Requirements Security Performance Fit
Error-state Kalman filter in local NED/ENU NumPy/SciPy prototype, C++ Eigen if profiling requires Owns covariance, source labels, anchor gating, and output smoothing Requires calibration, Monte Carlo validation, and conservative covariance floors IMU propagation, VO deltas, satellite-anchor measurements, innovation gates Reject overconfident anchors; log every gate decision Bounded CPU path; hot path may move to C++ only if measured Selected

Exact-fit evidence:

  • Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
  • Evidence: Facts #1, #2, #9, #10.
  • Disqualifiers: direct matcher-to-GPS output is rejected.

Component: Flight Controller and Ground Station Interface

Solution Tools Advantages Limitations Requirements Security Performance Fit
v1 GPS_INPUT emitter pymavlink Matches GPS-replacement framing and ArduPilot MAVLink GPS input path Less expressive than full external-nav ODOMETRY; fields must be honest raw-GPS-sensor values ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields Validate rate, sequence, fix_type, and fail-closed behavior 5-10 Hz output; freshest estimator timestamp only Selected
ODOMETRY auxiliary pymavlink Better covariance/yaw semantics EKF source-fusion and source-switching risk by ArduPilot version Version-pinned SITL and source-switch tests Avoid double-fusion v1.1 only Deferred

Exact-fit evidence:

  • Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
  • Evidence: Facts #1, #2, #3.
  • Disqualifiers: v1 emits no ODOMETRY.

Component: Local API, Object Localization, and FDR

Solution Tools Advantages Limitations Requirements Security Performance Fit
FastAPI local service + segmented FDR writer FastAPI, Pydantic, SQLite/Parquet/JSONL segments OpenAPI docs, health/session/object endpoints, replayable FDR Must stay outside hot frame path Local-first API, object pixel validation, rollover schema, no raw frame retention Bind localhost by default; JWT/API key for network exposure; segment checksums 1-2 Hz GCS summary; high-rate data local only Selected

Exact-fit evidence:

  • Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation, no raw frame storage.
  • Evidence: Fact #14 and Context7 FastAPI docs.
  • Disqualifiers: API cannot block GPS_INPUT emission.

Testing Strategy

Integration / Functional Tests

  • Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds and no frame over the allowed maximum error.
  • Simulate heavy VPR/local-matching frames and assert stale frames are dropped, not queued, with <=10% drops under the defined sustained-load scenario.
  • Simulate sharp turns, disconnected segments, and 350 m outliers; assert VPR/local verification recovers or the estimator downgrades confidence without false anchors.
  • Run ArduPilot SITL with v1 parameters and assert accepted GPS_INPUT messages at 5-10 Hz, no ODOMETRY emission, correct fix_type transitions, and QGC downsampled telemetry.
  • Inject stale tiles, mismatched manifests, corrupted descriptors, and cache-poisoning candidates; assert rejection or confidence downgrade.
  • Validate object localization with level-flight AI-camera geometry and out-of-frame input errors.

Non-Functional Tests

  • Jetson benchmark: capture-to-GPS_INPUT p95 <400 ms with scheduler, VO, VPR triggers, local matcher, ESKF, API, and FDR active.
  • Local matcher bake-off: ALIKED + LightGlue, SIFT/AKAZE, and DeDoDe on sample/public datasets, measuring accuracy, false positives, latency, memory, and license status.
  • VPR memory/index benchmark: prove compressed descriptors, FAISS index, TensorRT engines, and runtime buffers stay below 8 GB.
  • Cache-packing benchmark: package representative 400 km² imagery with overviews, manifests, descriptors, indexes, and generated-tile sidecars under the 10 GB persistent-cache budget.
  • Thermal soak: 25 W workload for 8 hours at the upper environmental envelope without throttling.
  • Monte Carlo false-position and cache-poisoning validation over public datasets plus SITL/real FC traces.
  • License and dependency scan: fail CI if noncommercial SuperPoint weights or unapproved model artifacts enter product builds.

References

  • ArduPilot MAVProxy GPSInput documentation: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
  • MAVLink GPS_INPUT message spec: https://mavlink.io/en/messages/common.html#GPS_INPUT
  • NVIDIA Isaac ROS cuVSLAM docs: https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html
  • Magic Leap SuperPoint pretrained network license: https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE
  • LightGlue repository: https://github.com/cvg/LightGlue
  • DeDoDe repository: https://github.com/Parskatt/DeDoDe
  • OpenCV SIFT source: https://github.com/opencv/opencv/blob/4.x/modules/features2d/src/sift.dispatch.cpp
  • AnyLoc/DINO repository: https://github.com/AnyLoc/DINO
  • GDAL COG driver and OGC COG standard: https://gdal.org/en/stable/drivers/raster/cog.html, http://www.opengis.net/doc/is/COG/1.0
  • AerialVL dataset: https://github.com/hmf21/AerialVL
  • UAV-VisLoc paper: https://arxiv.org/html/2405.11936v1
  • AC assessment: _docs/00_research/00_ac_assessment.md
  • Question decomposition: _docs/00_research/00_question_decomposition.md
  • Source registry: _docs/00_research/01_source_registry.md
  • Fact cards: _docs/00_research/02_fact_cards.md
  • Component fit matrix: _docs/00_research/06_component_fit_matrix.md
  • Tech stack evaluation: _docs/01_solution/tech_stack.md
  • Security analysis: _docs/01_solution/security_analysis.md