Files
gps-denied-onboard/_docs/01_solution/solution_draft02.md
T

11 KiB

Solution Draft

Assessment Findings

Old Component Solution Weak Point (functional/security/performance) New Solution
DINOv2-VLAD with possible TensorRT optimization TensorRT conversion may produce limited speedup and can alter embedding distances on Jetson-class deployments. Keep DINOv2-VLAD, but require descriptor-fidelity tests against PyTorch/ONNX before TensorRT descriptors are accepted.
FAISS CPU/GPU optional FAISS GPU is not a safe default on Jetson ARM64/aarch64 packaging. Pin FAISS as CPU-first on Jetson; use PQ/IVF and top-K caps before considering custom GPU builds.
LightGlue local matcher SuperPoint path has license risk and community confusion. Keep DISK/ALIKED+LightGlue as production default; SuperPoint remains license-gated benchmark/fallback only.
COG cache "COG cache" could be misread as mutable in-place raster updates. Use write-new COG tile objects plus manifest versioning and sidecars; never mutate COGs in place.
GPS_INPUT output ArduPilot velocity ignore flags have reported EKF3 pitfalls. SITL must validate velocity source parameters, ignore flags, and whether zero velocity is ever fused accidentally.
Visual/satellite anchoring Draft did not emphasize adversarial/cache integrity enough. Add signed cache manifests, tile provenance, freshness gates, anchor consistency checks, and FDR audit trail.

Product Solution Description

Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot GPS_INPUT estimates with calibrated covariance and source labels.

The production architecture is a trigger-based hybrid estimator:

Nav camera + FC telemetry
        |
        v
Image quality + calibration + orthorectification
        |
        +--> Hot path: VO/IMU propagation --> custom ESKF --> GPS_INPUT + QGC + FDR
        |
        +--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> DISK/ALIKED+LightGlue --> RANSAC --> ESKF anchor
        |
        +--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync

Heavy retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age.

Architecture

Component: Camera Ingest, Calibration, And Geometry

Solution Tools Pinned Mode/Config Advantages Limitations Requirements Security Performance API Capability Evidence Fit
OpenCV geometry utility layer OpenCV 4.x Calibration, undistortion, RANSAC homography, MRE measurement Mature, exact fit, permissive Not a full estimator Checkerboard calibration, fixed extrinsics, lens/FOV selection Local-only Fast enough for hot-path utility use MVE in 02_fact_cards.md; Source #5 Selected

Component: VO / IMU Propagation And Estimator

Solution Tools Pinned Mode/Config Advantages Limitations Requirements Security Performance API Capability Evidence Fit
Custom VO/IMU ESKF OpenCV + custom estimator Nadir VO/homography + FC IMU/attitude/altitude fused in ESKF with mode labels Owns covariance, source labels, blackout/spoofing behavior More implementation effort Synchronized frames/IMU, calibration, replay tests No network dependency Hot path is lightweight Facts #1, #16 Selected
OpenVINS OpenVINS Monocular+IMU reference runs Strong EKF/MSCKF reference GPL-3 production risk Replay adapter Local only Benchmark only Source #3 Reference only
ORB-SLAM3 ORB-SLAM3 Monocular-inertial SLAM Mature benchmark GPLv3 and heavier SLAM lifecycle Calibration/vocabulary/runtime tuning Local only Riskier on embedded Source #4 Rejected for production

Component: Satellite Retrieval And Anchor Verification

Solution Tools Pinned Mode/Config Advantages Limitations Requirements Security Performance API Capability Evidence Fit
DINOv2-VLAD + CPU FAISS + DISK/ALIKED+LightGlue DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; local match on candidates; TensorRT only after fidelity check Strong retrieval+geometry structure; avoids per-frame map search Requires profiling and representative data Descriptor cache, dynamic K, freshness, RANSAC, Mahalanobis gates Signed manifests, provenance, stale-tile rejection Trigger path only; top-K capped MVE blocks in 02_fact_cards.md; Sources #6-#9, #21-#25 Selected with runtime/fidelity gates
SuperPoint+LightGlue SuperPoint, LightGlue Same matcher with SuperPoint features Strong technical baseline SuperPoint license risk Legal review Local only Benchmark only Sources #6, #23 Needs user decision
Classical SIFT/ORB OpenCV Handcrafted features + homography Simple and cheap Weak cross-domain robustness Feature-rich scenes Local only Fast Source #5 Regression baseline

Component: Cache And Tile Lifecycle

Solution Tools Pinned Mode/Config Advantages Limitations Requirements Security Performance API Capability Evidence Fit
COG tile objects + manifest + sidecars GDAL COG, manifest DB/JSON, FAISS index files Service tiles and generated tiles are write-new COG objects; active version selected by manifest Geospatial standard, supports provenance and quality metadata Descriptor budget pressure CRS/date/source/m/px/freshness, sidecar hashes Signed manifests, tile provenance, hash verification Efficient local reads Source #18; Facts #21, #29 Selected
PMTiles PMTiles Read-only archive snapshot Compact read package Cannot update in place Archive rebuild Hash archive Good for read-only export Source #17 Rejected for live cache
Solution Tools Pinned Mode/Config Advantages Limitations Requirements Security Performance API Capability Evidence Fit
MAVSDK telemetry + pymavlink GPS_INPUT MAVSDK, pymavlink MAVSDK subscriptions; pymavlink emits GPS_INPUT; Plane SITL validates GPS1_TYPE=14, velocity source params, ignore flags, fix types, accuracy fields Exact output control with good telemetry ergonomics SITL required to prove Plane behavior ArduPilot Plane params, QGC, tlog/FDR Link/source validation, status audit Light CPU load Sources #10-#12, #24 Selected

Component: Security And Safety Controls

Solution Tools Pinned Mode/Config Advantages Limitations Requirements Security Performance API Capability Evidence Fit
Consistency-gated anchor acceptance Custom ESKF gates, cache manifest verification Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass Prevents confident false fixes Needs calibrated thresholds Representative replay and Monte Carlo Rejects stale/poisoned/low-confidence anchors Lightweight after candidate generation Facts #16, #17, #28 Selected
FDR audit trail Segmented logs + hashes Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions Supports incident analysis and cache-poisoning audits Schema work 64 GB rollover Tamper-evident hashes recommended Sequential writes AC-NEW-3 Selected

Runtime Modes

Mode Trigger Behavior GPS_INPUT / Telemetry
satellite_anchored VPR + local match passes all gates ESKF absolute update; tile write eligible only if sigma gate passes 3D fix, horiz_accuracy >= 95% covariance semi-major axis
vo_extrapolated VO healthy and anchor age/covariance within bounds VO/IMU propagation; covariance grows 3D/2D depending covariance threshold
dead_reckoned visual blackout or no accepted anchor IMU-only propagation, monotonic covariance growth degraded fix type; QGC VISUAL_BLACKOUT_IMU_ONLY
failsafe/no-fix covariance >500 m or blackout >30 s stop pretending position is valid fix_type=0, horiz_accuracy=999.0, QGC VISUAL_BLACKOUT_FAILSAFE

Testing Strategy

Integration / Functional Tests

  • VO replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs.
  • Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels.
  • DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines.
  • FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64.
  • LightGlue extractor matrix: DISK vs ALIKED vs SuperPoint benchmark; SuperPoint output excluded from production unless license approved.
  • COG cache lifecycle: write-new generated tile, update manifest, verify active version and rollback.
  • GPS_INPUT SITL: validate fix type, horiz_accuracy, velocity fields, ignore flags, EK3_SRC1_* parameters, QGC behavior.
  • Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout.

Non-Functional Tests

  • Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle.
  • Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget.
  • FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss.
  • Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7.
  • Cold boot: first valid GPS_INPUT <30 s p95 across 50 runs.

References

Detailed source registry: _docs/00_research/01_source_registry.md.

Key added Mode B sources:

  • Tech stack evaluation: _docs/01_solution/tech_stack.md
  • Component fit matrix: _docs/00_research/06_component_fit_matrix.md
  • Fact cards: _docs/00_research/02_fact_cards.md