# Solution Draft ## Assessment Findings | Old Component Solution | Weak Point (functional/security/performance) | New Solution | |------------------------|----------------------------------------------|-------------| | DINOv2-VLAD with possible TensorRT optimization | TensorRT conversion may produce limited speedup and can alter embedding distances on Jetson-class deployments. | Keep DINOv2-VLAD, but require descriptor-fidelity tests against PyTorch/ONNX before TensorRT descriptors are accepted. | | FAISS CPU/GPU optional | FAISS GPU is not a safe default on Jetson ARM64/aarch64 packaging. | Pin FAISS as CPU-first on Jetson; use PQ/IVF and top-K caps before considering custom GPU builds. | | LightGlue local matcher | SuperPoint path has license risk and community confusion. | Keep DISK/ALIKED+LightGlue as production default; SuperPoint remains license-gated benchmark/fallback only. | | COG cache | "COG cache" could be misread as mutable in-place raster updates. | Use write-new COG tile objects plus manifest versioning and sidecars; never mutate COGs in place. | | `GPS_INPUT` output | ArduPilot velocity ignore flags have reported EKF3 pitfalls. | SITL must validate velocity source parameters, ignore flags, and whether zero velocity is ever fused accidentally. | | Visual/satellite anchoring | Draft did not emphasize adversarial/cache integrity enough. | Add signed cache manifests, tile provenance, freshness gates, anchor consistency checks, and FDR audit trail. | ## Product Solution Description Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot `GPS_INPUT` estimates with calibrated covariance and source labels. The production architecture is a trigger-based hybrid estimator: ```text Nav camera + FC telemetry | v Image quality + calibration + orthorectification | +--> Hot path: VO/IMU propagation --> custom ESKF --> GPS_INPUT + QGC + FDR | +--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> DISK/ALIKED+LightGlue --> RANSAC --> ESKF anchor | +--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync ``` Heavy retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age. ## Architecture ### Component: Camera Ingest, Calibration, And Geometry | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit | |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----| | OpenCV geometry utility layer | OpenCV 4.x | Calibration, undistortion, RANSAC homography, MRE measurement | Mature, exact fit, permissive | Not a full estimator | Checkerboard calibration, fixed extrinsics, lens/FOV selection | Local-only | Fast enough for hot-path utility use | MVE in `02_fact_cards.md`; Source #5 | Selected | ### Component: VO / IMU Propagation And Estimator | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit | |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----| | Custom VO/IMU ESKF | OpenCV + custom estimator | Nadir VO/homography + FC IMU/attitude/altitude fused in ESKF with mode labels | Owns covariance, source labels, blackout/spoofing behavior | More implementation effort | Synchronized frames/IMU, calibration, replay tests | No network dependency | Hot path is lightweight | Facts #1, #16 | Selected | | OpenVINS | OpenVINS | Monocular+IMU reference runs | Strong EKF/MSCKF reference | GPL-3 production risk | Replay adapter | Local only | Benchmark only | Source #3 | Reference only | | ORB-SLAM3 | ORB-SLAM3 | Monocular-inertial SLAM | Mature benchmark | GPLv3 and heavier SLAM lifecycle | Calibration/vocabulary/runtime tuning | Local only | Riskier on embedded | Source #4 | Rejected for production | ### Component: Satellite Retrieval And Anchor Verification | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit | |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----| | DINOv2-VLAD + CPU FAISS + DISK/ALIKED+LightGlue | DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC | Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; local match on candidates; TensorRT only after fidelity check | Strong retrieval+geometry structure; avoids per-frame map search | Requires profiling and representative data | Descriptor cache, dynamic K, freshness, RANSAC, Mahalanobis gates | Signed manifests, provenance, stale-tile rejection | Trigger path only; top-K capped | MVE blocks in `02_fact_cards.md`; Sources #6-#9, #21-#25 | Selected with runtime/fidelity gates | | SuperPoint+LightGlue | SuperPoint, LightGlue | Same matcher with SuperPoint features | Strong technical baseline | SuperPoint license risk | Legal review | Local only | Benchmark only | Sources #6, #23 | Needs user decision | | Classical SIFT/ORB | OpenCV | Handcrafted features + homography | Simple and cheap | Weak cross-domain robustness | Feature-rich scenes | Local only | Fast | Source #5 | Regression baseline | ### Component: Cache And Tile Lifecycle | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit | |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----| | COG tile objects + manifest + sidecars | GDAL COG, manifest DB/JSON, FAISS index files | Service tiles and generated tiles are write-new COG objects; active version selected by manifest | Geospatial standard, supports provenance and quality metadata | Descriptor budget pressure | CRS/date/source/m/px/freshness, sidecar hashes | Signed manifests, tile provenance, hash verification | Efficient local reads | Source #18; Facts #21, #29 | Selected | | PMTiles | PMTiles | Read-only archive snapshot | Compact read package | Cannot update in place | Archive rebuild | Hash archive | Good for read-only export | Source #17 | Rejected for live cache | ### Component: MAVLink Integration | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit | |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----| | MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK, pymavlink | MAVSDK subscriptions; pymavlink emits `GPS_INPUT`; Plane SITL validates `GPS1_TYPE=14`, velocity source params, ignore flags, fix types, accuracy fields | Exact output control with good telemetry ergonomics | SITL required to prove Plane behavior | ArduPilot Plane params, QGC, tlog/FDR | Link/source validation, status audit | Light CPU load | Sources #10-#12, #24 | Selected | ### Component: Security And Safety Controls | Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit | |----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----| | Consistency-gated anchor acceptance | Custom ESKF gates, cache manifest verification | Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass | Prevents confident false fixes | Needs calibrated thresholds | Representative replay and Monte Carlo | Rejects stale/poisoned/low-confidence anchors | Lightweight after candidate generation | Facts #16, #17, #28 | Selected | | FDR audit trail | Segmented logs + hashes | Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions | Supports incident analysis and cache-poisoning audits | Schema work | 64 GB rollover | Tamper-evident hashes recommended | Sequential writes | AC-NEW-3 | Selected | ## Runtime Modes | Mode | Trigger | Behavior | `GPS_INPUT` / Telemetry | |------|---------|----------|--------------------------| | `satellite_anchored` | VPR + local match passes all gates | ESKF absolute update; tile write eligible only if sigma gate passes | 3D fix, `horiz_accuracy` >= 95% covariance semi-major axis | | `vo_extrapolated` | VO healthy and anchor age/covariance within bounds | VO/IMU propagation; covariance grows | 3D/2D depending covariance threshold | | `dead_reckoned` | visual blackout or no accepted anchor | IMU-only propagation, monotonic covariance growth | degraded fix type; QGC `VISUAL_BLACKOUT_IMU_ONLY` | | failsafe/no-fix | covariance >500 m or blackout >30 s | stop pretending position is valid | `fix_type=0`, `horiz_accuracy=999.0`, QGC `VISUAL_BLACKOUT_FAILSAFE` | ## Testing Strategy ### Integration / Functional Tests - VO replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs. - Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels. - DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines. - FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64. - LightGlue extractor matrix: DISK vs ALIKED vs SuperPoint benchmark; SuperPoint output excluded from production unless license approved. - COG cache lifecycle: write-new generated tile, update manifest, verify active version and rollback. - `GPS_INPUT` SITL: validate fix type, `horiz_accuracy`, velocity fields, ignore flags, `EK3_SRC1_*` parameters, QGC behavior. - Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout. ### Non-Functional Tests - Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle. - Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget. - FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss. - Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7. - Cold boot: first valid `GPS_INPUT` <30 s p95 across 50 runs. ## References Detailed source registry: `_docs/00_research/01_source_registry.md`. Key added Mode B sources: - DINOv2 TensorRT issue: https://github.com/NVIDIA/TensorRT/issues/4348 - DINOv2 Jetson forum issue: https://forums.developer.nvidia.com/t/dinov2-tensorrt-model-performance-issue/312251 - LightGlue license discussion: https://github.com/cvg/LightGlue/issues/120 - ArduPilot GPS_INPUT velocity issue: https://github.com/ArduPilot/ardupilot/issues/19633 - FAISS install docs: https://github.com/facebookresearch/faiss/blob/main/INSTALL.md - Orthophoto visual geolocalization: https://ar5iv.labs.arxiv.org/html/2103.14381 ## Related Artifacts - Tech stack evaluation: `_docs/01_solution/tech_stack.md` - Component fit matrix: `_docs/00_research/06_component_fit_matrix.md` - Fact cards: `_docs/00_research/02_fact_cards.md`