mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 09:51:13 +00:00
184 lines
17 KiB
Markdown
184 lines
17 KiB
Markdown
# Solution Draft
|
|
|
|
## Assessment Findings
|
|
|
|
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|
|
|------------------------|----------------------------------------------|--------------|
|
|
| "License-cleared extractor" in the local matcher | Too abstract for planning; tasks need concrete candidates and legal baselines. | Use ALIKED + LightGlue as the first learned-feature candidate, OpenCV SIFT/AKAZE as the legal baseline, and DeDoDe as an experimental fallback. |
|
|
| Image pipeline without an explicit scheduler | A FIFO frame queue can violate <400 ms p95 latency even when individual stages are fast, because frames arrive every ~333 ms at 3 Hz. | Add a bounded latest-frame scheduler: camera queue size 1, explicit frame-drop accounting, deadline-aware VPR/matching, and timestamp-correct `GPS_INPUT`. |
|
|
| SuperPoint + LightGlue-style local matching | Official Magic Leap SuperPoint pretrained weights are noncommercial research-only. | Reject official SuperPoint weights for product v1 unless a commercial license is obtained. |
|
|
| AnyLoc/DINOv2-VLAD VPR chunks | Raw 49,152-dimensional descriptors can consume too much RAM/cache once multi-scale chunks, overlap, indexes, and metadata are included. | Keep event-triggered VPR, but add a mandatory descriptor compression/index-size gate before implementation freeze. |
|
|
| 10 GB persistent satellite cache | 400 km² at 0.3-0.5 m/px plus overviews, manifests, VPR descriptors, and generated tiles is not proven by zoom-level math. | Keep the 10 GB target, but require a representative cache-packing benchmark using Suite Satellite Service sample imagery. |
|
|
| Public datasets for validation | AerialVL/UAV-VisLoc are useful but do not prove FC IMU timing, covariance calibration, thermal behavior, or MAVLink source behavior. | Use public datasets for early VPR/matcher tests, then require ArduPilot SITL IMU traces and real FC/camera timing captures before final acceptance. |
|
|
| `GPS_INPUT` + `ODOMETRY` hybrid | Richer `ODOMETRY` semantics are attractive, but source-fusion behavior is version-sensitive. | v1 emits `GPS_INPUT` only. `ODOMETRY` remains a v1.1 item gated by exact ArduPilot release and SITL proof. |
|
|
|
|
## Product Solution Description
|
|
|
|
Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the WGS84 coordinate of each navigation-camera frame center, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible `GPS_INPUT` messages with calibrated confidence.
|
|
|
|
```text
|
|
Nav camera + FC IMU/attitude/altitude
|
|
-> bounded latest-frame scheduler + timestamp sync
|
|
-> calibration + frame normalization
|
|
-> planar VO/IMU relative motion
|
|
-> conditional compressed-descriptor VPR over preloaded satellite chunks
|
|
-> ALIKED/LightGlue or SIFT/AKAZE local geometric verification
|
|
-> ESKF state + covariance + source label
|
|
-> pymavlink GPS_INPUT + local API + FDR
|
|
```
|
|
|
|
The architecture separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors. VPR runs only on cold start, sharp turns, disconnected segments, VO failure, covariance growth, or operator-assisted relocalization. The scheduler owns frame freshness: if processing pressure rises, it drops stale frames instead of letting a FIFO backlog delay flight-controller output.
|
|
|
|
## Architecture
|
|
|
|
### Component: Real-Time Frame Scheduler
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| Bounded latest-frame scheduler | Python/C++ worker loop, monotonic timestamps, metrics counters | Makes latency/drop behavior explicit and prevents stale FIFO backlog | Requires careful timestamp ownership across camera, IMU, VO, and MAVLink output | Camera queue size 1, drop accounting, deadline-aware VPR/matching, IMU propagation between image fixes | Logs every drop and stale-frame rejection to FDR | Supports <400 ms p95 and <=10% frame drops without batching | Selected |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: 3 Hz camera input, AC-4.1 latency/drop budget, AC-4.4 no batching, GPS_INPUT timestamp correctness.
|
|
- Evidence: Fact #27.
|
|
- Disqualifiers: unbounded FIFO image queues are rejected.
|
|
|
|
### Component: Frame Ingest, Calibration, and Time Sync
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| Python/C++ ingest with OpenCV/GStreamer, camera calibration files, and MAVLink timestamp alignment | OpenCV, GStreamer, NumPy, calibration YAML | Simple, debuggable, works with USB/MIPI/GigE once the camera module is pinned | Driver and hardware timestamp behavior are module-specific | Locked nav camera/lens, checkerboard calibration, FC clock sync, altitude/attitude stream | Reject unexpected dimensions/intrinsics; signed calibration profiles; no raw-frame persistence | 3 Hz full-res ingest; hot path may downsample/ROI | Selected |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: fixed downward nav camera, no raw photo storage, high-res frames, FC IMU/attitude, 400 ms p95.
|
|
- Evidence: Facts #4, #9, #20.
|
|
- Disqualifiers: final v1 camera/lens and hardware timestamp behavior must be pinned before calibration tasks.
|
|
|
|
### Component: Relative Motion Estimation
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| Custom planar VO/IMU module | OpenCV, Eigen/SciPy, optional C++ hot path | Matches nadir fixed camera, flat-terrain assumption, and FC attitude/altitude | Needs calibration, rolling-shutter assessment, and covariance model | Camera intrinsics, altitude, FC attitude/IMU, frame timestamps | Reject low-inlier/high-innovation updates | Must stay within steady-state deadline after downsampling/ROI | Selected |
|
|
| NVIDIA cuVSLAM | Isaac ROS/cuVSLAM | Strong Jetson ecosystem and IMU fallback | Official docs emphasize stereo-visual-inertial assumptions; IMU-only fallback is short-duration | Stereo or documented exact monocular path | ROS 2 surface area | Good Jetson acceleration, wrong v1 input fit | Rejected for v1 |
|
|
| ORB-SLAM3 / VINS-Fusion | Research SLAM/VIO stacks | Mono-IMU capability | GPL-family licensing and product integration risk | Legal approval, ROS/C++ integration | Larger dependency surface | Benchmark/offline only | Experimental only |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: single downward nav camera, Jetson runtime, flat terrain, VO drift AC.
|
|
- Evidence: Facts #7, #8, #20.
|
|
- Disqualifiers: stereo-required or GPL-family stacks are not product dependencies for v1.
|
|
|
|
### Component: Satellite Cache and Preprocessing
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars | GDAL/Rasterio, SQLite, local manifest schema | Clear offline service boundary, explicit pixel-size/freshness metadata, fast local lookup | The 10 GB budget is unproven until representative imagery, overviews, descriptors, and sidecars are packed together | 0.5 m/px minimum, 0.3 m/px ideal, capture date, source, CRS, tile matrix, compression profile | Signed manifests, checksums, immutable service-source tiles, stale-tile rejection | Cache-packing benchmark must include descriptors and generated-tile sidecars | Selected with storage gate |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: offline-only cache, 10 GB cap, freshness gates, mid-flight tile write-back, no direct provider calls.
|
|
- Evidence: Facts #13, #16, #21.
|
|
- Disqualifiers: zoom level alone cannot define physical resolution or storage cost.
|
|
|
|
### Component: Visual Place Recognition
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m chunks | PyTorch/TensorRT export path, FAISS CPU/HNSW-flat baseline, PCA/quantization | Strong cross-domain retrieval family; offline gallery descriptors; event-triggered online cost | Raw 49,152-dimensional descriptors can violate memory/cache budgets | Precomputed compressed descriptors, top-K dynamic sizing, covariance-aware search window | Retrieval is candidate generation only; never trusted without local verification | VPR invoked only on relocalization triggers; descriptor compression/index-size gate required | Selected with compression gate |
|
|
| FAISS GPU/cuVS | FAISS source build or cuVS | Potential lower query latency | ARM64 GPU deployment must be proven; not assumed | Jetson source build and benchmark | Same candidate-only trust model | Optimization path only | Experimental only |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency, 10 GB cache cap.
|
|
- Evidence: Facts #5, #6, #10, #15, #19.
|
|
- Disqualifiers: uncompressed descriptors and per-frame VPR are rejected.
|
|
|
|
### Component: Local Satellite/UAV Geometric Verification
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| ALIKED + LightGlue + RANSAC | LightGlue, ALIKED, OpenCV, ONNX/TensorRT path | Concrete learned-feature candidate without the official SuperPoint license blocker | Jetson speed and sparse-steppe accuracy must be measured | Candidate chunk/tile, camera intrinsics, attitude, altitude, freshness metadata | Strict inlier, reprojection, freshness, Mahalanobis, and covariance gates | Inline matcher target <=200 ms/pair | Selected candidate |
|
|
| OpenCV SIFT/AKAZE + classical matching | OpenCV | Commercial-safe legal baseline and regression target | May be weaker on cross-domain imagery and sparse fields | Same geometric verification inputs | Same verification gates | CPU/GPU baseline before learned extractor optimization | Selected baseline |
|
|
| DeDoDe | DeDoDe, ONNX/TensorRT ports | MIT-licensed learned-feature fallback | Model size, DINOv2-related variants, and Jetson runtime need validation | Model artifact approval and benchmark | Same verification gates | Fallback if ALIKED/SIFT miss robustness targets | Experimental only |
|
|
| Official Magic Leap SuperPoint pretrained weights | SuperPoint | Technically strong local features | Noncommercial research license blocks product use by default | Separate commercial license | License noncompliance risk | Not product path | Rejected for v1 |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: product licensing, cross-view false-match risk, sparse terrain, <400 ms p95.
|
|
- Evidence: Facts #10, #17, #18, #24, #25, #26.
|
|
- Disqualifiers: official SuperPoint weights are not selected unless licensing changes.
|
|
|
|
### Component: State Estimator and Confidence
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| Error-state Kalman filter in local NED/ENU | NumPy/SciPy prototype, C++ Eigen if profiling requires | Owns covariance, source labels, anchor gating, and output smoothing | Requires calibration, Monte Carlo validation, and conservative covariance floors | IMU propagation, VO deltas, satellite-anchor measurements, innovation gates | Reject overconfident anchors; log every gate decision | Bounded CPU path; hot path may move to C++ only if measured | Selected |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
|
|
- Evidence: Facts #1, #2, #9, #10.
|
|
- Disqualifiers: direct matcher-to-GPS output is rejected.
|
|
|
|
### Component: Flight Controller and Ground Station Interface
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| v1 `GPS_INPUT` emitter | pymavlink | Matches GPS-replacement framing and ArduPilot MAVLink GPS input path | Less expressive than full external-nav `ODOMETRY`; fields must be honest raw-GPS-sensor values | ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields | Validate rate, sequence, fix_type, and fail-closed behavior | 5-10 Hz output; freshest estimator timestamp only | Selected |
|
|
| `ODOMETRY` auxiliary | pymavlink | Better covariance/yaw semantics | EKF source-fusion and source-switching risk by ArduPilot version | Version-pinned SITL and source-switch tests | Avoid double-fusion | v1.1 only | Deferred |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
|
|
- Evidence: Facts #1, #2, #3.
|
|
- Disqualifiers: v1 emits no `ODOMETRY`.
|
|
|
|
### Component: Local API, Object Localization, and FDR
|
|
|
|
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
|
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
|
| FastAPI local service + segmented FDR writer | FastAPI, Pydantic, SQLite/Parquet/JSONL segments | OpenAPI docs, health/session/object endpoints, replayable FDR | Must stay outside hot frame path | Local-first API, object pixel validation, rollover schema, no raw frame retention | Bind localhost by default; JWT/API key for network exposure; segment checksums | 1-2 Hz GCS summary; high-rate data local only | Selected |
|
|
|
|
**Exact-fit evidence**:
|
|
- Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation, no raw frame storage.
|
|
- Evidence: Fact #14 and Context7 FastAPI docs.
|
|
- Disqualifiers: API cannot block GPS_INPUT emission.
|
|
|
|
## Testing Strategy
|
|
|
|
### Integration / Functional Tests
|
|
|
|
- Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds and no frame over the allowed maximum error.
|
|
- Simulate heavy VPR/local-matching frames and assert stale frames are dropped, not queued, with <=10% drops under the defined sustained-load scenario.
|
|
- Simulate sharp turns, disconnected segments, and 350 m outliers; assert VPR/local verification recovers or the estimator downgrades confidence without false anchors.
|
|
- Run ArduPilot SITL with v1 parameters and assert accepted `GPS_INPUT` messages at 5-10 Hz, no `ODOMETRY` emission, correct fix_type transitions, and QGC downsampled telemetry.
|
|
- Inject stale tiles, mismatched manifests, corrupted descriptors, and cache-poisoning candidates; assert rejection or confidence downgrade.
|
|
- Validate object localization with level-flight AI-camera geometry and out-of-frame input errors.
|
|
|
|
### Non-Functional Tests
|
|
|
|
- Jetson benchmark: capture-to-`GPS_INPUT` p95 <400 ms with scheduler, VO, VPR triggers, local matcher, ESKF, API, and FDR active.
|
|
- Local matcher bake-off: ALIKED + LightGlue, SIFT/AKAZE, and DeDoDe on sample/public datasets, measuring accuracy, false positives, latency, memory, and license status.
|
|
- VPR memory/index benchmark: prove compressed descriptors, FAISS index, TensorRT engines, and runtime buffers stay below 8 GB.
|
|
- Cache-packing benchmark: package representative 400 km² imagery with overviews, manifests, descriptors, indexes, and generated-tile sidecars under the 10 GB persistent-cache budget.
|
|
- Thermal soak: 25 W workload for 8 hours at the upper environmental envelope without throttling.
|
|
- Monte Carlo false-position and cache-poisoning validation over public datasets plus SITL/real FC traces.
|
|
- License and dependency scan: fail CI if noncommercial SuperPoint weights or unapproved model artifacts enter product builds.
|
|
|
|
## References
|
|
|
|
- ArduPilot MAVProxy GPSInput documentation: `https://ardupilot.org/mavproxy/docs/modules/GPSInput.html`
|
|
- MAVLink `GPS_INPUT` message spec: `https://mavlink.io/en/messages/common.html#GPS_INPUT`
|
|
- NVIDIA Isaac ROS cuVSLAM docs: `https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html`
|
|
- Magic Leap SuperPoint pretrained network license: `https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE`
|
|
- LightGlue repository: `https://github.com/cvg/LightGlue`
|
|
- DeDoDe repository: `https://github.com/Parskatt/DeDoDe`
|
|
- OpenCV SIFT source: `https://github.com/opencv/opencv/blob/4.x/modules/features2d/src/sift.dispatch.cpp`
|
|
- AnyLoc/DINO repository: `https://github.com/AnyLoc/DINO`
|
|
- GDAL COG driver and OGC COG standard: `https://gdal.org/en/stable/drivers/raster/cog.html`, `http://www.opengis.net/doc/is/COG/1.0`
|
|
- AerialVL dataset: `https://github.com/hmf21/AerialVL`
|
|
- UAV-VisLoc paper: `https://arxiv.org/html/2405.11936v1`
|
|
|
|
## Related Artifacts
|
|
|
|
- AC assessment: `_docs/00_research/00_ac_assessment.md`
|
|
- Question decomposition: `_docs/00_research/00_question_decomposition.md`
|
|
- Source registry: `_docs/00_research/01_source_registry.md`
|
|
- Fact cards: `_docs/00_research/02_fact_cards.md`
|
|
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
|
|
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
|
|
- Security analysis: `_docs/01_solution/security_analysis.md`
|