Files

126 lines
11 KiB
Markdown

# Solution Draft
## Assessment Findings
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|------------------------|----------------------------------------------|-------------|
| DINOv2-VLAD with possible TensorRT optimization | TensorRT conversion may produce limited speedup and can alter embedding distances on Jetson-class deployments. | Keep DINOv2-VLAD, but require descriptor-fidelity tests against PyTorch/ONNX before TensorRT descriptors are accepted. |
| FAISS CPU/GPU optional | FAISS GPU is not a safe default on Jetson ARM64/aarch64 packaging. | Pin FAISS as CPU-first on Jetson; use PQ/IVF and top-K caps before considering custom GPU builds. |
| LightGlue local matcher | SuperPoint path has license risk and community confusion. | Keep DISK/ALIKED+LightGlue as production default; SuperPoint remains license-gated benchmark/fallback only. |
| COG cache | "COG cache" could be misread as mutable in-place raster updates. | Use write-new COG tile objects plus manifest versioning and sidecars; never mutate COGs in place. |
| `GPS_INPUT` output | ArduPilot velocity ignore flags have reported EKF3 pitfalls. | SITL must validate velocity source parameters, ignore flags, and whether zero velocity is ever fused accidentally. |
| Visual/satellite anchoring | Draft did not emphasize adversarial/cache integrity enough. | Add signed cache manifests, tile provenance, freshness gates, anchor consistency checks, and FDR audit trail. |
## Product Solution Description
Build an onboard GPS-denied localization service that runs on the Jetson companion computer, uses the fixed downward navigation camera and flight-controller inertial telemetry, and emits ArduPilot `GPS_INPUT` estimates with calibrated covariance and source labels.
The production architecture is a trigger-based hybrid estimator:
```text
Nav camera + FC telemetry
|
v
Image quality + calibration + orthorectification
|
+--> Hot path: VO/IMU propagation --> custom ESKF --> GPS_INPUT + QGC + FDR
|
+--> Trigger path: DINOv2-VLAD query --> CPU FAISS top-K --> DISK/ALIKED+LightGlue --> RANSAC --> ESKF anchor
|
+--> Tile path: new COG tile + quality/provenance sidecar --> manifest update --> post-flight Satellite Service sync
```
Heavy retrieval and local matching are not steady-state per-frame dependencies. They run on cold start, VO failure, sharp turns, disconnected segments, covariance growth, or stale-anchor age.
## Architecture
### Component: Camera Ingest, Calibration, And Geometry
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| OpenCV geometry utility layer | OpenCV 4.x | Calibration, undistortion, RANSAC homography, MRE measurement | Mature, exact fit, permissive | Not a full estimator | Checkerboard calibration, fixed extrinsics, lens/FOV selection | Local-only | Fast enough for hot-path utility use | MVE in `02_fact_cards.md`; Source #5 | Selected |
### Component: VO / IMU Propagation And Estimator
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| Custom VO/IMU ESKF | OpenCV + custom estimator | Nadir VO/homography + FC IMU/attitude/altitude fused in ESKF with mode labels | Owns covariance, source labels, blackout/spoofing behavior | More implementation effort | Synchronized frames/IMU, calibration, replay tests | No network dependency | Hot path is lightweight | Facts #1, #16 | Selected |
| OpenVINS | OpenVINS | Monocular+IMU reference runs | Strong EKF/MSCKF reference | GPL-3 production risk | Replay adapter | Local only | Benchmark only | Source #3 | Reference only |
| ORB-SLAM3 | ORB-SLAM3 | Monocular-inertial SLAM | Mature benchmark | GPLv3 and heavier SLAM lifecycle | Calibration/vocabulary/runtime tuning | Local only | Riskier on embedded | Source #4 | Rejected for production |
### Component: Satellite Retrieval And Anchor Verification
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| DINOv2-VLAD + CPU FAISS + DISK/ALIKED+LightGlue | DINOv2/AnyLoc-style descriptors, FAISS CPU, LightGlue, OpenCV RANSAC | Offline VPR chunk descriptors; conditional query descriptor; CPU FAISS top-K; local match on candidates; TensorRT only after fidelity check | Strong retrieval+geometry structure; avoids per-frame map search | Requires profiling and representative data | Descriptor cache, dynamic K, freshness, RANSAC, Mahalanobis gates | Signed manifests, provenance, stale-tile rejection | Trigger path only; top-K capped | MVE blocks in `02_fact_cards.md`; Sources #6-#9, #21-#25 | Selected with runtime/fidelity gates |
| SuperPoint+LightGlue | SuperPoint, LightGlue | Same matcher with SuperPoint features | Strong technical baseline | SuperPoint license risk | Legal review | Local only | Benchmark only | Sources #6, #23 | Needs user decision |
| Classical SIFT/ORB | OpenCV | Handcrafted features + homography | Simple and cheap | Weak cross-domain robustness | Feature-rich scenes | Local only | Fast | Source #5 | Regression baseline |
### Component: Cache And Tile Lifecycle
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| COG tile objects + manifest + sidecars | GDAL COG, manifest DB/JSON, FAISS index files | Service tiles and generated tiles are write-new COG objects; active version selected by manifest | Geospatial standard, supports provenance and quality metadata | Descriptor budget pressure | CRS/date/source/m/px/freshness, sidecar hashes | Signed manifests, tile provenance, hash verification | Efficient local reads | Source #18; Facts #21, #29 | Selected |
| PMTiles | PMTiles | Read-only archive snapshot | Compact read package | Cannot update in place | Archive rebuild | Hash archive | Good for read-only export | Source #17 | Rejected for live cache |
### Component: MAVLink Integration
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| MAVSDK telemetry + pymavlink `GPS_INPUT` | MAVSDK, pymavlink | MAVSDK subscriptions; pymavlink emits `GPS_INPUT`; Plane SITL validates `GPS1_TYPE=14`, velocity source params, ignore flags, fix types, accuracy fields | Exact output control with good telemetry ergonomics | SITL required to prove Plane behavior | ArduPilot Plane params, QGC, tlog/FDR | Link/source validation, status audit | Light CPU load | Sources #10-#12, #24 | Selected |
### Component: Security And Safety Controls
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|------------|-------------|--------------|----------|-------------|-------------------------|-----|
| Consistency-gated anchor acceptance | Custom ESKF gates, cache manifest verification | Anchor accepted only if freshness, provenance, RANSAC, covariance, Mahalanobis, and temporal consistency pass | Prevents confident false fixes | Needs calibrated thresholds | Representative replay and Monte Carlo | Rejects stale/poisoned/low-confidence anchors | Lightweight after candidate generation | Facts #16, #17, #28 | Selected |
| FDR audit trail | Segmented logs + hashes | Logs estimates, inputs, emitted GPS_INPUT, health, tile writes, anchor decisions | Supports incident analysis and cache-poisoning audits | Schema work | 64 GB rollover | Tamper-evident hashes recommended | Sequential writes | AC-NEW-3 | Selected |
## Runtime Modes
| Mode | Trigger | Behavior | `GPS_INPUT` / Telemetry |
|------|---------|----------|--------------------------|
| `satellite_anchored` | VPR + local match passes all gates | ESKF absolute update; tile write eligible only if sigma gate passes | 3D fix, `horiz_accuracy` >= 95% covariance semi-major axis |
| `vo_extrapolated` | VO healthy and anchor age/covariance within bounds | VO/IMU propagation; covariance grows | 3D/2D depending covariance threshold |
| `dead_reckoned` | visual blackout or no accepted anchor | IMU-only propagation, monotonic covariance growth | degraded fix type; QGC `VISUAL_BLACKOUT_IMU_ONLY` |
| failsafe/no-fix | covariance >500 m or blackout >30 s | stop pretending position is valid | `fix_type=0`, `horiz_accuracy=999.0`, QGC `VISUAL_BLACKOUT_FAILSAFE` |
## Testing Strategy
### Integration / Functional Tests
- VO replay: assert AC-2.1a and AC-2.2 VO MRE on overlapping frame pairs.
- Satellite anchor replay: assert AC-1.1/1.2, AC-2.2 cross-domain MRE, freshness rejection, and source labels.
- DINOv2 descriptor fidelity: compare PyTorch/ONNX/TensorRT embeddings and retrieval rankings before accepting optimized engines.
- FAISS CPU index tests: top-K recall, query latency, index size, save/load behavior on Jetson ARM64.
- LightGlue extractor matrix: DISK vs ALIKED vs SuperPoint benchmark; SuperPoint output excluded from production unless license approved.
- COG cache lifecycle: write-new generated tile, update manifest, verify active version and rollback.
- `GPS_INPUT` SITL: validate fix type, `horiz_accuracy`, velocity fields, ignore flags, `EK3_SRC1_*` parameters, QGC behavior.
- Security gates: stale tile, mismatched tile hash, low inlier ratio, impossible velocity jump, and spoofed GPS during blackout.
### Non-Functional Tests
- Jetson latency and memory: <400 ms p95, <8 GB shared memory, no 25 W thermal throttle.
- Cache budget: 400 km² imagery + manifests + descriptors fits budget or reports explicit split budget.
- FDR 8-hour load: <=64 GB, rollover logged, no silent payload loss.
- Monte Carlo false-position and cache-poisoning tests for AC-NEW-4 and AC-NEW-7.
- Cold boot: first valid `GPS_INPUT` <30 s p95 across 50 runs.
## References
Detailed source registry: `_docs/00_research/01_source_registry.md`.
Key added Mode B sources:
- DINOv2 TensorRT issue: https://github.com/NVIDIA/TensorRT/issues/4348
- DINOv2 Jetson forum issue: https://forums.developer.nvidia.com/t/dinov2-tensorrt-model-performance-issue/312251
- LightGlue license discussion: https://github.com/cvg/LightGlue/issues/120
- ArduPilot GPS_INPUT velocity issue: https://github.com/ArduPilot/ardupilot/issues/19633
- FAISS install docs: https://github.com/facebookresearch/faiss/blob/main/INSTALL.md
- Orthophoto visual geolocalization: https://ar5iv.labs.arxiv.org/html/2103.14381
## Related Artifacts
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
- Fact cards: `_docs/00_research/02_fact_cards.md`