Keep VIO package and native bridge paths backend-neutral so BASALT remains an implementation choice rather than a component boundary. Co-authored-by: Cursor <cursoragent@cursor.com>
16 KiB
GPS-Denied Onboard Localization — Architecture
Architecture Vision
Build a Jetson-hosted onboard localization pipeline for fixed-wing GPS-denied flight. The hot path fuses fixed nadir camera frames and FC telemetry through OpenCV geometry, BASALT VIO, and a project-owned safety/anchor wrapper that emits calibrated GPS_INPUT estimates and QGC/FDR status. A triggered satellite-anchor path uses DINOv2-VLAD, CPU FAISS, ALIKED/DISK+LightGlue, and RANSAC against the offline cache; generated tiles are written back only with strict provenance and covariance gates.
Components / Responsibilities
- Camera ingest/calibration: load frames, apply intrinsics/extrinsics, validate image quality.
- VIO adapter: produce relative camera+IMU motion from synchronized nav frames and FC IMU.
- Safety/anchor wrapper: own covariance calibration, source labels, degraded modes, anchor fusion, and
GPS_INPUT. - Satellite Service: sync mission cache packages before flight, upload generated-tile packages after flight, and serve local VPR candidate retrieval from the offline cache.
- Anchor verification: run local matching/RANSAC and reject unsafe anchors.
- Tile Manager: manage COGs, manifests, freshness/provenance, orthorectified generated tiles, and local tile metadata.
- MAVLink/GCS integration: consume FC telemetry and emit
GPS_INPUT/QGC status. - FDR/observability: record replayable mission evidence under storage caps.
- Validation harness: run still-image, public dataset, SITL, Jetson, and representative replay tests.
Principles / Non-Negotiables
- No in-flight satellite-provider or Satellite Service calls; runtime uses offline cache only.
- BASALT is a VIO component, not the safety authority.
- Confidence must be honest; covariance must grow in degraded modes.
- Heavy VPR/local matching is trigger-based, not per-frame.
- Raw nav/AI frames are not retained in normal operation.
- GPL VIO libraries remain reference-only unless explicitly approved.
- Plane SITL and Jetson hardware are release gates.
- Public datasets can de-risk, but representative synchronized flight data is required for final acceptance.
1. System Context
Problem being solved: During fixed-wing flight, GPS may be denied or spoofed. The onboard system must estimate WGS84 coordinates for navigation-camera frame centers and detected objects, stream GPS_INPUT to ArduPilot Plane, report confidence honestly, and maintain safety during VO failure, stale imagery, spoofing, and visual blackout.
System boundaries:
- In scope: onboard localization runtime, offline cache consumption, BASALT VIO integration, satellite anchor verification, MAVLink output, QGC status, FDR, generated tile metadata, and a separate e2e/black-box test suite.
- Out of scope: upstream commercial satellite-provider sourcing, Satellite Service ingest implementation, AI mission-camera detection itself, PX4 support, raw-frame retention as a normal operating mode.
External systems:
| System | Integration Type | Direction | Purpose |
|---|---|---|---|
| ArduPilot Plane FC | MAVLink | Inbound/Outbound | FC telemetry in, GPS_INPUT and status out |
| QGroundControl | MAVLink telemetry | Outbound | Downsampled operator status and failsafe messages |
| Azaion Suite Satellite Service | Offline file/cache sync | Inbound before flight, outbound after landing | Provides mission cache packages and receives generated-tile packages; never called mid-flight |
| Public/replay datasets | File/rosbag/fixture | Inbound to validation | De-risk BASALT, VPR, and anchor logic |
2. Technology Stack
| Layer | Technology | Version / Mode | Rationale |
|---|---|---|---|
| OS / GPU stack | JetPack Ubuntu + CUDA/TensorRT/ONNX Runtime | Jetson Orin Nano Super target | Required for production hardware profiling |
| Runtime language | Python + C++ | Python orchestration; C++ for BASALT/hot vision paths | Fits MAVLink/test tooling and native VIO dependencies |
| Geometry | OpenCV 4.x | Calibration, undistortion, homography, RANSAC/USAC | Mature utility layer |
| VIO | BASALT | Production candidate | BSD-friendly, strong benchmark evidence |
| VIO reference | OpenVINS | Reference/covariance baseline only | Strong EKF covariance story; GPLv3 risk |
| Backup VIO | Kimera-VIO | Backup candidate | BSD-friendly fallback with mono caveats |
| Local matching | ALIKED/DISK + LightGlue | Anchor verification and optional VO fallback | Strong learned correspondences; profile before hot-path use |
| Retrieval | DINOv2-VLAD + CPU FAISS | Triggered VPR only | Robust candidate retrieval under cache/offline constraints |
| Structured metadata DB | PostgreSQL + PostGIS | Onboard/local deployment | Spatial cache manifests, mission state, generated-tile metadata, and FDR event indexes |
| Cache imagery | COG + PostgreSQL/PostGIS manifest + signed JSON sidecars | Write-new COG objects | Efficient geospatial rasters with queryable spatial metadata and auditable sidecars |
| FDR | PostgreSQL event index + CBOR segment payloads, optional Parquet export | Per-flight rollover | Queryable event metadata with compact bounded payload segments |
| MAVLink | MAVSDK + pymavlink | MAVSDK telemetry, pymavlink GPS_INPUT |
Exact output control |
Key constraints from restrictions.md:
- Jetson has 8 GB shared memory and 25 W thermal envelope, so heavy VPR/local matching cannot run every frame.
- Runtime must be offline with respect to satellite providers, so all imagery and descriptors are preloaded.
- The camera is fixed nadir; all VO choices must be validated against low-parallax/planar terrain.
- ADTi public specs conflict with current assumptions on resolution, continuous FPS, and operating temperature; manufacturer specs must be pinned before implementation.
3. Deployment Model
Environments: Development replay, public-dataset replay, Jetson hardware validation, Plane SITL, representative flight/replay rig.
Infrastructure:
- Onboard production runtime runs on the Jetson companion computer, not in cloud.
- Replay/test infrastructure may use Docker for deterministic fixture tests.
- Release gates require local Jetson hardware and ArduPilot Plane SITL.
Environment-specific configuration:
| Config | Development | Production |
|---|---|---|
| Satellite cache | Small fixture cache | Preloaded operational-area cache |
| Descriptor index | Fixture FAISS index | CPU-first FAISS index with PQ/IVF if needed |
| Secrets/signing | Local test keys | Mission/cache signing keys from Suite process |
| FDR | Local temp output | Per-flight bounded NVMe storage |
| MAVLink | SITL/replay | Physical FC telemetry link |
4. Data Model Overview
Core entities:
| Entity | Description | Owned By Component |
|---|---|---|
| FrameRecord | Navigation-camera frame metadata, total-occlusion status, and processing status | Camera ingest/calibration |
| TelemetrySample | FC IMU, attitude, airspeed, altitude, GPS health | MAVLink/GCS integration |
| VioState | Backend-relative pose/velocity/bias output and quality metadata | VIO adapter |
| PositionEstimate | WGS84 estimate, covariance, source label, fix type, anchor age | Safety/anchor wrapper |
| VprChunk | Retrieval unit over cache imagery and descriptors | Satellite Service |
| AnchorCandidate | Retrieved tile/chunk with local-match and RANSAC evidence | Anchor verification |
| CacheTile | COG tile plus manifest and sidecar metadata | Tile Manager |
| GeneratedTile | In-flight orthorectified tile with trust/provenance metadata | Tile Manager |
| FdrSegment | Bounded replayable log segment | FDR/observability |
Data flow summary:
- Frame quality/total-occlusion gate + telemetry -> BASALT VIO when usable, or IMU-only degraded mode when not -> safety/anchor wrapper ->
GPS_INPUT, QGC, FDR. - Relocalization trigger -> DINOv2-VLAD/FAISS -> ALIKED/DISK+LightGlue/RANSAC -> accepted/rejected anchor.
- High-confidence pose + frame -> generated tile -> manifest/sidecar -> post-flight Satellite Service sync.
5. Integration Points
Internal Communication
| From | To | Protocol | Pattern | Notes |
|---|---|---|---|---|
| Camera ingest/calibration | VIO adapter | In-process queue or shared frame bus | Streaming | Timestamp discipline is critical |
| MAVLink telemetry | VIO adapter | In-process telemetry buffer | Streaming | IMU/attitude/altitude sync |
| VIO adapter | Safety/anchor wrapper | Typed state messages | Streaming | Wrapper calibrates confidence |
| Safety/anchor wrapper | Satellite Service | Command | Triggered local request | Uses only preloaded cache/index data during flight |
| Satellite Service | Anchor verification | Candidate list | Request-response | Dynamic top-K |
| Anchor verification | Safety/anchor wrapper | Anchor decision | Request-response | Includes MRE/inliers/provenance |
| Safety/anchor wrapper | MAVLink/GCS integration | Position/status DTO | Streaming | GPS_INPUT emitted frame-by-frame |
| Safety/anchor wrapper | FDR/observability | Append-only events | Streaming | Bounded segments |
External Integrations
| External System | Protocol | Auth | Failure Mode |
|---|---|---|---|
| ArduPilot Plane | MAVLink | Source/system ID allowlist | Degrade/failsafe; never trust spoofed GPS blindly |
| QGroundControl | MAVLink | FC telemetry path | Downsampled status may be delayed but local FDR remains authoritative |
| Azaion Suite Satellite Service | Offline package sync | Signed manifests/sidecars | Missing/stale cache causes degraded mode, not mid-flight network fetch |
| Public datasets | File/rosbag | License constraints | Not final acceptance unless representative and license-compatible |
6. Non-Functional Requirements
| Requirement | Target | Measurement | Priority |
|---|---|---|---|
| Frame latency | <400 ms p95 | Capture/replay timestamp to emitted estimate | High |
| Memory | <8 GB shared | Jetson monitoring | High |
| First fix | <30 s p95 | 50 cold starts | High |
| Thermal | No throttle at 25 W / +50 C | 8-hour hot-soak | High |
| FDR storage | <=64 GB/flight | 8-hour synthetic load | High |
| Cache storage | ~10 GB persistent budget | Full mission cache accounting | High |
| False position | P(error >500 m) <0.1%, >1 km <0.01% | Monte Carlo/replay | High |
7. Security Architecture
Authentication / trust boundary:
- Runtime accepts only local cache files with valid manifest/signature/provenance.
- MAVLink input is filtered by expected source/system IDs and FC health semantics.
Data protection:
- At rest: FDR and cache sidecars should be integrity protected; mission secrets/signing keys are not stored in code.
- In transit: no in-flight satellite-provider or Satellite Service network dependency; MAVLink link security depends on FC/GCS deployment.
Audit logging:
- FDR records estimates, covariance, anchors, rejected anchors, cache validation failures, spoofing/blackout transitions, emitted
GPS_INPUT, resource health, and tile-write decisions.
8. Key Architectural Decisions
ADR-001: BASALT As Production VIO Candidate
Context: A naive OpenCV-only VIO implementation is risky, while OpenVINS has GPLv3 production constraints.
Decision: Use BASALT as the production relative VIO candidate and keep OpenVINS as covariance/reference baseline.
Alternatives considered:
- OpenVINS as production core — rejected by default because of GPLv3 and generic VIO ownership.
- Kimera-VIO — retained as backup due to BSD license but mono-inertial caveats.
- Fully custom OpenCV/ESKF — fallback only because implementation burden is high.
Consequences: The safety/anchor wrapper must calibrate confidence around BASALT and prove it on representative data.
ADR-002: ALIKED-LightGlue Role
Context: ALIKED-LightGlue can produce strong local correspondences and can support frame-to-frame homography/pose estimation.
Decision: Use ALIKED/DISK+LightGlue for satellite-anchor verification and evaluate it as an optional VO fallback/keyframe-assist path, not as the default BASALT replacement.
Alternatives considered:
- Per-frame ALIKED-LightGlue VO hot path — deferred until Jetson profiling proves latency/memory fit.
- SIFT/ORB-only matching — retained as regression baseline, weaker under cross-domain conditions.
- SuperPoint+LightGlue — license-gated.
Consequences: Implementation tasks must benchmark ALIKED-LightGlue on frame-to-frame VO and cross-domain anchor workloads separately.
ADR-003: Cache Metadata Format
Context: JSON is simple and auditable, but operational cache queries need spatial indexing, freshness filters, update safety, and integration with the project PostgreSQL database.
Decision: Use PostgreSQL with PostGIS as the primary cache manifest/index database, with signed JSON sidecars for each tile/generated tile for auditability and interchange.
Alternatives considered:
- JSON-only manifest — simpler, but weak for query/update scale, spatial search, and consistency.
- Embedded single-file metadata DB — efficient for small deployments, but rejected because the project will use PostgreSQL/PostGIS.
Consequences: The Tile Manager owns PostgreSQL migrations, PostGIS indexes, signature checks, generated-tile orthorectification metadata, and sidecar/db consistency.
ADR-004: FDR Format
Context: The FDR must be compact, bounded, replayable, and exportable for analysis.
Decision: Use PostgreSQL for FDR event indexes and mission-query metadata, with CBOR-backed segment payloads for bounded append-heavy runtime data and optional Parquet export after flight.
Alternatives considered:
- Plain CSV — rejected for type safety, size, and complex payloads.
- Parquet as primary onboard format — good analytics, but less ideal as the runtime append/rollover path.
Consequences: FDR implementation must define PostgreSQL tables/indexes, CBOR segment schema, rollover behavior, and export tooling.
ADR-006: Total Occlusion Before VIO
Context: BASALT should not receive frames that are completely unusable because of lens cover, cloud/whiteout, decode failure, extreme exposure, or other total visual blackout.
Decision: Camera ingest performs a pre-VIO total-occlusion/blackout check. Total occlusion bypasses BASALT for that frame, sends a total_occlusion or visual_blackout degradation signal to the safety wrapper, and continues IMU-only propagation from the last trusted state.
Alternatives considered:
- Let BASALT detect every visual failure — rejected because total occlusion is cheaper and safer to catch before the VIO hot path.
- Drop frames silently — rejected because the wrapper must grow covariance and emit honest degraded output.
Consequences: The camera component must expose occlusion_status, and tests must assert mode transition to dead_reckoned/failsafe under total blackout.
ADR-005: Public Dataset Strategy
Context: The original still-image sample lacks synchronized IMU and ground-truth trajectory. The Derkachi fixture adds cropped nadir video synchronized with IMU and GLOBAL_POSITION_INT trajectory, but camera intrinsics, distortion, and camera-to-body calibration remain pending.
Decision: Prioritize MUN-FRL for synchronized nadir camera + IMU + GNSS/ground truth; use ALTO for aerial localization/VPR and long nadir trajectories; investigate Kagaru/EPFL for fixed-wing/farmland relevance; use EuRoC/UZH FPV only as VIO proxies if license-compatible.
Consequences: Public datasets de-risk components but do not replace representative target flight data for final acceptance.