34 KiB
Solution Draft
Assessment Findings
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|---|---|---|
| FastAPI + SSE as primary output | Functional: New AC requires MAVLink GPS_INPUT to flight controller, not REST/SSE. The system must act as a GPS replacement module. SSE is wrong output channel. | Replace with pymavlink GPS_INPUT sender. Send GPS_INPUT at 5-10Hz to flight controller via UART. Retain minimal FastAPI only for local IPC (object localization API). |
| No ground station integration | Functional: New AC requires streaming position+confidence to ground station and receiving re-localization commands via telemetry. Draft02 had no telemetry. | MAVLink telemetry integration: GPS data forwarded automatically by flight controller. Custom data via NAMED_VALUE_FLOAT (confidence, drift). Re-localization hints via COMMAND_LONG listener. |
| MAVSDK library (per restriction) | Functional: MAVSDK-Python v3.15.3 cannot send GPS_INPUT messages. Feature requested since 2021, still unresolved. This is a blocking limitation for the core output function. | Use pymavlink for all MAVLink communication. pymavlink provides gps_input_send() and full MAVLink v2 access. Note conflict with restriction — pymavlink is the only viable option. |
| 3fps camera → ~3Hz output | Performance: ArduPilot GPS_RATE_MS minimum is 5Hz (200ms). 3Hz camera output is below minimum. Flight controller EKF may not fuse properly. | IMU-interpolated 5-10Hz GPS_INPUT: ESKF prediction runs at 100+Hz internally. Emit predicted state as GPS_INPUT at 5-10Hz. Camera corrections arrive at 3Hz within this stream. |
| No startup/failsafe procedures | Functional: New AC requires init from last GPS, reboot recovery, IMU-only fallback. Draft02 assumed position was already known. | Full lifecycle management: (1) Boot → read GPS from flight controller → init ESKF. (2) Reboot → read IMU-extrapolated position → re-init. (3) N-second failure → stop GPS_INPUT → autopilot falls back to IMU. |
| Basic object localization (nadir only) | Functional: New AC adds AI camera with configurable angle and zoom. Nadir pixel-to-GPS is insufficient. | Trigonometric projection for oblique camera: ground_distance = alt × tan(tilt), bearing = heading + pan + pixel offset. Local API for AI system requests. |
| No thermal management | Performance: Jetson Orin Nano Super throttles at 80°C (GPU drops 1GHz→300MHz = 3x slowdown). Could blow 400ms budget. | Thermal monitoring + adaptive pipeline: Use 25W mode. Monitor via tegrastats. If temp >75°C → reduce satellite matching frequency. If >80°C → VO+IMU only. |
| ESKF covariance without explicit drift budget | Functional: New AC requires max 100m cumulative VO drift between satellite anchors. Draft02 uses covariance for keyframe selection but no explicit budget. | Drift budget tracker: √(σ_x² + σ_y²) from ESKF as drift estimate. When approaching 100m → force every-frame satellite matching. Report via horiz_accuracy in GPS_INPUT. |
| No satellite imagery validation | Functional: New AC requires ≥0.5 m/pixel, <2 years old. Draft02 didn't validate. | Preprocessing validation step: Check zoom 19 availability (0.3 m/pixel). Fall back to zoom 18 (0.6 m/pixel). Flag stale tiles. |
| "Ask user via API" for re-localization | Functional: New AC says send re-localization request to ground station via telemetry link, not REST API. Operator sends hint via telemetry. | MAVLink re-localization protocol: On 3 consecutive failures → send STATUSTEXT alert to ground station. Operator sends COMMAND_LONG with approximate lat/lon. System uses hint to constrain tile search. |
Product Solution Description
A real-time GPS-denied visual navigation system for fixed-wing UAVs, running on a Jetson Orin Nano Super (8GB). The system replaces the GPS module for the flight controller by sending MAVLink GPS_INPUT messages via pymavlink over UART. Position is determined by fusing: (1) CUDA-accelerated visual odometry (cuVSLAM), (2) absolute position corrections from satellite image matching, and (3) IMU data from the flight controller. GPS_INPUT is sent at 5-10Hz, with camera-based corrections at 3Hz and IMU prediction filling the gaps.
Hard constraint: Camera shoots at ~3fps (333ms interval). The full VO+ESKF pipeline must complete within 400ms per frame. GPS_INPUT output rate: 5-10Hz minimum (ArduPilot EKF requirement).
Output architecture:
- Primary: pymavlink → GPS_INPUT to flight controller via UART (replaces GPS module)
- Telemetry: Flight controller auto-forwards GPS data to ground station. Custom NAMED_VALUE_FLOAT for confidence/drift at 1Hz
- Commands: Ground station → COMMAND_LONG → flight controller → pymavlink listener on companion computer
- Local IPC: Minimal FastAPI on localhost for object localization requests from AI systems
┌─────────────────────────────────────────────────────────────────────┐
│ OFFLINE (Before Flight) │
│ Satellite Tiles → Download & Validate → Pre-resize → Store │
│ (Google Maps) (≥0.5m/px, <2yr) (matcher res) (GeoHash) │
│ Copy to Jetson storage │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ ONLINE (During Flight) │
│ │
│ STARTUP: │
│ pymavlink → read GLOBAL_POSITION_INT → init ESKF → start cuVSLAM │
│ │
│ EVERY FRAME (3fps, 333ms interval): │
│ ┌──────────────────────────────────────┐ │
│ │ Nav Camera → Downsample (CUDA ~2ms) │ │
│ │ → cuVSLAM VO+IMU (~9ms) │ │
│ │ → ESKF measurement update │ │
│ └──────────────────────────────────────┘ │
│ │
│ 5-10Hz CONTINUOUS (between camera frames): │
│ ┌──────────────────────────────────────┐ │
│ │ ESKF IMU prediction → GPS_INPUT send │──→ Flight Controller │
│ │ (pymavlink, every 100-200ms) │ (GPS1_TYPE=14) │
│ └──────────────────────────────────────┘ │
│ │
│ KEYFRAMES (every 3-10 frames, async): │
│ ┌──────────────────────────────────────┐ │
│ │ Satellite match (CUDA stream B) │──→ ESKF correction │
│ │ LiteSAM TRT FP16 or XFeat │ │
│ └──────────────────────────────────────┘ │
│ │
│ TELEMETRY (1Hz): │
│ ┌──────────────────────────────────────┐ │
│ │ NAMED_VALUE_FLOAT: confidence, drift │──→ Ground Station │
│ │ STATUSTEXT: alerts, re-loc requests │ (via telemetry radio) │
│ └──────────────────────────────────────┘ │
│ │
│ COMMANDS (from ground station): │
│ ┌──────────────────────────────────────┐ │
│ │ Listen COMMAND_LONG: re-loc hint │←── Ground Station │
│ │ (lat/lon from operator) │ (via telemetry radio) │
│ └──────────────────────────────────────┘ │
│ │
│ LOCAL IPC: │
│ ┌──────────────────────────────────────┐ │
│ │ FastAPI localhost:8000 │←── AI Detection System │
│ │ POST /localize (object GPS calc) │ │
│ │ GET /status (system health) │ │
│ └──────────────────────────────────────┘ │
│ │
│ IMU: 100+Hz from flight controller → ESKF prediction │
│ TILES: ±2km preloaded in RAM from flight plan │
│ THERMAL: Monitor via tegrastats, adaptive pipeline throttling │
└─────────────────────────────────────────────────────────────────────┘
Speed Optimization Techniques
1. cuVSLAM for Visual Odometry (~9ms/frame)
NVIDIA's CUDA-accelerated VO library (PyCuVSLAM v15.0.0, March 2026) achieves 116fps on Jetson Orin Nano 8GB at 720p. Supports monocular camera + IMU natively. Auto-fallback to IMU when visual tracking fails, loop closure, Python and C++ APIs.
CRITICAL: cuVSLAM on low-texture terrain (agricultural fields, water): cuVSLAM uses Shi-Tomasi corners + Lucas-Kanade optical flow (classical features). On uniform agricultural terrain:
- Few corners detected → sparse/unreliable tracking
- Frequent keyframe creation → heavier compute
- Tracking loss → IMU fallback (~1s) → constant-velocity integrator (~0.5s)
- cuVSLAM does NOT guarantee pose recovery after tracking loss
Mitigation:
- Increase satellite matching frequency when cuVSLAM keypoint count drops
- IMU dead-reckoning bridge via ESKF (continues GPS_INPUT output during tracking loss)
- Accept higher drift in featureless segments — report via horiz_accuracy
- Keypoint density monitoring triggers adaptive satellite matching
2. Keyframe-Based Satellite Matching
Not every frame needs satellite matching:
- cuVSLAM provides VO at every frame (~9ms)
- Satellite matching triggers on keyframes selected by:
- Fixed interval: every 3-10 frames
- ESKF covariance exceeds threshold (drift approaching budget)
- VO failure: cuVSLAM reports tracking loss
- Thermal: reduce frequency if temperature high
3. Satellite Matcher Selection (Benchmark-Driven)
Context: Our UAV-to-satellite matching is nadir-to-nadir (both top-down). Challenges are season/lighting differences and temporal changes, not extreme viewpoint gaps.
Candidate A: LiteSAM (opt) TRT FP16 @ 1280px — Best satellite-aerial accuracy (RMSE@30 = 17.86m on UAV-VisLoc). 6.31M params. TensorRT FP16 with reparameterized MobileOne. Estimated ~165-330ms on Orin Nano Super with TRT FP16.
Candidate B: XFeat semi-dense — ~50-100ms on Orin Nano Super. Fastest option. General-purpose but our nadir-nadir gap is small.
Decision rule (day-one on Orin Nano Super):
- Export LiteSAM (opt) to TensorRT FP16
- Benchmark at 1280px
- If ≤200ms → LiteSAM at 1280px
- If >200ms → XFeat
4. TensorRT FP16 Optimization
LiteSAM's MobileOne backbone is reparameterizable — multi-branch collapses to single feed-forward at inference. INT8 safe only for MobileOne CNN layers, NOT for TAIFormer transformer components.
5. CUDA Stream Pipelining
- Stream A: cuVSLAM VO for current frame (~9ms) + ESKF fusion (~1ms)
- Stream B: Satellite matching for previous keyframe (async, does not block VO)
- CPU: GPS_INPUT output loop, NAMED_VALUE_FLOAT, command listener, tile management
6. Proactive Tile Loading
Preload tiles within ±2km of flight plan into RAM at startup. For a 50km route, ~2000 tiles at zoom 19 ≈ ~200MB. Eliminates disk I/O during flight.
On VO failure / expanded search:
- Compute IMU dead-reckoning position
- Rank preloaded tiles by distance to predicted position
- Try top 3 tiles, then expand
7. 5-10Hz GPS_INPUT Output Loop
Dedicated thread/coroutine sends GPS_INPUT at fixed rate (5-10Hz):
- Read current ESKF state (position, velocity, covariance)
- Compute horiz_accuracy from √(σ_x² + σ_y²)
- Set fix_type based on last correction type (3=satellite-corrected, 2=VO-only, 1=IMU-only)
- Send via
mav.gps_input_send() - Sleep until next interval
This decouples camera frame rate (3fps) from GPS_INPUT rate (5-10Hz).
Existing/Competitor Solutions Analysis
| Solution | Approach | Accuracy | Hardware | Limitations |
|---|---|---|---|---|
| Mateos-Ramirez et al. (2024) | VO (ORB) + satellite keypoint correction + Kalman | 142m mean / 17km (0.83%) | Orange Pi class | No re-localization; ORB only; 1000m+ altitude |
| SatLoc (2025) | DinoV2 + XFeat + optical flow + adaptive fusion | <15m, >90% coverage | Edge (unspecified) | Paper not fully accessible |
| LiteSAM (2025) | MobileOne + TAIFormer + MinGRU subpixel refinement | RMSE@30 = 17.86m on UAV-VisLoc | RTX 3090 (62ms), AGX Orin (497ms@1184px) | Not tested on Orin Nano |
| cuVSLAM (NVIDIA, 2025-2026) | CUDA-accelerated VO+SLAM, mono/stereo/IMU | <1% trajectory error (KITTI) | Jetson Orin Nano (116fps) | VO only, no satellite matching |
| EfficientLoFTR (CVPR 2024) | Aggregated attention + adaptive token selection | Competitive with LiteSAM | TRT available | 15.05M params, heavier |
| STHN (IEEE RA-L 2024) | Deep homography estimation | 4.24m at 50m range | Lightweight | Needs RGB retraining |
| JointLoc (IROS 2024) | Retrieval + VO fusion, adaptive weighting | 0.237m RMSE over 1km | Open-source | Planetary, needs adaptation |
Architecture
Component: Flight Controller Integration (NEW)
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|---|---|---|---|---|---|
| pymavlink GPS_INPUT | pymavlink | Full MAVLink v2 access, GPS_INPUT support, pure Python, aarch64 compatible | Lower-level API, manual message handling | ~1ms per send | ✅ Best |
| MAVSDK-Python TelemetryServer | MAVSDK v3.15.3 | Higher-level API, aarch64 wheels | NO GPS_INPUT support, no custom messages | N/A — missing feature | ❌ Blocked |
| MAVSDK C++ MavlinkDirect | MAVSDK v4 (future) | Custom message support planned | Not available in Python wrapper yet | N/A — not released | ❌ Not available |
| MAVROS (ROS) | ROS + MAVROS | Full GPS_INPUT support, ROS ecosystem | Heavy ROS dependency, complex setup, unnecessary overhead | ~5ms overhead | ⚠️ Overkill |
Selected: pymavlink — only viable Python library for GPS_INPUT. Pure Python, works on aarch64, full MAVLink v2 message set.
Restriction note: restrictions.md specifies "MAVSDK library" but MAVSDK-Python cannot send GPS_INPUT (confirmed: Issue #320, open since 2021). pymavlink is the necessary alternative.
Configuration:
- Connection: UART (
/dev/ttyTHS0or/dev/ttyTHS1on Jetson, 115200-921600 baud) - Flight controller: GPS1_TYPE=14, SERIAL2_PROTOCOL=2 (MAVLink2)
- GPS_INPUT rate: 5-10Hz (dedicated output thread)
- Heartbeat: 1Hz to maintain connection
Component: Visual Odometry
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|---|---|---|---|---|---|
| cuVSLAM (mono+IMU) | PyCuVSLAM v15.0.0 | 116fps on Orin Nano, NVIDIA-optimized, loop closure, IMU fallback | Closed-source, low-texture terrain risk | ~9ms/frame | ✅ Best |
| XFeat frame-to-frame | XFeatTensorRT | Open-source, learned features | No IMU integration, ~30-50ms | ~30-50ms/frame | ⚠️ Fallback |
| ORB-SLAM3 | OpenCV + custom | Well-understood, open-source | CPU-heavy, ~30fps | ~33ms/frame | ⚠️ Slower |
Selected: cuVSLAM (mono+IMU mode) — 116fps, purpose-built for Jetson.
Component: Satellite Image Matching
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|---|---|---|---|---|---|
| LiteSAM (opt) TRT FP16 @ 1280px | TensorRT | Best satellite-aerial accuracy, 6.31M params | Untested on Orin Nano Super TRT | Est. ~165-330ms TRT FP16 | ✅ If ≤200ms |
| XFeat semi-dense | XFeatTensorRT | ~50-100ms, Jetson-proven, fastest | General-purpose | ~50-100ms | ✅ Fallback |
Selection: Day-one benchmark. LiteSAM TRT FP16 at 1280px → if ≤200ms → LiteSAM. If >200ms → XFeat.
Component: Sensor Fusion
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|---|---|---|---|---|---|
| ESKF (custom) | Python/C++ | Lightweight, multi-rate, well-understood | Linear approximation | <1ms/step | ✅ Best |
| Hybrid ESKF/UKF | Custom | 49% better accuracy | More complex | ~2-3ms/step | ⚠️ Upgrade path |
Selected: ESKF with adaptive measurement noise. State vector: [position(3), velocity(3), orientation_quat(4), accel_bias(3), gyro_bias(3)] = 16 states.
Output rates:
- IMU prediction: 100+Hz (from flight controller IMU via pymavlink)
- cuVSLAM VO update: ~3Hz
- Satellite update: ~0.3-1Hz (keyframes, async)
- GPS_INPUT output: 5-10Hz (ESKF predicted state)
Drift budget: Track √(σ_x² + σ_y²) from ESKF covariance. When approaching 100m → force every-frame satellite matching.
Component: Ground Station Telemetry (NEW)
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|---|---|---|---|---|---|
| MAVLink auto-forwarding + NAMED_VALUE_FLOAT | pymavlink | Standard MAVLink, no custom protocol, works with all GCS (Mission Planner, QGC) | Limited bandwidth (~12kbit/s), NAMED_VALUE_FLOAT name limited to 10 chars | ~50 bytes/msg | ✅ Best |
| Custom MAVLink dialect messages | pymavlink + custom XML | Full flexibility | Requires custom GCS plugin, non-standard | ~50 bytes/msg | ⚠️ Complex |
| Separate telemetry channel | TCP/UDP over separate radio | Full bandwidth | Extra hardware, extra radio | N/A | ❌ Not available |
Selected: Standard MAVLink forwarding + NAMED_VALUE_FLOAT
Telemetry data sent to ground station:
- GPS position: auto-forwarded by flight controller from GPS_INPUT data
- Confidence score: NAMED_VALUE_FLOAT
"gps_conf"at 1Hz (values: 1=HIGH, 2=MEDIUM, 3=LOW, 4=VERY_LOW) - Drift estimate: NAMED_VALUE_FLOAT
"gps_drift"at 1Hz (meters) - Matching status: NAMED_VALUE_FLOAT
"sat_match"at 1Hz (0=inactive, 1=matching, 2=failed) - Alerts: STATUSTEXT for critical events (re-localization request, system failure)
Re-localization from ground station:
- Operator sees drift/failure alert in GCS
- Sends COMMAND_LONG (MAV_CMD_USER_1) with lat/lon in param5/param6
- Companion computer listens for COMMAND_LONG with target component ID
- Receives hint → constrains tile search → attempts satellite matching near hint coordinates
Component: Startup & Lifecycle (NEW)
Startup sequence:
- Boot Jetson → start GPS-Denied service (systemd)
- Connect to flight controller via pymavlink on UART
- Wait for heartbeat from flight controller
- Read GLOBAL_POSITION_INT → extract lat, lon, alt
- Initialize ESKF state with this position (high confidence if real GPS available)
- Start cuVSLAM with first camera frames
- Begin GPS_INPUT output loop at 5-10Hz
- Preload satellite tiles within ±2km of flight plan into RAM
- System ready — GPS-Denied active
GPS denial detection: Not required — the system always outputs GPS_INPUT. If real GPS is available, the flight controller uses whichever GPS source has better accuracy (configurable GPS blending or priority). When real GPS degrades/lost, flight controller seamlessly uses our GPS_INPUT.
Failsafe:
- If no valid position estimate for N seconds (configurable, e.g., 10s): stop sending GPS_INPUT
- Flight controller detects GPS timeout → falls back to IMU-only dead reckoning
- System logs failure, continues attempting recovery (VO + satellite matching)
- When recovery succeeds: resume GPS_INPUT output
Reboot recovery:
- Jetson reboots → re-establish pymavlink connection
- Read GPS_RAW_INT (now IMU-extrapolated by flight controller since GPS_INPUT stopped)
- Initialize ESKF with this position (low confidence, horiz_accuracy=100m+)
- Resume cuVSLAM + satellite matching → accuracy improves over time
- Resume GPS_INPUT output
Component: Object Localization (UPDATED)
Two modes:
Mode 1: Navigation camera (nadir) Frame-center GPS from ESKF. Any object in navigation camera frame:
- Pixel offset from center: (dx_px, dy_px)
- Convert to meters: dx_m = dx_px × GSD, dy_m = dy_px × GSD
- Rotate by heading (yaw from IMU)
- Convert meter offset to lat/lon delta, add to frame-center GPS
Mode 2: AI camera (configurable angle and zoom)
- Get current UAV position from ESKF
- Get AI camera params: tilt_angle (from vertical), pan_angle (from heading), zoom (effective focal length)
- Get pixel coordinates of detected object in AI camera frame
- Compute bearing: bearing = heading + pan_angle + atan2(dx_px × sensor_width / focal_eff, focal_eff)
- Compute ground distance: for flat terrain, slant_range = altitude / cos(tilt_angle + dy_angle), ground_range = slant_range × sin(tilt_angle + dy_angle)
- Convert bearing + ground_range to lat/lon offset
- Return GPS coordinates with accuracy estimate
Local API (FastAPI on localhost:8000):
POST /localize— accepts: pixel_x, pixel_y, camera_id ("nav" or "ai"), ai_camera_params (tilt, pan, zoom) → returns: lat, lon, accuracy_mGET /status— returns: system state, confidence, drift, uptime
Component: Satellite Tile Preprocessing (Offline)
Selected: GeoHash-indexed tile pairs on disk + RAM preloading.
Pipeline:
- Define operational area from flight plan
- Download satellite tiles from Google Maps Tile API at zoom 19 (0.3 m/pixel)
- If zoom 19 unavailable: fall back to zoom 18 (0.6 m/pixel — meets ≥0.5 m/pixel requirement)
- Validate: resolution ≥0.5 m/pixel, check imagery staleness where possible
- Pre-resize each tile to matcher input resolution
- Store: original + resized + metadata (GPS bounds, zoom, GSD, download date) in GeoHash-indexed structure
- Copy to Jetson storage before flight
- At startup: preload tiles within ±2km of flight plan into RAM
Component: Re-localization (Disconnected Segments)
When cuVSLAM reports tracking loss (sharp turn, no features):
- Flag next frame as keyframe → trigger satellite matching
- Compute IMU dead-reckoning position since last known position
- Rank preloaded tiles by distance to dead-reckoning position
- Try top 3 tiles sequentially
- If match found: position recovered, new segment begins
- If 3 consecutive keyframe failures: send STATUSTEXT alert to ground station ("RE-LOC REQUEST: position uncertain, drift Xm")
- While waiting for operator hint: continue VO/IMU dead reckoning, report low confidence via horiz_accuracy
- If operator sends COMMAND_LONG with lat/lon hint: constrain tile search to ±500m of hint
- If still no match after operator hint: continue dead reckoning, log failure
Component: Thermal Management (NEW)
Power mode: 25W (stable sustained performance)
Monitoring: Read GPU/CPU temperature via tegrastats or sysfs thermal zones at 1Hz.
Adaptive pipeline:
- Normal (<70°C): Full pipeline — cuVSLAM every frame + satellite match every 3-10 frames
- Warm (70-75°C): Reduce satellite matching to every 5-10 frames
- Hot (75-80°C): Reduce satellite matching to every 10-15 frames
- Throttling (>80°C): Disable satellite matching entirely, VO+IMU only (cuVSLAM ~9ms is very light). Report LOW confidence. Resume satellite matching when temp drops below 75°C
Hardware requirement: Active cooling fan (5V) mandatory for UAV companion computer enclosure.
Processing Time Budget (per frame, 333ms interval)
Normal Frame (non-keyframe, ~60-80% of frames)
| Step | Time | Notes |
|---|---|---|
| Image capture + transfer | ~10ms | CSI/USB3 |
| Downsample (for cuVSLAM) | ~2ms | OpenCV CUDA |
| cuVSLAM VO+IMU | ~9ms | NVIDIA CUDA-optimized, 116fps |
| ESKF measurement update | ~1ms | NumPy |
| Total per camera frame | ~22ms | Well within 333ms |
GPS_INPUT output runs independently at 5-10Hz (every 100-200ms):
| Step | Time | Notes |
|---|---|---|
| Read ESKF state | <0.1ms | Shared state |
| Compute horiz_accuracy | <0.1ms | √(σ²) |
| pymavlink gps_input_send | ~1ms | UART write |
| Total per GPS_INPUT | ~1ms | Negligible overhead |
Keyframe Satellite Matching (async, every 3-10 frames)
Runs on separate CUDA stream — does NOT block VO or GPS_INPUT.
Path A — LiteSAM TRT FP16 at 1280px (if ≤200ms benchmark):
| Step | Time | Notes |
|---|---|---|
| Downsample to 1280px | ~1ms | OpenCV CUDA |
| Load satellite tile | ~1ms | Pre-loaded in RAM |
| LiteSAM (opt) TRT FP16 | ≤200ms | Go/no-go threshold |
| Geometric pose (RANSAC) | ~5ms | Homography |
| ESKF satellite update | ~1ms | Delayed measurement |
| Total | ≤210ms | Async |
Path B — XFeat (if LiteSAM >200ms):
| Step | Time | Notes |
|---|---|---|
| XFeat extraction + matching | ~50-80ms | TensorRT FP16 |
| Geometric verification (RANSAC) | ~5ms | |
| ESKF satellite update | ~1ms | |
| Total | ~60-90ms | Async |
Memory Budget (Jetson Orin Nano Super, 8GB shared)
| Component | Memory | Notes |
|---|---|---|
| OS + runtime | ~1.5GB | JetPack 6.2 + Python |
| cuVSLAM | ~200-500MB | CUDA library + map state (configure pruning for 3000 frames) |
| Satellite matcher TensorRT | ~50-100MB | LiteSAM FP16 or XFeat FP16 |
| Preloaded satellite tiles | ~200MB | ±2km of flight plan |
| pymavlink + MAVLink runtime | ~20MB | Lightweight |
| FastAPI (local IPC) | ~50MB | Minimal, localhost only |
| Current frame buffer | ~2MB | |
| ESKF state + buffers | ~10MB | |
| Total | ~2.1-2.9GB | ~26-36% of 8GB — comfortable |
Confidence Scoring → GPS_INPUT Mapping
| Level | Condition | horiz_accuracy (m) | fix_type | GPS_INPUT satellites_visible |
|---|---|---|---|---|
| HIGH | Satellite match succeeded + cuVSLAM consistent | 10-20 | 3 (3D) | 12 |
| MEDIUM | cuVSLAM VO only, recent satellite correction (<500m travel) | 20-50 | 3 (3D) | 8 |
| LOW | cuVSLAM VO only, no recent correction, OR high thermal throttling | 50-100 | 2 (2D) | 4 |
| VERY LOW | IMU dead-reckoning only | 100-500 | 1 (no fix) | 1 |
| MANUAL | Operator-provided re-localization hint | 200 | 3 (3D) | 6 |
Note: satellites_visible is synthetic — used to influence EKF weighting. ArduPilot gives more weight to GPS with higher satellite count and lower horiz_accuracy.
Key Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| MAVSDK cannot send GPS_INPUT | CONFIRMED | Must use pymavlink (conflicts with restriction) | Use pymavlink. Document restriction conflict. No alternative in Python. |
| cuVSLAM fails on low-texture agricultural terrain | HIGH | Frequent tracking loss, degraded VO | Increase satellite matching frequency. IMU dead-reckoning bridge. Accept higher drift. |
| Jetson UART instability with ArduPilot | MEDIUM | MAVLink connection drops | Test thoroughly. Use USB serial adapter if UART unreliable. Add watchdog reconnect. |
| Thermal throttling blows satellite matching budget | MEDIUM | Miss keyframe windows | Adaptive pipeline: reduce/skip satellite matching at high temp. Active cooling mandatory. |
| LiteSAM TRT FP16 >200ms at 1280px | MEDIUM | Must use XFeat | Day-one benchmark. XFeat fallback. |
| XFeat cross-view accuracy insufficient | MEDIUM | Satellite corrections less accurate | Multi-tile consensus, strict RANSAC, increase keyframe frequency. |
| cuVSLAM map memory growth on long flights | MEDIUM | Memory pressure | Configure map pruning, max keyframes. |
| Google Maps satellite quality in conflict zone | HIGH | Satellite matching fails | Accept VO+IMU with higher drift. Alternative providers. |
| GPS_INPUT at 3Hz too slow for ArduPilot EKF | HIGH | Poor EKF fusion, position jumps | 5-10Hz output with IMU interpolation between camera frames. |
| Companion computer reboot mid-flight | LOW | ~30-60s GPS gap | Flight controller IMU fallback. Automatic recovery on restart. |
| Telemetry bandwidth saturation | LOW | Custom messages compete with autopilot telemetry | Limit NAMED_VALUE_FLOAT to 1Hz. Keep messages compact. |
Testing Strategy
Integration / Functional Tests
- End-to-end: camera → cuVSLAM → ESKF → GPS_INPUT → verify flight controller receives valid position
- Compare computed positions against ground truth GPS from coordinates.csv
- Measure: percentage within 50m (target: 80%), percentage within 20m (target: 60%)
- Test GPS_INPUT rate: verify 5-10Hz output to flight controller
- Test sharp-turn handling: verify satellite re-localization after 90-degree heading change
- Test disconnected segments: simulate 3+ route breaks, verify all segments connected
- Test re-localization: simulate 3 consecutive failures → verify STATUSTEXT sent → inject COMMAND_LONG hint → verify recovery
- Test object localization: send POST /localize with known AI camera params → verify GPS accuracy
- Test startup: verify ESKF initializes from flight controller GPS
- Test reboot recovery: kill process → restart → verify reconnection and position recovery
- Test failsafe: simulate total failure → verify GPS_INPUT stops → verify flight controller IMU fallback
- Test cuVSLAM map memory: run 3000-frame session, monitor memory growth
Non-Functional Tests
- Day-one satellite matcher benchmark: LiteSAM TRT FP16 at 1280px on Orin Nano Super
- cuVSLAM benchmark: verify 116fps monocular+IMU on Orin Nano Super
- cuVSLAM terrain stress test: urban, agricultural, water, forest
- UART reliability test: sustained pymavlink communication over 1+ hour
- Thermal endurance test: run full pipeline for 30+ minutes, measure GPU temp, verify no throttling with active cooling
- Per-frame latency: must be <400ms for VO pipeline
- GPS_INPUT latency: measure time from camera capture to GPS_INPUT send
- Memory: peak usage during 3000-frame session (must stay <8GB)
- Drift budget: verify ESKF covariance tracks cumulative drift, triggers satellite matching before 100m
- Telemetry bandwidth: measure total MAVLink bandwidth used by companion computer
References
- pymavlink GPS_INPUT example: https://webperso.ensta.fr/lebars/Share/GPS_INPUT_pymavlink.py
- pymavlink mavgps.py: https://github.com/ArduPilot/pymavlink/blob/master/examples/mavgps.py
- ArduPilot GPS Input module: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
- MAVLink GPS_INPUT message spec: https://mavlink.io/en/messages/common.html#GPS_INPUT
- MAVSDK-Python GPS_INPUT limitation: https://github.com/mavlink/MAVSDK-Python/issues/320
- MAVSDK-Python custom message limitation: https://github.com/mavlink/MAVSDK-Python/issues/739
- ArduPilot companion computer setup: https://ardupilot.org/dev/docs/raspberry-pi-via-mavlink.html
- Jetson Orin UART with ArduPilot: https://forums.developer.nvidia.com/t/uart-connection-between-jetson-nano-orin-and-ardupilot/325416
- MAVLink NAMED_VALUE_FLOAT: https://mavlink.io/en/messages/common.html#NAMED_VALUE_FLOAT
- MAVLink STATUSTEXT: https://mavlink.io/en/messages/common.html#STATUSTEXT
- MAVLink telemetry bandwidth: https://github.com/mavlink/mavlink/issues/1605
- JetPack 6.2 Super Mode: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
- Jetson Orin Nano power consumption: https://edgeaistack.app/blog/jetson-orin-nano-power-consumption/
- UAV target geolocation: https://www.mdpi.com/1424-8220/22/5/1903
- LiteSAM (2025): https://www.mdpi.com/2072-4292/17/19/3349
- LiteSAM code: https://github.com/boyagesmile/LiteSAM
- cuVSLAM (2025-2026): https://github.com/NVlabs/PyCuVSLAM
- PyCuVSLAM API: https://nvlabs.github.io/PyCuVSLAM/api.html
- Intermodalics cuVSLAM benchmark: https://www.intermodalics.ai/blog/nvidia-isaac-ros-in-depth-cuvslam-and-the-dp3-1-release
- XFeat (CVPR 2024): https://arxiv.org/abs/2404.19174
- XFeat TensorRT for Jetson: https://github.com/PranavNedunghat/XFeatTensorRT
- EfficientLoFTR (CVPR 2024): https://github.com/zju3dv/EfficientLoFTR
- STHN (IEEE RA-L 2024): https://github.com/arplaboratory/STHN
- JointLoc (IROS 2024): https://github.com/LuoXubo/JointLoc
- Hybrid ESKF/UKF: https://arxiv.org/abs/2512.17505
- Google Maps Tile API: https://developers.google.com/maps/documentation/tile/satellite
- ArduPilot EKF Source Selection: https://ardupilot.org/copter/docs/common-ekf-sources.html
- Mateos-Ramirez et al. (2024): https://www.mdpi.com/2076-3417/14/16/7420
- SatLoc (2025): https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f
Related Artifacts
- AC Assessment:
_docs/00_research/gps_denied_nav/00_ac_assessment.md - Research artifacts:
_docs/00_research/gps_denied_nav_v3/ - Tech stack evaluation:
_docs/01_solution/tech_stack.md - Security analysis:
_docs/01_solution/security_analysis.md