mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-04-22 22:26:38 +00:00

Files

T

Oleksandr Bezdieniezhnykh 531a1301d5 Revise skills documentation to incorporate updated directory structure and terminology. Replace references to integration tests with blackbox tests in SKILL.md files and templates. Adjust paths in planning and deployment documentation to align with the new _docs/02_document/ structure, ensuring consistency and clarity throughout the documentation.

2026-03-25 06:35:41 +02:00

52 KiB

Raw Blame History

Solution Draft

Assessment Findings

Old Component Solution	Weak Point (functional/security/performance)	New Solution
ESKF described as "16-state vector, ~10MB" with no mathematical specification	Functional: No state vector, no process model (F,Q), no measurement models (H for VO, H for satellite), no noise parameters, no scale observability analysis. Impossible to implement or validate accuracy claims.	Define complete ESKF specification: 15-state error vector, IMU-driven prediction, dual measurement models (VO relative pose, satellite absolute position), initial Q/R values, scale constraint via altitude + satellite corrections.
GPS_INPUT at 5-10Hz via pymavlink — no field mapping	Functional: GPS_INPUT requires 15+ fields (velocity, accuracy, hdop, fix_type, GPS time). No specification of how ESKF state maps to these fields. ArduPilot requires minimum 5Hz.	Define GPS_INPUT population spec: velocity from ESKF, accuracy from covariance, fix_type from confidence tier, GPS time from system clock conversion, synthesized hdop/vdop.
Confidence scoring "unchanged from draft03" — not in draft05	Functional: Draft05 is supposed to be self-contained. Confidence scoring determines GPS_INPUT accuracy fields and fix_type — directly affects how ArduPilot EKF weights the position data.	Define confidence scoring inline: 3 tiers (satellite-anchored, VO-tracked, IMU-only) mapping to fix_type + accuracy values.
Coordinate transformations not defined	Functional: No pixel→camera→body→NED→WGS84 chain. Camera is not autostabilized, so body attitude matters. Satellite match → WGS84 conversion undefined. Object localization impossible without these transforms.	Define coordinate transformation chain: camera intrinsics K, camera-to-body extrinsic T_cam_body, body-to-NED from ESKF attitude, NED origin at mission start point.
Disconnected route segments — "satellite re-localization" mentioned but no algorithm	Functional: AC requires handling as "core to the system." Multiple disconnected segments expected. No tracking-loss detection, no re-localization trigger, no ESKF re-initialization, no cuVSLAM restart procedure.	Define re-localization pipeline: detect cuVSLAM tracking loss → IMU-only ESKF prediction → trigger satellite match on every frame → on match success: ESKF position reset + cuVSLAM restart → on 3 consecutive failures: operator re-localization request.
No startup handoff from GPS to GPS-denied	Functional: System reads GLOBAL_POSITION_INT at startup but no protocol for when GPS is lost/spoofed vs system start. No validation of initial position.	Define handoff protocol: system runs continuously, FC receives both real GPS and GPS_INPUT. GPS-denied system always provides its estimate; FC selects best source. Initial position validated against first satellite match.
No mid-flight reboot recovery	Functional: AC requires: "re-initialize from flight controller's current IMU-extrapolated position." No procedure defined. Recovery time estimation missing.	Define reboot recovery sequence: read FC position → init ESKF with high uncertainty → load TRT engines → start cuVSLAM → immediate satellite match. Estimated recovery: ~35-70s. Document as known limitation.
3-consecutive-failure re-localization request undefined	Functional: AC requires ground station re-localization request. No message format, no operator workflow, no system behavior while waiting.	Define re-localization protocol: detect 3 failures → send custom MAVLink message with last known position + uncertainty → operator provides approximate coordinates → system uses as ESKF measurement with high covariance.
Object localization — "trigonometric calculation" with no details	Functional: No math, no API, no Viewpro gimbal integration, no accuracy propagation. Other onboard systems cannot use this component as specified.	Define object localization: pixel→ray using Viewpro intrinsics + gimbal angles → body frame → NED → ray-ground intersection → WGS84. FastAPI endpoint: POST /objects/locate. Accuracy propagated from UAV position + gimbal uncertainty.
Satellite matching — GSD normalization and tile selection unspecified	Functional: Camera GSD ~15.9 cm/px at 600m vs satellite ~0.3 m/px at zoom 19. The "pre-resize" step is mentioned but not specified. Tile selection radius based on ESKF uncertainty not defined.	Define GSD handling: downsample camera frame to match satellite GSD. Define tile selection: ESKF position ± 3σ_horizontal → select tiles covering that area. Assemble tile mosaic for matching.
Satellite tile storage requirements not calculated	Functional: "±2km" preload mentioned but no storage estimate. At zoom 19: a 200km path with ±2km buffer requires ~~130K tiles (~~2.5GB).	Calculate tile storage: specify zoom level (18 preferred — 0.6m/px, 4× fewer tiles), estimate storage per mission profile, define maximum mission area by storage limit.
FastAPI endpoints not in solution draft	Functional: Endpoints only in security_analysis.md. No request/response schemas. No SSE event format. No object localization endpoint.	Consolidate API spec in solution: define all endpoints, SSE event schema, object localization endpoint. Reference security_analysis.md for auth.
cuVSLAM configuration missing (calibration, IMU params, mode)	Functional: No camera calibration procedure, no IMU noise parameters, no T_imu_rig extrinsic, no mode selection (Mono vs Inertial).	Define cuVSLAM configuration: use Inertial mode, specify required calibration data (camera intrinsics, distortion, IMU noise params from datasheet, T_imu_rig from physical measurement), define calibration procedure.
tech_stack.md inconsistent with draft05	Functional: tech_stack.md says 3fps (should be 0.7fps), LiteSAM at 480px (should be 1280px), missing EfficientLoFTR.	Flag for update: tech_stack.md must be synchronized with draft05 corrections. Not addressed in this draft — separate task.

Overall Maturity Assessment

Category	Maturity (1-5)	Assessment
Hardware & Platform Selection	3.5	UAV airframe, cameras, Jetson, batteries — well-researched with specs, weight budget, endurance calculations. Ready for procurement.
Core Algorithm Selection	3.0	cuVSLAM, LiteSAM/XFeat, ESKF — components selected with comparison tables, fallback chains, decision trees. Day-one benchmarks defined.
AI Inference Runtime	3.5	TRT Engine migration thoroughly analyzed. Conversion workflows, memory savings, performance estimates. Code wrapper provided.
Sensor Fusion (ESKF)	1.5	Mentioned but not specified. No implementable detail. Blockerfor coding.
System Integration	1.5	GPS_INPUT, coordinate transforms, inter-component data flow — all under-specified.
Edge Cases & Resilience	1.0	Disconnected segments, reboot recovery, re-localization — acknowledged but no algorithms.
Operational Readiness	0.5	No pre-flight procedures, no in-flight monitoring, no failure response.
Security	3.0	Comprehensive threat model, OP-TEE analysis, LUKS, secure boot. Well-researched.
Overall TRL	~2.5	Technology concept formulated + some component validation. Not implementation-ready.

The solution is at approximately TRL 3 (proof of concept) for hardware/algorithm selection and TRL 1-2 (basic concept) for system integration, ESKF, and operational procedures.

Product Solution Description

A real-time GPS-denied visual navigation system for fixed-wing UAVs, running on a Jetson Orin Nano Super (8GB). All AI model inference uses native TensorRT Engine files. The system replaces the GPS module by sending MAVLink GPS_INPUT messages via pymavlink over UART at 5-10Hz.

Position is determined by fusing: (1) CUDA-accelerated visual odometry (cuVSLAM in Inertial mode) from ADTI 20L V1 at 0.7 fps sustained, (2) absolute position corrections from satellite image matching (LiteSAM or XFeat — TRT Engine FP16) using keyframes from the same ADTI image stream, and (3) IMU data from the flight controller via ESKF. Viewpro A40 Pro is reserved for AI object detection only.

The ESKF is the central state estimator with 15-state error vector. It fuses:

IMU prediction at 5-10Hz (high-frequency pose propagation)
cuVSLAM VO measurement at 0.7Hz (relative pose correction)
Satellite matching measurement at ~0.07-0.14Hz (absolute position correction)

GPS_INPUT messages carry position, velocity, and accuracy derived from the ESKF state and covariance.

Hard constraint: ADTI 20L V1 shoots at 0.7 fps sustained (1430ms interval). Full VO+ESKF pipeline within 400ms per frame. Satellite matching async on keyframes (every 5-10 camera frames). GPS_INPUT at 5-10Hz (ESKF IMU prediction fills gaps between camera frames).

┌─────────────────────────────────────────────────────────────────────┐
│                    OFFLINE (Before Flight)                           │
│  1. Satellite Tiles → Download & Validate → Pre-resize → Store      │
│     (Google Maps)     (≥0.5m/px, <2yr)     (matcher res)  (GeoHash)│
│  2. TRT Engine Build (one-time per model version):                  │
│     PyTorch model → reparameterize → ONNX export → trtexec --fp16  │
│     Output: litesam.engine, xfeat.engine                            │
│  3. Camera + IMU calibration (one-time per hardware unit)           │
│  4. Copy tiles + engines + calibration to Jetson storage            │
└─────────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    ONLINE (During Flight)                            │
│                                                                     │
│  STARTUP:                                                           │
│  1. pymavlink → read GLOBAL_POSITION_INT → init ESKF state         │
│  2. Load TRT engines + allocate GPU buffers                         │
│  3. Load camera calibration + IMU calibration                       │
│  4. Start cuVSLAM (Inertial mode) with ADTI 20L V1                 │
│  5. Preload satellite tiles ±2km into RAM                           │
│  6. First satellite match → validate initial position               │
│  7. Begin GPS_INPUT output loop at 5-10Hz                           │
│                                                                     │
│  EVERY CAMERA FRAME (0.7fps from ADTI 20L V1):                     │
│  ┌──────────────────────────────────────┐                           │
│  │ ADTI 20L V1 → Downsample (CUDA)     │                           │
│  │             → cuVSLAM VO+IMU (~9ms)  │ ← CUDA Stream A          │
│  │             → ESKF VO measurement    │                           │
│  └──────────────────────────────────────┘                           │
│                                                                     │
│  5-10Hz CONTINUOUS (IMU-driven between camera frames):              │
│  ┌──────────────────────────────────────┐                           │
│  │ IMU data → ESKF prediction           │                           │
│  │ ESKF state → GPS_INPUT fields        │                           │
│  │ GPS_INPUT → Flight Controller (UART) │                           │
│  └──────────────────────────────────────┘                           │
│                                                                     │
│  KEYFRAMES (every 5-10 camera frames, async):                       │
│  ┌──────────────────────────────────────┐                           │
│  │ Camera frame → GSD downsample        │                           │
│  │ Select satellite tile (ESKF pos±3σ)  │                           │
│  │ TRT inference (Stream B): LiteSAM/   │                           │
│  │   XFeat → correspondences            │                           │
│  │ RANSAC → homography → WGS84 position │                           │
│  │ ESKF satellite measurement update    │──→ Position correction    │
│  └──────────────────────────────────────┘                           │
│                                                                     │
│  TRACKING LOSS (cuVSLAM fails — sharp turn / featureless):         │
│  ┌──────────────────────────────────────┐                           │
│  │ ESKF → IMU-only prediction (growing  │                           │
│  │   uncertainty)                        │                           │
│  │ Satellite match on EVERY frame       │                           │
│  │ On match success → ESKF reset +      │                           │
│  │   cuVSLAM restart                    │                           │
│  │ 3 consecutive failures → operator    │                           │
│  │   re-localization request            │                           │
│  └──────────────────────────────────────┘                           │
│                                                                     │
│  TELEMETRY (1Hz):                                                   │
│  ┌──────────────────────────────────────┐                           │
│  │ NAMED_VALUE_FLOAT: confidence, drift │──→ Ground Station         │
│  └──────────────────────────────────────┘                           │
└─────────────────────────────────────────────────────────────────────┘

Architecture

Component: ESKF Sensor Fusion (NEW — previously unspecified)

Error-State Kalman Filter fusing IMU, visual odometry, and satellite matching.

Nominal state vector (propagated by IMU):

State	Symbol	Size	Description
Position	p	3	NED position relative to mission origin (meters)
Velocity	v	3	NED velocity (m/s)
Attitude	q	4	Unit quaternion (body-to-NED rotation)
Accel bias	b_a	3	Accelerometer bias (m/s²)
Gyro bias	b_g	3	Gyroscope bias (rad/s)

Error-state vector (estimated by ESKF): δx = [δp, δv, δθ, δb_a, δb_g]ᵀ ∈ ℝ¹⁵ where δθ ∈ so(3) is the 3D rotation error.

Prediction step (IMU at 5-10Hz from flight controller):

Input: accelerometer a_m, gyroscope ω_m, dt
Propagate nominal state: p += v·dt, v += (R(q)·(a_m - b_a) - g)·dt, q ⊗= Exp(ω_m - b_g)·dt
Propagate error covariance: P = F·P·Fᵀ + Q
F is the 15×15 error-state transition matrix (standard ESKF formulation)
Q: process noise diagonal, initial values from IMU datasheet noise densities

VO measurement update (0.7Hz from cuVSLAM):

cuVSLAM outputs relative pose: ΔR, Δt (camera frame)
Transform to NED: Δp_ned = R_body_ned · T_cam_body · Δt
Innovation: z = Δp_ned_measured - Δp_ned_predicted
Observation matrix H_vo maps error state to relative position change
R_vo: measurement noise, initial ~0.1-0.5m (from cuVSLAM precision at 600m+ altitude)
Kalman update: K = P·Hᵀ·(H·P·Hᵀ + R)⁻¹, δx = K·z, P = (I - K·H)·P

Satellite measurement update (0.07-0.14Hz, async):

Satellite matching outputs absolute position: lat_sat, lon_sat in WGS84
Convert to NED relative to mission origin
Innovation: z = p_satellite - p_predicted
H_sat = [I₃, 0, 0, 0, 0] (directly observes position)
R_sat: measurement noise, from matching confidence (~5-20m based on RANSAC inlier ratio)
Provides absolute position correction — bounds drift accumulation

Scale observability:

Monocular cuVSLAM has scale ambiguity during constant-velocity flight
Scale is constrained by: (1) satellite matching absolute positions (primary), (2) known flight altitude from barometer + predefined mission altitude, (3) IMU accelerometer during maneuvers
During long straight segments without satellite correction, scale drift is possible. Satellite corrections every ~7-14s re-anchor scale.

Tuning approach: Start with IMU datasheet noise values for Q. Start with conservative R values (high measurement noise). Tune on flight test data by comparing ESKF output to known GPS ground truth.

Solution	Tools	Advantages	Limitations	Performance	Fit
Custom ESKF (Python/NumPy)	NumPy, SciPy	Full control, minimal dependencies, well-understood algorithm	Implementation effort, tuning required	<1ms per step	✅ Selected
FilterPy ESKF	FilterPy v1.4.5	Reference implementation, less code	Less flexible for multi-rate fusion	<1ms per step	⚠️ Fallback

Component: Coordinate System & Transformations (NEW — previously undefined)

Reference frames:

Camera frame (C): origin at camera optical center, Z forward, X right, Y down (OpenCV convention)
Body frame (B): origin at UAV CG, X forward (nose), Y right (starboard), Z down
NED frame (N): North-East-Down, origin at mission start point
WGS84: latitude, longitude, altitude (output format)

Transformation chain:

Pixel → Camera ray: p_cam = K⁻¹ · [u, v, 1]ᵀ where K = camera intrinsic matrix (ADTI 20L V1: fx, fy from 16mm lens + APS-C sensor)
Camera → Body: p_body = T_cam_body · p_cam where T_cam_body is the fixed mounting rotation (camera points nadir: 90° pitch rotation from body X-forward to camera Z-down)
Body → NED: p_ned = R_body_ned(q) · p_body where q is the ESKF quaternion attitude estimate
NED → WGS84: lat = lat_origin + p_north / R_earth, lon = lon_origin + p_east / (R_earth · cos(lat_origin)) where (lat_origin, lon_origin) is the mission start GPS position

Camera intrinsic matrix K (ADTI 20L V1 + 16mm lens):

Sensor: 23.2 × 15.4 mm, Resolution: 5456 × 3632
fx = fy = focal_mm × width_px / sensor_width_mm = 16 × 5456 / 23.2 = 3763 pixels
cx = 2728, cy = 1816 (sensor center)
Distortion: Brown model (k1, k2, p1, p2 from calibration)

T_cam_body (camera mount):

Navigation camera is fixed, pointing nadir (downward), not autostabilized
R_cam_body = R_x(180°) · R_z(0°) (camera Z-axis aligned with body -Z, camera X with body X)
Translation: offset from CG to camera mount (measured during assembly, typically <0.3m)

Satellite match → WGS84:

Feature correspondences between camera frame and geo-referenced satellite tile
Homography H maps camera pixels to satellite tile pixels
Satellite tile pixel → WGS84 via tile's known georeference (zoom level + tile x,y → lat,lon)
Camera center projects to satellite pixel (cx_sat, cy_sat) via H
Convert (cx_sat, cy_sat) to WGS84 using tile georeference

Component: GPS_INPUT Message Population (NEW — previously undefined)

GPS_INPUT Field	Source	Computation
lat, lon	ESKF position (NED)	NED → WGS84 conversion using mission origin
alt	ESKF position (Down) + mission origin altitude	alt = alt_origin - p_down
vn, ve, vd	ESKF velocity state	Direct from ESKF v[0], v[1], v[2]
fix_type	Confidence tier	3 (3D fix) when satellite-anchored (last match <30s). 2 (2D) when VO-only. 0 (no fix) when IMU-only >5s
hdop	ESKF horizontal covariance	hdop = sqrt(P[0,0] + P[1,1]) / 5.0 (approximate CEP→HDOP mapping)
vdop	ESKF vertical covariance	vdop = sqrt(P[2,2]) / 5.0
horiz_accuracy	ESKF horizontal covariance	horiz_accuracy = sqrt(P[0,0] + P[1,1]) meters
vert_accuracy	ESKF vertical covariance	vert_accuracy = sqrt(P[2,2]) meters
speed_accuracy	ESKF velocity covariance	speed_accuracy = sqrt(P[3,3] + P[4,4]) m/s
time_week, time_week_ms	System time	Convert Unix time to GPS epoch (GPS epoch = 1980-01-06, subtract leap seconds)
satellites_visible	Constant	10 (synthetic — prevents satellite-count failsafes in ArduPilot)
gps_id	Constant	0
ignore_flags	Constant	0 (provide all fields)

Confidence tiers mapping to GPS_INPUT:

Tier	Condition	fix_type	horiz_accuracy	Rationale
HIGH	Satellite match <30s ago, ESKF covariance < 400m²	3 (3D fix)	From ESKF P (typically 5-20m)	Absolute position anchor recent
MEDIUM	cuVSLAM tracking OK, no recent satellite match	3 (3D fix)	From ESKF P (typically 20-50m)	Relative tracking valid, drift growing
LOW	cuVSLAM lost, IMU-only	2 (2D fix)	From ESKF P (50-200m+, growing)	Only IMU dead reckoning, rapid drift
FAILED	3+ consecutive total failures	0 (no fix)	999.0	System cannot determine position

Component: Disconnected Route Segment Handling (NEW — previously undefined)

Trigger: cuVSLAM reports tracking_lost OR tracking confidence drops below threshold

Algorithm:

STATE: TRACKING_NORMAL
  cuVSLAM provides relative pose
  ESKF VO measurement updates at 0.7Hz
  Satellite matching on keyframes (every 5-10 frames)

STATE: TRACKING_LOST (enter when cuVSLAM reports loss)
  1. ESKF continues with IMU-only prediction (no VO updates)
     → uncertainty grows rapidly (~1-5 m/s drift with consumer IMU)
  2. Switch satellite matching to EVERY frame (not just keyframes)
     → maximize chances of getting absolute correction
  3. For each camera frame:
     a. Attempt satellite match using ESKF predicted position ± 3σ for tile selection
     b. If match succeeds (RANSAC inlier ratio > 30%):
        → ESKF measurement update with satellite position
        → Restart cuVSLAM with current frame as new origin
        → Transition to TRACKING_NORMAL
        → Reset failure counter
     c. If match fails:
        → Increment failure_counter
        → Continue IMU-only ESKF prediction
  4. If failure_counter >= 3:
     → Send re-localization request to ground station
     → GPS_INPUT fix_type = 0 (no fix), horiz_accuracy = 999.0
     → Continue attempting satellite matching on each frame
  5. If operator sends re-localization hint (approximate lat,lon):
     → Use as ESKF measurement with high covariance (~500m)
     → Attempt satellite match in that area
     → On success: transition to TRACKING_NORMAL

STATE: SEGMENT_DISCONNECT
  After re-localization following tracking loss:
  → New cuVSLAM track is independent of previous track
  → ESKF maintains global NED position continuity via satellite anchor
  → No need to "connect" segments at the cuVSLAM level
  → ESKF already handles this: satellite corrections keep global position consistent

Component: Satellite Image Matching Pipeline (UPDATED — added GSD + tile selection details)

GSD normalization:

Camera GSD at 600m: ~15.9 cm/pixel (ADTI 20L V1 + 16mm)
Satellite tile GSD at zoom 18: ~0.6 m/pixel
Scale ratio: ~3.8:1
Downsample camera image to satellite GSD before matching: resize from 5456×3632 to ~1440×960 (matching zoom 18 GSD)
This is close to LiteSAM's 1280px input — use 1280px with minor GSD mismatch acceptable for matching

Tile selection:

Input: ESKF position estimate (lat, lon) + horizontal covariance σ_h
Search radius: max(3·σ_h, 500m) — at least 500m to handle initial uncertainty
Compute geohash for center position → load tiles covering the search area
Assemble tile mosaic if needed (typically 2×2 to 4×4 tiles for adequate coverage)
If ESKF uncertainty > 2km: tile selection unreliable, fall back to wider search or request operator input

Tile storage calculation (zoom 18 — 0.6 m/pixel):

Each 256×256 tile covers ~153m × 153m
Flight path 200km with ±2km buffer: area ≈ 200km × 4km = 800 km²
Tiles needed: 800,000,000 / (153 × 153) ≈ 34,200 tiles
Storage: ~10-15KB per JPEG tile → ~340-510 MB
With zoom 19 overlap tiles for higher precision: ×4 = ~1.4-2.0 GB
Recommended: zoom 18 primary + zoom 19 for ±500m along flight path → ~500-800 MB total

Solution	Tools	Advantages	Limitations	Performance (est. Orin Nano Super TRT FP16)	Params	Fit
LiteSAM (opt) TRT Engine FP16 @ 1280px	trtexec + tensorrt Python	Best satellite-aerial accuracy (RMSE@30=17.86m UAV-VisLoc), 6.31M params	MinGRU TRT export needs verification (LOW-MEDIUM risk)	Est. ~165-330ms	6.31M	✅ Primary
EfficientLoFTR TRT Engine FP16	trtexec + tensorrt Python	Proven TRT path (Coarse_LoFTR_TRT). Semi-dense. CVPR 2024.	2.4x more params than LiteSAM.	Est. ~200-400ms	15.05M	✅ Fallback if LiteSAM TRT fails
XFeat TRT Engine FP16	trtexec + tensorrt Python	Fastest. Proven TRT implementation.	General-purpose, not designed for cross-view gap.	Est. ~50-100ms	<5M	✅ Speed fallback

Component: cuVSLAM Configuration (NEW — previously undefined)

Mode: Inertial (mono camera + IMU)

Camera configuration (ADTI 20L V1 + 16mm lens):

Model: Brown distortion
fx = fy = 3763 px (16mm on 23.2mm sensor at 5456px width)
cx = 2728 px, cy = 1816 px
Distortion coefficients: from calibration (k1, k2, p1, p2)
Border: 50px (ignore lens edge distortion)

IMU configuration (Pixhawk 6x IMU — ICM-42688-P):

Gyroscope noise density: 3.0 × 10⁻³ °/s/√Hz
Gyroscope random walk: 5.0 × 10⁻⁵ °/s²/√Hz
Accelerometer noise density: 70 µg/√Hz
Accelerometer random walk: ~2.0 × 10⁻³ m/s³/√Hz
IMU frequency: 200 Hz (from flight controller via MAVLink)
T_imu_rig: measured transformation from Pixhawk IMU to camera center (translation + rotation)

cuVSLAM settings:

OdometryMode: INERTIAL
MulticameraMode: PRECISION (favor accuracy over speed — we have 1430ms budget)
Input resolution: downsample to 1280×852 (or 720p) for processing speed
async_bundle_adjustment: True

Initialization:

cuVSLAM initializes automatically when it receives the first camera frame + IMU data
First few frames used for feature initialization and scale estimation
First satellite match validates and corrects the initial position

Calibration procedure (one-time per hardware unit):

Camera intrinsics: checkerboard calibration with OpenCV (or use manufacturer data if available)
Camera-IMU extrinsic (T_imu_rig): Kalibr tool with checkerboard + IMU data
IMU noise parameters: Allan variance analysis or use datasheet values
Store calibration files on Jetson storage

Component: AI Model Inference Runtime (UNCHANGED)

Native TRT Engine — optimal performance and memory on fixed NVIDIA hardware. See draft05 for full comparison table and conversion workflow.

Component: Visual Odometry (UNCHANGED)

cuVSLAM in Inertial mode, fed by ADTI 20L V1 at 0.7 fps sustained. See draft05 for feasibility analysis at 0.7fps.

Component: Flight Controller Integration (UPDATED — added GPS_INPUT field spec)

pymavlink over UART at 5-10Hz. GPS_INPUT field population defined above.

ArduPilot configuration:

GPS1_TYPE = 14 (MAVLink)
GPS_RATE = 5 (minimum, matching our 5-10Hz output)
EK3_SRC1_POSXY = 1 (GPS), EK3_SRC1_VELXY = 1 (GPS) — EKF uses GPS_INPUT as position/velocity source

Component: Object Localization (NEW — previously undefined)

Input: pixel coordinates (u, v) in Viewpro A40 Pro image, current gimbal angles (pan_deg, tilt_deg), zoom factor, UAV position from GPS-denied system, UAV altitude

Process:

Pixel → camera ray: ray_cam = K_viewpro⁻¹(zoom) · [u, v, 1]ᵀ
Camera → gimbal frame: ray_gimbal = R_gimbal(pan, tilt) · ray_cam
Gimbal → body: ray_body = T_gimbal_body · ray_gimbal
Body → NED: ray_ned = R_body_ned(q) · ray_body
Ray-ground intersection: assuming flat terrain at UAV altitude h: t = -h / ray_ned[2], p_ground_ned = p_uav_ned + t · ray_ned
NED → WGS84: convert to lat, lon

Output: { lat, lon, accuracy_m, confidence }

accuracy_m propagated from: UAV position accuracy (from ESKF) + gimbal angle uncertainty + altitude uncertainty

API endpoint: POST /objects/locate

Request: { pixel_x, pixel_y, gimbal_pan_deg, gimbal_tilt_deg, zoom_factor }
Response: { lat, lon, alt, accuracy_m, confidence, uav_position: {lat, lon, alt}, timestamp }

Component: Startup, Handoff & Failsafe (UPDATED — added handoff + reboot + re-localization)

GPS-denied handoff protocol:

GPS-denied system runs continuously from companion computer boot
Reads initial position from FC (GLOBAL_POSITION_INT) — this may be real GPS or last known
First satellite match validates the initial position
FC receives both real GPS (if available) and GPS_INPUT; FC EKF selects best source based on accuracy
No explicit "switch" — the GPS-denied system is a secondary GPS source

Startup sequence (expanded from draft05):

Boot Jetson → start GPS-Denied service (systemd)
Connect to flight controller via pymavlink on UART
Wait for heartbeat
Initialize PyCUDA context
Load TRT engines: litesam.engine + xfeat.engine (~1-3s each)
Allocate GPU I/O buffers
Create CUDA streams: Stream A (cuVSLAM), Stream B (satellite matching)
Load camera calibration + IMU calibration files
Read GLOBAL_POSITION_INT → set mission origin (NED reference point) → init ESKF
Start cuVSLAM (Inertial mode) with ADTI 20L V1 camera stream
Preload satellite tiles within ±2km into RAM
Trigger first satellite match → validate initial position
Begin GPS_INPUT output loop at 5-10Hz
System ready

Mid-flight reboot recovery:

Jetson boots (~30-60s)
GPS-Denied service starts, connects to FC
Read GLOBAL_POSITION_INT (FC's current IMU-extrapolated position)
Init ESKF with this position + HIGH uncertainty covariance (σ = 200m)
Load TRT engines (~2-6s total)
Start cuVSLAM (fresh, no prior map)
Immediate satellite matching on first camera frame
On satellite match success: ESKF corrected, uncertainty drops
Estimated total recovery: ~35-70s
During recovery: FC uses IMU-only dead reckoning (at 70 km/h: ~700-1400m uncontrolled drift)
Known limitation: recovery time is dominated by Jetson boot time

3-consecutive-failure re-localization:

Trigger: VO lost + satellite match failed × 3 consecutive camera frames
Action: send re-localization request via MAVLink STATUSTEXT or custom message
Message content: "RELOC_REQ: last_lat={lat} last_lon={lon} uncertainty={σ}m"
Operator response: MAVLink COMMAND_LONG with approximate lat/lon
System: use operator position as ESKF measurement with R = diag(500², 500², 100²) meters²
System continues satellite matching with updated search area
While waiting: GPS_INPUT fix_type=0, IMU-only ESKF prediction continues

Component: Ground Station Telemetry (UPDATED — added re-localization)

MAVLink messages to ground station:

Message	Rate	Content
NAMED_VALUE_FLOAT "gps_conf"	1Hz	Confidence score (0.0-1.0)
NAMED_VALUE_FLOAT "gps_drift"	1Hz	Estimated drift from last satellite anchor (meters)
NAMED_VALUE_FLOAT "gps_hacc"	1Hz	Horizontal accuracy (meters, from ESKF)
STATUSTEXT	On event	"RELOC_REQ: ..." for re-localization request
STATUSTEXT	On event	Tracking loss / recovery notifications

Component: Thermal Management (UNCHANGED)

Same adaptive pipeline from draft05. Active cooling required at 25W. Throttling at 80°C SoC junction.

Component: API & Inter-System Communication (NEW — consolidated)

FastAPI (Uvicorn) running locally on Jetson for inter-process communication with other onboard systems.

Endpoint	Method	Purpose	Auth
/sessions	POST	Start GPS-denied session	JWT
/sessions/{id}/stream	GET (SSE)	Real-time position + confidence stream	JWT
/sessions/{id}/anchor	POST	Operator re-localization hint	JWT
/sessions/{id}	DELETE	End session	JWT
/objects/locate	POST	Object GPS from pixel coordinates	JWT
/health	GET	System health + memory + thermal	None

SSE event schema (1Hz):

{
  "type": "position",
  "timestamp": "2026-03-17T12:00:00.000Z",
  "lat": 48.123456,
  "lon": 37.654321,
  "alt": 600.0,
  "accuracy_h": 15.2,
  "accuracy_v": 8.1,
  "confidence": "HIGH",
  "drift_from_anchor": 12.5,
  "vo_status": "tracking",
  "last_satellite_match_age_s": 8.3
}

UAV Platform

Unchanged from draft05. See draft05 for: airframe configuration (3.5m S-2 composite, 12.5kg AUW), flight performance (3.4h endurance at 50 km/h), camera specifications (ADTI 20L V1 + 16mm, Viewpro A40 Pro), ground coverage calculations.

Speed Optimization Techniques

Unchanged from draft05. Key points: cuVSLAM ~9ms/frame, native TRT Engine (no ONNX RT), dual CUDA streams, 5-10Hz GPS_INPUT from ESKF IMU prediction.

Processing Time Budget

Unchanged from draft05. VO frame: ~17-22ms. Satellite matching: ≤210ms async. Well within 1430ms frame interval.

Memory Budget (Jetson Orin Nano Super, 8GB shared)

Component	Memory	Notes
OS + runtime	~1.5GB	JetPack 6.2 + Python
cuVSLAM	~200-500MB	CUDA library + map
LiteSAM TRT engine	~50-80MB	If LiteSAM fails: EfficientLoFTR ~100-150MB
XFeat TRT engine	~30-50MB
Preloaded satellite tiles	~200MB	±2km of flight plan
pymavlink + MAVLink	~20MB
FastAPI (local IPC)	~50MB
ESKF + buffers	~10MB
Total	~2.1-2.9GB	26-36% of 8GB

Key Risks and Mitigations

Risk	Likelihood	Impact	Mitigation
LiteSAM MinGRU ops unsupported in TRT 10.3	LOW-MEDIUM	LiteSAM TRT export fails	Day-one verification. Fallback: EfficientLoFTR TRT → XFeat TRT.
cuVSLAM fails on low-texture terrain at 0.7fps	HIGH	Frequent tracking loss	Satellite matching corrections bound drift. Re-localization pipeline handles tracking loss. IMU bridges short gaps.
Google Maps satellite quality in conflict zone	HIGH	Satellite matching fails, outdated imagery	Pre-flight tile validation. Consider alternative providers (Bing, Mapbox). Robust to seasonal appearance changes via feature-based matching.
ESKF scale drift during long constant-velocity segments	MEDIUM	Position error exceeds 100m between satellite anchors	Satellite corrections every 7-14s re-anchor. Altitude constraint from barometer. Monitor drift rate — if >50m between corrections, increase satellite matching frequency.
Monocular scale ambiguity	MEDIUM	Metric scale lost during constant-velocity flight	Satellite absolute corrections provide scale. Known altitude constrains vertical scale. IMU acceleration during turns provides observability.
AUW exceeds AT4125 recommended range	MEDIUM	Reduced endurance, motor thermal stress	12.5 kg vs 8-10 kg recommended. Monitor motor temps. Weight optimization.
ADTI mechanical shutter lifespan	MEDIUM	Replacement needed periodically	~8,800 actuations/flight at 0.7fps. Estimated 11-57 flights before replacement. Budget as consumable.
Mid-flight companion computer failure	LOW	~35-70s position gap	Reboot recovery procedure defined. FC uses IMU dead reckoning during gap. Known limitation.
Thermal throttling on Jetson	MEDIUM	Satellite matching latency increases	Active cooling required. Monitor SoC temp. Throttling at 80°C. Our workload ~8-15W typical — well under 25W TDP.
Engine incompatibility after JetPack update	MEDIUM	Must rebuild engines	Include engine rebuild in update procedure.
TRT engine build OOM on 8GB	LOW	Cannot build on target	Models small (6.31M, <5M). Reduce --memPoolSize if needed.

Testing Strategy

Integration / Functional Tests

ESKF correctness: Feed recorded IMU + synthetic VO/satellite data → verify output matches reference ESKF implementation
GPS_INPUT field validation: Send GPS_INPUT to SITL ArduPilot → verify EKF accepts and uses the data correctly
Coordinate transform chain: Known GPS → NED → pixel → back to GPS — verify round-trip error <0.1m
Disconnected segment handling: Simulate tracking loss → verify satellite re-localization triggers → verify cuVSLAM restarts → verify ESKF position continuity
3-consecutive-failure: Simulate VO + satellite failures → verify re-localization request sent → verify operator hint accepted
Object localization: Known object at known GPS → verify computed GPS matches within camera accuracy
Mid-flight reboot: Kill GPS-denied process → restart → verify recovery within expected time → verify position accuracy after recovery
TRT engine load test: Verify engines load successfully on Jetson
TRT inference correctness: Compare TRT output vs PyTorch reference (max L1 error < 0.01)
CUDA Stream pipelining: Verify Stream B satellite matching does not block Stream A VO
ADTI sustained capture rate: Verify 0.7fps sustained >30 min without buffer overflow
Confidence tier transitions: Verify fix_type and accuracy change correctly across HIGH → MEDIUM → LOW → FAILED transitions

Non-Functional Tests

End-to-end accuracy (primary validation): Fly with real GPS recording → run GPS-denied system in parallel → compare estimated vs real positions → verify 80% within 50m, 60% within 20m
VO drift rate: Measure cuVSLAM drift over 1km straight segment without satellite correction
Satellite matching accuracy: Compare satellite-matched position vs real GPS at known locations
Processing time: Verify end-to-end per-frame <400ms
Memory usage: Monitor over 30-min session → verify <8GB, no leaks
Thermal: Sustained 30-min run → verify no throttling
GPS_INPUT rate: Verify consistent 5-10Hz delivery to FC
Tile storage: Validate calculated storage matches actual for test mission area
MinGRU TRT compatibility (day-one blocker): Clone LiteSAM → ONNX export → polygraphy → trtexec
Flight endurance: Ground-test full system power draw against 267W estimate

References

ArduPilot GPS_RATE parameter: https://github.com/ArduPilot/ardupilot/pull/15980
MAVLink GPS_INPUT message: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
pymavlink GPS_INPUT example: https://webperso.ensta.fr/lebars/Share/GPS_INPUT_pymavlink.py
ESKF reference (fixed-wing UAV): https://github.com/ludvigls/ESKF
ROS ESKF multi-sensor: https://github.com/EliaTarasov/ESKF
Range-VIO scale observability: https://arxiv.org/abs/2103.15215
NaviLoc trajectory-level localization: https://www.mdpi.com/2504-446X/10/2/97
SatLoc-Fusion hierarchical framework: https://www.scilit.com/publications/e5cafaf875a49297a62b298a89d5572f
Auterion GPS-denied workflow: https://docs.auterion.com/vehicle-operation/auterion-mission-control/useful-resources/operations/gps-denied-workflow
PX4 GNSS-denied flight: https://docs.px4.io/main/en/advanced_config/gnss_degraded_or_denied_flight.html
ArduPilot GPS_INPUT advanced usage: https://discuss.ardupilot.org/t/advanced-usage-of-gps-type-mav-14/99406
Google Maps Ukraine imagery: https://newsukraine.rbc.ua/news/google-maps-has-surprise-for-satellite-imagery-1727182380.html
Jetson Orin Nano Super thermal: https://edgeaistack.app/blog/jetson-orin-nano-power-consumption/
GSD matching research: https://www.kjrs.org/journal/view.html?pn=related&uid=756&vmd=Full
VO+satellite matching pipeline: https://polen.itu.edu.tr/items/1fe1e872-7cea-44d8-a8de-339e4587bee6
PyCuVSLAM docs: https://wiki.seeedstudio.com/pycuvslam_recomputer_robotics/
Pixhawk 6x IMU (ICM-42688-P) datasheet: https://invensense.tdk.com/products/motion-tracking/6-axis/icm-42688-p/
All references from solution_draft05.md

AC Assessment: _docs/00_research/gps_denied_nav/00_ac_assessment.md
Completeness assessment research: _docs/00_research/solution_completeness_assessment/
Previous research: _docs/00_research/trt_engine_migration/
Tech stack evaluation: _docs/01_solution/tech_stack.md (needs sync with draft05 corrections)
Security analysis: _docs/01_solution/security_analysis.md
Previous draft: _docs/01_solution/solution_draft05.md

52 KiB Raw Blame History Unescape Escape