mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-04-27 16:46:36 +00:00
- Introduced a new document detailing the current state of the autodev process, including steps, status, and findings.
- Revised acceptance criteria in the acceptance_criteria.md file to clarify metrics and expectations, including updates to GPS accuracy and image processing quality. - Enhanced restrictions documentation to reflect operational parameters and constraints for UAV flights, including camera specifications and satellite imagery usage. - Added new research documents for acceptance criteria assessment and question decomposition to support ongoing project evaluation and decision-making.
This commit is contained in:
@@ -0,0 +1,449 @@
|
||||
# Solution Draft 03
|
||||
|
||||
> **Mode**: B (Solution Assessment of `solution_draft02.md`).
|
||||
> **Inputs**: `solution_draft02.md` (Mode B round 1) + `_docs/00_research/{03_mode_b_decomposition_round2,04_reasoning_chain_mode_b_round2,05_validation_log_mode_b_round2}.md` + Mode B round-2 sources S58–S77 in `01_source_registry.md` + Mode B round-2 fact cards M-22..M-35 in `02_fact_cards.md`.
|
||||
> **Date**: 2026-04-26 (Mode B round 2).
|
||||
> **Self-contained**: yes — supersedes `solution_draft02.md`.
|
||||
>
|
||||
> **What changed in round 2** (driven by user-explicit asks: VO, matcher, EKF/ESKF, ortho-tile generator + thorough sweep):
|
||||
>
|
||||
> - **Component 4 (VO)**: replace draft02's *custom 2-frame homography VO via SP+LG* with **cuVSLAM** (NVIDIA, CUDA-accelerated, drop-in via `isaac_ros_visual_slam`) in monocular + IMU mode (M-22, M-23, S60, S64).
|
||||
> - **Component 5 (Fusion)**: **drop the companion-side EKF entirely for v1**. Replace with a lightweight **covariance calibrator + Mahalanobis outlier gate + source-label producer** — no state propagation, no IMU integration on the companion (M-26). Let ArduPilot EKF3 do the actual fusion. The "EKF vs ESKF" question becomes: *if* we re-introduce a companion filter in v1.x, use vanilla ESKF (S68, S69) — but for v1 the question is moot.
|
||||
> - **Component 5 (Hybrid output)**: walk back round-1 M-1's "emit BOTH GPS_INPUT AND ODOMETRY in parallel for the same axis" — that triggers ArduPilot EKF3 double-fusion bugs (S65, S66, S67). v1 ships **GPS_INPUT only** (Option A in M-30); ODOMETRY-primary mode is v1.1 territory.
|
||||
> - **Component 3 (Matcher)**: **SP+LG (TRT FP16/INT8) remains the inline matcher**; **LiteSAM (S58) added in three non-inline roles**: re-localization fallback (cold start, σ_xy > 50 m), validation oracle, distillation teacher (M-24). RoMa v2 (S63), MASt3R-SLAM (S62), MapGlue, MATCHA added to the matcher bench-off as **offline ceiling references** (M-25).
|
||||
> - **Component 1b (Ortho-Tile Generator)**: replace draft02's hand-rolled "pinhole projection on per-sector DEM" with **Orthority** (S59) — Python library, frame + RPC camera, GeoTIFF DEM, pip-installable. Documented fall-back to `cv2.warpPerspective + bilinear DEM` if F-T14 latency measurement fails (M-27).
|
||||
> - **Component 9 (Software platform)**: **ROS 2 Humble + Isaac ROS 3.2** chosen (Q6 → A, locked 2026-04-26). Natural pair for cuVSLAM and a published reference architecture on Orin Nano Super (S64, S77, M-29). DDS overhead (~2–5 % CPU, ~200 MB image growth) accepted in exchange for free integration of `isaac_ros_visual_slam`, MAVROS, and `ros2 bag` / `rqt_*` observability tooling.
|
||||
> - **Component 1 (Tile storage)**, **C-2 (VPR)**, **C-6 (MAVLink)**, **C-7/C-8/C-10/C-11**: unchanged from draft02 (M-28, M-31, M-33).
|
||||
>
|
||||
> **Locked-in user decisions carried over from round 1** (unchanged):
|
||||
>
|
||||
> - **Q1** → A: GPS_INPUT primary channel (now: ONLY channel for v1 — see M-30 above).
|
||||
> - **Q2** → A: distinct system-IDs via ArduPilot native MAVLink routing; **no `mavlink-router` daemon**.
|
||||
> - **Q3** → A: AC-NEW-7 thresholds confirmed at P(>30 m)<1 %, P(>100 m)<0.1 % per flight.
|
||||
> - **Q4** → A: TartanAir V2 included as early-stage synthetic baseline.
|
||||
> - **Q5** → B (round 1): proceed to Plan in fresh conversation. **Round 2 was triggered after rollback for additional component-replacement investigation.**
|
||||
> - Camera spec → ADTi 20MP 20L V1 APS-C; storage zoom → z=20.
|
||||
>
|
||||
> **Round-2 user decisions locked-in (2026-04-26)**:
|
||||
>
|
||||
> - **Q6** → A: **ROS 2 Humble + Isaac ROS 3.2** as the v1 orchestrator (M-29). DIY Python orchestrator dropped. Codified in Component 9.
|
||||
> - **Q7** → A: **MAVLink `RAW_IMU` / `SCALED_IMU` from FC** (path a) as the v1 IMU source for cuVSLAM (M-35). Dedicated companion IMU is a v1.1 hardware revision triggered only if F-T1c shows sync-jitter problems. Codified in Component 4.
|
||||
|
||||
---
|
||||
|
||||
## Assessment Findings (Round 2 additions)
|
||||
|
||||
The round-1 findings table (15 rows: M-1 … M-21, including addenda M-19/M-20/M-21) carries forward unchanged. **Round 2 adds the following findings, with the same `old → weak → new` pattern**:
|
||||
|
||||
| Old Component Solution (round 1) | Weak Point (round 2 evidence) | New Solution (round 2) |
|
||||
|----------------------------------|-------------------------------|------------------------|
|
||||
| **C-4 (round 1)**: "custom 2-frame VO via SuperPoint+LightGlue / GIM-LightGlue homography." | **Functional, high (M-22)**. Custom 2-frame homography skips loop closure, sparse bundle adjustment, and keyframe-based local mapping — every mechanism that bounds drift in production VO/SLAM. AFIT thesis (S52) shows even ORB-SLAM2/SVO/DSO struggle on real fixed-wing flights; a hand-rolled 2-frame variant will be strictly worse. At 1 km AGL motion parallax shrinks ~10–25× per frame vs 100 m AGL, further degrading monocular VO. | **Replace with cuVSLAM** (NVIDIA, CUDA-accelerated, Apache-2.0; S60, S64). Monocular + IMU mode, drop-in via `isaac_ros_visual_slam` ROS 2 wrapper. <1 % ATE on KITTI / <5 cm on EuRoC. Fixed-wing 1 km AGL behaviour empirically TBD — bench-off in F-T1b mandatory before AC-1.3 lock. |
|
||||
| **C-4 (round 1)**: same row, alternatives. | **Functional (M-23)**. Deep-VO alternatives evaluated for Orin Nano Super: DPVO/DPV-SLAM (S61, S73) extrapolate to 4–15 FPS — borderline for our 10 Hz target; MASt3R-SLAM (S62) is sub-1 Hz on Orin Nano Super — infeasible; VINS-Fusion / OpenVINS / BASALT / SVO Pro (S71) require non-trivial integration cost with no accuracy advantage over cuVSLAM. | cuVSLAM is **lead**; DPV-SLAM / VINS-Fusion / OpenVINS retained as **bench-off fall-backs** if cuVSLAM underperforms on fixed-wing 1 km AGL. MASt3R-SLAM / RoMa v2 reserved for **offline ceiling references**. |
|
||||
| **C-3 (round 1)**: "SP+LG (TRT FP16) lead, GIM-LightGlue peer, RoMa/DKM bench-off, MASt3R dropped." | **Functional, positive (M-24)**. LiteSAM (S58, MDPI Oct 2025) is purpose-built for satellite↔aerial AVL: 6.31 M params (2.4× smaller than EfficientLoFTR), RMSE@30 = 17.86 m on UAV-VisLoc, beats EfficientLoFTR. **But on Jetson Orin Nano Super, extrapolated latency is ~1500–2000 ms / pair** (AGX Orin → Orin Nano Super 4× scaling) — outside our 400 ms p95 budget for inline use. | **Add LiteSAM in three non-inline roles**: (a) re-localization fallback (cold start, σ_xy > 50 m, 1.5–2 s tolerable); (b) validation oracle for offline regression bench; (c) distillation teacher to train a satellite-aerial-specialised student model that fits the inline budget. **Inline matcher remains SP+LG / GIM-LG.** |
|
||||
| **C-3 (round 1)**: same row, ceilings. | **Functional, positive (M-25)**. RoMa v2 (S63, Nov 2025): SOTA dense matcher with frozen DINOv3 backbone + custom CUDA + predictive covariance — best published pose-estimation accuracy. MASt3R-SLAM (S62), MapGlue, MATCHA: cross-modal/multimodal matchers with strong specialisation. All GPU-class compute. | **Add RoMa v2, MASt3R, MapGlue, MATCHA to the matcher bench-off as offline ceiling references** so we know how much accuracy we trade by using SP+LG inline. None becomes inline candidate. |
|
||||
| **C-5 (round 1, M-1)**: "Onboard loosely-coupled EKF emits two parallel MAVLink streams: GPS_INPUT (primary) AND ODOMETRY (auxiliary, when available) for the same axis." | **Functional, safety, high (M-26, M-30)**. ArduPilot ExtNav best practice (S65, S66, S67): **only one position source per axis at a time**. Open issues #30076 and #32506 document concrete EKF3 misbehaviours when both ExtNav (ODOMETRY) and GPS (GPS_INPUT) are fed for overlapping axes — including unstable position with high variances and Z-axis snap-to-ODOMETRY. The "emit both in parallel" framing was a misconfiguration, not a feature. | **v1 ships GPS_INPUT only** (Option A in M-30). ODOMETRY emission disabled in v1. ArduPilot configured `EK3_SRC1_*=GPS+Compass`; failover via `EK3_SRC2_*`. **Option B (ODOMETRY-primary) is v1.1 work** once F-T9 SITL confirms PR #30080-class source-switching is clean. |
|
||||
| **C-5 (round 1)**: "loosely-coupled EKF in our process." | **Architectural (M-26)**. The companion-side EKF was always going to feed the FC's own EKF3 → double-fusion. Visual fix → companion EKF → ArduPilot EKF3 stacks two filters on overlapping observations, breaks the single-source-per-axis invariant, and risks the same instability documented in #30076/#32506. | **Drop the companion-side EKF for v1.** Component 5 becomes a **"covariance calibrator + Mahalanobis outlier gate + source-label producer"** — no state propagation, no IMU integration. Each upstream (matcher, cuVSLAM) emits a hypothesis with covariance; outliers are gated; covariances are re-scaled if empirical residuals show over- or under-confidence; results are emitted on the appropriate MAVLink channel. **If v1.x evidence demands a companion-side filter**, use vanilla **ESKF** (S68, S69) — the right family for orientation correctness. |
|
||||
| **C-1b (round 1)**: "Pinhole projection on per-sector DEM (flat-Earth in flat sectors; SRTM-30 m DEM lookup in moderate sectors)." | **Engineering (M-27)**. Implicit hand-rolled implementation reinvents distortion handling, RPC refinement, DEM bilinear lookup, projection — all of which exist in the **Orthority** Python library (S59) under MIT-class licence, pip-installable. | **Use Orthority for per-frame ortho** (frame-camera mode). Falls back to `cv2.warpPerspective + bilinear DEM` (~5–20 ms estimated) if F-T14 measurement shows Orthority's per-frame latency on Orin Nano Super > 50 ms allotted to ortho. |
|
||||
| **C-9 (round 1)**: "Single Python process (asyncio) on CPython 3.11/3.12; TRT subprocess workers." | **Architectural (M-29)**. With cuVSLAM adoption (M-23), the natural integration path is `isaac_ros_visual_slam` (ROS 2 wrapper) → MAVROS → FC. Re-exporting cuVSLAM into a custom asyncio orchestrator is high-friction. **ROS 2 Humble + JetPack 6 + Isaac ROS 3.2 is a published, working reference design on the exact hardware target** (S64, S77). | **OPEN QUESTION (Q6)**: ROS 2 Humble + Isaac ROS 3.2 vs. DIY Python orchestrator. ROS 2 cost: ~2–5 % CPU (DDS + topic serialisation), ~200 MB image growth, learning curve. ROS 2 benefit: free integration of cuVSLAM, MAVROS, observability via `ros2 bag` / `rqt_*`. **User decides.** |
|
||||
|
||||
(Round-1 findings M-1 through M-21 — including the Phase-1-correction addenda — remain unchanged in their original form; round-2 supersedes only the rows above. Full round-1 rationale lives in `solution_draft02.md` for traceability and `_docs/00_research/02_fact_cards.md`.)
|
||||
|
||||
---
|
||||
|
||||
## Product Solution Description (Revised)
|
||||
|
||||
A companion-computer software stack that runs on the **Jetson Orin Nano Super** alongside an **ArduPilot 4.5+** flight controller and provides **GPS-equivalent position fixes** to the autopilot when real GPS is jammed, spoofed, or denied.
|
||||
|
||||
**Localization pipeline (per frame at 3 fps nav cam):**
|
||||
|
||||
1. **cuVSLAM** (monocular + IMU from FC `RAW_IMU` MAVLink stream) provides drift-bounded **relative pose** with keyframe-based local mapping + sparse bundle adjustment + loop closure.
|
||||
2. **VPR** (DINOv2 SALAD/BoQ chosen by bench-off; AnyLoc fallback) narrows the satellite basemap to a top-K candidate-chunk shortlist on re-localization triggers (cold start, sharp turn, σ_xy > 50 m) — **conditional invocation** keeps cruise overhead near zero.
|
||||
3. **Cross-view matcher** (SP+LG TRT FP16 inline; GIM-LightGlue peer in the bench-off; LiteSAM as **re-loc fallback**) produces sub-pixel keypoint correspondences against the candidate chunks; PnP yields an **absolute pose** + covariance.
|
||||
4. **Component 5** (**covariance calibrator + Mahalanobis outlier gate + source-label producer** — *not* an EKF) consumes the absolute pose + cuVSLAM relative pose; rejects outliers; re-scales covariances; emits result on the appropriate MAVLink channel.
|
||||
5. **GPS_INPUT** (`GPS1_TYPE=14`, MAVLink2-signed, pymavlink) is sent to the FC. ArduPilot EKF3 (24-state classical EKF, 400 Hz) does the actual fusion of our GPS-equivalent fix with its own IMU, baro, compass.
|
||||
|
||||
**Tile generation** (in-flight, asynchronous):
|
||||
|
||||
1. Per-frame eligibility check (σ_xy ≤ 5 m hard gate, terrain class flat/moderate, EKF source = `satellite_anchored`).
|
||||
2. **Orthorectification via Orthority** (frame-camera model + per-sector DEM from SRTM 30 m).
|
||||
3. Quality scoring + dedup against existing tile cache (service-tile immutability respected).
|
||||
4. Write to MBTiles SQLite cache (WAL + connection pool + transaction batching) with `parent_pose_sigma_xy`, `terrain_class`, `trust_level`.
|
||||
5. **Post-flight**: tiles uploaded to **Suite Service candidate pool**; **2-flight voting** at Service ingest promotes onboard tiles to trusted basemap.
|
||||
|
||||
**Object localization** (separate path, AI camera): trig + airframe-attitude fusion via FC `ATTITUDE` MAVLink stream — unchanged from round 1.
|
||||
|
||||
**MAVLink endpoint**: shared between MAVSDK (telemetry, sysid=10) and pymavlink (GPS_INPUT, sysid=11) via **distinct system-IDs through ArduPilot's native MAVLink routing** — no `mavlink-router` daemon. **MAVLink2 signing mandatory in v1**.
|
||||
|
||||
```
|
||||
Pre-flight (ground)
|
||||
┌────────────────────────────────────────────────┐
|
||||
│ Azaion Suite Satellite Service │
|
||||
│ (sources commercial / agency imagery; │
|
||||
│ ingests onboard tiles via candidate pool + │
|
||||
│ 2-flight voting layer) │
|
||||
└──────────────┬───────────────────┬─────────────┘
|
||||
│ sync down │ upload back (post-flight)
|
||||
▼ ▲
|
||||
┌─────────────────┐
|
||||
│ DEM (SRTM 30 m) │ ─────► sector classification
|
||||
└─────────────────┘
|
||||
Onboard (in-flight)
|
||||
Nav Cam: ADTi 20MP, 3 fps AI Cam (gimbal+zoom, on-demand)
|
||||
│ │
|
||||
▼ ▼
|
||||
┌────────────────────────────────────────────┐ ┌────────────────────┐
|
||||
│ ROS 2 Humble + Isaac ROS 3.2 (Q6: TBD) │ │ Object Geo-Locator │
|
||||
│ ┌──────────────────────────┐ │ │ (pinhole+ATTITUDE) │
|
||||
│ │ cuVSLAM (mono + IMU) │←──FC RAW_IMU │ └──────┬─────────────┘
|
||||
│ │ → keyframe pose + cov │ │ │
|
||||
│ └────────────┬─────────────┘ │ │
|
||||
│ ▼ │ │
|
||||
│ ┌──────────────────────────┐ │ │
|
||||
│ │ VPR (SALAD/BoQ/AnyLoc) │←─ re-loc │ │
|
||||
│ │ on demand only │ triggers │ │
|
||||
│ └────────────┬─────────────┘ │ │
|
||||
│ ▼ │ │
|
||||
│ ┌──────────────────────────┐ │ │
|
||||
│ │ Cross-view Matcher │ │ │
|
||||
│ │ inline: SP+LG / GIM-LG │ │ │
|
||||
│ │ re-loc: LiteSAM (rare) │ │ │
|
||||
│ └────────────┬─────────────┘ │ │
|
||||
│ ▼ │ │
|
||||
│ ┌──────────────────────────┐ │ │
|
||||
│ │ PnP → absolute pose + Σ │ │ │
|
||||
│ └────────────┬─────────────┘ │ │
|
||||
│ ▼ │ │
|
||||
│ ┌──────────────────────────────────────┐ │ │
|
||||
│ │ Component 5 (NOT an EKF) │ │ │
|
||||
│ │ - covariance calibrator │ │ │
|
||||
│ │ - Mahalanobis outlier gate │ │ │
|
||||
│ │ - source-label producer │ │ │
|
||||
│ └────────────┬─────────────────────────┘ │ │
|
||||
│ ▼ │ │
|
||||
│ ┌──────────────────────────────────────┐ │ │
|
||||
│ │ Ortho-Tile Generator (Orthority) │ │ │
|
||||
│ │ → MBTiles+WAL Tile Cache │ │ │
|
||||
│ └──────────────────────────────────────┘ │ │
|
||||
└────────────────┬───────────────────────────┘ │
|
||||
▼ │
|
||||
GPS_INPUT (pymavlink, signed) ──► ArduPilot │
|
||||
(GPS1_TYPE=14, EK3_SRC1_POSXY=GPS, EK3_SRC2=GPS)│
|
||||
│ (ODOMETRY disabled for v1; v1.1+) │
|
||||
▼ │
|
||||
Telemetry summary 1–2 Hz ──────► QGroundControl │
|
||||
│ │
|
||||
▼ │
|
||||
Flight Data Recorder (NVMe, 64 GB cap, no raw frames)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
### Overall principles (revised vs draft02)
|
||||
|
||||
1. **Pipeline = stages with explicit confidence**. Each stage emits a pose hypothesis + covariance + categorical label. **Component 5 calibrates and gates; ArduPilot EKF3 fuses.** *(Revised — M-26.)*
|
||||
2. **All heavy NN inference runs on GPU via TensorRT** (FP16, INT8 where validated). Pre-extract satellite-tile descriptors offline (AC-8.3). *(Unchanged.)*
|
||||
3. **Orchestration**: **ROS 2 Humble + Isaac ROS 3.2** (Q6 → A, locked). cuVSLAM consumed via `isaac_ros_visual_slam`; MAVROS bridges ROS 2 ↔ MAVLink for the FC. Our matcher / VPR / ortho / Component-5 calibrator / FDR / uploader run as `rclpy` Python nodes. CPython 3.11 / 3.12 inside the nodes; TensorRT engines + CUDA contexts owned per-node. *(Revised — M-29.)*
|
||||
4. **Persistent satellite cache** across flights (~10 GB for 400 km²); per-flight FDR is separate. *(Unchanged.)*
|
||||
5. **Every output to the FC carries a covariance** — GPS_INPUT (`h_acc`, `v_acc`, `vel_acc`). ODOMETRY emission disabled for v1 (Option A in M-30). *(Revised — M-30.)*
|
||||
6. **Service tiles are basemap truth**; onboard tiles go through Service-side voting before promotion (M-9). *(Unchanged.)*
|
||||
7. **MAVLink2 signing on every companion↔FC link** (M-7). USB bypasses signing — bench-only access. *(Unchanged.)*
|
||||
8. **No companion-side state propagation** — the FC's EKF3 is the only filter. Any future companion-side filter (v1.x) will be an **ESKF** (S69), not a regular EKF. *(New — M-26.)*
|
||||
|
||||
---
|
||||
|
||||
### Component 1: Satellite Tile Cache & Descriptor Index
|
||||
|
||||
**Unchanged from draft02 / Mode B round 1** — MBTiles SQLite + WAL + connection pool + transaction batching; FAISS IVF over per-chunk DINOv2-VLAD vectors (chunk-decoupled per M-16); `terrain_class` and `trust_level` sidecar. (M-28: COG + PMTiles considered and rejected for our use case.)
|
||||
|
||||
---
|
||||
|
||||
### Component 1b: Ortho-Tile Generator *(REVISED — M-27)*
|
||||
|
||||
**Library**: **Orthority** (S59, Python, MIT-class) — frame-camera model with GeoTIFF DEM lookup. Pip-installable: `pip install orthority`. Replaces draft02's hand-rolled "pinhole projection on per-sector DEM".
|
||||
|
||||
**Pipeline per frame** (eligibility / quality / dedup logic unchanged from draft02; only the *projection step* is replaced):
|
||||
|
||||
1. **Eligibility check** (unchanged from draft02 / M-9 hard gate): skip when EKF source is `dead_reckoned`, σ_xy > 5 m, roll/pitch > 10°, no inliers, or sector is `rugged`. Sectors classified `moderate` get `terrain_uncertainty=true` sidecar flag.
|
||||
2. **Orthorectification (revised)**: call `orthority.Ortho(frame, dem, camera_model).process()` with the frame-camera model populated from FC `ATTITUDE` (gimbal pitch / roll / yaw) + companion-resolved position + airframe altitude. SRTM-30 m DEM tile pre-loaded for the operational area.
|
||||
3. **Resampling to basemap projection** (unchanged): EPSG:3857 z=20.
|
||||
4. **Quality scoring** (unchanged from draft02): sharpness + coverage + match_inliers + parent_pose_sigma_xy + glare/cloud flag.
|
||||
5. **Deduplication / write decision** (unchanged from draft02 — M-9 service-tile-immutability + soft/candidate gates).
|
||||
6. **Sidecar metadata** (unchanged): `parent_pose_sigma_xy`, `terrain_class`, `trust_level`.
|
||||
|
||||
**Latency budget**: F-T14 (revised) measures Orthority's per-frame latency on Orin Nano Super. **Budget: ≤50 ms / frame.** Documented fall-back if exceeded: `cv2.warpPerspective` + bilinear DEM lookup (~5–20 ms estimated).
|
||||
|
||||
---
|
||||
|
||||
### Component 2: Visual Place Recognition (Global Retrieval)
|
||||
|
||||
**Unchanged from draft02 / Mode B round 1.** AnyLoc + SALAD + BoQ + MixVPR shortlist; conditional invocation (M-17); chunk-based retrieval unit (M-16); expanding-window retry (M-18); multi-scale chunks + OSM road-overlay + sector-volatility-driven K (M-19); active-conflict scene-change mitigations stand. (M-33: no new VPR backbone in 2025 displaces this.)
|
||||
|
||||
---
|
||||
|
||||
### Component 3: Cross-View Matching & PnP *(REVISED — M-24, M-25)*
|
||||
|
||||
**Inline lead**: **SuperPoint + LightGlue (TRT FP16/INT8)** — unchanged. Feasibility re-confirmed: ~50–200 ms / pair on Orin Nano Super FP16 at 320×240 → 640×480 (RTX 3080 baseline 0.96 + 2.54 ms scaled by Orin Nano Super throughput ratio; cross-validated by S76 YOLO26 reference points).
|
||||
|
||||
**Inline peer**: **GIM-LightGlue** — unchanged from draft02 (M-3, S48). +8.4–18.1 % zero-shot vs LightGlue baseline.
|
||||
|
||||
**Embedded fallback**: **XFeat (sparse + semi-dense)** — unchanged.
|
||||
|
||||
**Re-localization fallback** *(new — M-24)*: **LiteSAM** (S58). Invoked rarely (cold start, σ_xy > 50 m, sharp turn after cuVSLAM tracking loss). Latency budget: 1.5–2 s on Orin Nano Super. Accepted because re-loc events are rare and AC-NEW-1 cold-start budget is 30 s.
|
||||
|
||||
**Validation oracle** *(new — M-24)*: **LiteSAM run offline on bench data** for ground-truth-quality matches. Used to score the inline matcher's recall@30m on a per-flight basis without needing manual annotation.
|
||||
|
||||
**Distillation teacher** *(new — M-24)*: train a satellite-aerial-specialised student model (target ≤5 M params, ≤100 ms / pair) using LiteSAM-supervised correspondences on TartanAir V2 + AerialExtreMatch + UAV-VisLoc. Output is a candidate inline matcher for v1.x.
|
||||
|
||||
**Offline ceiling references** *(new — M-25)*: **RoMa v2** (S63), **MASt3R-SLAM** (S62), **MapGlue**, **MATCHA** — included in the matcher bench-off so we know how much accuracy we trade by using SP+LG inline. None becomes inline candidate.
|
||||
|
||||
**Bench-off scope (revised)** for the deferred research item:
|
||||
- Inline candidates (must fit in 200 ms / pair on Orin Nano Super @ 25 W): SP+LG, GIM-LightGlue, XFeat (sparse), XFeat (semi-dense).
|
||||
- Re-loc candidates (must fit in 2 s / pair): LiteSAM.
|
||||
- Offline ceilings: RoMa v2, MASt3R-SLAM, MapGlue, MATCHA.
|
||||
|
||||
**Bench-off targets** (unchanged from draft02): AerialVL, UAV-VisLoc, AerialExtreMatch, 2chADCNN season set, TartanAir V2, internal Mavic, first internal fixed-wing flight.
|
||||
|
||||
**Score on**: AC-1.1 / AC-1.2 / AC-2.2 / p95 latency on Orin Nano Super 25 W / sustained 30-min thermal stability / peak GPU memory / **plus seasonal-robustness score** / **plus accuracy-vs-inline-feasibility frontier (re-loc role only for >200 ms candidates)**.
|
||||
|
||||
**PnP & projection**: unchanged from draft02.
|
||||
|
||||
**Input downsampling**: unchanged starting points (1024×768 for SP+LG / GIM-LG; 640×480 for XFeat sparse).
|
||||
|
||||
---
|
||||
|
||||
### Component 4: Visual Odometry *(REVISED — M-22, M-23)*
|
||||
|
||||
**v1 choice**: **cuVSLAM** (NVIDIA, CUDA-accelerated, Apache-2.0; S60). Monocular + IMU mode. Drop-in via `isaac_ros_visual_slam` ROS 2 wrapper (S64). Replaces draft02's "custom 2-frame VO via SP+LG / GIM-LG homography".
|
||||
|
||||
**Why cuVSLAM**:
|
||||
- Production-grade VO/SLAM with keyframe-based local mapping + sparse bundle adjustment + loop closure — bounds drift, unlike a 2-frame homography.
|
||||
- CUDA-accelerated, optimized for Jetson. Reference designs on Orin Nano (S64, S77) confirm runtime feasibility.
|
||||
- <1 % ATE on KITTI / <5 cm on EuRoC.
|
||||
- Minimal integration cost via the ROS 2 wrapper.
|
||||
|
||||
**Why not the alternatives**:
|
||||
- DPVO / DPV-SLAM (S61, S73): extrapolated 4–15 FPS on Orin Nano Super — borderline for 10 Hz target. Reserved as bench-off fall-back.
|
||||
- MASt3R-SLAM (S62): sub-1 Hz on Orin Nano Super — infeasible inline.
|
||||
- VINS-Fusion / OpenVINS / BASALT / SVO Pro (S71): non-trivial integration cost; no accuracy advantage. Reserved as bench-off fall-backs.
|
||||
- Custom 2-frame homography VO (draft02): wrong design (M-22).
|
||||
|
||||
**IMU source for cuVSLAM** (Q7 → A, locked, M-35): **MAVLink `RAW_IMU` / `SCALED_IMU` from FC** at ~200–400 Hz (path a). Subscribed inside the cuVSLAM node via MAVROS. **F-T1c** (new field test) measures sync-jitter under flight load; if it fails the threshold (TBD by cuVSLAM tolerance), v1.1 adds a dedicated companion IMU (BNO055 / ICM-42688P / BMI270) over SPI as a hardware revision.
|
||||
|
||||
**Camera intrinsics**: nav cam (ADTi 20MP APS-C) calibrated pre-flight via standard checkerboard (M-34). cuVSLAM consumes the `camera_info` topic at start-up.
|
||||
|
||||
**Risk R8 reframed**: cuVSLAM's high-altitude fixed-wing performance is empirically unproven (its published benchmarks are urban driving + indoor MAV). **F-T1b (revised) bench-off mandatory before AC-1.3 lock**.
|
||||
|
||||
**Fall-back path**: if cuVSLAM underperforms on AerialVL fixed-wing trajectories, use a properly-scoped VO (DPV-SLAM with keyframe + bundle adjustment + loop closure, not 2-frame homography) as the v1.1 candidate. Custom 2-frame VO never comes back.
|
||||
|
||||
---
|
||||
|
||||
### Component 5: Companion-Side Output Stage *(REVISED — M-26, M-30)*
|
||||
|
||||
**Renamed**: was "IMU + Visual EKF Fusion" in draft02. Now: **"Companion-Side Output Stage — Covariance Calibrator + Outlier Gate + Source-Label Producer"**.
|
||||
|
||||
**Responsibility (v1)**:
|
||||
1. Consume cuVSLAM relative-pose + cross-view matcher absolute-pose hypotheses.
|
||||
2. Run a Mahalanobis outlier gate to drop fixes whose innovation w.r.t. cuVSLAM relative pose exceeds a threshold (computed against AC-NEW-4 false-position safety budget).
|
||||
3. Re-scale covariances using empirical residuals (online, exponentially-weighted) to correct for systematic over- / under-confidence in the matcher / VPR / VO outputs.
|
||||
4. Tag the result with a categorical source label: `satellite_anchored / vo_extrapolated / dead_reckoned`.
|
||||
5. Emit on the appropriate MAVLink channel (GPS_INPUT for v1, Option A in M-30).
|
||||
|
||||
**Explicitly NOT in v1**:
|
||||
- ❌ State propagation (no `x_{k+1} = f(x_k, u_k) + w_k`).
|
||||
- ❌ IMU integration (the FC's EKF3 does this with the FC's own IMU at 400 Hz).
|
||||
- ❌ ODOMETRY emission (Option B in M-30 — v1.1+).
|
||||
|
||||
**ESKF question resolved**: ArduPilot EKF3 is a regular EKF (24-state) — we cannot swap the FC filter (S65, S66, S67, S68). The EKF-vs-ESKF debate applies only to a hypothetical companion-side filter, which we drop for v1. **If v1.x evidence (F-T9 SITL) demands a companion-side filter, use vanilla ESKF** (S69) — the right family for orientation correctness, with tangent-space covariance on SO(3).
|
||||
|
||||
**Hybrid-output channel split (M-30)**:
|
||||
|
||||
| Mode | `EK3_SRC1_*` configuration | Channel emission | Status |
|
||||
|------|---------------------------|------------------|--------|
|
||||
| **Option A (v1 default)** | `POSXY=GPS, VELXY=GPS, YAW=GPS+Compass`. `EK3_SRC2_*=GPS` for failover. | GPS_INPUT only (`GPS1_TYPE=14`). ODOMETRY disabled. | Ships in v1. |
|
||||
| **Option B (v1.1+)** | `POSXY=ExternalNav, YAW=ExternalNav`. `EK3_SRC2_POSXY=GPS` for failover. | ODOMETRY primary; GPS_INPUT held in reserve, not actively fused while ODOMETRY healthy. | Requires PR #30080 fix; gated on F-T9 SITL pass. |
|
||||
|
||||
---
|
||||
|
||||
### Component 6: MAVLink Integration & Source Promotion
|
||||
|
||||
**Unchanged from draft02 / round 1.** MAVSDK (telemetry, sysid=10) + pymavlink (GPS_INPUT, sysid=11), distinct system-IDs sharing the serial port via ArduPilot's native MAVLink routing. **No mavlink-router daemon.** **MAVLink2 signing mandatory**, per-airframe key in FC FRAM. Source-promotion logic and AC-NEW-2 (<3 s spoofing-promotion latency) carry forward unchanged. (M-31: sysid collision-check added to deploy runbook.)
|
||||
|
||||
---
|
||||
|
||||
### Component 7: Failsafe, Health & Re-Localization
|
||||
|
||||
Unchanged from draft02.
|
||||
|
||||
---
|
||||
|
||||
### Component 8: Object Localization (AI Camera)
|
||||
|
||||
Unchanged from draft02.
|
||||
|
||||
---
|
||||
|
||||
### Component 9: Software Platform & Process Topology *(LOCKED — Q6 → A, M-29)*
|
||||
|
||||
**v1 choice**: **ROS 2 Humble + Isaac ROS 3.2 on JetPack 6 / Ubuntu 22.04** (S64, S77).
|
||||
|
||||
**Process topology**:
|
||||
- **C++ Isaac ROS node**: cuVSLAM via `isaac_ros_visual_slam` (consumes `camera_info` + image stream + IMU; publishes `nav_msgs/Odometry`).
|
||||
- **C++ MAVROS node**: bridges ROS 2 ↔ MAVLink for the FC. `RAW_IMU` / `SCALED_IMU` subscribed by the cuVSLAM node; FC `ATTITUDE` consumed by Component 1b ortho node; `GPS_INPUT` published by Component 5 calibrator node.
|
||||
- **Python `rclpy` nodes**: matcher (SP+LG TRT FP16/INT8), VPR (SALAD/BoQ on demand), Component 1b ortho generator (Orthority), Component 5 calibrator + outlier gate, FDR writer, Suite-Service uploader.
|
||||
- **TensorRT engines + CUDA contexts** owned per-node (no shared CUDA context). Engines loaded at node start-up; warm-up inference at boot.
|
||||
|
||||
**Stack details (locked)**:
|
||||
- CPython **3.11 or 3.12** inside `rclpy` nodes (free-threaded 3.13 deferred to v1.x — M-32, M-33).
|
||||
- TensorRT **FP16 default**, INT8 where validated by the matcher bench-off.
|
||||
- **numba JIT** for the calibrator's hot path (Mahalanobis distance + covariance re-scale).
|
||||
- Configuration via YAML; structured-JSON logging to FDR; `ros2 bag` for in-flight telemetry capture.
|
||||
|
||||
**Cost / benefit reaffirmed**:
|
||||
- **Cost**: ~2–5 % CPU for DDS + topic serialisation; ~200 MB extra deployment-image footprint; learning curve (mitigated by published reference designs in S64, S77).
|
||||
- **Benefit**: drop-in `isaac_ros_visual_slam` for cuVSLAM, drop-in MAVROS for the FC bridge, free observability via `ros2 bag` and `rqt_*`, battle-tested by the wider robotics community.
|
||||
|
||||
**Reference designs**: S64 (Hackster.io GPS-Denied Drone), S77 (thomasthelliez ROS 2 / Isaac ROS guide), `bandofpv/VSLAM-UAV` (PX4 + ROS 2 reference), `sidharthmohannair/ros2-ardupilot-sitl-hardware` (ArduPilot + ROS 2 reference).
|
||||
|
||||
---
|
||||
|
||||
### Component 10: Flight Data Recorder
|
||||
|
||||
Unchanged from draft02 / round 1.
|
||||
|
||||
---
|
||||
|
||||
### Component 11: Confidence Score (cross-cutting)
|
||||
|
||||
Unchanged from draft02 / round 1.
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Functional / Integration
|
||||
|
||||
- **F-T1** Tile cache load/lookup *(unchanged)*.
|
||||
- **F-T1b** *(REVISED — M-22, M-23, R8 reframed)* AC-1.3 drift regression: run **cuVSLAM** on AerialVL fixed-wing trajectories (70 km of real flight). Pass = drift ≤ 100 m mono-only / ≤ 50 m mono+IMU between satellite anchors at 95th percentile. **Gates AC-1.3 lock.** If cuVSLAM fails: fall back to DPV-SLAM bench / VINS-Fusion bench.
|
||||
- **F-T1c** *(new — M-22, M-23)* Compare cuVSLAM mono vs cuVSLAM mono+IMU on the same AerialVL trajectories — quantifies IMU contribution given MAVLink-rated IMU rate (path (a) of M-35).
|
||||
- **F-T2** Tile generation + dedup *(extended — M-9 + M-27)*: replay a recorded flight; assert (a) ≤1 tile per ground sector covered ≥2× by nav cam; (b) tile has `parent_pose_sigma_xy` ≤ hard gate; (c) service tiles never overwritten within freshness budget; **(d) Orthority output equivalent to ground-truth ortho (RMSE < 1 px on synthetic frame with known DEM)**.
|
||||
- **F-T3** Tile uploader → candidate pool *(unchanged from draft02)*.
|
||||
- **F-T4** End-to-end against AerialVL.
|
||||
- **F-T5** End-to-end against UAV-VisLoc.
|
||||
- **F-T5b** End-to-end against AerialExtreMatch *(unchanged from draft02 — M-14)*.
|
||||
- **F-T5c** Season-robustness regression against 2chADCNN season set *(unchanged from draft02 — M-14)*.
|
||||
- **F-T6** End-to-end against internal Mavic flight footage.
|
||||
- **F-T7** Sharp-turn handling *(extended — M-24)*: assert LiteSAM re-loc fallback recovers within 2 s on post-turn frames where SP+LG inline matcher fails.
|
||||
- **F-T8** Disconnected-segment re-localization *(extended — M-24)*: include LiteSAM re-loc in the test matrix.
|
||||
- **F-T9** ArduPilot SITL: full MAVLink loop *(REVISED — M-30)*. Test matrix:
|
||||
- **Option A mode** (v1 default): GPS_INPUT only; verify EKF3 fuses correctly; verify failover to backup GPS via `EK3_SRC2_*`.
|
||||
- **Option B mode** (v1.1 candidate): ODOMETRY-primary; verify PR #30080-class source-switching is clean; verify GPS_INPUT held in reserve does not double-fuse (issues #30076 / #32506 regression test).
|
||||
- Source switching: jam-onset → our channel; spoofed-real-GPS recovery → operator-confirmed source-restore.
|
||||
- **MAVLink2 signing on**: assert injection refused on signing failure; assert acceptance on valid signing.
|
||||
- **F-T10** Operator re-loc workflow via QGC `STATUSTEXT` *(unchanged)*.
|
||||
- **F-T11** Cold-start TTFF <30 s (AC-NEW-1) *(extended — M-24)*: include LiteSAM as the cold-start re-loc path.
|
||||
- **F-T12** Spoofing-promotion <3 s (AC-NEW-2) *(unchanged)*.
|
||||
- **F-T13** Object localization with airframe-attitude fusion *(unchanged)*.
|
||||
- **F-T14** *(REVISED — M-27)* Per-sector DEM classification + **Orthority per-frame latency**: load SRTM-30 m for the operational area; assert sector classes (`flat`, `moderate`, `rugged`) match ground-truth DEM amplitudes; **measure Orthority per-frame ortho latency on Orin Nano Super @ 25 W**; assert ≤ 50 ms / frame budget. If exceeded: switch to `cv2.warpPerspective + bilinear DEM` fall-back.
|
||||
- **F-T15** VPR retrieval-unit bench *(unchanged from draft02 — M-16/17/18)*.
|
||||
- **F-T16** Synthetic cloud-occlusion injection *(unchanged from draft02)*.
|
||||
- **F-T17** Mission replay assertion *(unchanged from draft02 — M-17)*.
|
||||
- **F-T18** *(new — M-26)* Companion-side calibrator regression: replay a recorded flight; assert the calibrator's empirical residuals lie within the configured Mahalanobis gate; assert no state-propagation logic is invoked; assert ArduPilot EKF3 receives well-calibrated covariances (post-flight comparison of `h_acc` reported vs measured residual).
|
||||
- **F-T19** *(new — Q6)* If Q6.A is chosen: ROS 2 topic-rate sanity test — assert all ROS 2 topics meet expected publish rates under simulated load.
|
||||
|
||||
### Non-Functional
|
||||
|
||||
- **NF-T1** Latency p95 <400 ms on Orin Nano Super 25 W (AC-4.1) *(unchanged)*.
|
||||
- **NF-T2** Memory <8 GB shared (AC-4.2) *(extended — Q6)*: ROS 2 + Isaac ROS deployment image must fit; reserve ≥1 GB for matcher + VPR engines.
|
||||
- **NF-T3** Thermal: 8 h sustained 25 W (AC-NEW-5) *(unchanged)*.
|
||||
- **NF-T4** False-position safety budget (AC-NEW-4) *(extended — M-26)*: Monte Carlo with synthetic over-confidence injection; verify Component 5's outlier gate rejects bad fixes BEFORE they reach ArduPilot EKF3 (companion-side gate; FC EKF3 gate is a second line of defence).
|
||||
- **NF-T4b** AC-NEW-7 cache-poisoning safety budget *(unchanged — M-9)*.
|
||||
- **NF-T5** Storage: 64 GB FDR cap with rollover *(unchanged)*.
|
||||
- **NF-T6** Imagery freshness gate (AC-NEW-6) *(unchanged)*.
|
||||
|
||||
### Security
|
||||
|
||||
- **S-T1** … **S-T5** *(unchanged from draft02)*.
|
||||
|
||||
### Field
|
||||
|
||||
- **FT-1** … **FT-3** *(unchanged from draft02)*.
|
||||
|
||||
---
|
||||
|
||||
## Key Risks & Open Items (carried into Plan step)
|
||||
|
||||
| ID | Risk | Severity | Mitigation |
|
||||
|----|------|----------|------------|
|
||||
| R1 | Imagery licensing lead time (Service-side) | Med | Suite Service procurement |
|
||||
| R2 | Latency budget on Orin Nano Super at 1024×768 | Med | Empirical bench-off in week 1 of impl |
|
||||
| R3 | Cross-view accuracy at 1 km AGL with Ukrainian seasonal change | Med | 50 %@20 m hard floor; bench-off includes SALAD/BoQ/GIM-LG/2chADCNN/**LiteSAM-as-oracle** |
|
||||
| R4 | MAVSDK + pymavlink coexistence | **Resolved** (M-6) | — |
|
||||
| R5 | Thermal at 25 W for 8 h | Med | NF-T3 |
|
||||
| R6 | AC-7.1 in turning flight | Low | v1.1 |
|
||||
| R7 | Public dataset gap (V&V) | Med | Bench-off + first internal fixed-wing flight before AC-1.3 lock |
|
||||
| **R8** *(REFRAMED — M-22, M-23)* | **cuVSLAM 1 km AGL fixed-wing performance is empirically unproven** | Med | F-T1b on AerialVL fixed-wing trajectories; FT-3 first internal fixed-wing flight; documented fall-back to DPV-SLAM / VINS-Fusion |
|
||||
| R9 | Cross-flight cache poisoning | High (safety) | Service-tile immutability + 2-flight voting + σ_xy hard gate + AC-NEW-7 |
|
||||
| R10 | Companion↔FC link is flight-critical attack surface | High (security) | MAVLink2 signing mandatory + native routing |
|
||||
| R11 | ArduPilot ExtNav source-switching gotchas | Med | F-T9 SITL matrix; pin ArduPilot to PR #30080 version |
|
||||
| R12 | Eastern-Ukraine relief amplitude breaks flat-Earth assumption | Med | Per-sector DEM lookup + runtime self-classifier |
|
||||
| **R13** *(new — M-27)* | **Orthority per-frame latency on Orin Nano Super may exceed budget** | Low–Med | F-T14 measurement; fall-back to `cv2.warpPerspective + bilinear DEM` (~5–20 ms estimated) |
|
||||
| **R14** *(new — M-26, M-30)* | **Dropping companion-side EKF may surface FC-side covariance-handling issues** | Low–Med | F-T18 calibrator regression + F-T9 SITL Option A; if EKF3 mishandles raw inputs, re-introduce vanilla ESKF in v1.x |
|
||||
| R15 *(M-29)* | Orchestrator choice (Q6 → A locked: ROS 2 Humble + Isaac ROS 3.2) | **Resolved** | — |
|
||||
| **R16** *(M-35)* | **MAVLink-rated IMU may be insufficient for cuVSLAM sync sensitivity** | Low–Med | F-T1c IMU-sync-jitter measurement; v1.1 hardware revision adds dedicated companion IMU if F-T1c fails (Q7 → A locked: path (a) for v1) |
|
||||
|
||||
---
|
||||
|
||||
## Proposed AC additions
|
||||
|
||||
**AC-NEW-7 — Cache-poisoning safety budget** *(unchanged from draft02 — M-9)*.
|
||||
|
||||
**AC-NEW-8 — VO drift bound on fixed-wing 1 km AGL** *(new — M-22, M-23, R8 reframed)*. Specifically: cuVSLAM (mono+IMU) drift between satellite anchors ≤ 50 m at 95th percentile on AerialVL fixed-wing trajectories; ≤ 100 m mono-only. Validated by **F-T1b**.
|
||||
|
||||
**AC-NEW-9 — Companion-side covariance calibration accuracy** *(new — M-26)*. Empirical residuals of GPS_INPUT pose, computed against ground truth on F-T1b trajectories, must lie within the reported `h_acc`/`v_acc` covariance with probability ≥ 95 %. (Calibration must not under- or over-claim.) Validated by **F-T18**.
|
||||
|
||||
---
|
||||
|
||||
## Open Research (deferred to dedicated research passes before Plan)
|
||||
|
||||
| Topic | Why now | Output | Owner |
|
||||
|-------|---------|--------|-------|
|
||||
| **Cross-view matcher bench-off** *(REVISED scope — M-24, M-25)* | Inline + re-loc + offline-ceiling tracks are now distinct | Selected inline matcher; selected re-loc matcher; ceiling reference numbers; distillation candidate teacher (LiteSAM) | Research skill, follow-up Mode A pass |
|
||||
| **Input-resolution sweep** | Same as draft02 | Resolution per matcher candidate; sensitivity curves | Same pass |
|
||||
| **VPR backbone bench-off** | Same as draft02 | Selected VPR backbone | Same pass |
|
||||
| **VO bench-off** *(new — M-22, M-23)* | cuVSLAM is the lead but unproven on 1 km AGL fixed-wing | cuVSLAM mono / cuVSLAM mono+IMU / DPV-SLAM / VINS-Fusion / OpenVINS comparison on AerialVL + first internal fixed-wing flight | Research / impl. team |
|
||||
| **Tile-generator quality scoring** | Same as draft02 | Calibrated thresholds for σ_xy / sharpness / glare | Implementation phase |
|
||||
| **Orthority per-frame latency on Orin Nano Super** *(new — M-27)* | Confirms or rejects M-27 library choice | F-T14 measurement; if fail → `cv2.warpPerspective + bilinear DEM` fall-back path locked | Implementation phase |
|
||||
| **Internal Mavic-flight V&V dataset** | Same as draft02 | Curated, ground-truth-labelled clips | Operations / data team |
|
||||
| **First internal fixed-wing flight** | Same as draft02 | Recorded sortie with synced IMU + GPS truth + nav-cam stream | Field-test plan |
|
||||
| ~~Q6 — Orchestrator decision~~ | **Locked 2026-04-26**: Q6 → A (ROS 2 Humble + Isaac ROS 3.2). | — | — |
|
||||
| ~~Q7 — Companion IMU strategy~~ | **Locked 2026-04-26**: Q7 → A (MAVLink `RAW_IMU` from FC for v1; dedicated IMU only if F-T1c fails). | — | — |
|
||||
| **Encryption-at-rest key management** | Same as draft02 | Threat-modelled design | Phase 4 security analysis |
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
All citations are by ID from `_docs/00_research/01_source_registry.md`. Mode B round 2 sources: **S58–S77** (round 1 sources S40–S57 carried over).
|
||||
|
||||
- **VO**: S60 (cuVSLAM), S61 (DPVO-QAT++), S62 (MASt3R-SLAM), S64 (Isaac ROS UAV reference), S71 (VINS-Fusion / OpenVINS Jetson reports), S72 (high-altitude VIO), S73 (DPV-SLAM).
|
||||
- **Matcher**: S58 (LiteSAM), S63 (RoMa v2), S74 (OrthoLoC + AdHoP), S75 (AerialExtreMatch open-review).
|
||||
- **Fusion**: S65 (ArduPilot ExtNav double-fusion bug), S66 (Z-axis snap bug), S67 (EKF sources spec), S68 (PX4 EKF2 ESKF PR), S69 (Sola ESKF tutorial), S70 (T-ESKF + Hybrid ESKF/UKF 2025).
|
||||
- **Ortho**: S59 (Orthority).
|
||||
- **Sweep**: S76 (Orin Nano Super FP16/INT8 reference points), S77 (ROS 2 / Isaac ROS practical guide).
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- Mode A draft: `_docs/01_solution/solution_draft01.md` (superseded by draft02 → draft03).
|
||||
- Mode B round 1 draft: `_docs/01_solution/solution_draft02.md` (superseded by draft03).
|
||||
- Mode B round 2 decomposition: `_docs/00_research/03_mode_b_decomposition_round2.md`.
|
||||
- Mode B round 2 reasoning chain: `_docs/00_research/04_reasoning_chain_mode_b_round2.md`.
|
||||
- Mode B round 2 validation log: `_docs/00_research/05_validation_log_mode_b_round2.md`.
|
||||
- AC & Restrictions assessment (Phase 1): `_docs/00_research/00_ac_assessment.md` *(unchanged)*.
|
||||
- Source registry: `_docs/00_research/01_source_registry.md` (S01–S77).
|
||||
- Fact cards: `_docs/00_research/02_fact_cards.md` (Phase 1 + Mode B round 1 M-1..M-21 + Mode B round 2 M-22..M-35).
|
||||
- Tech stack consolidation: `_docs/01_solution/tech_stack.md` (deferred — Phase 3 optional).
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md` (deferred — Phase 4 optional, **promoted to recommended-before-Plan-lock** because of M-6/M-7).
|
||||
@@ -0,0 +1,513 @@
|
||||
# Solution Draft 02
|
||||
|
||||
> **Mode**: B (Solution Assessment of `solution_draft01.md`).
|
||||
> **Inputs**: `solution_draft01.md` (Mode A) + `_docs/00_research/{03_mode_b_decomposition,04_reasoning_chain_mode_b,05_validation_log_mode_b}.md` + Mode B sources S40–S57 in `01_source_registry.md` + Mode B fact cards M-1..M-21 in `02_fact_cards.md`.
|
||||
> **Date**: 2026-04-26 (revised after user lock-in of open items Q1–Q5).
|
||||
> **Self-contained**: yes — this draft is the new source of truth and supersedes `solution_draft01.md`.
|
||||
>
|
||||
> **Locked-in user decisions (2026-04-26)**:
|
||||
>
|
||||
> - **Q1** → A: GPS_INPUT + ODOMETRY hybrid output (M-1). Codified in AC-4.3.
|
||||
> - **Q2** → A: distinct system-IDs via ArduPilot native MAVLink routing; **no `mavlink-router` daemon** (M-6).
|
||||
> - **Q3** → A: AC-NEW-7 thresholds confirmed at P(>30 m)<1 %, P(>100 m)<0.1 % per flight (M-9). Codified in AC-NEW-7.
|
||||
> - **Q4** → A: TartanAir V2 included as early-stage synthetic baseline in the bench-off (M-13).
|
||||
> - **Q5** → B: proceed to Plan in a fresh conversation (no further Mode B round).
|
||||
> - Camera spec → ADTi 20MP 20L V1 APS-C; storage zoom → z=20 (M-20). Codified in `restrictions.md`.
|
||||
|
||||
---
|
||||
|
||||
## Assessment Findings
|
||||
|
||||
The "old solution → weak point → new solution" table for the v1 commitments. Every row references the corresponding fact card (M-X) for traceability. **15 findings**: 4 high-severity functional, 2 high-severity security, 1 high-severity safety, 1 high-severity-positive (latency easier than thought), 6 medium, 1 open question.
|
||||
|
||||
|
||||
| Old Component Solution | Weak Point | New Solution |
|
||||
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| **C-6**: emit `GPS_INPUT` only via pymavlink (`GPS1_TYPE=14`) — covariance collapsed to scalar `h_acc`/`v_acc`. | **Functional** (M-1). ArduPilot dev docs (S41) call **ODOMETRY the preferred external-nav channel**; ODOMETRY carries quaternion + 6-DoF covariance + native quality field. GPS_INPUT-only under-utilises the FC's EKF3 and erases our yaw covariance — directly hurts AC-NEW-4 (false-position safety). | **Hybrid output**. GPS_INPUT remains the primary "GPS-substitute" channel (matches AC-4.3 framing). When the companion EKF emits a fix with full 6-DoF covariance and observability, **also emit ODOMETRY** so EKF3 can fuse the richer signal. FC source priorities config'd so GPS_INPUT is the failover if ODOMETRY trips VISO_QUAL_MIN. |
|
||||
| **C-3**: bench-off shortlist = {SP+LG, XFeat sparse, XFeat semi-dense, MASt3R (stretch), RoMa/DKM (bench-off candidate), classical (last-resort)}. | **Functional** (M-2). MASt3R `mast3r-runtime` lists Jetson Orin support as **"Planned"**, not implemented (S57). Speedy MASt3R = 91 ms/pair on **A40 GPU**; Orin Nano Super throughput ≈ 1/30 of A40 → MASt3R ≈ **2.5–3 s/pair**, ~7× over the 400 ms p95 budget. | Drop MASt3R from the v1 bench-off; mark it **research-track-only** (long-horizon distillation experiment). |
|
||||
| **C-3**: bench-off shortlist (same row, expansion). | **Functional** (M-3). GIM (S48, ICLR 2024 spotlight) gives drop-in 8.4–18.1 % zero-shot improvement over LightGlue/RoMa/DKM/LoFTR by self-training on internet videos. Same TRT path as vanilla SP+LG, better cross-domain transfer — exactly our regime (zero training data on eastern-Ukraine 1 km AGL). | Add **GIM-LightGlue** to the bench-off as a peer of vanilla SP+LG. |
|
||||
| **C-2**: VPR shortlist = AnyLoc (primary) + MixVPR (degraded-power fast lane). | **Functional** (M-4). Two CVPR 2024 papers landed after the Mode A draft was written: **DINOv2 SALAD** (S47) — DINOv2 + Sinkhorn-VLAD, R@1 = 75 % MSLS Challenge / 92.2 % MSLS Val / 76 % NordLand; **BoQ** (S46) — bag of learnable queries, beats NetVLAD/MixVPR/EigenPlaces/Patch-NetVLAD/TransVPR/R2Former on 14 benchmarks. | VPR shortlist grows to **{AnyLoc, SALAD, BoQ, MixVPR}**. AnyLoc retained as training-free fallback; SALAD and BoQ are likely primaries. |
|
||||
| **C-2 / C-9**: latency budget for AnyLoc (DINOv2 ViT-B) = 50–80 ms/inf at 224×224 (estimated). | **Performance, positive direction** (M-5). Jetson AI Lab L1 measurements (S40): **DINOv2-base-patch14 = 126 inf/s = ~8 ms/inf at 224×224** on Orin Nano Super (FP16 trtexec). Real number is ~6–10× better than draft's estimate. | AC-4.1 (400 ms p95) is comfortably feasible. **R2 (latency) downgraded High → Medium**. Empirical confirmation still required, but no longer make-or-break. |
|
||||
| **C-6**: "MAVSDK + pymavlink share the same serial / TCP MAVLink endpoint via a single `mavlink-router` instance." | **Security** (M-6). mavlink-router has a public, fuzzing-discovered, easily-triggered **stack-based buffer overflow** in config-file parsing (S45 issue #436). Repo has **no SECURITY.md**, no formal advisory process. Drops a known-vulnerable C++ daemon onto a flight-critical companion. | **Replace mavlink-router**. v1 default: distinct system-IDs for MAVSDK and pymavlink, sharing the serial port via ArduPilot's native MAVLink routing — no router daemon at all. v1.1 fallback: in-process MAVLink endpoint multiplexer (~150 LOC). |
|
||||
| **C-6 / Security**: "MAVLink2 signing is recommended (deferred to a Phase-4 security pass)." | **Security** (M-7). GPS_INPUT (and now ODOMETRY) is a high-trust local channel feeding the flight-critical EKF. Without signing, anyone with serial-line access on the airframe can crash the vehicle by injecting a malicious fix. Cost of enabling signing is one operator key-provisioning step per airframe (S44). | Promote MAVLink2 signing to **v1 hard configuration item**. Document the key-provisioning procedure in the deploy runbook. Verify signing-on at boot; refuse to inject GPS_INPUT/ODOMETRY if the FC reports signing-off on our link. |
|
||||
| **C-1**: "Tile format = MBTiles SQLite + per-tile metadata. Single file, mmap-friendly, ubiquitous." | **Performance** (M-8). Default SQLite rollback journal mode + concurrent reader (matcher cache lookup at ≤3 fps × ~30 candidate tiles) + writer (Component 1b ortho-tile write at ≤1–2 Hz × ~30 tiles) → guaranteed `database is locked` failures (S54). | Specify **MBTiles SQLite + WAL + connection pool + per-cycle transaction batching**. Multiple read connections + one write connection. Tile-cache lookup p95 ≤ 5 ms is now a measurable AC-4.1 sub-budget. |
|
||||
| **C-1b**: tile dedup rule "If cache has stale service tile AND our quality > existing → write (overwrites with `source = onboard`)". Quality = inlier count + sharpness. | **Safety** (M-9). EKF over-confidence (a known failure mode) escapes the σ_xy ≤ 10 m generation gate; a confidently-bad pose writes a misaligned tile that becomes the next flight's anchor → cross-flight error compounding. AC-NEW-4 doesn't model this. | (a) **Service tiles are immutable within freshness budget** — onboard tiles overwrite only stale or other-onboard tiles. (b) **Voting layer at the Service ingest**: onboard tile gets promoted to "trusted basemap" only after **N≥2 independent flights** confirm consistent geo-alignment. (c) Quality score includes **parent-pose covariance as a hard gate** (σ_xy ≤ 5 m, tighter than the 10 m generation gate); tiles above that gate are marked "soft" in their sidecar. (d) New AC: **AC-NEW-7 — cache-poisoning safety budget**: P(onboard tile mis-aligned > 30 m) per flight < 1 %; P(>100 m) per flight < 0.1 %. |
|
||||
| **C-9**: "single Python process (asyncio) with TRT inference workers via CUDA IPC for tensor handoff." | **Functional** (M-10). Free-threaded Python 3.13 is **experimental**, has substantial single-threaded perf hit, and **GIL re-enables on import of any non-FT-aware C extension** (S55) — which would silently include numba, possibly TRT bindings, possibly older pymavlink. Free-threading is not a v1 escape hatch. | Stay on **CPython 3.11 or 3.12** for v1. Sharpen the rationale: the choice is "asyncio + TRT subprocess workers + numba JIT on hot path is the production-ready combination today; revisit free-threading in v1.1 once NumPy/SciPy/numba/TRT bindings stabilise on PEP 703". |
|
||||
| **C-5 / C-6**: source-promotion logic "we **immediately** promote our `GPS_INPUT` to fix_type=3D and assert" on FC fix degradation. | **Functional / safety** (M-11). ArduPilot's external-nav source-switching path has known production gotchas (S41, S42 PR #19563, S43 PR #30080 active 2025): companion-derived velocity errors, position-estimate resets when external-nav reference is lost, conflicts when running alongside GPS. AC-NEW-2 (3 s spoofing-promotion latency) **is** that path. | Promote **F-T9 SITL coverage of source-switching** from "verify the loop closes" to a **hard test gate**. Test matrix: jam-onset → our channel; spoofed-real-GPS recovery → operator-confirmed source-restore; `EK3_SRC1_`* parameter combinations across both GPS_INPUT-primary and ODOMETRY-primary. Pin ArduPilot to the version containing PR #30080. |
|
||||
| **C-1b / R-Terrain**: "flat-Earth model" everywhere; "operational area is flat steppe (R-Terrain)". | **Functional / safety** (M-12). Eastern-Ukraine relief amplitude reaches **~24 m peak-to-trough** in Kharkiv survey areas (S56), with creek + gully (yary/balky) systems. At 1 km AGL with 35° HFOV, that produces ~17 m horizontal misalignment at frame edge under flat-Earth ortho. Inside AC-1.1 (50 m@80 %) but eats into AC-1.2 (20 m@50 %). | **Per-sector DEM lookup** in pre-flight. Classify sectors: **flat** (≤5 m amplitude, full anchor weight), **moderate** (5–15 m, weight × 0.7), **rugged** (>15 m, skip ortho-tile generation, weight × 0.3 with `rugged_sector` telemetry flag). Use SRTM 30 m DEM (free; ~30 MB for 400 km²). Add a runtime self-classifier: if matcher RANSAC inlier ratio drops < threshold for K consecutive frames, auto-promote the sector to "rugged" for the rest of the flight. |
|
||||
| **C-3 / V&V**: bench-off targets = AerialVL + UAV-VisLoc + internal Mavic. | **Functional** (M-14). None of those grade **extreme-pitch / extreme-scale / extreme-overlap separately**. AerialExtreMatch (S49, 1.5 M synthetic pairs, 32 difficulty levels) covers exactly the failure-mode axes that matter; **2chADCNN** (S50, MDPI Drones 2023) is a published season-robustness ceiling reference. | Add **AerialExtreMatch** as a primary structured-difficulty regression bench. Use **2chADCNN as a season-robustness ceiling reference number only** — *not* as a bench-off candidate. (2chADCNN's outputs are template-overlap regions, not sub-pixel keypoints; its tested altitude band is 252–500 m, not 1 km; and it has no Jetson benchmark. Keypoint-grade modern matchers — SP+LG, GIM-LightGlue, GIM-RoMa — are the bench-off candidates.) |
|
||||
| **C-4 / AC-1.3**: "<100 m drift VO-only / <50 m with IMU" budget — implicit confidence based on ORB-SLAM3 / VINS-Fusion baselines. | **Functional** (M-15). S52 (AFIT thesis) — SVO/DSO/ORB-SLAM2 all **had significant difficulty** maintaining localisation on real fixed-wing flights. Our framing (VO between satellite anchors, not standalone metric SLAM) is correct, but the AC-1.3 budget needs validation against a real fixed-wing baseline — *not* Mavic-class footage. | New risk **R8 — fixed-wing VO drift under AC-1.3 budget is unconfirmed**. Mitigations: (a) borrow AerialVL's 70 km of fixed-wing trajectories for **F-T1b** AC-1.3 regression; (b) plan the first internal fixed-wing flight before AC lock, not as a stretch. |
|
||||
| **R7 / Datasets**: "MidAir / synthetic IMU is dropped; AerialVL primary; internal Mavic for deployment-domain proxy." | M-13. TartanAir V2 (S51) is photo-realistic synthetic with **native IMU + 12-cam + 65 environments + season variation + custom camera models**, configurable motion patterns — dynamics-mismatch argument weaker than for MidAir. | **CONFIRMED (Q4 = A, 2026-04-26)** — TartanAir V2 added as early-stage synthetic baseline alongside AerialVL + UAV-VisLoc + AerialExtreMatch + 2chADCNN-season-set + Mavic. Used for sweeping seasons / lighting / pitches before real fixed-wing flight (FT-3) lands. |
|
||||
| **C-2 / Granularity**: "FAISS IVF over **per-tile** DINOv2-VLAD vectors" using z=20 storage tiles (~154 × 154 m). | **Functional, high** (M-16). A 1 km AGL frame covers 30–100 z=20 tiles. Cosine similarity between a frame descriptor (covers ~600 × 450 m of ground) and a single-tile descriptor (covers 154 × 154 m) is fundamentally mismatched. None of AerialVL / AnyLoc / NaviLoc do per-storage-tile retrieval; they use frame-footprint-sized reference chunks with overlap. | **Decouple VPR chunk from storage tile.** Storage tile = z=20 / 512×512 (kept for orthorect + dedup). **VPR chunk** = ground-footprint-sized window (e.g. ~600 × 450 m at the deployment altitude band) with **40–50 % overlap**, optionally multi-scale across altitude bands. FAISS index is over chunks, not tiles. Frame descriptor is computed once per *invoked* frame after IMU-heading de-rotation. |
|
||||
| **C-2 / Invocation**: VPR runs on every retrieval cycle. | **Performance, medium** (M-17). VPR's value is concentrated in re-loc paths (cold start, sharp turn, disconnected segment, large σ_xy). In steady state — recent anchor < 2 s, σ_xy < 20 m, VO healthy — a **geometric prior** from IMU+VO predicted position picks top-K candidate chunks by distance alone, no DINOv2 forward needed. | **Conditional VPR invocation.** `if (steady_state) { rank top-K by geometric distance } else { invoke VPR }`. Saves ~10–35 ms/frame in cruise. DINOv2 TRT engine stays resident for low-latency wake-up. |
|
||||
| **C-2 / Fallback**: no defined behaviour when top-1 retrieval is "unconvinced". | **Resilience, medium** (M-18). If top-1 similarity is below threshold OR top-1/top-2 similarity gap is below threshold, the system today goes straight to "no anchor → VO/IMU dead-reckoning" — wasteful, since an adjacent chunk is often correct. | **Expanding-window retry.** On unconvincing top-1, expand the candidate set to adjacent VPR chunks (±1 in each direction; ~8 neighbours for square-grid layout) and let the matcher (Component 3) decide via inlier ratio + reprojection error. Same FAISS index, larger K, no extra DINOv2 forward. |
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Product Solution Description
|
||||
|
||||
A companion-computer software stack that runs on the **Jetson Orin Nano Super** alongside an **ArduPilot 4.5+** flight controller and provides **GPS-equivalent position fixes** to the autopilot when real GPS is jammed, spoofed, or denied. It does so by continuously matching the downward navigation-camera feed against a **pre-cached satellite basemap supplied by the Azaion Suite Satellite Service** and fusing match-derived absolute positions with onboard **Visual Odometry** and the autopilot's high-rate **IMU** in a loosely-coupled EKF. The fused estimate is exported on **two MAVLink channels in parallel**: `GPS_INPUT` (the primary "GPS-substitute" channel matching AC-4.3) and `ODOMETRY` (when our pose has full 6-DoF covariance, so the FC's EKF3 can fuse the richer signal).
|
||||
|
||||
During flight the system also **generates fresh tiles** from the navigation camera, classifies each sector against a pre-loaded SRTM-30 m DEM (skip rugged sectors), deduplicates new tiles against the existing cache (service tiles immutable within freshness budget), and uploads the new tiles to the **Suite Satellite Service candidate pool** on landing — where a **2-flight voting layer** promotes onboard tiles to "trusted basemap" only after independent confirmation. **No raw frames are persisted** — the tile is the unit of storage.
|
||||
|
||||
A separate path computes ground-projected GPS coordinates for objects detected by the AI camera using gimbal angle, airframe attitude, and altitude.
|
||||
|
||||
The MAVLink endpoint is shared between MAVSDK (telemetry) and pymavlink (`GPS_INPUT` + `ODOMETRY`) by **distinct system-IDs through ArduPilot's native MAVLink routing** — no `mavlink-router` daemon. **MAVLink2 signing is mandatory in v1** between companion and FC, with a documented per-airframe key-provisioning procedure.
|
||||
|
||||
```
|
||||
Pre-flight (ground)
|
||||
┌────────────────────────────────────────────────┐
|
||||
│ Azaion Suite Satellite Service │
|
||||
│ (sources commercial / agency imagery; │
|
||||
│ ingests onboard tiles via candidate pool + │
|
||||
│ 2-flight voting layer) │
|
||||
└──────────────┬───────────────────┬─────────────┘
|
||||
│ sync down │ upload back (post-flight, candidate pool)
|
||||
▼ ▲
|
||||
┌─────────────────┐
|
||||
│ DEM (SRTM 30 m) │ ─────► sector classification (flat/moderate/rugged)
|
||||
└─────────────────┘
|
||||
Onboard (in-flight)
|
||||
Nav Cam: ADTi 20MP, 3 fps AI Cam (gimbal+zoom, on-demand)
|
||||
│ │
|
||||
▼ ▼
|
||||
┌────────────────────────────────┐ ┌──────────────────────┐
|
||||
│ GPS-Denied Pipeline │ │ Object Geo-Locator │
|
||||
│ ┌──────────────────────┐ │ │ (pinhole + ATTITUDE │
|
||||
│ │ Visual Odometry │ │ │ MAVLink fusion) │
|
||||
│ │ (SP+LG 2-frame homog)│ │ └──────────┬───────────┘
|
||||
│ └──────────┬───────────┘ │ │
|
||||
│ ▼ │ │
|
||||
│ ┌──────────────────────┐ │ │
|
||||
│ │ Place Recognition │←──┐ │ │
|
||||
│ │ (SALAD/BoQ lead, │ │ │ │
|
||||
│ │ AnyLoc fallback) │ │ │ │
|
||||
│ └──────────┬───────────┘ │ │ │
|
||||
│ ▼ │ │ │
|
||||
│ ┌──────────────────────┐ │ │ │
|
||||
│ │ Cross-view Matcher │ │ │ │
|
||||
│ │ (SP+LG TRT/FP16 lead │ │ │ │
|
||||
│ │ + GIM-LG bench peer)│ │ │ │
|
||||
│ └──────────┬───────────┘ │ │ │
|
||||
│ ▼ │ │ │
|
||||
│ ┌──────────────────────┐ │ │ │
|
||||
│ │ EKF Fusion │←──┼───┼── IMU (FC) │
|
||||
│ │ (loose-coupled, │ │ │ │
|
||||
│ │ Python + numba) │ │ │ │
|
||||
│ └──────────┬───────────┘ │ │ │
|
||||
│ │ │ │ │
|
||||
│ ├──► Ortho-Tile Generator ──► Tile Cache (NVMe, MBTiles+WAL+pool)
|
||||
│ │ (skip if rugged sector; │ ▲
|
||||
│ │ σ_xy hard gate ≤5m for hard │ │ dedup w/
|
||||
│ │ write; soft tiles flagged) │ │ service-tile
|
||||
│ │ └───┘ immutability
|
||||
│ ▼ │
|
||||
└────────────┼──────────────────────────────────┘
|
||||
▼
|
||||
GPS_INPUT (pymavlink, signed MAVLink2) ─────► ArduPilot (GPS1_TYPE=14)
|
||||
ODOMETRY (pymavlink, signed MAVLink2) ─────► ArduPilot (EK3_SRC1_* = ExternalNav)
|
||||
│
|
||||
▼
|
||||
Telemetry summary 1–2 Hz ───────────► QGroundControl (signed)
|
||||
│
|
||||
▼
|
||||
Flight Data Recorder (NVMe, 64 GB cap)
|
||||
(tiles + telemetry + IMU + tlog + per-sector flags; NO raw frames)
|
||||
│
|
||||
▼
|
||||
Post-flight (landing)
|
||||
┌────────────────────────────────────────────────┐
|
||||
│ Tile uploader → Suite Satellite Service │
|
||||
│ → CANDIDATE POOL │
|
||||
│ → 2-flight voting → trusted-basemap promotion │
|
||||
└────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
### Overall principles
|
||||
|
||||
1. **Pipeline = stages with explicit confidence**. Each stage emits a pose hypothesis + covariance + categorical label. Downstream EKF fuses by covariance.
|
||||
2. **All heavy NN inference runs on GPU via TensorRT** (FP16, INT8 where validated). Pre-extract satellite-tile descriptors offline (AC-8.3).
|
||||
3. **Single-process Python orchestrator** (asyncio, **CPython 3.11/3.12**) owns I/O, MAVLink, telemetry, FDR. **Inference workers** are TRT-engine processes on GPU. Free-threaded Python deferred to v1.1 (M-10).
|
||||
4. **Persistent satellite cache** across flights (~10 GB for 400 km²); per-flight FDR (ACvisu-NEW-3) is separate.
|
||||
5. **Every output to the FC carries a covariance** — both GPS_INPUT (`h_acc`, `v_acc`, `vel_acc`) and ODOMETRY (full 21-element matrix). Never a bare lat/lon.
|
||||
6. **Service tiles are basemap truth**; onboard tiles are candidate input that goes through a Service-side voting layer before becoming basemap (M-9).
|
||||
7. **MAVLink2 signing on every companion↔FC link** (M-7). USB bypasses signing — bench-only access.
|
||||
|
||||
---
|
||||
|
||||
### Component 1: Satellite Tile Cache & Descriptor Index
|
||||
|
||||
|
||||
| Aspect | Choice | Rationale / change vs. Mode A |
|
||||
| ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Tile format | **MBTiles SQLite + WAL mode + connection pool + per-cycle transaction batching** | M-8: WAL is mandatory under our concurrent reader+writer load. Pool gives multiple read connections + one write connection. Without these, `database is locked` errors are guaranteed. |
|
||||
| Tile coordinate system | Slippy-map XYZ at zoom 20 (~30 cm/px) | Unchanged. |
|
||||
| Tile size | 512 × 512 | Unchanged. |
|
||||
| Descriptor index | FAISS IVF (cosine) over per-tile DINOv2-VLAD vectors | Unchanged. **New constraint**: index loadable on ≤4 GB GPU RAM (Jetson budget, M-5 / W13 cross-check). |
|
||||
| Per-tile keypoints | SuperPoint + LightGlue descriptors precomputed pre-flight | Unchanged. **Parallel index** for GIM-LightGlue keypoints if the bench-off picks GIM. |
|
||||
| Freshness metadata | `capture_date`, `sector_class ∈ {active,stable}`, `source ∈ {service,onboard}`, `terrain_class ∈ {flat,moderate,rugged}`, `trust_level ∈ {basemap,candidate,soft}` | Adds `terrain_class` (M-12) and `trust_level` (M-9). |
|
||||
| Encryption at rest | AES-GCM, key from secure element on the FC or the companion's TPM | Unchanged. |
|
||||
| **Service-tile immutability** | **Service-source tiles are immutable within freshness budget; onboard tiles overwrite only stale or other-onboard tiles** | **New (M-9).** Critical to prevent cross-flight cache poisoning. |
|
||||
|
||||
|
||||
**Per-flight storage budget.** ~10 GB persistent for the 400 km² operational-area cache. Plus ~30 MB for SRTM-30 m DEM coverage (M-12).
|
||||
|
||||
---
|
||||
|
||||
### Component 1b: Ortho-Tile Generator (in-flight tile creation & write-back)
|
||||
|
||||
**Responsibility (AC-8.4).** Same as Mode A draft, with the following changes:
|
||||
|
||||
**Pipeline per frame:**
|
||||
|
||||
1. **Eligibility check** (changed). Skip tile generation when:
|
||||
- EKF source label is `dead_reckoned`.
|
||||
- **σ_xy > 5 m** (was 10 m — M-9 hard gate).
|
||||
- Airframe roll/pitch (from FC `ATTITUDE`) > 10°.
|
||||
- VPR + cross-view match returned no inliers.
|
||||
- **Sector is classified as `rugged` in the pre-flight DEM lookup** (M-12) — skip ortho-tile generation entirely for that sector.
|
||||
For sectors classified as `moderate`, generate but flag the tile sidecar `terrain_uncertainty=true`.
|
||||
2. **Orthorectification.** Pinhole projection on the **per-sector DEM** (flat-Earth in flat sectors; SRTM-30 m DEM lookup in moderate sectors).
|
||||
3. **Resampling to basemap projection.** Unchanged.
|
||||
4. **Quality scoring** (changed). Adds **σ_xy from EKF as hard gate**:
|
||||
- `sharpness` (variance of Laplacian),
|
||||
- `coverage` (fraction inside source frame),
|
||||
- `match_inliers` (RANSAC inlier count),
|
||||
- `parent_pose_sigma_xy` (EKF position covariance — a tile written from σ_xy ∈ [3, 5] m is `trust_level = soft`; σ_xy ≤ 3 m is `trust_level = candidate`),
|
||||
- `glare/cloud` flag.
|
||||
5. **Deduplication / write decision** (changed per M-9):
|
||||
- If cache has no tile at that key → **write** (`source = onboard`, `trust_level = candidate` or `soft`).
|
||||
- If cache has a tile and it's `source = service` and **within** AC-8.2 freshness budget → **never overwrite** (was: overwrite if our quality > existing).
|
||||
- If cache has a tile and it's `source = service` and **outside** AC-8.2 freshness budget → write only if our parent-pose σ_xy ≤ 3 m AND quality score > existing.
|
||||
- If cache has a tile from `source = onboard` from this same flight, but our quality is materially better → write.
|
||||
- Otherwise → skip.
|
||||
6. **Sidecar metadata** (extended). Includes `parent_pose_sigma_xy`, `terrain_class`, `trust_level`.
|
||||
|
||||
**Post-flight uploader** (changed). Onboard tiles are pushed to the Suite Service **candidate pool**, not directly to the basemap. Service ingest applies the **2-flight voting rule** (M-9) before promoting candidate tiles to trusted basemap. Tiles already at `trust_level = soft` upload but with the soft-trust flag preserved.
|
||||
|
||||
---
|
||||
|
||||
### Component 2: Visual Place Recognition (Global Retrieval)
|
||||
|
||||
**Role.** VPR is a **resilience module**, not an every-frame primary-loop module. Its job is to narrow ~10⁴–10⁵ candidate ground-footprint chunks down to a top-K (5–10) when a geometric prior from IMU+VO is unavailable or untrusted. In steady-state cruise we use the geometric prior alone; we invoke VPR on **re-loc triggers** (M-17). VPR is essential for the resilience ACs (AC-NEW-1 cold start, AC-3.2 sharp turn re-loc, AC-3.3 disconnected segment); it is *not* essential to every steady-state frame.
|
||||
|
||||
#### Retrieval unit (revised — M-16)
|
||||
|
||||
The VPR retrieval unit is **decoupled** from the storage tile:
|
||||
|
||||
|
||||
| Concept | Size | Purpose |
|
||||
| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **Storage tile** (Component 1) | z=20 slippy XYZ, 512×512 (~154 × 154 m ground) | Orthorectification, dedup, basemap update. Storage layer only. |
|
||||
| **VPR chunk** *(new)* | Ground-footprint-sized to expected frame coverage (~600 × 450 m at 1 km AGL with the v1 lens — re-pinned per camera spec); **40–50 % overlap** between adjacent chunks; optionally multi-scale across altitude bands | The unit FAISS retrieval works on. Decoupled from storage so any frame footprint always falls inside ≥1 chunk regardless of position. |
|
||||
|
||||
|
||||
The FAISS IVF index is over **VPR chunk descriptors**, not storage-tile descriptors. Chunks are derived from the storage tile cache pre-flight (one batch DINOv2 forward per chunk); refreshed when tiles inside a chunk change beyond a threshold. Index size for a 400 km² operational area at ~600×450 m chunk size with 50 % overlap ≈ **6 000–8 000 entries**, well within FAISS-on-Jetson memory.
|
||||
|
||||
Frame descriptor pipeline (only on VPR invocation): **IMU-heading de-rotate frame → downsample to backbone input size → DINOv2 forward → VLAD/SALAD/BoQ aggregator → cosine retrieval against FAISS chunk index**.
|
||||
|
||||
#### Invocation policy (revised — M-17)
|
||||
|
||||
```
|
||||
on each EKF cycle:
|
||||
if steady_state(last_anchor_age < 2s, sigma_xy < 20m, vo_healthy):
|
||||
candidates = top_K_chunks_by_predicted_position() # geometric prior, no DINOv2
|
||||
else:
|
||||
# Re-loc path — cold start, sharp turn, disconnected segment, sigma_xy > 50m, VO failed
|
||||
candidates = vpr_top_K_chunks(frame_descriptor) # DINOv2 + FAISS
|
||||
if not convincing_match(candidates): # M-18
|
||||
candidates = expand_to_adjacent_chunks(candidates)
|
||||
pose, covariance = matcher_pnp(frame, candidates) # Component 3
|
||||
```
|
||||
|
||||
Telemetry exposes `vpr_invoked` per cycle so the FDR captures the steady-state-vs-reloc fraction over a flight.
|
||||
|
||||
#### Backbone bench-off candidates
|
||||
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Performance | Fit |
|
||||
| --------------------------------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | -------------------------------------------------------------- | --------------------------------------------------------------------- |
|
||||
| **AnyLoc** (DINOv2 + unsupervised VLAD) | dinov2 ViT-B/14, VLAD aggregator | Training-free; up to 4× R@1 over specialised methods on aerial cross-domain (F-C2) | Needs bench-off vs SALAD/BoQ on aerial cross-domain | DINOv2-base ~8 ms/inf at 224×224 on Orin Nano Super (S40, M-5) | **Bench-off candidate** — keep as fallback even if not picked primary |
|
||||
| **DINOv2 SALAD** *(new)* | SALAD repo (CVPR 2024) | DINOv2-trained Sinkhorn-VLAD; R@1 = 75 % MSLS Challenge / 92.2 % MSLS Val / 76 % NordLand; in `aero-vloc` | Requires training data — but published checkpoints are usable zero-shot | Same backbone as AnyLoc → similar latency | **Bench-off primary candidate** (M-4) |
|
||||
| **BoQ** *(new)* | Bag-of-Queries (CVPR 2024) | Beats NetVLAD/MixVPR/EigenPlaces/Patch-NetVLAD/TransVPR/R2Former on 14 benchmarks | Aerial-domain ranking TBD by bench-off | Lower compute than AnyLoc/SALAD when used with a CNN backbone | **Bench-off primary candidate** (M-4) |
|
||||
| **MixVPR** | mixvpr trained on GSV-Cities | Lighter than AnyLoc; degraded-power fast-lane | Trained on street-view; weaker out-of-domain on aerial | Lower latency than ViT-class | **Fast-lane on degraded power** |
|
||||
| **EigenPlaces / SelaVPR** | aero-vloc | Recent SOTA on some aerial | Mixed wins vs AnyLoc | — | Bench-off candidates |
|
||||
|
||||
|
||||
**Bench-off scope expanded** (M-4): AnyLoc + SALAD + BoQ + MixVPR primary; EigenPlaces / SelaVPR secondary. **Each candidate is benched on the new chunk-based retrieval unit, not per-tile.**
|
||||
|
||||
#### Active-conflict scene change (destroyed buildings, cratering, dam flooding)
|
||||
|
||||
This is a frequent operational reality, not an edge case. Layered mitigations:
|
||||
|
||||
|
||||
| Mitigation | What it does | Cost |
|
||||
| ----------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
|
||||
| **Multi-scale VPR chunks** *(new)* — z=17 / z=18-effective coarse chunks alongside z=20-derived fine chunks | Coarse-scale chunk descriptors describe road-network + field-boundary + waterway structure that survives building destruction. When the fine-scale top-K is unconvinced, the system falls back to coarse-scale top-K. | ~12 MB extra disk; ~3 min one-time DINOv2 forward over coarse chunks pre-flight |
|
||||
| **OSM road-network overlay** as stable-feature anchor *(new)* | OSM road geometry persists even when buildings are destroyed. Extract from OpenStreetMap as a binary "road-mask" tile sidecar. The matcher applies a bonus inlier weighting on keypoints that fall on road edges (~1.3× confidence multiplier). GISNav (closest published reference architecture) does this. | One-time pre-flight OSM extract for the operational area (~minutes); ~5 % bigger sidecar |
|
||||
| **Sector volatility classification drives K** *(new)* — bound to AC-NEW-6 `sector_class` | K=5 in stable sectors; K=20 in active-conflict sectors; K=50 in expanding-window fallback. Bigger candidate pool absorbs stale-tile false negatives. | Pure config; no compute or storage cost |
|
||||
| **Onboard-tile rapid promotion in active sectors** *(new)* — refines M-9's 2-flight voting | In active sectors specifically, allow promotion to "trusted basemap" after 1 flight if σ_xy ≤ 3 m AND OSM-road-overlap ≥ 70 % (dual gate). Faster basemap refresh keeps up with active-sector change rate. Stable sectors keep the conservative N≥2 voting. | Branch in Service ingest voting layer; no onboard cost |
|
||||
| **Negative cache** *(new)* | When the matcher rejects a tile pair (RANSAC inlier ratio < 0.3) repeatedly across multiple flights, mark that tile's `trust_level = stale_destroyed` and exclude from retrieval until refreshed by Service. | One extra metadata field; trivial at retrieval time |
|
||||
|
||||
|
||||
Of these, **multi-scale VPR chunks + OSM road overlay** are the two with the biggest payoff for active-conflict scene change. Sector-driven K is essentially free. Negative cache is cheap insurance.
|
||||
|
||||
#### Stale-tile / cloud robustness
|
||||
|
||||
- **Stale tiles** in stable sectors (seasonal mismatch only): bench-off includes AerialExtreMatch (S49, structured-difficulty). 2chADCNN (S50) is the season-robustness ceiling reference. Production-side mitigation: top-K is **dynamically sized** by sector + σ_xy (see table above). Stale-tile false-negatives are absorbed by larger K + matcher-driven verification.
|
||||
- **Cloudy stored tile**: deprioritised at retrieval time via the `glare/cloud` sidecar flag (Component 1).
|
||||
- **Cloudy live frame**: not VPR-specific — the matcher and orthorectifier also fail. System falls back to VO/IMU dead-reckoning. **F-T16** (new test) synthesises cloud-occlusion injection on AerialVL frames to characterise the recovery profile (see Testing Strategy).
|
||||
|
||||
---
|
||||
|
||||
### Component 3: Cross-View Matching & PnP
|
||||
|
||||
> **⚠ Deep-research item.** Highest-leverage decision in the system. Mode B updates the candidate list.
|
||||
|
||||
**Bench-off candidates (revised vs Mode A):**
|
||||
|
||||
|
||||
| Candidate | Status vs Mode A | Rationale |
|
||||
| -------------------------------------- | ------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- |
|
||||
| **SuperPoint + LightGlue (TRT, FP16)** | **Lead candidate** (unchanged) | Well-trodden TRT path, ~286 FPS on RTX 3080 @ 320×240 baseline (F-B1) |
|
||||
| **GIM-LightGlue** *(new — M-3)* | **Bench-off candidate, peer of SP+LG** | 8.4–18.1 % zero-shot improvement over LightGlue baseline (S48). Same TRT path. |
|
||||
| **XFeat (sparse + semi-dense)** | **Bench-off candidates** (unchanged) | Embedded-class throughput; CPU-viable as failover (S08) |
|
||||
| **MASt3R** | **DROPPED from primary list** (was stretch — M-2) | mast3r-runtime Jetson support "Planned"; ~3 s/pair on Orin Nano Super extrapolated. Research-track-only. |
|
||||
| **GIM-RoMa / RoMa** | Bench-off candidate | Best Map-free / aerial cross-view in 2024 papers; needs distillation work |
|
||||
| **GIM-DKM** | Bench-off candidate | Same as RoMa — bench-off only if SP+LG variants fall short |
|
||||
| **2chADCNN** *(new — M-14)* | **Season-aware reference** | UAV↔satellite season-aware template-matching (S50). Either bench-off candidate or season-robustness ceiling reference. |
|
||||
| **Classical (SIFT/ORB/AKAZE)** | Last-resort degraded mode | Cross-view domain gap kills these (F-A5) |
|
||||
|
||||
|
||||
**Bench-off targets (revised):**
|
||||
|
||||
1. **AerialVL** — primary public benchmark (S03), 70 km of fixed-wing trajectories.
|
||||
2. **UAV-VisLoc** — accuracy regression at 405–840 m (S01).
|
||||
3. **AerialExtreMatch** *(new — M-14)* — 1.5 M synthetic pairs, 32 difficulty levels (overlap × scale × pitch). Direct grading of failure-mode axes.
|
||||
4. **2chADCNN season set** — season-aware **ceiling reference number only** (M-21); not a candidate matcher.
|
||||
5. **TartanAir V2** *(confirmed — M-13, Q4=A)* — early-stage synthetic baseline; sweeps seasons / lighting / pitches before the first internal fixed-wing flight lands.
|
||||
6. **Internal Mavic flight footage** — closest to deployment domain.
|
||||
7. **First internal fixed-wing flight** (FT-3) — lands before AC-1.3 lock per M-15.
|
||||
|
||||
**Score on:** AC-1.1 / AC-1.2 / AC-2.2 / p95 latency on Orin Nano Super 25 W / sustained 30-min thermal stability / peak GPU memory / **plus seasonal-robustness score from the 2chADCNN-axis tests**.
|
||||
|
||||
**PnP & projection:** Unchanged from Mode A, except output schema adds `parent_pose_sigma_xy` for downstream Component-1b dedup gate.
|
||||
|
||||
**Input downsampling:** Empirical pin during research pass. Latency budget is more comfortable than Mode A assumed (M-5 / S40), so 1024×768 is a low-risk starting point for SP+LG / GIM-LG; 1024×768 semi-dense or 640×480 sparse for XFeat.
|
||||
|
||||
---
|
||||
|
||||
### Component 4: Visual Odometry
|
||||
|
||||
Unchanged from Mode A (custom 2-frame VO via SuperPoint+LightGlue / GIM-LightGlue homography). New risk **R8** (M-15) added: AC-1.3 drift budget needs validation against AerialVL's fixed-wing trajectories before lock — *not* against Mavic-class footage.
|
||||
|
||||
---
|
||||
|
||||
### Component 5: IMU + Visual EKF Fusion
|
||||
|
||||
**Working choice (revised from Mode A):** Onboard loosely-coupled EKF in our process emits **two parallel MAVLink streams**:
|
||||
|
||||
1. **GPS_INPUT** (primary, GPS-substitute framing per AC-4.3) with `h_acc`/`v_acc` derived from EKF covariance.
|
||||
2. **ODOMETRY** (auxiliary, when full 6-DoF covariance is available and quality > VISO_QUAL_MIN) with the full 21-element pos+att covariance (M-1).
|
||||
|
||||
ArduPilot is configured with `EK3_SRC1_`* set to GPS-with-fallback-to-ExternalNav so GPS_INPUT remains the failover path. Mode/source labels (`satellite_anchored / vo_extrapolated / dead_reckoned`) are emitted on both channels.
|
||||
|
||||
**Key tuning:** the EKF's Mahalanobis gate and process-noise covariances are calibrated against AC-NEW-4 budget through Monte Carlo (which now also includes M-9 cache-poisoning injection — see "Testing Strategy").
|
||||
|
||||
---
|
||||
|
||||
### Component 6: MAVLink Integration & Source Promotion
|
||||
|
||||
**Working choice (revised from Mode A):**
|
||||
|
||||
- **MAVSDK for telemetry** + **pymavlink for `GPS_INPUT` and `ODOMETRY` lines**.
|
||||
- **No mavlink-router daemon** (M-6). Instead: distinct system-IDs for MAVSDK (sysid=10) and pymavlink (sysid=11), sharing the serial port via ArduPilot's native MAVLink routing (S35-class). Single endpoint configuration, no third-party C++ daemon, no #436-class CVE risk.
|
||||
- **MAVLink2 signing mandatory** (M-7) on every companion↔FC link. Per-airframe key in FC FRAM; provisioning runbook is part of the deploy procedure.
|
||||
|
||||
**Source-promotion logic (revised per M-11):** unchanged behaviourally, but **F-T9 SITL test scope expanded** to include source-switching combinations across both GPS_INPUT-primary and ODOMETRY-primary modes. Pin ArduPilot to the version containing PR #30080.
|
||||
|
||||
**Spoofing-promotion latency budget:** <3 s (AC-NEW-2) — unchanged.
|
||||
|
||||
---
|
||||
|
||||
### Component 7: Failsafe, Health & Re-Localization
|
||||
|
||||
Unchanged from Mode A.
|
||||
|
||||
---
|
||||
|
||||
### Component 8: Object Localization (AI Camera)
|
||||
|
||||
Unchanged from Mode A (trig + airframe-attitude fusion via `ATTITUDE` MAVLink stream).
|
||||
|
||||
---
|
||||
|
||||
### Component 9: Software Platform & Process Topology
|
||||
|
||||
**Working choice (revised rationale per M-10):**
|
||||
|
||||
- **Single Python process (asyncio) on CPython 3.11 or 3.12** (well-supported by JetPack / numba / TRT bindings).
|
||||
- **TRT inference workers as subprocesses**, tensor handoff via CUDA IPC (Jetson unified-memory aware: zero-copy possible since CPU and GPU share the LPDDR5 pool).
|
||||
- **numba JIT** for EKF math hot paths.
|
||||
- Configuration via YAML; logging via structured JSON to FDR.
|
||||
- **Free-threaded Python (3.13+) is v1.1 territory.** Reason: experimental, single-threaded perf hit, GIL re-enables on import of any non-FT-aware C extension (S55). Revisit when NumPy/SciPy/numba/TRT bindings are FT-aware.
|
||||
|
||||
---
|
||||
|
||||
### Component 10: Flight Data Recorder
|
||||
|
||||
Unchanged from Mode A, except the per-sector `terrain_class` and `trust_level` flags are recorded alongside the position-estimate stream so post-mission analysis can filter on them.
|
||||
|
||||
---
|
||||
|
||||
### Component 11: Confidence Score (cross-cutting)
|
||||
|
||||
**Computed** from: RANSAC inlier ratio, reprojection error variance, top-K retrieval similarity gap, EKF covariance, **plus** parent-pose σ_xy gate result (M-9 hard gate).
|
||||
|
||||
**Emitted on:**
|
||||
|
||||
1. GPS_INPUT (`h_acc`).
|
||||
2. ODOMETRY (full 21-element covariance + `quality` field 0–100).
|
||||
3. NAMED_VALUE_FLOAT "CONF_M" on the GCS link.
|
||||
4. Per-tile sidecar (`parent_pose_sigma_xy`) for Component-1b dedup gate.
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Functional / Integration
|
||||
|
||||
- **F-T1** Tile cache load/lookup (unchanged).
|
||||
- **F-T1b** *(new — M-15)* AC-1.3 drift regression against **AerialVL's fixed-wing trajectories** (70 km of real flight). Pass = drift ≤100 m VO-only / ≤50 m IMU-fused between satellite anchors at 95th percentile.
|
||||
- **F-T2** Tile generation + dedup *(extended — M-9)*: replay a recorded flight; assert (a) for any ground sector covered ≥2× by the nav cam, exactly **one** tile is written; (b) the chosen tile has `parent_pose_sigma_xy` ≤ the hard gate; (c) **service tiles are never overwritten when within freshness budget**, regardless of our quality score.
|
||||
- **F-T3** Tile uploader → candidate pool *(extended — M-9)*: post-flight, the diff against the Service candidate pool is correct; freshness + trust_level metadata round-trips; 2-flight voting promotes candidates to basemap only after 2nd-flight confirmation.
|
||||
- **F-T4** End-to-end against **AerialVL** (S03).
|
||||
- **F-T5** End-to-end against **UAV-VisLoc** (S01).
|
||||
- **F-T5b** *(new — M-14)* End-to-end against **AerialExtreMatch** (S49) — structured-difficulty regression. For each of 32 difficulty levels, log AC-1.1 / AC-1.2 pass/fail.
|
||||
- **F-T5c** *(new — M-14)* Season-robustness regression against **2chADCNN season set** (S50) — assert AC-1.1 holds across summer↔winter pairs.
|
||||
- **F-T6** End-to-end against **internal Mavic flight footage** — deployment-domain proxy.
|
||||
- **F-T7** Sharp-turn handling (unchanged).
|
||||
- **F-T8** Disconnected-segment re-localization (unchanged).
|
||||
- **F-T9** ArduPilot SITL: full MAVLink loop *(extended — M-11)*. Test matrix:
|
||||
- GPS_INPUT-only mode (Mode A baseline).
|
||||
- GPS_INPUT + ODOMETRY hybrid mode.
|
||||
- Source switching: jam-onset → our channel; spoofed-real-GPS recovery → operator-confirmed source-restore.
|
||||
- `EK3_SRC1_`* parameter combinations across both modes.
|
||||
- **MAVLink2 signing on**: assert injection refused on signing failure; assert acceptance on valid signing.
|
||||
- **F-T10** Operator re-loc workflow via QGC `STATUSTEXT`.
|
||||
- **F-T11** Cold-start TTFF <30 s (AC-NEW-1).
|
||||
- **F-T12** Spoofing-promotion <3 s (AC-NEW-2).
|
||||
- **F-T13** Object localization with airframe-attitude fusion (unchanged).
|
||||
- **F-T14** *(new — M-12)* Per-sector DEM classification: load SRTM-30 m for the operational area; assert sector classes (`flat`, `moderate`, `rugged`) line up with ground-truth DEM amplitudes; assert ortho-tile generation is skipped in `rugged` sectors.
|
||||
- **F-T15** *(new — M-16/17/18 cluster)* **VPR retrieval-unit bench**: build the chunk-based FAISS index over a 400 km² synthetic operational area; assert that for any ground point, ≥1 chunk fully contains the expected frame footprint (overlap correctness). Bench top-K recall at K = {3, 5, 10, 50} for steady-state, re-loc, and expanding-window modes against AerialVL + AerialExtreMatch + 2chADCNN season set.
|
||||
- **F-T16** *(new — concern #3 cloud robustness)* Synthetic cloud-occlusion injection: inject 0–60 % cloud cover on AerialVL frames (and on cached basemap tiles independently); assert the system gracefully degrades (top-K expansion → matcher failure → VO/IMU fallback) rather than emitting a confident bad fix.
|
||||
- **F-T17** *(new — M-17 invocation policy)* Mission replay assertion: in a typical mission replay (steady cruise + 1 sharp turn + 1 simulated reboot), measure the % of cycles VPR is invoked. Pass criterion: ≥80 % of steady-state cycles use the geometric-prior path; 100 % of re-loc-trigger cycles invoke VPR.
|
||||
|
||||
### Non-Functional
|
||||
|
||||
- **NF-T1** Latency p95 <400 ms on Orin Nano Super 25 W (AC-4.1).
|
||||
- **NF-T2** Memory <8 GB shared (AC-4.2).
|
||||
- **NF-T3** Thermal: 8 h sustained 25 W (AC-NEW-5).
|
||||
- **NF-T4** *(extended — M-9)* False-position safety budget (AC-NEW-4) — Monte Carlo over AerialVL + Mavic + AerialExtreMatch with **synthetic over-confidence injection**: artificially deflate EKF covariance by 1.5×–3× and verify the EKF's Mahalanobis gate still rejects the bad fix and the cache-poisoning hazard does not trigger.
|
||||
- **NF-T4b** *(new — M-9)* **AC-NEW-7 cache-poisoning safety budget** validation — Monte Carlo over multi-flight replays: assert P(onboard tile mis-aligned > 30 m) per flight < 1 %; P(>100 m) per flight < 0.1 %.
|
||||
- **NF-T5** Storage: 64 GB FDR cap with rollover.
|
||||
- **NF-T6** Imagery freshness gate (AC-NEW-6).
|
||||
|
||||
### Security
|
||||
|
||||
- **S-T1** GPS_INPUT + ODOMETRY not accepted from any non-whitelisted MAVLink source-system-id.
|
||||
- **S-T2** Tile cache encrypted at rest.
|
||||
- **S-T3** *(promoted to v1-mandatory — M-7)* MAVLink2 signing between companion and FC. Verified at boot. Refuse to inject GPS_INPUT/ODOMETRY if FC reports signing-off on our link.
|
||||
- **S-T4** *(new — M-6)* No `mavlink-router` binary on the deployed companion image. The CI image-build step verifies absence.
|
||||
- **S-T5** *(new — M-6)* MAVLink endpoint multiplexing via distinct system-IDs is exercised in CI integration tests.
|
||||
|
||||
### Field
|
||||
|
||||
- **FT-1** Flight-data-recorder review of ≥5 real-world test flights at progressive altitudes (200 m → 1 km AGL).
|
||||
- **FT-2** Single 8-hour sortie endurance / thermal soak.
|
||||
- **FT-3** *(new — M-15)* **First internal fixed-wing flight at 1 km AGL** before AC-1.3 lock. Synced IMU + GPS truth + nav-cam stream collected; replayed through the pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Key Risks & Open Items (carried into Plan step)
|
||||
|
||||
|
||||
| ID | Risk | Severity | Mitigation |
|
||||
| --------------------------- | --------------------------------------------------------------------------------- | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| R1 | Imagery licensing lead time (Service-side concern) | Med (was High; now upstream) | Suite Service procurement; not on this build's critical path |
|
||||
| R2 | Latency budget on Orin Nano Super at 1024×768 | **Med (was High — M-5)** | DINOv2-base measured at ~8 ms/inf at 224×224 (S40); empirical bench-off in week 1 of impl |
|
||||
| R3 | Cross-view accuracy at 1 km AGL with Ukrainian seasonal change | Med | 50 %@20m hard floor; bench-off now includes SALAD/BoQ/GIM-LightGlue/2chADCNN before lock (M-3, M-4, M-14) |
|
||||
| R4 | MAVSDK + pymavlink coexistence (resolved: distinct system-IDs, no router, M-6) | **Resolved** | — |
|
||||
| R5 | Thermal at 25 W for 8 h | Med | Cooling validation in NF-T3 |
|
||||
| R6 | AC-7.1 in turning flight (gimbal-only pose) | Low (scoped to level flight in v1) | Add airframe-attitude fusion in v1.1 |
|
||||
| R7 | Public dataset gap (V&V) | Med | AerialVL primary; AerialExtreMatch + 2chADCNN added (M-14); internal Mavic for deployment proxy; first fixed-wing flight scheduled before AC-1.3 lock (M-15) |
|
||||
| **R8** *(new — M-15)* | **Fixed-wing VO drift under AC-1.3 budget unconfirmed** | Med | F-T1b regression on AerialVL's fixed-wing trajectories; FT-3 first internal fixed-wing flight |
|
||||
| **R9** *(new — M-9)* | **Cross-flight cache poisoning via onboard tile overwrite of stale service tile** | High (safety) | Service-tile immutability inside freshness budget; 2-flight voting at Service ingest; parent-pose σ_xy hard gate; AC-NEW-7 numeric budget |
|
||||
| **R10** *(new — M-7 / M-6)* | **Companion↔FC link is a flight-critical attack surface** | High (security) | MAVLink2 signing v1-mandatory; mavlink-router replaced by native MAVLink routing with distinct system-IDs |
|
||||
| **R11** *(new — M-11)* | **ArduPilot external-nav source-switching has known production gotchas** | Med | F-T9 SITL test matrix; pin ArduPilot version containing PR #30080 |
|
||||
| **R12** *(new — M-12)* | **Eastern-Ukraine relief amplitude breaks flat-Earth assumption near frame edge** | Med | Pre-flight SRTM-30 m DEM lookup; per-sector terrain class; runtime self-classifier |
|
||||
|
||||
|
||||
## Proposed AC additions
|
||||
|
||||
**AC-NEW-7 — Cache-poisoning safety budget** *(new — M-9)*
|
||||
|
||||
- P(onboard tile geo-misaligned > **30 m**) per flight **<1 %**.
|
||||
- P(onboard tile geo-misaligned > **100 m**) per flight **<0.1 %**.
|
||||
|
||||
**Why it matters.** Cross-flight error compounding. Validated by **NF-T4b**.
|
||||
|
||||
**Implementation drivers.** Service-tile immutability within freshness budget; 2-flight voting at Service ingest; parent-pose σ_xy hard gate (≤5 m for hard write, ≤3 m for `trust_level = candidate`).
|
||||
|
||||
---
|
||||
|
||||
## Open Research (deferred to dedicated research passes before Plan)
|
||||
|
||||
|
||||
| Topic | Why now | Output | Owner |
|
||||
| -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------- |
|
||||
| **Cross-view matcher bench-off** (Component 3) — *expanded list per M-2/M-3/M-14* | Highest-leverage decision; expanded shortlist requires explicit empirical comparison | Selected matcher + chosen input resolution + measured latency / accuracy / memory + season-robustness score | Research skill, follow-up Mode A pass scoped to "matcher selection" |
|
||||
| **Input-resolution sweep** | Coupled with the matcher bench (latency budget more comfortable than Mode A assumed — M-5 → bigger sweep range possible) | Resolution per matcher candidate; sensitivity curves for AC-1.1 / AC-1.2 vs resolution | Same pass |
|
||||
| **VPR backbone bench-off** (Component 2) — *expanded list per M-4* (AnyLoc + SALAD + BoQ + MixVPR) | Cheaper than the matcher decision but feeds it | Selected VPR backbone + measured Recall@K on AerialVL + AerialExtreMatch + Mavic | Same pass |
|
||||
| **Tile-generator quality scoring** (Component 1b) | Need empirically-grounded thresholds for σ_xy (M-9), sharpness, glare | Calibrated thresholds | Implementation phase |
|
||||
| **Internal Mavic-flight V&V dataset** | Closest proxy to deployment domain | Curated, ground-truth-labelled clips | Operations / data team |
|
||||
| **First internal fixed-wing flight** *(new priority — M-15)* | AC-1.3 drift budget unconfirmed | Recorded sortie with synced IMU + GPS truth + nav-cam stream | Field-test plan; **before AC lock**, not stretch |
|
||||
| **Encryption-at-rest key management** | Tile cache + FDR are operationally sensitive | Threat-modelled design | Phase 4 security analysis |
|
||||
| ~~*(Open question — M-13)*~~ TartanAir V2 as early-stage synthetic baseline | **Confirmed yes (Q4 = A, 2026-04-26)** | Folded into bench-off plan | — |
|
||||
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
All citations are by ID from `_docs/00_research/01_source_registry.md`. Mode B sources: **S40–S57**.
|
||||
|
||||
- **Latency / hardware**: S40 (Jetson AI Lab L1).
|
||||
- **MAVLink integration**: S41–S44 (ArduPilot dev docs L1, ODOMETRY PR #19563, External-nav fix PR #30080, MAVLink2 signing).
|
||||
- **Security**: S44 (signing), S45 (mavlink-router CVE class).
|
||||
- **VPR SOTA 2024**: S46 (BoQ), S47 (SALAD).
|
||||
- **Matcher SOTA 2024**: S48 (GIM), S57 (MASt3R Jetson status).
|
||||
- **Datasets**: S49 (AerialExtreMatch), S50 (2chADCNN), S51 (TartanAir V2), S52 (AFIT fixed-wing VO), S53 (high-altitude VIO).
|
||||
- **Tile cache**: S54 (MBTiles WAL recipe).
|
||||
- **Python topology**: S55 (free-threaded Python 3.13).
|
||||
- **Terrain**: S56 (eastern-Ukraine relief).
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- Mode A draft (superseded by this draft): `_docs/01_solution/solution_draft01.md`.
|
||||
- Mode B decomposition: `_docs/00_research/03_mode_b_decomposition.md`.
|
||||
- Mode B reasoning chain: `_docs/00_research/04_reasoning_chain_mode_b.md`.
|
||||
- Mode B validation log: `_docs/00_research/05_validation_log_mode_b.md`.
|
||||
- AC & Restrictions assessment (Phase 1): `_docs/00_research/00_ac_assessment.md`.
|
||||
- Source registry: `_docs/00_research/01_source_registry.md` (S01–S57).
|
||||
- Fact cards: `_docs/00_research/02_fact_cards.md` (Phase 1 + Mode B M-1..M-15).
|
||||
- Tech stack consolidation: `_docs/01_solution/tech_stack.md` (deferred — Phase 3 optional).
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md` (deferred — Phase 4 optional, but **promoted to recommended-before-Plan-lock** because of M-6/M-7 promotion).
|
||||
|
||||
Reference in New Issue
Block a user