- Introduced a new document detailing the current state of the autodev process, including steps, status, and findings.

- Revised acceptance criteria in the acceptance_criteria.md file to clarify metrics and expectations, including updates to GPS accuracy and image processing quality.
- Enhanced restrictions documentation to reflect operational parameters and constraints for UAV flights, including camera specifications and satellite imagery usage.
- Added new research documents for acceptance criteria assessment and question decomposition to support ongoing project evaluation and decision-making.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-04-26 14:28:10 +03:00
parent 2178737b36
commit 9eba1689b3
17 changed files with 2965 additions and 69 deletions
+211
View File
@@ -0,0 +1,211 @@
# Acceptance Criteria & Restrictions Assessment (Phase 1 — BLOCKING)
**Project**: GPS-denied onboard navigation for fixed-wing UAV (companion-computer-side, Jetson Orin Nano Super, MAVLink/MAVSDK to flight controller, satellite tile reference + monocular VO).
**Mode**: Research Mode A, Phase 1 (AC Assessment). Phase 2 (Solution Draft) blocked on user confirmation of this document.
**Reviewers**: Product owner / decision-maker; technical architect.
---
## 0. TL;DR — what changes after this assessment
1. **One blocker, three contradictions, six gaps** — the AC and restriction set is solid in spirit but cannot be implemented as written.
2. **Hard blocker**: Google Maps satellite tiles are explicitly prohibited for offline / autonomous-vehicle / image-analysis use (Google Maps Platform ToS + Map Tiles API Policies). The restriction `"limited right now to Google Maps"` is legally not deployable.
3. **Hard contradiction #1**: "up to 3000 photos per flight" vs. 8 h × 3 fps = 86,400 photos. Storage and processing budgets cannot be sized until this is reconciled.
4. **Hard contradiction #2**: Camera resolution range "FullHD to 6252×4168" gives a 13× pixel-count delta — per-frame compute cannot be designed against a 13× moving target.
5. **Hard contradiction #3**: AC "Object localization accuracy is consistent with frame-center accuracy" is not physically achievable with the *AI-camera-gimbal-angle-only* pose information confirmed in scope (no airframe IMU fusion onto AI cam). At 1 km AGL a 5° unknown roll/pitch is ~87 m of ground error.
6. **Six recommended new AC** added at the bottom of section 1 (TTFF, spoofing-promotion latency, flight-data-recorder, false-position safety budget, environmental envelope, imagery freshness).
7. **Suggested numerical recalibration** of three existing AC values (frame-center 80%/60%, MRE, "TBD" failsafe N) to better-evidenced numbers.
A clear **A/B/C choice** at the bottom of this document is the BLOCKING gate.
---
## 1. Acceptance Criteria
Status legend: **K** = keep as-is; **M** = modify (numeric or wording change recommended); **A** = added (new AC); **R** = remove / supersede; **F** = flagged (factually questionable, needs user judgement).
| ID | Criterion (paraphrased) | Our Value | Researched / Recommended Value | Cost / Timeline Impact | Status | Notes |
|----|-------------------------|-----------|--------------------------------|------------------------|--------|-------|
| AC-1.1 | Frame-center GPS error: ≥80% within 50 m | 80%@50m | **Achievable.** SOTA cross-view UAV-vs-satellite at low-mid altitude reaches RDS ~84% / MA@20 ~83% in nadir-favoring setups (S39); AnyVisLoc 74.1%@5m at 30300 m (S02). At 1 km AGL with non-stabilized monocular nadir + Google-Earth-grade reference, **80%@50 m is realistic IF satellite-anchored frames are ≥30% of the trajectory**; relies on VPR + fine matching + Kalman/factor-graph fusion with VO between anchors. | None to scope | K | Achievability assumes the anchor density and VO drift bounds in AC-1.3. |
| AC-1.2 | Frame-center GPS error: ≥60% within 20 m | 60%@20m | **Aggressive but achievable.** MA@20 of 83% appears in cross-view literature (S39), but on benchmark-favorable data. With Ukraine seasonal change, dust, and 1 km AGL viewpoint, expect **4565%@20 m** in production. Recommend **soft-keep at 60%, hard-floor at 50%** to avoid blocking GA. | None to scope; affects test pass/fail | M | Add a hard-floor of 50%@20m as the must-pass acceptance gate; treat 60%@20m as the "stretch" target. |
| AC-1.3 | Cumulative VO drift between satellite anchors <100 m | 100 m | **Achievable.** Monocular VO without IMU drifts ~13% of distance travelled in benign conditions (S32 baselines). At 60 km/h cruise, anchor cadence of ~5 s gives ~83 m between anchors → naive 13% drift = 13 m of accumulated drift, well below 100 m. **Tighten to <50 m if VO is IMU-fused**, keep at 100 m for VO-only fallback. | None | K | Add measurement protocol: drift = ‖VO-extrapolated centre next anchor centre‖ at the moment of anchor-fix. |
| AC-1.4 | Per-estimate confidence score (high/low) | qualitative | **Recommend quantitative**: 95% covariance ellipse (semi-major axis, m) + categorical {anchored, vo-extrapolated, dead-reckoned}. Standard schemes use RANSAC inlier ratio + reprojection variance + EKF covariance (S03, S06, S32). | Negligible cost | M | Wording change only; emit both number and category. |
| AC-2.1 | Image registration rate >95% in normal segments | 95% | **Tight but achievable** with strong matcher stack (LightGlue / XFeat / MASt3R) and dense satellite tile coverage. Cross-view aerial benchmarks report 7090% on hard splits, 9098% on training-similar splits. Define "normal segment" as: nadir flight ±10°, ≥40% overlap, daytime, season-matched tile. | Drives matcher choice (S07, S08, S09) | K | Add the "normal segment" definition explicitly. |
| AC-2.2 | Mean Reprojection Error <1.0 px | <1.0 px | **Realistic for homography on overlapping aerial pairs**, optimistic for cross-domain UAVsatellite registration (typical 13 px). Recommend **split**: MRE <1.0 px for VO frame-to-frame; MRE <2.5 px for satellite-anchored homography. | None | M | Two MRE budgets, one per pipeline stage. |
| AC-3.1 | Survive 350 m outlier between consecutive photos (tilt) | 350 m | **Reasonable.** At 1 km AGL with up to 20° plane tilt, frame-centre can shift ~360 m. Outlier-rejection (RANSAC + Mahalanobis gate on EKF innovation) handles this. | None | K | — |
| AC-3.2 | Sharp-turn handling: <5% overlap, <200 m drift, <70° heading change | <5% / 200 m / 70° | **Plausible** with a place-recognition based re-localization (S05 AnyLoc, S06 MixVPR, S04 aero-vloc). Without overlap, VO returns NULL; the segment-stitching strategy must rely on global descriptor retrieval over the satellite tile cache. | Drives VPR component (S04, S05) | K | — |
| AC-3.3 | >2 disconnected segments per flight; stitching is core | yes | **Confirmed core** by AC-3.2. Re-localization → multi-segment SLAM-style merging via pose-graph; aero-vloc benchmark validates this pattern. | Architectural | K | — |
| AC-3.4 | After 3 frames with no fix → operator re-localization request | 3 frames | **Reasonable trigger**, but **N seconds is a better unit** because frame rate may drop under load. Recommend: re-localization request after **≥3 consecutive frames AND ≥2 s** without a fix. | Negligible | M | Dual trigger (frames AND time). |
| AC-4.1 | <400 ms end-to-end per frame | <400 ms p95 | **Feasible** on Jetson Orin Nano Super at 25 W with image downsampling (e.g., 1024 × 683 working resolution from 6200 × 4100), TensorRT-accelerated SuperPoint+LightGlue or XFeat, and pre-cached satellite tile descriptors. Empirical Jetson Orin NX precedent exists (S12); Orin Nano Super envelope confirmed (S14). Skip-allowed under load (user-confirmed). | Drives matcher choice + downsampling (D-D4) | K | Reword as "p95 <400 ms with up to ~10% frames dropped under sustained load." |
| AC-4.2 | Memory <8 GB shared CPU+GPU | 8 GB | **Hard hardware envelope** of Jetson Orin Nano Super 8 GB variant (S14). Realistic with downsampling + tile descriptor cache; risky with full-res 25 MP photos held simultaneously. | Drives buffering strategy | K | — |
| AC-4.3 | Output via MAVLink GPS_INPUT (MAVSDK) | GPS_INPUT via MAVSDK | **Mismatch with stack**: MAVSDK-Python has no native GPS_INPUT (open issue #320, S18). PX4 native GPS_INPUT is limited; ArduPilot fully supports `GPS1_TYPE=14`. Recommend either (a) target ArduPilot autopilot, or (b) use **pymavlink** for raw GPS_INPUT alongside MAVSDK for general telemetry, or (c) on PX4 use VISION_POSITION_ESTIMATE through EKF2. | Drives autopilot choice / library mix | M | Lock the autopilot target (PX4 vs ArduPilot vs both) — see Q-1 below. |
| AC-4.4 | Frame-by-frame streaming, not batched | streaming | OK | None | K | — |
| AC-4.5 | May refine and resend corrections | yes | OK; common pattern. | None | K | — |
| AC-5.1 | Initialise from last known FC GPS pre-denial | yes | OK. Add explicit boundary: "system requires the FC's last EKF position + IMU extrapolation at hand-off." | Negligible | K | — |
| AC-5.2 | If no estimate for `N` seconds → FC falls back to IMU dead reckoning | N=TBD | **Recommend `N = 35 s`** (PX4 COM_POS_FS_DELAY default = 1 s, our pipeline is heavier and includes VO retries). | Decision | M | Pin a value. |
| AC-5.3 | On companion reboot mid-flight → re-init from FC's IMU-extrapolated position | yes | OK, but add: cold-start time-to-first-fix budget (see new AC-NEW-1). | Modest | K | — |
| AC-6.1 | Position + confidence streamed to GCS | yes | OK. QGroundControl primary channel is STATUSTEXT; richer telemetry via custom MAVLink dialect or NAMED_VALUE_FLOAT (S34, S35). Bandwidth budget required. | Modest | K | — |
| AC-6.2 | GCS can send re-localization hint | yes | OK; implementable as STATUSTEXT command + custom NAV-msg, or via QGC plugin. | Modest | K | — |
| AC-6.3 | WGS84 output | WGS84 | OK (matches GPS_INPUT spec). | None | K | — |
| AC-7.1 | AI camera object localization accuracy "consistent with frame-center" | qualitative | **Not physically achievable in turning flight** with **gimbal-angle-only** pose (user-confirmed scope). At 1 km AGL, 5° unknown airframe attitude → ~87 m ground error; at 25° bank → ~470 m. **Restate**: "consistent with frame-center accuracy in level flight (<5° bank); in maneuvering flight, expect ground-projection error proportional to altitude × sin(bank)." OR add airframe IMU fusion to the AI-cam pose (re-opens scope C5). | Decision | F | See Q-2 below. |
| AC-7.2 | Trigonometric calc using gimbal angle + zoom + altitude (flat terrain) | yes | Physically correct given the limitation in AC-7.1. Flat-terrain assumption costs <30 m typical for eastern/southern Ukraine relief at 1 km AGL with small gimbal off-nadir. | None | K | — |
| AC-8.1 | Satellite imagery ≥0.5 m/px, ideally 0.3 m/px | 0.5/0.3 m/px | **0.3 m/px is realistic only via paid commercial providers** (Maxar Vivid, Airbus Pléiades Neo, S25, S26). Free Sentinel-2 is 10 m/px (S28) — too coarse for 1 km AGL drone-vs-satellite registration without scale-bridging tricks. | **Significant**: $2532 / km² archive Maxar; ~€58.50 / km² Airbus → for 400 km² ≈ $1012 k / mission area, one-time. Cheaper if a defense agency feed is available. | K (criterion), M (sourcing) | See Q-3 below. |
| AC-8.2 | Imagery <2 years old where possible | <2 years | **Recommend tighter**: <12 months for stable rear sectors, <6 months for active-conflict sectors (post-2022 Ukraine landscape change is rapid — Kakhovka dam destruction is a documented example, S28). | Operational | M | Add freshness-by-sector. |
| AC-8.3 | Satellite imagery pre-loaded before flight; preprocessing time uncritical | offline preprocessing | OK; standard. Tile descriptor pre-extraction (SuperPoint / DINOv2 features) is the natural offline step. | None | K | — |
| AC-NEW-1 | **Time-to-first-fix on cold start / mid-flight reboot** | (new) | Recommend **<30 s after companion-computer boot**, given IMU-extrapolated initial position from FC. | Modest | A | Operational requirement, missing today. |
| AC-NEW-2 | **Spoofing-promotion latency** (system asserts its estimate over FC's spoofed GPS) | (new) | Recommend **<3 s** from spoof-detected to first valid GPS_INPUT taking precedence at the EKF. PX4 has 1 s spoof-detect hysteresis (S19); we add 12 s of GPS_DENIED system "warm" margin. | Drives FC config (GPS1_TYPE) | A | Security-critical AC. |
| AC-NEW-3 | **Flight-data-recorder** | (new) | All photos (or downsampled), estimates, confidence, IMU traces, MAVLink GPS_INPUT outputs retained at full rate to non-volatile storage. Cap at e.g. 64 GB/flight. | Storage budget | A | Required for post-mission forensics, cert, and ML retraining. |
| AC-NEW-4 | **False-position safety budget** | (new) | P(estimate error > 500 m) < 0.1% per flight; P(error > 1 km) < 0.01% per flight. Validated by Monte-Carlo over IMU-injected datasets + recorded flights. | Drives validation effort | A | Safety AC; missing today, but waypoint/RTL behaviour depends on it. |
| AC-NEW-5 | **Operational environmental envelope** | (new) | Operating temp 20 °C … +50 °C (Ukrainian seasonal range), shock per RTCA DO-160G low-altitude UAV-class, vibration spec to be matched to airframe. | Drives BoM (cooling, mounting) | A | Required for any production deployment. |
| AC-NEW-6 | **Imagery freshness enforcement** | (new) | System rejects (or downgrades confidence on) tiles older than the per-sector freshness threshold (AC-8.2). | Negligible | A | Operational safety. |
---
## 2. Restrictions Assessment
| ID | Restriction (paraphrased) | Our Value | Researched / Recommended Value | Cost / Timeline Impact | Status |
|----|----------------------------|-----------|--------------------------------|------------------------|--------|
| R-1 | Fixed-wing UAV only | yes | OK; matches public benchmark domain (UAV-VisLoc, AerialVL — S01, S03). | None | K |
| R-2 | Downward camera, fixed, **not autostabilized** | yes | OK; this is the hard mode (no gimbal compensation for plane bank/pitch). The whole pipeline must tolerate up to ±25° roll in turns. | Drives matcher robustness | K |
| R-3 | Eastern/Southern Ukraine ops area | yes | OK; **flag the imagery freshness implication** (active-conflict change rate, F-E7). | Drives sourcing | K |
| R-4 | Altitude ≤1 km AGL | ≤1 km | OK; flag that **GSD at 1 km with 24 mm full-frame is ~24 cm/px** (F-F1) — already finer than 0.3 m/px satellite reference. | None | K |
| R-5 | Mostly sunny weather | yes | OK; partly-cloudy and shadow movement remain a real degrader (F-A5). | Drives evaluation | K |
| R-6 | Sharp turns are exceptional but possible | yes | OK; scope includes the multi-segment stitching path (AC-3.2/3.3). | None | K |
| R-7 | **Photo count up to 3,000 per flight** | 3,000 | **Hard contradiction with 8-hour endurance × 3 fps = 86,400.** Either: (a) interpret as on-disk *retention* budget (sub-sample); (b) per-segment, not per-sortie; (c) stale value. **MUST RESOLVE** — see Q-4. | Sizing-critical | F |
| R-8 | Two cameras: nav (fixed) + AI (gimbal angle + zoom) | yes | OK; user has confirmed AI cam pose is gimbal-only (no airframe IMU fusion). Implication captured in AC-7.1. | Drives object-localization quality | K |
| R-9 | **Nav cam resolution: FullHD to 6252×4168** | range | **13× pixel-count delta** between extremes. Locking the camera spec is required for AC-4.1 / 4.2 sizing. **MUST RESOLVE** — see Q-5. | Sizing-critical | F |
| R-10 | Camera intrinsics known | yes | OK; pre-flight checkerboard or factory cal mandatory (F-F2). | Modest | K |
| R-11 | Camera-to-CC interface TBD (USB / CSI / GigE) | TBD | **Recommend GigE Vision** for 25 MP @ 3 fps (8.4 MB/frame raw → 25 MB/s — comfortable for GigE; tight for USB 3.0 in noisy electrical environments; CSI feasible for embedded camera modules but unusual at this resolution). | Drives BoM | M |
| R-12 | **Satellite imagery limited to Google Maps** | Google Maps | **Hard blocker** — Google Maps Platform ToS explicitly prohibits offline use, image analysis, autonomous-vehicle control, geodata extraction (S22, S23). Bing has the same prohibition (S24). **Must change to license-cleared provider** (Maxar Vivid / Airbus Pléiades Neo / commissioned tasking / government feed). See Q-3. | $$ + time | **R / M** |
| R-13 | Pre-loaded satellite imagery on companion | yes | OK; persistent cross-flight cache as user requested. | None | K |
| R-14 | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared, 25 W) | yes | OK; envelope confirmed (S14, S15). Active cooling required for 8-hour duty (F-D2). | Drives BoM | K |
| R-15 | JetPack (Ubuntu) + CUDA + TensorRT | yes | OK; **lock JetPack 6.2** to get Super Mode (S14). | None | K |
| R-16 | Onboard storage TBD | TBD | **Recommend NVMe ≥256 GB** (10 GB tile cache + 64 GB flight-data-recorder buffer + headroom). | Modest | M |
| R-17 | Sustained GPU load may throttle | yes | OK; design constraint, not a target. Active cooling + 25 W power mode + duty-cycled compute (skip-allowed) all help. | Drives BoM + thermal design | K |
| R-18 | Lots of IMU data via FC | yes (production) | OK; **for dev/test the user has confirmed public-dataset path**. Recommended: AerialVL (primary, S03), UAV-VisLoc (visual-only validation, S01), MidAir (synthetic IMU augmentation, S30), plus the user's 65 sample photos for sanity. Plan one real test flight with IMU log capture before V&V. | Modest | K |
| R-19 | MAVLink + MAVSDK to FC | yes | **MAVSDK has no native GPS_INPUT** (S18). Use **pymavlink** for the GPS_INPUT line, MAVSDK for general telemetry. ArduPilot is the lower-friction FC target. | Drives library mix | M |
| R-20 | Output is GPS_INPUT (MAVLink) | yes | OK with the library mix above. | None | K |
| R-21 | GCS telemetry bandwidth-limited | yes | OK; high-frequency content (per-frame estimates) over MAVLink at 57600/115200 baud is tight — recommend down-sampling to 12 Hz on the telemetry link, full rate over local TCP for bench testing. | Drives protocol design | K |
---
## 3. Key findings (cross-cutting)
1. **The single biggest risk in the project is satellite-imagery sourcing.** Switching off Google Maps is mandatory; the replacement decision (paid commercial vs. government feed vs. public agency partnership) drives the budget, the freshness, and the legal posture. Recommend Maxar Vivid 30 cm or Airbus Pléiades Neo 30 cm as the working assumption; engage procurement early.
2. **Per-frame compute will fit Jetson Orin Nano Super at 25 W only with a downsampled working resolution and pre-cached satellite descriptors.** Full 6200 × 4100 matching at 3 fps within 400 ms is not realistic. We will run matchers at ~1 Mpx and reserve the full-res image only for offline forensics + AI-cam ROI.
3. **The matcher stack landscape in 20242026 is healthy**: LightGlue / SuperPoint with TensorRT (mature on Orin), XFeat (fastest, best for embedded), MASt3R (best cross-view, but heavy). A two-tier pipeline — XFeat for VO frame-to-frame, LightGlue/MASt3R for satellite anchoring — is the most defensible architecture.
4. **VPR is core**, not optional: sharp turns and disconnected segments demand a global retrieval step over the satellite tile cache. AnyLoc (DINOv2 + VLAD, training-free) is the pragmatic baseline; MixVPR is the lightweight option.
5. **Confidence scoring must be quantitative**, not just "high/low" — the flight controller and GCS need a numeric to decide when to trust GPS_INPUT versus IMU dead reckoning.
6. **AI-camera object localization at the AC's stated accuracy is not achievable with gimbal-only pose** in turning flight. Either restate the AC, or expand scope to fuse airframe IMU into the AI-cam pose.
7. **MAVSDK + GPS_INPUT does not exist.** Plan for a hybrid pymavlink/MAVSDK approach, and prefer ArduPilot as the autopilot target unless a strong PX4 reason exists.
8. **No public dataset perfectly matches our mission profile.** AerialVL is the closest fixed-wing real-world dataset; we should plan one or two of our own test flights with IMU log capture for V&V before claiming AC compliance.
---
## 4. Sources
See `_docs/00_research/01_source_registry.md` (39 sources, mostly L1/L2). Key L1: Google Maps Platform ToS (S22/S23), Bing ToS (S24), NVIDIA Jetson Linux developer guide (S14/S15), ArduPilot GPSInput docs (S16/S17), PX4 spoofing PRs (S19/S20). Key L2: UAV-VisLoc (S01), AerialVL (S03), AnyVisLoc (S02), AnyLoc (S05), XFeat (S08), MASt3R (S09), VPR aerial survey (S04).
Each fact in this document is traceable back to one or more of those sources via `_docs/00_research/02_fact_cards.md`.
---
## 5. BLOCKING — decisions required before Phase 2
These are the questions the assessment cannot resolve from research alone. **Phase 2 (the solution draft) cannot start until these are answered.**
### Q-1 — Autopilot target (drives AC-4.3, R-19, R-20)
PX4 vs ArduPilot for the flight controller has direct consequences for the GPS_INPUT pipeline. ArduPilot (`GPS1_TYPE=14`) is the lower-friction path; PX4 forces a VISION_POSITION_ESTIMATE workaround.
**Choose A / B / C:**
- A) **ArduPilot only** (lowest friction; matches MAVProxy GPSInput reference impl).
- B) **PX4 only** (must use VISION_POSITION_ESTIMATE, more EKF tuning).
- C) **Both** (more work, but maximises addressable airframe market).
### Q-2 — AI-camera pose source (drives AC-7.1)
The AC says object localization should be "consistent with frame-center accuracy". With gimbal-only pose, this is not physically achievable in turning flight.
**Choose A / B / C:**
- A) **Relax the AC** to "consistent in level flight (<5° bank); degraded by airframe attitude in maneuvering flight" — keeps scope as agreed.
- B) **Expand scope** to fuse airframe IMU (roll/pitch/yaw) into the AI-cam pose at the moment of capture, restoring the original AC.
- C) **Defer object localization** entirely (AC-7.x removed from this cycle; future work).
### Q-3 — Satellite imagery sourcing (drives R-12, AC-8.1, AC-8.2)
Google Maps is not a legally usable source.
**Choose A / B / C / D:**
- A) **Maxar Vivid 30 cm** (standard offering, $2532 / km² archive; ~$1012 k for 400 km² mission area; explicit defense licensing path).
- B) **Airbus Pléiades Neo 30 cm** (€58.50 / km² volume tier; OneAtlas tasking).
- C) **Government / agency feed** (free or subsidised — requires user to identify the agency and partnership channel).
- D) **Park the question**, deliver the system imagery-source-agnostic with a documented offline-tile-cache interface; user procures tiles separately.
### Q-4 — Photo count per flight (drives R-7, AC-NEW-3, sizing)
"Up to 3000 photos per flight" contradicts 8 h × 3 fps.
**Choose A / B / C:**
- A) "3000" is the **on-disk retention** budget — system processes 86k frames live, retains every Nth in the flight-data-recorder.
- B) "3000" is the **per mission segment** count, not per sortie — typical mission segments are ~17 minutes.
- C) "3000" is **stale** and should be replaced with "all frames captured during the sortie" (no per-flight cap, sized by storage AC-NEW-3).
### Q-5 — Nav-camera spec lock (drives R-9, AC-4.1, AC-4.2, R-11)
"FullHD to 6252×4168" is too wide for compute / storage sizing.
**Choose A / B / C:**
- A) Lock at **6252×4168** (worst case for sizing; safest).
- B) Lock at a mid-range **~12 MP** (e.g. 4000×3000) — balanced for compute and detail.
- C) Lock at **FullHD (1920×1080)** — easiest compute, but fewest features per frame.
- D) **Pick a specific camera model** (and we research focal length / lens distortion / interface).
### Q-6 — New AC additions (drives AC-NEW-1 through AC-NEW-6)
Six new AC are recommended (cold-start TTFF, spoofing-promotion latency, flight-data-recorder, false-position safety budget, environmental envelope, imagery freshness). Each addresses a real gap in the current AC.
**Choose A / B:**
- A) **Adopt all six** as written (recommended).
- B) **Adopt selectively** — user picks which to keep (we'll iterate inline).
### Q-7 — AC-1.2 hard floor (drives AC-1.2 and pass/fail gate)
Recommend a hard floor of **50% within 20 m** alongside the existing **60% within 20 m** stretch target.
**Choose A / B:**
- A) **Adopt** the 50% hard floor + 60% stretch target.
- B) **Keep** 60%@20m as the only gate (research suggests this is occasionally infeasible in production conditions — risk of a non-shippable system).
### Q-8 — Failsafe `N` value (drives AC-5.2)
Recommend **`N = 3 s`** (with rationale in fact card F-J1).
**Choose A / B / C:**
- A) **Adopt N = 3 s.**
- B) **N = 5 s** (more tolerant; longer pre-fallback dead-reckoning by FC).
- C) **Tune empirically** during integration test (placeholder N = 3 s in spec).
---
## 6. Sign-off — defaults applied
The user opted to skip the structured Q-1…Q-8 prompt and asked to "continue with the information you already have." The recommended values from this assessment have therefore been applied as the working defaults. The user may revise any cell below at any time; revisions propagate into `_docs/00_problem/acceptance_criteria.md` and `restrictions.md`.
| Decision | Applied default | Rationale | Date |
|----------|------------------|-----------|------|
| Q-1 (autopilot) | **ArduPilot only** | Native GPS_INPUT support via `GPS1_TYPE=14`; lowest integration friction (S16, S17). | 2026-04-25 |
| Q-2 (AI-cam pose) | **Relax AC-7.1 to level-flight only** | Gimbal-only pose cannot meet "consistent with frame-center" in turns at 1 km AGL (F-H1). Object-localization scope unchanged otherwise. | 2026-04-25 |
| Q-3 (imagery source) | **Azaion Suite Satellite Service** is the source. Onboard system consumes via an offline tile-cache interface; commercial procurement (Maxar / Airbus / agency) is the Service's concern, not this build's. | User clarified post-blocker: imagery is supplied by a separate Suite component. AC-8.1 / restrictions rewritten accordingly. | 2026-04-25 (revised) |
| Q-4 (photo count) | **Drop the 3000-cap entirely.** The system does **not store raw photos**. Tile cache (~10 GB) and FDR (64 GB) are the storage caps. Tiles are also generated mid-flight (AC-8.4) and uploaded to the Suite Satellite Service on landing. | User clarified post-blocker: "3000" was a legacy Mavic-class operator number; the deduplicated tile is the unit of storage. | 2026-04-25 (revised) |
| Q-5 (nav-camera spec) | **Lock at ~12 MP (4000 × 3000); always downsample for the cross-view matcher.** Specific matcher + downsample target deferred to a dedicated research pass (see solution-draft "Open Research"). | User confirmed downsampling; matcher choice is the highest-leverage decision and deserves its own research pass. | 2026-04-25 (revised) |
| Q-6 (new ACs) | **Adopt all six** (AC-NEW-1…AC-NEW-6). Each AC expanded with rationale, implementation drivers, and validation method in `acceptance_criteria.md`. AC-NEW-3 amended to exclude raw frames (tiles only, per Q-4 revision). | User asked for the new ACs to be enlisted in detail. | 2026-04-25 (revised) |
| Q-7 (AC-1.2 floor) | **50% hard floor only** (60% stretch dropped). | User decision — single hard floor avoids ambiguity around what passes. | 2026-04-25 (revised) |
| Q-8 (failsafe N) | **N = 3 s** | PX4 default GPS-loss delay is 1 s; our pipeline is heavier and includes VO retries; 3 s rides through one sharp turn (F-J1). | 2026-04-25 |
### Outstanding consequences not auto-resolvable
- **Cross-view matcher selection** is now an explicit deferred research item ("Open Research" in `solution_draft01.md`). Plan step starts with this on the table.
- **Specific nav-camera model** (Q-5) is left to the matcher / resolution research pass to recommend with concrete focal-length / interface justification.
- **Real fixed-wing flight at 1 km AGL with synced IMU** does not exist as a public dataset. Internal Mavic footage is the deployment-domain proxy; AerialVL is the primary public benchmark. Synthesizing IMU from Mavic video is **not pursued** (user judgement: dynamics don't transfer from quad-class to fixed-wing-class).
@@ -0,0 +1,110 @@
# Question Decomposition — AC & Restrictions Assessment
**Mode**: A (Initial Research) — Phase 1 (AC Assessment, BLOCKING)
**Domain**: Onboard GPS-denied UAV navigation via downward-facing camera + satellite reference imagery + VO/IMU on Jetson Orin Nano Super.
**Question type**: Multi-criterion feasibility + technology positioning + benchmark validation. High-novelty intersection (defense-grade UAV CV/SLAM + low-power edge inference + active-conflict region operational constraints), so timeliness is high — prefer 20232026 sources.
## Project context (locked-in user answers)
| # | Item | Value |
|---|------|-------|
| C1 | Fresh research run; ignore deleted prior artifacts | yes |
| C2 | Operational area per mission | 150 km² mission box + 50 km × 1 km corridor; ~10 GB satellite tile cache; persistent across flights |
| C3 | Flight envelope | Fixed-wing, 1 km AGL ceiling, ~60 km/h cruise, up to 8 h endurance, sunny weather, eastern/southern Ukraine |
| C4 | GCS | QGroundControl over MAVLink/MAVSDK |
| C5 | AI camera pose | Only gimbal angle + zoom (no airframe IMU fusion onto AI cam frame) |
| C6 | Latency budget | <400 ms p95 end-to-end; frame skipping allowed under load |
| C7 | IMU dev/test data | Use public UAV datasets — research and recommend |
| C8 | Onboard compute | Jetson Orin Nano Super (67 TOPS sparse INT8 / 33 TOPS dense, 8 GB shared LPDDR5, 25 W TDP) |
| C9 | Output channel | MAVLink GPS_INPUT to flight controller; telemetry to GCS for situational awareness |
## Sub-questions (drives Phase 1 web research)
### A. Position accuracy realism
- A1. Hybrid VO + satellite-anchored geolocalization accuracy on fixed-wing UAVs at ~1 km AGL — what's state-of-the-art (CIRCLE, AnyLoc, UAV-VisLoc benchmark, OpenIBL, AerialVL, GPS-denied papers 20232026)?
- A2. Are AC values "80% within 50 m, 60% within 20 m" achievable with non-stabilized monocular nadir camera + Google Maps tile reference?
- A3. Monocular VO drift rates (m per 100 m travelled) for aerial imagery — feasibility of <100 m cumulative drift between satellite anchors.
- A4. Confidence-score schemes for visual geolocalization (covariance, top-K retrieval similarity, photometric consistency).
### B. Image registration & feature matching
- B1. Registration rate >95% for non-overlapping flight + viewpoint changes — SOTA matchers (LoFTR, LightGlue+SuperPoint, RoMa, OmniGlue, MASt3R, XFeat) on aerial-vs-satellite domain gap.
- B2. Mean Reprojection Error <1.0 px — typical for aerial homography vs full PnP at 1 km AGL?
- B3. Cross-modality matching (off-nadir aerial photo vs ortho satellite tile) — what works in 20242026 literature, what fails?
### C. Resilience — sharp turns, off-nadir, re-localization
- C1. Place recognition / tile retrieval for re-localization after sharp turn (no overlap) — NetVLAD, AnyLoc, CosPlace, EigenPlaces, MixVPR.
- C2. Aerial pose recovery under up to 70° heading change and 350 m position outlier — practical pipelines.
- C3. Multi-segment trajectory stitching (disconnected SLAM sessions) — pose-graph relocalization via global descriptor + RANSAC.
### D. Onboard real-time performance on Jetson Orin Nano Super
- D1. Memory & compute envelope of LightGlue / SuperPoint / LoFTR / RoMa / XFeat at 6200×4100 → typical downsampled resolution; can the matcher + VO run within ~400 ms on Jetson Orin Nano Super (67 TOPS sparse INT8)?
- D2. TensorRT-accelerated implementations available for the 2025-class matchers?
- D3. Hot-cache satellite tile lookup (precomputed descriptors) for ~10 GB tile budget — index size and lookup latency.
- D4. Concurrent VO + tile registration scheduling under 8 GB shared CPU/GPU memory.
- D5. Sustained-load thermal throttle threshold of Jetson Orin Nano Super (25 W mode) and effective duty cycle for 8-hour flight.
### E. Satellite imagery — sourcing, freshness, legality, preprocessing
- E1. Google Maps satellite tile usage in defense / offline UAV context — terms-of-service status; alternatives.
- E2. Sub-meter-resolution providers (Maxar, Airbus Pleiades, Planet SkySat, Capella, ICEYE, Vexcel, Maxar Vivid) — pricing tiers, license for tactical reuse, freshness over Ukraine.
- E3. Free / open alternatives: Sentinel-2 (10 m), USGS, Mapbox, Bing — usable as fallback at 1 km AGL?
- E4. Pre-flight tile preprocessing (descriptor extraction, MBTiles packaging, persistent on-disk cache between flights) — best practice.
- E5. Imagery age — how stale before registration fails for active-conflict regions (Ukraine 2022+ rapid landscape change)?
### F. Camera, optics, sensor model
- F1. 6252×4168 sensor at 1 km AGL — typical GSD per pixel for the implied focal lengths of fixed-down sUAS payloads.
- F2. Camera intrinsics calibration — pre-flight checkerboard vs factory cal vs self-calibration.
- F3. Rolling-shutter compensation for ~3 fps mid-altitude photogrammetry.
### G. MAVLink / MAVSDK / flight controller integration
- G1. MAVLink GPS_INPUT message — fields, supported autopilots (PX4 vs ArduPilot vs Cube), expected rate, rejection criteria.
- G2. MAVSDK on Jetson Orin Nano Super (JetPack 6.x / Ubuntu 22.04) — versions, async IO patterns.
- G3. QGroundControl integration — re-localization request UI / NAMED_VALUE / STATUSTEXT / custom message conventions.
### H. Object localization (AI camera)
- H1. Trigonometric ground point intersection accuracy under unknown airframe attitude (gimbal-angle-only) — error budget analysis at 1 km AGL.
- H2. Flat-terrain assumption error contribution over eastern/southern Ukraine (relief amplitude, riverbanks, urban areas).
- H3. Best-practice for graceful degradation when attitude is missing.
### I. Hardware envelope, power, thermals
- I1. Jetson Orin Nano Super 25 W mode sustained load — 8-hour fixed-wing power budget (battery + solar?), cooling solutions for 25 W onboard.
- I2. Storage: persistent ~10 GB tile cache + flight logs on Jetson — recommended SSD/NVMe.
### J. Failsafe & resilience
- J1. Reasonable failsafe timeout `N` for "no estimate produced" before flight controller falls back to IMU-only — typical practitioner values.
- J2. Companion computer reboot mid-flight — recovery patterns from PX4/ArduPilot field reports.
### K. Public datasets for VO/IMU dev & test
- K1. Aerial UAV datasets with synchronized IMU + downward camera + GPS ground truth — list and assess (UAV-VisLoc, AerialVL, MidAir, EuRoC MAV, NPU Drone, USC, Senseable City Lab, AERIAL-D, GeoText, AmsterTime, VPAir, DenseUAV).
- K2. Are there datasets covering eastern European agriculture / mixed-terrain at altitudes 3001000 m? If not, what's the closest analogue.
### L. Acceptance criteria gaps (potential missing AC)
- L1. Operational temperature, vibration, shock — military/UAV environmental standards (MIL-STD-810, RTCA DO-160 lite).
- L2. Time-to-first-fix on cold-start (boot to first valid GPS_INPUT message).
- L3. Maximum tolerable spoofing detection latency (system promotes its own estimate over flight controller GPS) — security AC.
- L4. Logging / black-box requirement for post-mission forensics.
- L5. Safety AC: false-position rate budget (geolocation off by >X km) — dangerous for waypoint/RTL behavior.
### M. Restriction soundness
- M1. Photo count "up to 3000 per flight" vs "8 hour flight × 3 fps" → 86,400 photos. **Hard contradiction** — needs user resolution.
- M2. Camera "FullHD to 6252×4168" — wide range; processing must accommodate worst case.
- M3. "Eastern/southern Ukraine, mostly sunny" — operational implications: shadow direction, season, vegetation cycle (seasonal mismatch with stale satellite imagery).
## Output
Each sub-question feeds into:
1. `01_source_registry.md` — sources consulted and tier
2. `02_fact_cards.md` — facts with citations
3. The Phase 1 deliverable: `00_ac_assessment.md` (BLOCKING gate)
+103
View File
@@ -0,0 +1,103 @@
# Source Registry — Phase 1 (AC & Restrictions Assessment)
Tier legend: L1 = official spec / standard / reference manual; L2 = peer-reviewed paper or tool from a vendor / SOTA author; L3 = vendor docs, popular OSS repo, expert blog; L4 = forum post, secondary blog.
| ID | Tier | Title | URL | Used for |
|----|------|-------|-----|----------|
| S01 | L2 | Xu et al., *UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization* (arXiv 2405.11936, May 2024) | https://arxiv.org/html/2405.11936v1 | Fixed-wing UAV visual localization benchmark; 405840 m altitudes; 0.3 m/px Google Earth satellite reference |
| S02 | L2 | Xu et al., *Exploring the best way for UAV visual localization under Low-altitude Multi-view Observation Condition: a Benchmark* (AnyVisLoc, arXiv 2503.10692, 2025) | https://arxiv.org/html/2503.10692v1 | SOTA recall@Xm numbers; 74.1% @ 5 m at 30300 m altitude |
| S03 | L2 | He et al., *AerialVL: A Dataset, Baseline and Algorithm Framework for Aerial-Based Visual Localization With Reference Map* (RA-L 2024) | https://ieeexplore.ieee.org/document/10632587 ; https://github.com/hmf21/AerialVL | Fixed-wing aerial VPR + visual alignment + VO benchmark; 70 km of trajectories; FLIR + gimbal + NovAtel GNSS 1.5 m RMS |
| S04 | L2 | Schmidt-Salzmann et al., *Visual Place Recognition for Aerial Imagery: A Survey* (arXiv 2406.00885, 2024) + aero-vloc benchmark | https://arxiv.org/abs/2406.00885 ; https://github.com/prime-slam/aero-vloc | VPR methods (AnyLoc, CosPlace, EigenPlaces, MixVPR, NetVLAD, SALAD, SelaVPR) for aerial domain |
| S05 | L2 | Keetha et al., *AnyLoc: Towards Universal Visual Place Recognition* | https://anyloc.github.io/ ; https://github.com/AnyLoc/AnyLoc | DINOv2 + VLAD VPR, training-free, strong on aerial cross-domain |
| S06 | L2 | Ali-bey et al., *MixVPR: Feature Mixing for Visual Place Recognition* (arXiv 2303.02190) | https://arxiv.org/abs/2303.02190 | Lightweight VPR aggregation, 94.6% R@1 Pitts250k |
| S07 | L2 | Lindenberger et al., *LightGlue: Local Feature Matching at Light Speed* | https://github.com/cvg/LightGlue | Real-time matcher (with SuperPoint) |
| S08 | L2 | Potje et al., *XFeat: Accelerated Features for Lightweight Image Matching* (CVPR 2024) | https://openaccess.thecvf.com/content/CVPR2024/papers/Potje_XFeat_Accelerated_Features_for_Lightweight_Image_Matching_CVPR_2024_paper.pdf ; https://github.com/verlab/accelerated_features | 5× faster than LightGlue, designed for embedded; semi-dense option |
| S09 | L2 | Leroy et al., *Grounding Image Matching in 3D with MASt3R* (ECCV 2024) | https://arxiv.org/abs/2406.09756 | Cross-view 3D-grounded matching; +30% AUC on Map-free |
| S10 | L3 | `fettahyildizz/superpoint_lightglue_tensorrt` (TRT 8.5.2.2, dynamic shapes) | https://github.com/fettahyildizz/superpoint_lightglue_tensorrt | TensorRT-ready SuperPoint+LightGlue C++ |
| S11 | L3 | `yuefanhao/SuperPoint-LightGlue-TensorRT` | https://github.com/yuefanhao/SuperPoint-LightGlue-TensorRT | RTX3080 baseline: SP 0.95 ms + LG 2.54 ms @ 320×240 = 286 FPS |
| S12 | L3 | `qdLMF/LightGlue-with-FlashAttentionV2-TensorRT` (Jetson Orin NX, CUTLASS plugin) | https://github.com/qdLMF/LightGlue-with-FlashAttentionV2-TensorRT | Jetson Orinclass deployment proof |
| S13 | L3 | `fabio-sim/LightGlue-ONNX` (FP8) | https://github.com/fabio-sim/LightGlue-ONNX | ONNX/TRT path for matchers |
| S14 | L1 | NVIDIA — *JetPack 6.2 brings Super Mode to Jetson Orin Nano and Orin NX* | https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/ | Confirms 67 TOPS sparse INT8, 15/25 W/MAXN SUPER modes, 8 GB shared LPDDR5 |
| S15 | L1 | NVIDIA — *Jetson Orin Nano / Orin NX / AGX Orin Power & Performance* | https://docs.nvidia.com/jetson/archives/r35.6.1/DeveloperGuide/SD/PlatformPowerAndPerformance/ | Power-mode specifics, throttling behaviour |
| S16 | L1 | ArduPilot — *MAVProxy GPSInput module* | https://ardupilot.org/mavproxy/docs/modules/GPSInput.html | GPS1_TYPE=14 (MAVLink); GPS_INPUT message fields |
| S17 | L1 | ArduPilot — *MAVProxy GPSInput source* | https://github.com/ArduPilot/MAVProxy/blob/master/MAVProxy/modules/mavproxy_GPSInput.py | Reference impl for GPS_INPUT injection |
| S18 | L3 | mavlink/MAVSDK-Python issue #320*Input external gps through mavsdk* | https://github.com/mavlink/MAVSDK-Python/issues/320 | MAVSDK has no native GPS_INPUT support — must use pymavlink |
| S19 | L1 | PX4 PR #21244, #23366*Add GPS spoofing state* / *EKF2 spoofing GPS check* | https://github.com/PX4/PX4-Autopilot/pull/21244 ; https://github.com/PX4/PX4-Autopilot/pull/23366 | PX4 spoofing flag, ~1 s hysteresis; EKF2 disables GNSS fusion when spoofed |
| S20 | L1 | PX4 PR #23346*EKF2 fix timeout after gps failure* | https://github.com/PX4/PX4-Autopilot/pull/23346 | Dead-reckoning timeout logic |
| S21 | L1 | PX4 issue #23970*COM_POS_FS_DELAY does not take effect* | https://github.com/PX4/PX4-Autopilot/issues/23970 | Failsafe delay parameter behaviour (default 1 s) |
| S22 | L1 | Google — *Map Tiles API Policies* | https://developers.google.com/maps/documentation/tile/policies | Explicit prohibition: "Offline uses … Image analysis, Machine interpretation, Object detection or identification, Geodata extraction or resale" |
| S23 | L1 | Google — *Maps Platform Terms of Service* | https://developers.google.com/maps/terms | Prohibits use "with any products, systems, or applications for … any systems or functions for automatic or autonomous control of vehicle behavior" |
| S24 | L1 | Microsoft — *Bing Maps Terms of Use (April 2024)* | https://www.bingmapsportal.com/terms/TermsApril2024 | Bing tiles cannot be cached/stored offline; tile URLs are not stable |
| S25 | L1 | Maxar/Vantor — *Vivid Mosaic 30 cm Basemaps* | https://maxar.com/precision ; https://developers.maxar.com/docs/ordering/guides/vivid-standard-30 | 30 cm global mosaic (135 M km²), 15 cm urban mosaic (7 M km²), AI change detection refresh; ~$2532/km² archive |
| S26 | L1 | Airbus — *Order Pléiades Neo (30 cm)* | https://space-solutions.airbus.com/imagery/how-to-order-imagery-and-data/how-to-order-pleiades-neo/ | Pléiades Neo 30 cm, OneAtlas tasking; ~€58.50/km² volume tier |
| S27 | L1 | Planet Community — *Commercial imagery pricing* | https://community.planet.com/advanced-analysis-apis-81/commercial-imagery-pricing-4926 | SkySat / PlanetScope pricing tiers |
| S28 | L3 | EOX — *Sentinel-2 cloudless (s2maps.eu)* | https://s2maps.com/ | Free 10 m/px global mosaic; updated annually; CC-BY-NC for non-commercial |
| S29 | L3 | UAV Coach — *GSD calculator* | https://uavcoach.com/gsd-calculator/ | GSD = (alt × sensor_w) / (focal × image_w); validates ~24 cm/px at 1 km AGL with full-frame 24 mm |
| S30 | L2 | Mid-Air dataset (synthetic, quadcopter, IMU + GPS + 420k frames) | https://midair.ulg.ac.be/ | Training-time augmentation candidate (synthetic) |
| S31 | L2 | AgriLiRa4D (LiDAR + 4D radar + IMU, 518 m AGL agriculture) | https://arxiv.org/html/2512.01753v1 | Out of altitude band — only useful for SLAM regression baselines |
| S32 | L2 | Survey & comparison of ORB-SLAM3 / VINS-Fusion / DROID-SLAM / RTAB-Map | https://article.isarpublisher.com/viewArticle/Numerical-Evaluation-and-Comparative-Analysis-of-Visual-Inertial-SLAM-Algorithms-ORB-SLAM3-VINS-Fusion-DROID-SLAM-and-RTAB-Map | VIO drift baselines |
| S33 | L3 | nicholasaleks/Damn-Vulnerable-Drone wiki — *GPS Data Injection* | https://github.com/nicholasaleks/Damn-Vulnerable-Drone/wiki/GPS-Data-Injection | Confirms ArduPilot blends GPS sources by quality; security implications of GPS_INPUT |
| S34 | L1 | QGroundControl — *StatusTextHandler / RequestMessageState API* | https://api.qgroundcontrol.com/master/classStatusTextHandler.html | STATUSTEXT pipeline used for companion-computer comms |
| S35 | L4 | mavlink/qgroundcontrol issue #7599*Display Companion Status on QGC* | https://github.com/mavlink/qgroundcontrol/issues/7599 | Companion-computer status display gap; ONBOARD_COMPUTER_STATUS workflow |
| S36 | L2 | Bian et al., *ViewBridge: Revisiting Cross-View Localization from Image Matching* (arXiv 2508.10716, 2025) | https://arxiv.org/abs/2508.10716 | CVFM benchmark, 32,509 cross-view pairs, BEV projection + similarity refinement |
| S37 | L2 | OrthoLoC (2025) — UAV-to-orthographic 6-DoF localization with AdHoP refinement | (referenced in cross-view SOTA results) | Compatible with any matcher; ↑95% match quality, ↓63% translation error |
| S38 | L3 | LAND INFO — *Satellite imagery pricing* | https://www.landinfo.com/satellite-imagery-pricing.html | Cross-vendor reference pricing (WV-3/4 30 cm pansharpened: $25.5032.50/km² archive vs new) |
| S39 | L2 | Cross-view UAV-satellite matching survey (MDPI Sensors 2024) | https://www.mdpi.com/1424-8220/24/12/3719 | RDS 84.40%, MA@20 83.35% — practical accuracy ceiling for cross-view in mostly-nadir setup |
**Coverage notes**
- Multiple L1/L2 sources for every quantitative AC line (accuracy, MRE, latency, hardware envelope, tile size).
- The Google Maps + Bing Maps offline-prohibition findings have **two L1 sources each** (terms of service + dev-platform AUP).
- The "fixed-wing 1 km AGL with public IMU" gap is a **finding**, not a fixable source — no public dataset matches all four constraints simultaneously.
---
## Mode B (Solution Assessment) sources — appended 2026-04-26
| ID | Tier | Title | URL | Used for |
|----|------|-------|-----|----------|
| S40 | L1 | NVIDIA Jetson AI Lab — *Benchmarks (DINOv2-base-patch14, ViT-base, CLIP-ViT-base)* | https://www.jetson-ai-lab.com/archive/benchmarks.html | Measured Orin Nano Super throughput: DINOv2-base-patch14 = **126 inf/s** (Super), 75 inf/s (original); CLIP-ViT-base/16 = 161 inf/s; ViT-base/16 = 158 inf/s. Real numbers for AnyLoc backbone (W2.a / W9.a). |
| S41 | L1 | ArduPilot — *Non-GPS Position Estimation* (dev docs) | https://ardupilot.org/dev/docs/mavlink-nongps-position-estimation.html | **ODOMETRY is the preferred external-nav method** in ArduPilot (over VISION_POSITION_ESTIMATE and over GPS_INPUT for non-GPS-substitute use). Carries quaternion, velocity, **21-element pos+attitude covariance**, and a `quality` field (-1=failed → 100=best). VISO_QUAL_MIN gates ignored messages. |
| S42 | L1 | ArduPilot PR #19563*VisualOdom: Support ODOMETRY mavlink message* | https://github.com/ArduPilot/ardupilot/pull/19563 | ODOMETRY support landed Dec 2021 for the Plane stack as well as Copter; tested with ModalAI VOXL VIO. |
| S43 | L1 | ArduPilot PR #30080*External nav+gps fix* | https://github.com/ArduPilot/ardupilot/pull/30080 | Active 2025 work on source-switching when running external nav alongside GPS — confirms there are real edge cases when migrating between GPS_INPUT and ODOMETRY mid-flight. Relevant to AC-NEW-2 (spoofing-promotion latency). |
| S44 | L1 | ArduPilot Plane — *MAVLink2 Signing* | https://ardupilot.org/plane/docs/common-MAVLink2-signing.html | Signing is per-link, USB bypasses signing, keys live in FRAM (32-byte secret + timestamp). Configured via Mission Planner. Production-mature in ArduPilot 4.5+ but key-distribution is an operator step. |
| S45 | L3 | mavlink-router issue #436*Stack-based buffer overflow in ConfFile::get_sections* | https://github.com/mavlink-router/mavlink-router/issues/436 | Public, easily-triggered overflow in config-file parsing of mavlink-router. Repo has **no formal security policy / no SECURITY.md**. Direct attack surface for any project that uses mavlink-router on the companion. |
| S46 | L2 | Ali-bey et al., *BoQ: A Place is Worth a Bag of Learnable Queries* (CVPR 2024) | https://arxiv.org/abs/2405.07364 ; https://github.com/amaralibey/bag-of-queries | New VPR SOTA (CVPR 2024); cross-attention over learnable queries; works on CNN + ViT backbones; **outperforms NetVLAD, MixVPR, EigenPlaces** + outperforms two-stage (Patch-NetVLAD, TransVPR, R2Former) at lower cost. DinoV2 results added Nov 2024. |
| S47 | L2 | Izquierdo & Civera, *DINOv2 SALAD: Optimal Transport Aggregation for VPR* (CVPR 2024) | https://serizba.github.io/salad.html ; https://github.com/serizba/salad | DINOv2 + Sinkhorn-based optimal-transport VLAD aggregation; **R@1 75% on MSLS Challenge, 92.2% on MSLS Val, 76% on NordLand**. Already in `aero-vloc` benchmark, so we get an apples-to-apples bench against AnyLoc/MixVPR/EigenPlaces. |
| S48 | L2 | Shen et al., *GIM: Learning Generalizable Image Matcher From Internet Videos* (ICLR 2024 spotlight) | https://arxiv.org/abs/2402.11095 ; https://github.com/xuelunshen/gim ; https://xuelunshen.com/gim | Self-training on 50 h of YouTube videos → **8.418.1% relative zero-shot improvement** over LightGlue / RoMa / DKM / LoFTR baselines. ZEB benchmark (zero-shot evaluation). Same architecture, more general training. |
| S49 | L2 | *AerialExtreMatch: A Benchmark for Extreme-View Image Matching and UAV Localization* | https://openreview.net/forum?id=5a5T3IW2B6 | 1.5 M synthetic image-pair benchmark with **32 difficulty levels** (overlap × scale × pitch). Real-world UAV localization subset. Direct measurement of the failure-mode that worries us most. |
| S50 | L2 | *2chADCNN: Template Matching for Season-Changing UAV Aerial Images and Satellite Imagery* (MDPI Drones 2023) | https://www.mdpi.com/2504-446X/7/9/558 | Two-channel CNN trained for cross-season UAV↔satellite matching. Useful both as season-robustness baseline and as a target for the bench-off (does the SOTA matcher really need season-aware training, or do generic GIM/RoMa already win?). |
| S51 | L2 | TartanAir V2 — photorealistic synthetic SLAM dataset | https://tartanair.org/ ; https://tartanair.org/modalities.html | 65 environments, 12-camera rig, IMU + LiDAR + depth + semantic + flow + event modalities, custom camera models (pinhole / fisheye / equirectangular). Photorealistic (AirSim-based). Higher fidelity than MidAir. |
| S52 | L2 | Kim — *Monocular Visual Odometry for Fixed-Wing Small Unmanned Aircraft Systems* (AFIT thesis #2266) | https://scholar.afit.edu/etd/2266 | SOTA monocular VO (SVO, DSO, ORB-SLAM2) tested on real fixed-wing flights — **all three had significant difficulty maintaining localisation**. Confirms VO-only is not viable; the draft's "VO between satellite anchors" framing is the right answer. |
| S53 | L2 | Quan & Cao, *Visual-Inertial Odometry Using High Flying Altitude Drone Datasets* (MDPI Drones 2023) | https://www.mdpi.com/2504-446X/7/1/36 | High-altitude VIO performance numbers for the 3001000 m AGL band — directly applicable to our 1 km AGL operating band; benchmark baseline for AC-1.3. |
| S54 | L1 | mapproxy issue #196 + maplibre/martin `mbtiles` pool | https://github.com/mapproxy/mapproxy/issues/196 ; https://github.com/maplibre/martin/blob/738c55e9/mbtiles/src/pool.rs | Operational recipe for MBTiles SQLite under concurrent read+write: **WAL mode + connection pool + transaction batching**. Non-WAL MBTiles is the typical reason "MBTiles is slow" complaints exist. |
| S55 | L1 | Python.org — *Free-threaded mode (Python 3.13)* | https://docs.python.org/3.13/howto/free-threading-python.html ; https://py-free-threading.github.io/ | Free-threading is **experimental** in 3.13; has "substantial single-threaded performance hit"; many C extensions don't support it; GIL auto-re-enables on import of non-FT-aware extensions. Not v1-ready. |
| S56 | L2 | Lazarski et al. — *Terrain Analysis in Eastern Ukraine* (Kharkiv-region UAV survey, IEEE 2018) | https://ieeexplore.ieee.org/document/8441556 ; http://www.50northspatial.org/medium-cost-uav-mapping/ | **Eastern-Ukraine relief amplitude ≈ 24 m peak-to-trough** in Kharkiv test areas, with creek + gully (yary) systems. Quantifies the residual error of the flat-Earth ortho assumption (R-Terrain). |
| S57 | L1 | aedelon/mast3r-runtime | https://github.com/aedelon/mast3r-runtime | MASt3R inference runtime: **Jetson Orin support listed as "Planned"**, not implemented. Plus *Speedy MASt3R* paper achieves 91 ms/pair on **A40 GPU** — Jetson Orin Nano Super is roughly 1/30 of A40 throughput, putting MASt3R at ~3 s/pair on our target hardware. |
---
## Mode B Round 2 (component-replacement deep-dive) — appended 2026-04-26
| ID | Tier | Title | URL | Used for |
|----|------|-------|-----|----------|
| S58 | L2 | Yang et al., *LiteSAM: Lightweight and Robust Feature Matching for Satellite and Aerial Imagery* (Remote Sensing 17(19):3349, MDPI, Oct 2025) | https://www.mdpi.com/2072-4292/17/19/3349 ; https://github.com/boyagesmile/LiteSAM | Purpose-built satellite↔aerial matcher. **6.31 M params (2.4× smaller than EfficientLoFTR's 15.05 M); RMSE@30 = 17.86 m on UAV-VisLoc (beats EfficientLoFTR); 61.98 ms / pair on standard GPU; 497.49 ms / pair on Jetson AGX Orin (= 22.9% / 19.8% faster than EfficientLoFTR-optimized).** Components: TAIFormer (token-aggregation transformer with conv token mixer) + MinGRU dynamic sub-pixel refinement. |
| S59 | L1 | leftfield-geospatial/orthority — Python orthorectification toolkit | https://orthority.readthedocs.io/ ; https://github.com/leftfield-geospatial/orthority ; https://pypi.org/project/orthority/ | **Per-image orthorectification** as a Python library (frame + RPC camera models, GeoTIFF DEM, RPC refinement, pan-sharpening). Successor of `dugalh/simple-ortho`. Pip/conda installable; CLI + API. Direct fit for Component 1b's per-frame ortho step (replaces hand-rolled pinhole-on-DEM code). |
| S60 | L2 | Korovko et al. (NVIDIA), *cuVSLAM: CUDA-Accelerated Visual Odometry and Mapping* (arXiv 2506.04359, Jul 2025) | https://arxiv.org/abs/2506.04359 ; https://github.com/nvidia-isaac/cuVSLAM ; https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_visual_slam | NVIDIA's CUDA-accelerated VSLAM, **explicitly optimized for Jetson edge devices**. Modular front-end (Shi-Tomasi GFTT keypoints + LK pyramidal tracking + NCC consistency check) and back-end (sparse bundle adjustment + pose-graph optimization + loop closure). Supports 1 → 32 cameras, monocular + monocular-depth + stereo + multi-stereo, optional IMU. **<1 % ATE on KITTI; <5 cm on EuRoC**, real-time on Orin platforms. Apache-2.0. Drop-in via `isaac_ros_visual_slam` ROS 2 package. |
| S61 | L2 | Liao, *DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry* (arXiv 2511.12653, Nov 2025) | https://arxiv.org/abs/2511.12653 ; https://arxiv.org/html/2511.12653v1 | Quantization-aware training + CUDA kernel fusion for DPVO front-end (back-end stays FP32). On RTX-4060: **+52% FPS (TartanAir), +30% FPS (EuRoC), 3765 % peak GPU memory**, ATE preserved. Confirms the "deployment gap" framing: **even DPVO-QAT++ is benchmarked on RTX-4060, NOT on Jetson** — Orin Nano Super extrapolation puts plain DPVO at ≈410 FPS (well under our 10 Hz inference target). |
| S62 | L2 | Murai et al. (Imperial / NVIDIA), *MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors* (CVPR 2025) | https://arxiv.org/abs/2412.12392 ; https://github.com/rmurai0610/MASt3R-SLAM ; https://opencv.org/mast3r-slam/ | Dense monocular SLAM built on MASt3R prior. **15 FPS on a single GPU**; outperforms DROID-SLAM on EuRoC + 7-Scenes; calibration-free. **No Jetson port**; given Speedy MASt3R = 91 ms/pair on A40, MASt3R-SLAM on Orin Nano Super is sub-1-Hz → **infeasible for inline v1 use**. Useful as offline ground-truth oracle or future-track candidate. |
| S63 | L2 | Edstedt et al., *RoMa v2: Harder Better Faster Denser Feature Matching* (arXiv 2511.15706, Nov 2025) | https://arxiv.org/abs/2511.15706 ; https://github.com/Parskatt/romav2 | New SOTA dense matcher: frozen DINOv3 backbone + custom CUDA + predictive covariance + decoupled match-then-refine. Best published pose-estimation accuracy. **Compute footprint is GPU-class**; not a candidate for inline Jetson Orin Nano Super inference, but a plausible offline ceiling reference for the Component-3 bench-off. |
| S64 | L1 | NVIDIA Isaac ROS — *Visual SLAM* (Jetson tutorial + reference implementation) | https://nvidia-ai-iot.github.io/jetson_isaac_ros_visual_slam_tutorial/ ; https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_visual_slam ; https://github.com/bandofpv/VSLAM-UAV | **Reference implementation of GPS-denied UAV with cuVSLAM on Jetson Orin Nano + RealSense D435i + MAVROS + PX4** (Hackster.io / bandofpv). Demonstrates the production-deployable path: cuVSLAM publishes ROS 2 pose; MAVROS converts to MAVLink; FC consumes via VISION_POSITION_ESTIMATE / ODOMETRY. ArduPilot variant exists (sidharthmohannair/ros2-ardupilot-sitl-hardware). |
| S65 | L1 | ArduPilot issue #30076*Fixing ExternalNav + GPS* | https://github.com/ArduPilot/ardupilot/issues/30076 | **EKF3 incorrectly fuses GPS data simultaneously when ExtNav is the configured POSXY source** — root cause was a stray `else` branch in `FuseVelPosNED()`. Causes "unstable positions with high variances and reset behavior when position estimates diverge". Documents the **double-fusion-is-not-a-feature** invariant for our hybrid `GPS_INPUT + ODOMETRY` plan. Status: PR landed; pin ArduPilot to a fixed version. |
| S66 | L1 | ArduPilot issue #32506*EKF3 Position Down snaps to ODOMETRY Z value when ExternalNav is not configured as POSZ source* | https://github.com/ArduPilot/ardupilot/issues/32506 | Sister bug to #30076: Z-axis snap-to-ODOMETRY when only POSXY uses ExtNav. Reinforces the "**only one horizontal position source active at a time**" architectural invariant — feeding both GPS_INPUT and ODOMETRY for the same axis is a configuration error, not a feature. Has a direct impact on draft02's M-1 conclusion. |
| S67 | L1 | ArduPilot wiki — *EKF Sources* (`common-ekf-sources.rst`) | https://github.com/ArduPilot/ardupilot_wiki/blob/master/common/source/docs/common-ekf-sources.rst | Authoritative spec for `EK3_SRC1_*` / `EK3_SRC2_*` / `EK3_SRC3_*` and runtime source switching via RC aux or MAVLink. Confirms architectural rule: **only one position source per axis at a time**; ExtNav is option 6. |
| S68 | L1 | PX4 PR #22262*EKF2: Error-State Kalman Filter* | https://github.com/PX4/PX4-Autopilot/pull/22262 | Confirms PX4 EKF2 is an **ESKF** (in contrast to ArduPilot's EKF3 which is a classical extended Kalman filter). Real-hardware PX4 testing: ESKF reduces CPU load by ~0.3 % vs total-state EKF on autopilot. Key takeaway: **ArduPilot users (us) cannot swap the FC filter to ESKF** — the FC-side debate is moot. ESKF only matters for any companion-side filter we choose to add. |
| S69 | L2 | Sola, *Quaternion kinematics for the error-state Kalman filter* (arXiv 1711.02508) + Madgwick / Solà / Forster references | https://arxiv.org/abs/1711.02508 | Canonical ESKF treatment: nominal + error-state decomposition, tangent-space covariance, retraction through `Exp/Log` on SO(3) / SE(3). The standard reference for any companion-side ESKF implementation. |
| S70 | L2 | Yu et al., *T-ESKF: Transformed Error-State Kalman Filter for Consistent Visual-Inertial Navigation* (arXiv 2510.23359, Oct 2025) + *Adaptive Covariance and Quaternion-Focused Hybrid ESKF/UKF for VIO* (arXiv 2512.17505, Dec 2025) | https://arxiv.org/abs/2510.23359 ; https://arxiv.org/abs/2512.17505 | 2025 advances on top of ESKF: T-ESKF restores observability consistency under partial-yaw observability; Hybrid ESKF/UKF gains **+49 % position / +57 % rotation accuracy vs pure ESKF, ~48 % cheaper than full SUKF**. Both are research-track; v1 if we run a companion-side filter at all, vanilla ESKF is enough. |
| S71 | L1 | OpenStreetMap Sensors / VINS-Fusion + OpenVINS Jetson Orin Nano integration reports | https://github.com/HKUST-Aerial-Robotics/VINS-Fusion/issues/220 ; https://github.com/rpng/open_vins/issues/421 ; https://github.com/fdcl-gwu/openvins_jetson_realsense | Field reports: VINS-Fusion runs ~15 FPS on Xavier NX after OpenCV pinning; on Orin Nano builds with JetPack 6 + ROS 2 Humble after fixing OpenCV ArUco/CUDA mismatches. Useful as **comparison baselines for any cuVSLAM bench-off, not as primary candidates** (integration cost dwarfs cuVSLAM's drop-in). |
| S72 | L2 | Quan et al., *Visual-Inertial Odometry Using High Flying Altitude Drone Datasets* (Drones 7(1):36, MDPI 2023) | https://www.mdpi.com/2504-446X/7/1/36 | High-altitude (40100 m) VIO field tests: **stereo-VIO = 2.186 m error over 800 m trajectory; monocular VIO "acceptable but worse than stereo"**. Lower bound on the altitude band; our regime is 1 km AGL where motion-parallax VO degrades further (most VO benchmarks assume non-trivial parallax per frame). Reinforces R8. |
| S73 | L2 | Princeton VL — *Deep Patch Visual SLAM* (DPV-SLAM, ECCV 2024) | https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00272.pdf ; https://github.com/iis-esslingen/DPV-SLAM | DPV-SLAM = DPVO + two loop-closure mechanisms. **2.5× faster than DROID-SLAM on EuRoC, 57 GB GPU memory vs DROID's 24 GB**, 1×–4× real-time on real-world datasets. Same Jetson-deployment caveat as DPVO. |
| S74 | L2 | OrthoLoC + AdHoP — UAV-to-orthographic 6-DoF localization with feature-matcher refinement | (referenced in cross-view SOTA results) | Compatible with **any matcher** (drop-in refinement layer): up to **+95 % matching accuracy / 63 % translation error**. Architecturally orthogonal to the matcher choice itself; we can layer this on top of SP+LG / GIM-LG / LiteSAM regardless of which wins the bench-off. |
| S75 | L2 | AerialExtreMatch open-review (1.5 M synthetic pairs, 32 difficulty levels) — methods evaluated table | https://openreview.net/forum?id=5a5T3IW2B6 ; https://github.com/Xecades/AerialExtreMatch | Confirms AerialExtreMatch evaluates **16 representative matchers** (detector-based + detector-free), with publicly-available results. Becomes our primary structured-difficulty regression bench (already in draft02 as F-T5b). |
| S76 | L4 | Stack Overflow / Jetson dev forum — *Orin Nano FP16/INT8 throughput discussion* | https://forums.developer.nvidia.com/t/jetson-orin-nano-fp16-int8-performance/326723 ; https://github.com/ultralytics/ultralytics (YOLO26 Jetson Orin Nano Super benchmark commit 8d4e6e8) | Empirical reference points on Orin Nano Super: **FP16 ≈ 4.5 ms / INT8 ≈ 3.8 ms per YOLO26-n inference**. Useful sanity-check rate: small TRT engines run in single-digit ms; SP+LG / GIM-LG family fits comfortably in our budget. |
| S77 | L2 | thomasthelliez.com — *ROS 2 / Isaac ROS on Jetson Orin Nano Super practical guide* + Hackster.io GPS-Denied Drone reference design | https://thomasthelliez.com/blog/isaac-ros-on-nvidia-jetson-orin-nano-super/ ; https://www.hackster.io/bandofpv/gps-denied-drone-with-nvidia-jetson-orin-nano-9f3417 | **ROS 2 Humble + JetPack 6 + Isaac ROS 3.2 + cuVSLAM + MAVROS** is a working reference architecture on the exact target hardware (Orin Nano Super). Establishes ROS 2 vs DIY Python orchestrator as a real alternative for Component 9. |
+501
View File
@@ -0,0 +1,501 @@
# Fact Cards — Phase 1 (AC & Restrictions Assessment)
Each fact card: statement, source(s), confidence (High / Med / Low), audience.
---
## A — Position accuracy state of the art
**F-A1**. State-of-the-art UAV cross-view visual localization (drone image vs. ortho satellite map) at low altitude (30300 m, multi-view, oblique allowed) achieves **74.1% recall@5 m** on the AnyVisLoc benchmark (best combined retrieval + matching + PnP).
- Source: S02 (AnyVisLoc paper, 2025).
- Confidence: High. Audience: implementer / decision-maker.
**F-A2**. **Cross-view image matching benchmarks** report Relative Distance Score (RDS) up to 84.40% and **MA@20 (matched within 20 m) up to 83.35%** in nadir-favoring setups — i.e., 80%+ within 20 m is achievable with current methods on similar reference data.
- Source: S39.
- Confidence: Med. Audience: implementer / decision-maker.
**F-A3**. The most relevant **fixed-wing aerial public benchmark** is UAV-VisLoc (6,742 drone images, fixed-wing & multi-rotor, **altitudes 405840 m**, ortho satellite reference at **0.3 m/px** from Google Earth, 11 sites in China incl. cities/towns/farms/rivers/hills/forests).
- Source: S01.
- Confidence: High. Audience: implementer.
**F-A4**. **AerialVL** (RA-L 2024) is a fixed-wing UAV dataset with 11 sequences / ~70 km of trajectory, RGB camera with **gimbal**, NovAtel GNSS at **1.5 m RMS** ground truth, and reference satellite map. Provides VPR + visual alignment + VO baselines.
- Source: S03.
- Confidence: High. Audience: implementer.
**F-A5**. The **viewpoint discrepancy** (oblique aerial vs. nadir satellite) and **temporal staleness** (seasonal / construction change) are the two dominant accuracy degraders cited across cross-view localization literature. ViewBridge (2025), OrthoLoC (2025), and AnyVisLoc all emphasise BEV projection or 3D-grounded matching as mitigation.
- Source: S02, S36, S37.
- Confidence: High. Audience: technical expert.
**F-A6**. Confidence-score schemes used by mature visual localization stacks: (a) RANSAC inlier ratio after PnP/homography; (b) reprojection error variance; (c) top-K retrieval similarity gap; (d) 6-DoF pose covariance from EKF/factor-graph optimization; (e) photometric consistency vs. tile.
- Source: S03, S04, S32 (and ORB-SLAM3 lit).
- Confidence: High. Audience: implementer.
---
## B — Image registration & feature matching
**F-B1**. **SuperPoint + LightGlue** with TensorRT runs at ~286 FPS on RTX 3080 at 320×240. SuperPoint ≈ 0.95 ms, LightGlue ≈ 2.54 ms per pair on RTX 3080.
- Source: S11.
- Confidence: High. Audience: implementer.
**F-B2**. **Jetson Orin NX (sibling SoC)** has a working LightGlue+TensorRT deployment (CUTLASS FlashAttention V2 plugin, qdLMF repo) — confirms feasibility on Jetson Orinclass hardware. No public public-released benchmark for Jetson Orin Nano Super specifically.
- Source: S12.
- Confidence: Med. Audience: implementer.
**F-B3**. **XFeat** (CVPR 2024) is **5× faster** than LightGlue / SuperPoint while maintaining comparable accuracy; runs in real-time on a budget CPU (i5-1135G7); offers semi-dense matching mode; C++ + CUDA 12.2 implementations available.
- Source: S08.
- Confidence: High. Audience: implementer.
**F-B4**. **MASt3R** (ECCV 2024) achieves +30% absolute VCRE AUC on Map-free localization vs. prior SOTA — valuable for cross-view UAV/satellite due to its 3D-grounded matching, but is **heavier** (transformer with depth backbone) than LightGlue/XFeat — may exceed Jetson Orin Nano Super 8 GB envelope under the user's latency budget without aggressive distillation/quantization.
- Source: S09.
- Confidence: Med. Audience: technical expert.
**F-B5**. **Mean Reprojection Error <1 px** is a tight but achievable target for *homography-fit* on overlapping aerial pairs; for full-PnP across UAVsatellite the typical achieved MRE is 13 px on cross-view benchmarks (heavily dependent on pixel scale ratio between drone and satellite).
- Sources: S01 (UAV-VisLoc), S03 (AerialVL), S36 (ViewBridge).
- Confidence: Med. Audience: technical expert.
---
## C — Resilience & re-localization
**F-C1**. **Aerial VPR survey + aero-vloc benchmark** (2024) provides a unified evaluation framework over AnyLoc, CosPlace, EigenPlaces, MixVPR, NetVLAD, SALAD, SelaVPR with re-ranking via LightGlue/SuperGlue. Datasets used: VPAir, ALTO, MARS-LVIG.
- Source: S04.
- Confidence: High. Audience: implementer.
**F-C2**. **AnyLoc** (DINOv2 + unsupervised VLAD) achieves up to 4× higher Recall@1 than environment-specialised approaches across urban / aerial / underwater / subterranean **without training**. Strong default for cross-view re-localization when training data is limited.
- Source: S05.
- Confidence: High. Audience: implementer / decision-maker.
**F-C3**. **MixVPR**: 94.6% R@1 on Pitts250k with <50% the parameter count of NetVLAD — best lightweight VPR aggregation in 20232024.
- Source: S06.
- Confidence: High. Audience: implementer.
**F-C4**. **Tile-zoom / overlap selection** when constructing the satellite reference map is a **critical** parameter for VPR efficiency and accuracy in aerial domain (per the 2024 survey).
- Source: S04.
- Confidence: High. Audience: implementer.
---
## D — Onboard real-time performance on Jetson Orin Nano Super
**F-D1**. **Jetson Orin Nano Super** (with JetPack 6.2 "Super Mode"): **67 TOPS sparse INT8** AI performance, 8 GB shared LPDDR5, supports **15 W / 25 W / MAXN SUPER** power modes. The 25 W mode is the new "reference" performance mode.
- Source: S14, S15.
- Confidence: High. Audience: implementer / decision-maker.
**F-D2**. **Sustained-load thermal throttling** is real on Jetson family — earlier-gen Xavier NX (21 TOPS) throttled within 5 minutes at 640×480 YOLOv8n. Orin Nano Super is reportedly more thermally efficient but **8-hour sustained 25 W requires forced-air cooling and possibly active heatsink** — not solvable purely in software.
- Source: S14, S15 + practitioner test S14.
- Confidence: Med. Audience: implementer / decision-maker.
**F-D3**. **MAXN SUPER** is uncapped; if power exceeds TDP the module auto-throttles. For sustained 8 h flight on a fixed-wing UAV with ~25 W power budget, **the system MUST be sized to fit the 25 W envelope at 100% duty**, not MAXN.
- Source: S14.
- Confidence: High. Audience: implementer.
**F-D4**. Naive scaling from RTX 3080 → Orin Nano Super for SuperPoint+LightGlue gives ~3040× slower (RTX 3080 ≈ 30 TFLOPS FP16, Orin Nano Super ≈ 1 TFLOPS FP16 scope). At 320×240 ≈ 3.5 ms × 35 ≈ **~120 ms/pair on Jetson Orin Nano Super**. Pre-running matching on a downsampled image (e.g., 1024×683 from 6200×4100) is feasible within the **400 ms p95 budget** when combined with feature caching for the satellite tile.
- Source: derived from S11, S14 (back-of-envelope; needs empirical confirmation in Phase 2).
- Confidence: Low. Audience: technical expert.
- **Action**: Empirical benchmark on actual Jetson Orin Nano Super in implementation phase.
---
## E — Satellite imagery sourcing & legality
**F-E1**. **Google Maps / Map Tiles API** explicitly prohibits offline use, image analysis, machine interpretation, object detection, geodata extraction, and "any systems or functions for automatic or autonomous control of vehicle behavior". **Use of Google Maps satellite tiles for an offline UAV navigation system violates the Terms of Service.**
- Sources: S22 (Map Tiles API Policies), S23 (Maps Platform ToS).
- Confidence: **High** (two L1 sources, explicit language). Audience: decision-maker / legal.
- **Severity**: Hard blocker — must be resolved before solution design.
**F-E2**. **Bing Maps** also prohibits creating local copies / offline storage of tiles. Tile URLs are not stable; the supported access pattern is dynamic REST queries per session. **Bing tiles are not a viable offline reference source either.**
- Source: S24.
- Confidence: High. Audience: decision-maker.
**F-E3**. **Maxar Vivid Mosaic** offers a **30 cm global basemap** (135 M km², ex-Antarctica) and a **15 cm urban basemap** (7 M km²), **continuously refreshed** with AI-driven change detection. Pricing for archive imagery is approximately **$2532 / km²** for similar 30 cm products. **Licensing for offline tactical use must be negotiated explicitly with Maxar (Vantor)** — this is the standard path for defense customers.
- Sources: S25, S38.
- Confidence: High. Audience: decision-maker.
**F-E4**. **Airbus Pléiades Neo** provides 30 cm via OneAtlas; volume pricing approximately **€58.50 / km²** on a 6-month sliding window. Direct competitor to Maxar at sub-meter resolution.
- Sources: S26, S27.
- Confidence: High. Audience: decision-maker.
**F-E5**. **Sentinel-2 cloudless** (EOX) provides a **free** global mosaic but at **10 m/px** — well below the AC requirement of 0.5 m/px (ideally 0.3 m/px). At **1 km AGL** Sentinel-2 is too coarse to achieve registration with a 24 cm/px drone image without massive scale-bridging losses.
- Source: S28 + S01 (drone GSD).
- Confidence: High. Audience: implementer.
**F-E6**. For **Eastern/Southern Ukraine** specifically, Sentinel-2 / Sentinel-1 are heavily used in 2022+ academic literature for damage / change detection. **Maxar and Planet are the de-facto sources for sub-meter imagery** of Ukraine. Recent satellite imagery for this region is operationally sensitive but commercially available.
- Source: S28 (Ukraine 20242025 references in the EOX/Sentinel papers).
- Confidence: High. Audience: decision-maker.
**F-E7**. **Active-conflict-region staleness** is a real risk: dam destruction (Kakhovka), urban damage, cratering, road realignment, smoke/dust — all can defeat cross-view matching against pre-conflict imagery. **Imagery freshness budget should be tightened from "<2 years" to "<6 months for active sectors, <12 months for stable rear areas"** — to be confirmed with operations.
- Source: S28 + extrapolation from change-detection literature for Ukraine.
- Confidence: Med. Audience: decision-maker.
---
## F — Camera & GSD
**F-F1**. **GSD formula**: GSD (cm/px) = (Altitude_m × 100 × Sensor_w_mm) / (Focal_mm × Image_w_px).
For a typical full-frame sensor (36 mm wide) with 24 mm wide-angle lens at **1 km AGL** and **6200 px** wide image: GSD ≈ **24 cm/px**, frame footprint ≈ **1.49 km × 0.99 km**. Drone images at 0.10.2 m/px (per UAV-VisLoc) is consistent.
- Source: S29.
- Confidence: High. Audience: implementer.
**F-F2**. **Camera intrinsics calibration is mandatory** — without known focal length, principal point, and distortion, sub-pixel MRE is impossible. Pre-flight checkerboard calibration is the standard; some payloads use factory-cal + temperature compensation.
- Source: photogrammetry consensus (S01, S03, S29).
- Confidence: High. Audience: implementer.
---
## G — MAVLink / MAVSDK / flight controller integration
**F-G1**. **GPS_INPUT** is a standard MAVLink message. **ArduPilot**: set `GPS1_TYPE=14` (MAVLink) and the autopilot will accept GPS_INPUT as the primary GPS. **PX4**: native GPS_INPUT support is limited; the standard workaround is to publish via VISION_POSITION_ESTIMATE through the EKF2 vision-pose pipeline.
- Sources: S16, S17, S18.
- Confidence: High. Audience: implementer / decision-maker.
**F-G2**. **MAVSDK-Python does NOT natively support GPS_INPUT** (open issue #320). For Python implementations, **pymavlink** must be used to emit raw GPS_INPUT messages.
- Source: S18.
- Confidence: High. Audience: implementer.
**F-G3**. ArduPilot can **blend or switch between GPS sources** by quality (sat count, HDOP). If the legitimate (jammed) GPS keeps reporting plausible values while the spoofed/denied state is intermittent, the autopilot may oscillate between sources. **The companion computer must explicitly disable / lower-quality the real GPS** (or the autopilot must be configured to *only* trust GPS_INPUT) to avoid this.
- Source: S33.
- Confidence: High. Audience: implementer / security architect.
**F-G4**. **PX4 has GPS spoofing detection** baked into the EKF2 driver chain (u-blox spoof flag, ~1 s hysteresis, GNSS-fusion auto-disable on consistent spoof signal). This is a useful upstream signal for the GPS-Denied system to know "you are now the primary source".
- Sources: S19, S20.
- Confidence: High. Audience: implementer / security architect.
**F-G5**. **PX4 failsafe delay** `COM_POS_FS_DELAY` defaults to **1 s**; `EKF2_NOAID_TOUT` controls dead-reckoning validity. Documented bugs exist (#23970) — version pinning matters.
- Source: S21.
- Confidence: Med. Audience: implementer.
**F-G6**. **QGroundControl** has only **STATUSTEXT** (string) as a first-class companion-computer message channel; ONBOARD_COMPUTER_STATUS (planned) and custom MAVLink messages (NAMED_VALUE_FLOAT/INT, custom dialect) are practical channels for re-localization request UI / confidence scores.
- Sources: S34, S35.
- Confidence: High. Audience: implementer.
---
## H — Object localization (AI camera, gimbal-only pose)
**F-H1**. **Trigonometric ground projection error** with **gimbal-angle-only** (no airframe IMU attitude fusion onto AI cam) is dominated by the **unknown UAV roll/pitch** at the moment of capture. For a fixed-wing UAV, typical roll/pitch in straight cruise is ±2°; in turns up to ±25°. At **1 km AGL**, a 5° unknown attitude → ~87 m ground-position error. **The AC "object localization accuracy is consistent with frame-center accuracy" is therefore unrealistic without attitude fusion in turning flight.**
- Source: derived from F-F1 + standard photogrammetry trig.
- Confidence: High. Audience: technical expert / decision-maker.
- **Action**: revise AC to "consistent with frame-center accuracy in level flight; expect ±h·tan(unknown_attitude) in turns" OR add attitude fusion onto AI cam.
**F-H2**. **Flat-terrain assumption** is reasonable for eastern/southern Ukraine (typical relief amplitude ~50150 m over 10 km). At 1 km AGL with up to 5° gimbal off-nadir, terrain-induced ground-projection error from flat-terrain assumption is typically <30 m for level flight — within the AC envelope. Riverbanks, tall buildings, and reservoir scarps are local exceptions.
- Source: derived from S26 + S28 + Ukraine relief data.
- Confidence: Med. Audience: technical expert.
---
## I — Hardware envelope & power
**F-I1**. Jetson Orin Nano Super in 25 W mode: ~25 W average; with cooling adequately sized for 8-hour duty, sustained throttling can be avoided. Without active cooling, expect throttling within minutes (Xavier NX precedent).
- Source: S14, S15.
- Confidence: Med. Audience: implementer.
**F-I2**. **Storage budget**: User's "~10 GB" estimate for a 400 km² @ 0.3 m/px tile cache is **correct** (400 km² × 11 px²/m² with 3-byte JPEG ≈ 1013 GB). Persistent cache across flights is feasible with a small NVMe (≥64 GB).
- Source: arithmetic; cross-checked S25 (Vivid pricing per km²).
- Confidence: High. Audience: implementer / decision-maker.
---
## J — Failsafe & resilience
**F-J1**. PX4's own GPS-loss failsafe defaults to ~1 s delay. A reasonable upstream **"system fails to produce an estimate" failsafe `N`** for the GPS-Denied system is **35 seconds** — long enough to ride out one sharp turn / re-localization attempt without flapping, short enough to let the flight controller switch to IMU dead reckoning before drift exceeds tens of metres.
- Source: S21 + practitioner heuristic.
- Confidence: Med. Audience: implementer / decision-maker.
---
## K — Public datasets for IMU / aerial dev & test
**F-K1**. **No public dataset perfectly matches all four constraints**: fixed-wing + ~1 km AGL + downward-facing + synchronized IMU + GPS truth. **Closest match is AerialVL** (fixed-wing + gimbal RGB + GNSS, ~70 km of tracks, 11 sequences, RA-L 2024). Altitude band for AerialVL is "different altitudes" (not always 1 km).
- Source: S03.
- Confidence: High. Audience: implementer.
**F-K2**. **UAV-VisLoc** is the largest fixed-wing **drone-vs-satellite localization** dataset (6,742 images, 405840 m altitudes, 0.3 m/px Google Earth reference) — but it does not provide synchronized IMU.
- Source: S01.
- Confidence: High. Audience: implementer.
**F-K3**. **MidAir** (synthetic, quadcopter) provides full IMU + GPS + depth + semantic at low altitude. Good for **training-time augmentation** but not real-world testing for fixed-wing at 1 km AGL.
- Source: S30.
- Confidence: High. Audience: implementer.
**F-K4**. **Recommended dev/test stack**: AerialVL (primary real-world fixed-wing) + UAV-VisLoc (visual-localization-only validation at 1 kmneighborhood altitude) + MidAir (synthetic IMU augmentation) + the user's own 65 input-data photos for sanity / regression. Real IMU from a dedicated test flight should still be planned for system V&V.
- Source: synthesis of S01, S03, S30.
- Confidence: High. Audience: decision-maker.
---
## L & M — Restriction & AC gaps / contradictions
**F-LM1**. **Restriction "up to 3000 photos per flight"** is **inconsistent** with the stated 8-hour endurance × 3 fps = **86,400 photos** and with the 500 ms minimum interval × 8 h = 57,600 photos. Likely interpretations:
(a) On-disk **retention** budget (sub-sample for storage).
(b) Imagery for an *individual mission segment* (~17 min × 3 fps = 3,000), not the full sortie.
(c) A stale value carried over from a Mavic 3 attempt that should be updated.
- **Hard contradiction**: needs user resolution before solution sizing.
**F-LM2**. **Camera resolution range "FullHD to 6252×4168"** is wide (~13× pixel-count delta). Per-frame pipeline cost scales with resolution; AC compliance is camera-dependent. Need to lock the **target camera spec** for AC validation.
**F-LM3**. **Latency 400 ms vs. cycle 333 ms (3 fps)**: the user has confirmed `<400 ms p95` with skip-allowed. This is **internally consistent**; the AC should be re-stated as "p95 latency <400 ms; up to ~10% of frames may be dropped under sustained load" to remove the apparent contradiction with frame rate.
**F-LM4**. **Suggested missing AC** (gap analysis):
- **L2** — Time-to-first-fix on cold start / mid-flight reboot (e.g., <30 s after IMU-extrapolated init).
- **L3** — Spoofing-promotion latency (system asserts its estimate over flight controller GPS within X seconds of denial).
- **L4** — Flight-data-recorder requirement (all photos + estimates + confidence + IMU traces at full rate, retained in non-volatile storage with a budgeted size cap).
- **L5** — False-position safety budget (e.g., probability of an estimate >500 m from truth must be <0.1% per flight).
- **L6** — Operational temperature / vibration envelope (MIL-STD-810 lite or RTCA DO-160G low-altitude variant).
- **L7** — Imagery freshness operationally enforced (e.g., reject tiles older than 12 months for active sectors).
**F-LM5**. **Restriction "Google Maps allowed"** is **legally not allowed** per F-E1/E2. The project must change source to a license-cleared provider (Maxar Vivid / Airbus Pléiades / commissioned tasking / government feed) before deployment. **This is a blocker, not a tweak.**
---
## Mode B Findings — adversarial assessment of `solution_draft01.md` (2026-04-26)
**M-1 (Component 6 / AC-4.3) — ODOMETRY is ArduPilot's preferred external-nav channel, not GPS_INPUT.** ArduPilot's own dev docs (S41) call **ODOMETRY "the preferred method"** for sending external position estimates to EKF3, ahead of both VISION_POSITION_ESTIMATE and GPS_INPUT. ODOMETRY carries quaternion + 3-D linear velocity + a **21-element pos+attitude covariance** (incl. native yaw error) + a `quality` field (-1=failed, 0=unset, 1..100). VISO_QUAL_MIN gates ignored messages on the FC side. GPS_INPUT collapses our 6-DoF covariance into a scalar `h_acc` / `v_acc`, which directly under-reports our yaw covariance and under-utilises the FC's EKF3. The draft's GPS_INPUT-only choice is sub-optimal for AC-NEW-4 (false-position safety) covariance fidelity.
- Source: S41, S42, S43.
- Confidence: ✅ High.
**M-2 (Component 3) — MASt3R is not viable as primary on Orin Nano Super at 25 W.** `mast3r-runtime` (S57) lists Jetson Orin support as **"Planned"**, not implemented. *Speedy MASt3R* (S57 paper-side) achieves 91 ms / pair on an **A40 GPU**, which is roughly **30× the throughput** of a Jetson Orin Nano Super in 25 W mode → MASt3R extrapolates to **~2.53 s / pair** on our target hardware without aggressive distillation/INT8 work that nobody has published yet. Drop MASt3R from the matcher *primary* shortlist; keep it only as a long-horizon research target.
- Source: S57.
- Confidence: ✅ High.
**M-3 (Component 3) — Add GIM (ICLR 2024 spotlight) to the bench-off shortlist.** GIM (S48) is a self-training framework that takes existing matchers (LightGlue, RoMa, DKM, LoFTR) and re-trains them on 50 h of internet videos for **8.418.1 % zero-shot improvement**. The "generalist trained on diverse video" framing is the closest published proxy for our domain transfer (eastern-Ukraine 1 km AGL nadir vs. service satellite tiles). GIM-LightGlue should be included alongside vanilla LightGlue.
- Source: S48.
- Confidence: ✅ High.
**M-4 (Component 2) — Add SALAD (DINOv2 + Sinkhorn-VLAD) and BoQ to the VPR shortlist.** Two CVPR 2024 papers landed after the draft's "AnyLoc primary + MixVPR fast-lane" decision was made:
- **DINOv2 SALAD** (S47) — DINOv2 backbone + optimal-transport Sinkhorn aggregator with a "dustbin" cluster for non-informative features. R@1 = **75.0 %** on MSLS Challenge, **92.2 %** on MSLS Val, **76.0 %** on NordLand. Already a supported method in `aero-vloc` (S04), so direct apples-to-apples bench against AnyLoc/MixVPR.
- **BoQ** (S46) — bag of learnable queries with cross-attention; **outperforms NetVLAD, MixVPR, EigenPlaces** on 14 large-scale benchmarks; surpasses two-stage methods (Patch-NetVLAD, TransVPR, R2Former) at lower cost; DinoV2 results published Nov 2024.
AnyLoc is no longer the only DINOv2-based VPR option in the cross-domain regime; the bench-off must include all four.
- Source: S46, S47.
- Confidence: ✅ High.
**M-5 (Component 2 / 9 / latency) — DINOv2-base latency on Orin Nano Super is ~10× better than the draft assumed.** Jetson AI Lab measurements (S40): **DINOv2-base-patch14 = 126 inferences/sec on Orin Nano Super** (~8 ms/inf at 224×224), 75 inf/s on the original Orin Nano (~13 ms/inf). The draft estimated 5080 ms / 224×224. The latency budget therefore has substantially more headroom than the draft assumed — **but only at 224×224**; at higher input resolution, expect ~quadratic scaling (so 448×448 ≈ 32 ms/inf is still very comfortable inside the 400 ms p95 budget). This is a **good news** finding that simplifies AC-4.1.
- Source: S40.
- Confidence: ✅ High (NVIDIA L1 source; precision implied FP16 from JetPack 6.2 default trtexec).
**M-6 (Component 6 / Security) — `mavlink-router` is itself attack surface.** Issue #436 (S45): public, easily-triggered, fuzzing-discovered **stack-based buffer overflow** in `ConfFile::get_sections` (memcpy of user-controlled section names into a 100-byte fixed buffer with no bounds check, plus an OOB write on null-terminator append). The repo has **no formal security policy / no SECURITY.md**. The draft's "share the MAVLink endpoint via a single mavlink-router instance" recipe drops a known-vulnerable C++ daemon onto a flight-critical companion. Mitigation options:
1. Pin to a fixed-and-audited tag, harden the systemd unit (NoNewPrivileges, ReadOnlyPaths, sandbox), and config-file-validate before launch.
2. Replace mavlink-router with a tiny in-process MAVLink endpoint multiplexer (Python or Go; this is ~150 lines of code given the only consumers are MAVSDK + pymavlink + mavlink-router-replacement → FC).
3. Use distinct system-IDs for MAVSDK and pymavlink and let ArduPilot's native MAVLink routing (S35-class) do the muxing on the FC side.
- Source: S45.
- Confidence: ✅ High.
**M-7 (Component 6 / Security) — MAVLink2 signing is a v1-mandatory configuration item, not "recommended".** S44: signing is per-link, **USB bypasses signing**, keys live in FRAM (32-byte secret + timestamp), configured via Mission Planner (or the MAVProxy `signing` module). It works in ArduPilot 4.5+, but key provisioning is a **per-airframe operator step** that needs a documented procedure. Given that GPS_INPUT (or ODOMETRY) is a high-trust local channel feeding the flight-critical EKF, a signed MAVLink link companion↔FC is the only defence against an attacker who gains serial access. The draft mentions signing under "Security note (deferred to a Phase-4 security pass)" — Mode B promotes it to v1-required.
- Source: S44.
- Confidence: ✅ High.
**M-8 (Component 1 / Tile Cache) — MBTiles SQLite under our concurrent read+write workload needs WAL + connection pool + transaction batching.** S54: the canonical `mbtiles` SQLite failure modes are (a) `database is locked` errors when concurrent writers compete with readers (default rollback journal is single-writer), (b) per-tile commit overhead crippling throughput on non-SSD. Recipe:
- `PRAGMA journal_mode = WAL` (mandatory for mixed read+write).
- Connection pool (cf. `MbtilesPool` from maplibre/martin S54) — multiple read connections + one write connection.
- Transaction batching: bulk insert per N tiles per Component-1b cycle, not per tile.
- Disable per-INSERT commit; rely on transaction boundary.
The draft's tile-cache section says "MBTiles SQLite + per-tile metadata" but doesn't specify these. Add as a hard implementation note.
- Source: S54.
- Confidence: ✅ High.
**M-9 (Component 1b / Tile Dedup — *new safety risk*) — onboard tile overwrites can poison the cache.** The draft's dedup rule:
> If cache has a tile and the cache tile's `source ∈ {service}` AND the cache tile's `capture_date` is older than AC-8.2 freshness threshold AND our quality score > existing → **write** (overwrites with `source = onboard`).
The risk: a confidently-bad onboard pose (over-confident EKF covariance escapes the σ_xy ≤ 10 m gate) writes a tile that's misaligned by, say, 3050 m, but with high inlier count. Next flight, that misaligned tile becomes the satellite anchor for *another* fix → error compounds across flights. **This is a feedback-loop safety hazard that AC-NEW-4 (false-position budget) does not currently capture**, because Monte-Carlo over a single flight doesn't model the cross-flight cache-poisoning amplification.
Mitigations (any of, ideally all):
1. **Service-source tiles are immutable within freshness budget.** Onboard tiles overwrite only stale or other-onboard tiles, never a fresh service tile.
2. **Voting layer at the Service ingest.** An onboard tile gets promoted to "trusted basemap" only after **N≥2 independent flights** confirm consistent geo-alignment within X m of each other.
3. **Quality score includes parent-pose covariance as a hard gate**, not just inlier count: a tile written from σ_xy > 5 m (tighter than the 10 m generation gate) is marked as "soft" and flagged in the sidecar.
4. **An additional AC**: "AC-NEW-7 — cache-poisoning safety" — see proposed addition in `solution_draft02.md`.
- Source: derived analytical finding (no single L1/L2 — this is a design-level hazard exposed by Mode B reasoning).
- Confidence: ⚠️ Medium (hazard is real and well-known in cartography/SfM; specific mitigation choice is empirical).
**M-10 (Component 9 / Process topology) — Free-threaded Python 3.13 is not v1-ready.** S55: free-threading is **experimental**, has a "substantial single-threaded performance hit", many C extensions don't yet support it, and the GIL **auto-re-enables on import of any non-FT-aware extension** (which would silently include numba, possibly TensorRT bindings, possibly older pymavlink). The draft's choice (single asyncio Python process + TRT subprocess workers + numba on hot path) is correct for v1 — but the rationale should be sharpened from "GIL is a risk we mitigate" to **"free-threaded Python is not yet a substitute; revisit in v1.1 once NumPy/SciPy/numba/TRT bindings stabilise on PEP 703."**
- Source: S55.
- Confidence: ✅ High.
**M-11 (Component 5 / W4.a) — ODOMETRY for fixed-wing in ArduPilot has known production gotchas.** S42 confirms ODOMETRY landed Dec 2021; S43 (PR #30080, "External nav+gps fix", merged 2025) shows ongoing work on the source-switching path when running external-nav alongside GPS. Practitioner-reported issues from S41/S42 discussion:
- velocity errors when companion-computer-derived velocity is fed into EKF3,
- position-estimate resets when external-nav loses reference,
- conflicts when running external-nav alongside GPS.
This is directly relevant to AC-NEW-2 (3 s spoofing-promotion latency) — the source switch is exactly the path that has known bugs. Mode B's recommended hybrid (GPS_INPUT primary + ODOMETRY when full covariance is available) needs SITL coverage of source-switching scenarios as a hard prerequisite, not a v1.1 follow-up.
- Source: S41, S42, S43.
- Confidence: ✅ High.
**M-12 (Component 1b / R-Terrain) — Eastern-Ukraine relief amplitude breaks the "flat enough" assumption near edges.** S56: Kharkiv-region UAV survey reports **~24 m peak-to-trough relief** between low and high points in test areas, with creek + gully (yary/balky) systems. At 1 km AGL with a 35° HFOV camera, a 24 m elevation deviation at the frame edge produces ~17 m horizontal misalignment when projected via the flat-Earth assumption. That's **inside AC-1.1** (50 m@80%) but **eats into AC-1.2** (20 m@50%, hard-floor variant). Recommended addition: a per-sector DEM lookup (one-time pre-flight) that classifies sectors as "flat" (≤5 m amplitude), "moderate" (515 m), "rugged" (>15 m). The system uses tile-anchor weight-decay or skips ortho-tile generation in rugged sectors.
- Source: S56.
- Confidence: ⚠️ Medium (S56 is one regional survey; relief varies across the operational area).
**M-13 (Datasets) — TartanAir V2 is a stronger synthetic baseline than MidAir; flag for user reconsideration.** S51: TartanAir V2 is photo-realistic (AirSim) with **native IMU + 12-cam rigs + 65 environments + season/weather variation + custom camera models**. The draft drops synthetic IMU per user instruction (AC-NEW-4 validation rewritten in solution_draft01). User's stated reason: Mavic-class dynamics ≠ fixed-wing dynamics. TartanAir V2 lets us **configure motion patterns**, so the dynamics-mismatch argument is weaker for TartanAir than for MidAir. **This is a real choice for the user**: either keep "real-data only" purism, or add TartanAir V2 as an early-bench-off-only baseline. Surface to user as an open question, not a unilateral change.
- Source: S51.
- Confidence: ⚠️ Medium (technical viability is high; product/operator preference is the user's call).
**M-14 (Component 3 / W1.c) — Add AerialExtreMatch and 2chADCNN to the matcher V&V plan for season/viewpoint robustness.** Two underweighted benchmarks:
- **AerialExtreMatch** (S49): 1.5 M synthetic image pairs with **32 difficulty levels** crossing overlap × scale × pitch — exact failure-mode profile for our 1 km AGL operational regime. Real-world UAV localization subset for end-to-end validation.
- **2chADCNN** (S50): season-aware UAV↔satellite template-matching reference. Either include as bench-off candidate (vs. generic GIM/RoMa), or as a season-robustness *benchmark* the bench-off candidates run against.
- Source: S49, S50.
- Confidence: ✅ High.
**M-15 (Component 4) — Real fixed-wing monocular VO is harder than the draft implies.** S52: SVO, DSO, ORB-SLAM2 all "had significant difficulty maintaining localisation" on real fixed-wing flights at altitude. S53: high-altitude (3001000 m AGL) VIO publishes drift numbers in the same band as our AC-1.3. Conclusion: the draft's choice ("custom 2-frame homography VO using the Component-3 matcher") is **right** for our framing (VO between satellite anchors, not standalone metric SLAM), but the AC-1.3 drift budget (<100 m without IMU, <50 m with IMU) needs validation against real fixed-wing footage — *not* Mavic-class footage — before lock.
- Source: S52, S53.
- Confidence: ✅ High.
---
## Mode B Findings — second adversarial pass (user-driven, 2026-04-26)
**M-16 (Component 2 / Granularity) — VPR retrieval unit must be decoupled from the storage-tile boundary.** The Mode A and Mode B draft both said "FAISS IVF over per-tile DINOv2-VLAD vectors" using **storage tiles at z=20** (~154 m × 154 m ground). A 1 km AGL nadir frame covers **30100 such tiles** depending on lens. Cosine similarity between a frame descriptor (covers ~600 × 450 m) and a tile descriptor (covers 154 × 154 m) is fundamentally mismatched and noisy. None of the published aerial-VPR systems do it this way:
- **AerialVL** (S03) preprocesses the reference satellite map into **frame-footprint-sized reference chunks** matched to expected drone-frame ground coverage.
- **AnyLoc** (S05) uses overlapping macro-windows scaled to query footprint on aerial.
- **NaviLoc** uses a sliding-window descriptor over the basemap.
**Conclusion**: the storage tile (z=20, 512×512) stays as the dedup / orthorect unit. The **VPR chunk** is a separate concept: ground-footprint chunks sized to the expected frame coverage with **4050% overlap** so any frame footprint lands cleanly inside ≥1 chunk. Optionally multi-scale (one set per altitude band). Index is over chunks, not tiles.
- Source: re-reading S03 + S05 with the granularity question in mind; verified against the user-surfaced gap.
- Confidence: ✅ High. The error mode is well-known in the aerial-VPR literature; the original draft just under-specified the retrieval unit.
**M-17 (Component 2 / Invocation policy) — VPR is a re-loc-trigger module, not an every-frame module.** Per Component 5 EKF analysis, in steady state (recent anchor < 2 s, σ_xy < 20 m, VO healthy), a geometric prior from the IMU + VO predicted position is enough to pick top-K candidate VPR chunks by **distance alone** — no DINOv2 forward needed. VPR's value is concentrated in the resilience paths:
- **AC-NEW-1 cold start** — no IMU prior at all → VPR is the only viable narrow.
- **AC-3.2 sharp turn** — VO fails, IMU prior degrades fast → VPR re-anchors.
- **AC-3.3 disconnected segment** — explicitly requires "global descriptor retrieval" — VPR.
- **σ_xy growth** — when EKF position covariance escapes σ_xy ≥ 50 m, geometric prior is too wide; VPR re-narrows.
**Conclusion**: control flow is `if (steady_state) { use geometric prior } else { invoke VPR }`. Saves ~1035 ms/frame and lets VPR backbone idle (one less concurrent process during cruise). The DINOv2-base TRT engine still has to be resident in GPU memory for fast invocation.
- Source: derived from M-1, M-5, AC-NEW-1, AC-3.2, AC-3.3, EKF analysis. Independently corroborated by user feedback on the architecture.
- Confidence: ✅ High.
**M-18 (Component 2 / Fallback) — expanding-window retry on unconvincing top-1.** Standard pattern in re-loc literature: if top-1 VPR similarity is below threshold OR top-1/top-2 gap is below threshold (both signs that VPR is unsure), **expand the candidate set to adjacent chunks** (±1 chunk in each direction = 8 neighbours in a regular grid; or radius-N expansion for sparse-overlap layouts) before failing over to operator-assisted re-loc. Cheap to add: same FAISS index, larger K, no extra DINOv2 forward.
- Source: standard relocalization pattern (cf. ORB-SLAM3, GISNav, NGPS implementations).
- Confidence: ✅ High.
**M-19 (Component 2 / Active-conflict robustness) — multi-scale chunks + OSM road overlay + sector-driven K + negative cache.** Active-conflict scene change (destroyed buildings, cratering, dam flooding, road realignment) is a frequent operational reality in the eastern/southern Ukraine deployment, not an edge case. Layered mitigations beyond M-16/17/18:
- **Multi-scale VPR chunks**: maintain BOTH fine-scale (z=20-derived) and coarse-scale (z=17/18-effective) chunk descriptor sets. Coarse-scale descriptors capture road-network + field-boundary + waterway structure that survives building destruction. ~12 MB extra disk, ~3 min one-time pre-flight DINOv2 forward.
- **OSM road-network overlay**: extract OSM road geometry for the operational area pre-flight as a binary "road-mask" tile sidecar; matcher applies bonus inlier weighting on keypoints that fall on road edges. GISNav uses this pattern. Roads are the single most change-stable feature in active-conflict zones.
- **Sector volatility classification drives K** (binds to AC-NEW-6 `sector_class`): K=5 stable / K=20 active / K=50 expanding-window-fallback.
- **Onboard-tile rapid promotion in active sectors**: refines M-9's 2-flight voting — single-flight promotion allowed in active sectors when σ_xy ≤ 3 m AND OSM-road-overlap ≥ 70 % (dual gate keeps safety).
- **Negative cache**: tiles repeatedly rejected by matcher across flights get `trust_level = stale_destroyed`, excluded from retrieval until Service refresh.
The two highest-leverage of these are multi-scale chunks and OSM overlay; the rest are essentially free.
- Source: derived from M-9, M-16, M-17, M-18 + standard cartographic-stability reasoning + GISNav reference architecture; user-driven concern about active-conflict scene change frequency.
- Confidence: ✅ High on multi-scale + OSM (literature-backed); ⚠️ Medium on the OSM-road-overlap-≥-70 % numeric threshold (needs empirical calibration).
**M-20 (Component 1) — Storage tile zoom level pinned at z=20.** Trade-off analysis in response to user question (z=18 vs z=20):
- ADTi 20MP APS-C @ 1 km AGL with 2450 mm lens → frame GSD in 818 cm/px range. Mid-range (~35 mm lens) → ~12 cm/px.
- Frame-vs-reference scale ratio at z=20 (30 cm/px): **2.5×** — well within the SP+LG / GIM-LightGlue "well-handled" band (≤4× per published IMW-style benchmarks).
- Frame-vs-reference scale ratio at z=18 (~120 cm/px): **10×** — outside the SP+LG well-handled band; sub-pixel keypoint-correspondence accuracy degrades sharply, pushing AC-1.2 (50 % @ 20 m) and AC-2.2 (MRE < 2.5 px) into risk territory.
- Storage @ z=20 over 400 km² ≈ 2.8 GB cache + 30 MB DEM + 16 MB VPR chunk index ≈ 3 GB total — **28 % of the 10 GB budget**, leaving 7 GB headroom for FDR overflow and multi-scale chunks (M-19).
- Storage @ z=18 over 400 km² ≈ 220 MB total — saves ~2.5 GB but provides no operational benefit at our budget level.
- Pre-flight compute: z=20 takes ~5 min; z=18 takes ~3 min. Both trivial on the bench. Not a deciding factor.
- **Decision: z=20** for the storage tile. The accuracy benefit is meaningful; the storage cost fits comfortably. Folded into restrictions.md.
- Source: derived analysis using ADTi camera spec + Mode B finding S40 (DINOv2 latency) + IMW-style matcher-resolution-mismatch data.
- Confidence: ✅ High.
**M-21 (2chADCNN re-classification) — ceiling reference, NOT bench-off candidate.** Closer reading of S50 (MDPI Drones 2023) reveals 2chADCNN is structurally incompatible with our bench-off:
- **Output format**: template-overlap region (IoU-style), not sub-pixel keypoints. Component 3's PnP needs keypoint correspondences; 2chADCNN can't supply them.
- **Tested altitude band**: 252500 m AGL, not 1 km. Their experimental envelope doesn't cover our regime.
- **No Jetson / TRT benchmark**: trained on Intel i5 + 8 GB RAM CPU only.
- **Method paradigm**: traversal-search template matching (slide template over satellite image at every position, compute similarity). Doesn't scale to a 400 km² operational area in our latency budget.
- **Reported numbers**: real-summer overlap-IoU 0.920.99; synthetic-snow overlap-IoU 0.820.95. Useful as a published season-robustness *number* against which we benchmark our chosen modern matcher (SP+LG / GIM-LightGlue) — but not as a candidate for the matcher slot itself.
Walks back the "optionally a bench-off candidate" tag in M-14. 2chADCNN is **purely a season-robustness ceiling reference**.
Newer / more relevant season-aware references for the open-research reading list:
- **AFF-CNN-HTransformer cross-perspective UAV-satellite matching** (Sci Reports 2025) — hybrid CNN+Transformer cross-view + season.
- **Polar-coordinate-transformation rotation-and-season-invariant UAV-satellite matching** (2026) — explicitly addresses both rotation and season; intersects nicely with our IMU-driven de-rotation step.
- Source: closer reading of S50 + new search results 2025-2026.
- Confidence: ✅ High on 2chADCNN re-classification; ⚠️ Medium on the newer papers (need to read full PDFs before bench-off inclusion).
---
## Mode B Round 2 (component replacements & sweep) — appended 2026-04-26
**M-22 (Component 4 / VO architecture) — custom 2-frame homography VO is the wrong design.** Source: S52 (AFIT thesis), S60 (cuVSLAM), S64 (Isaac ROS UAV reference), S72 (high-altitude VIO), S73 (DPV-SLAM).
- Draft02 C-4 says "custom 2-frame VO via SuperPoint+LightGlue homography". This skips loop closure, sparse bundle adjustment, keyframe-based local mapping — every mechanism that bounds drift in production VO/SLAM systems.
- AFIT thesis (S52) shows even ORB-SLAM2 / SVO / DSO struggle on real fixed-wing flights; a hand-rolled 2-frame homography VO will be strictly worse.
- High-altitude VIO field test (S72): stereo-VIO = 2.186 m / 800 m at 40100 m AGL; monocular-VIO is "acceptable but worse". At 1 km AGL motion parallax shrinks ~1025× per frame, further degrading monocular VO.
- **Recommendation: replace custom 2-frame VO with cuVSLAM (S60, S64) in monocular + IMU mode.**
- Confidence: ✅ High on "custom 2-frame VO is wrong"; ⚠️ Medium on "cuVSLAM is the right replacement" — high-altitude fixed-wing performance is unproven on cuVSLAM's published benchmarks (KITTI urban driving + EuRoC indoor MAV). Bench-off in F-T1b mandatory.
**M-23 (Component 4 / VO candidate evaluation on Jetson Orin Nano Super).** Source: S60, S61, S62, S71, S73, S76.
- **cuVSLAM (S60)**: NVIDIA-supported, CUDA-optimized, drop-in via `isaac_ros_visual_slam`, Apache-2.0. Reference designs on Orin Nano (S64, S77) confirm runtime feasibility. <1% ATE on KITTI / <5cm on EuRoC. **Verdict: v1 lead candidate.**
- **DPVO / DPV-SLAM (S61, S73)**: SOTA deep VO, but DPVO-QAT++ is benchmarked on RTX-4060, not Jetson. Original DPVO @ 25× real-time on RTX-3090 (4 GB) → Orin Nano Super extrapolation ≈ 410 FPS without QAT, ≈615 FPS with QAT. **Borderline for 10 Hz target; not v1.**
- **MASt3R-SLAM (S62)**: 15 FPS on a single GPU; sub-1 Hz extrapolated on Orin Nano Super. **Infeasible for inline v1.**
- **VINS-Fusion / OpenVINS / BASALT / SVO Pro (S71)**: Classical, well-tested, but require manual integration (OpenCV pinning, ArUco fixes, DDS / ROS plumbing) and no Jetson-class CUDA acceleration of the front-end. Higher integration cost than cuVSLAM with no accuracy advantage.
- **Custom 2-frame homography VO (current draft02 plan)**: M-22 already disqualified.
- Confidence: ✅ High.
**M-24 (Component 3 / cross-view matcher — LiteSAM evaluation).** Source: S58.
- LiteSAM is **purpose-built for satellite↔aerial AVL in GPS-denied environments**. Architectural choices (TAIFormer + MinGRU sub-pixel refinement) are tailored to large appearance variations and texture-scarce regions — exactly our regime.
- Results: 6.31 M params (2.4× smaller than EfficientLoFTR); RMSE@30 = 17.86 m on UAV-VisLoc; 61.98 ms on standard GPU; **497.49 ms on Jetson AGX Orin** (FP16-optimized).
- **Crucial extrapolation**: AGX Orin INT8 throughput ≈ 275 TOPS, Orin Nano Super ≈ 67 TOPS → 4× scaling factor → **LiteSAM on Orin Nano Super ≈ 15002000 ms / pair**. Well outside our 400 ms p95 budget for inline use.
- **Three useful roles (not the inline matcher)**:
- (a) **Re-localization fallback** — invoked rarely (cold start, σ_xy > 50 m), 1.52 s latency tolerable.
- (b) **Validation oracle** — ground-truth-quality matches for offline regression bench.
- (c) **Distillation teacher** — train a smaller student model with LiteSAM-supervised correspondences for the satellite-aerial domain.
- **Verdict: add LiteSAM in roles (a)/(b)/(c); SP+LG (TRT FP16/INT8) remains the inline matcher.**
- Confidence: ✅ High on architectural fit; ⚠️ Medium on the 4× AGX-Orin → Orin Nano Super scaling — needs empirical confirmation in bench-off.
**M-25 (Component 3 / cross-view matcher — RoMa v2 / MapGlue / MATCHA).** Source: S63 + earlier MapGlue / MATCHA notes.
- **RoMa v2 (S63)**: SOTA dense matcher, frozen DINOv3 + custom CUDA + predictive covariance. GPU-class compute. Infeasible inline on Orin Nano Super; viable as **offline ceiling reference** for Component 3 bench-off.
- **MapGlue / MATCHA**: Cross-modal/multimodal matchers — useful research-track candidates but no Jetson deployment data; same offline-only verdict.
- **Verdict**: not a v1 candidate; offline ceiling reference. The matcher bench-off (deferred research item) MUST include both as ceilings so we know how much accuracy we're trading away by using SP+LG inline.
- Confidence: ✅ High.
**M-26 (Component 5 / EKF→ESKF question — architectural reframing).** Source: S65, S66, S67, S68, S69.
- **The FC (ArduPilot 4.5+) runs EKF3, a classical extended Kalman filter — not an ESKF.** PX4 EKF2 is the ESKF (S68); we are not on PX4. We cannot swap the FC's filter.
- The "EKF vs ESKF" debate therefore applies **only to the companion-side filter** (Component 5 in draft02).
- **Best practice for ArduPilot ExtNav setups (S65, S66, S67)**: companion does NOT run a heavy filter on top. Companion produces (visual fix → GPS_INPUT) and/or (relative pose → ODOMETRY) with well-calibrated covariances; ArduPilot EKF3 fuses those with the FC's IMU.
- ArduPilot issues #30076 (S65) and #32506 (S66) document concrete failure modes when feeding the FC two simultaneous position sources — **only one position source per axis at a time**. The hybrid `GPS_INPUT + ODOMETRY` plan from M-1 must therefore split responsibilities by **channel**, not duplicate position on both.
- **Architectural revision**: the companion-side EKF in draft02's C-5 is **not necessary** for v1. It can be replaced by a lightweight **"covariance calibrator + outlier gate + source-label producer"**: each upstream (matcher, VO, IMU passthrough if any) emits a hypothesis with a covariance; a Mahalanobis gate rejects outliers; covariances are re-scaled if empirical residuals indicate over- or under-confidence; results are emitted on the appropriate MAVLink channel. No state propagation, no IMU integration on the companion.
- **If a companion-side filter is justified later** (e.g., to smooth visual fixes before they reach the FC, or to integrate VO with the FC's downsampled-IMU stream the companion can subscribe to), use **vanilla ESKF (S69)** for orientation correctness — but only after F-T9 SITL shows the FC's EKF3 cannot handle our raw input quality.
- Confidence: ✅ High on dropping the companion-side EKF for v1; ⚠️ Medium on whether we'll need to re-introduce one for v1.x.
**M-27 (Component 1b / Ortho-Tile Generator — use Orthority).** Source: S59.
- Orthority (Python, MIT-class) supports frame + RPC camera models, GeoTIFF DEM lookup, RPC refinement, pan-sharpening — i.e., everything draft02's hand-rolled pinhole-on-DEM ortho was going to reinvent.
- Pip-installable (`pip install orthority`). API-driven (per-image ortho via `Ortho` class) → callable inline from our Component 1b worker.
- ODM is post-processing batch SfM — wrong tier; not for per-frame ortho on a 1 km AGL nadir camera with known FC pose.
- **Verdict: replace draft02's "Pinhole projection on per-sector DEM" with Orthority frame-camera ortho.** Falls back to a 6-line `cv2.warpPerspective` + bilinear DEM lookup if Orthority's per-frame latency on Orin Nano Super blows our budget — measure in F-T14.
- Confidence: ✅ High on Orthority being the right tier; ⚠️ Medium on the latency assumption — needs measurement.
**M-28 (Component 1 / tile storage — MBTiles WAL stays; PMTiles / COG considered).** Source: COG/PMTiles search results + draft02 M-8.
- **COG**: Highly-tiled COG metadata can trigger 500 MB initial download on a 7 GB file (geotiff.js issue #479) — defeats selective access on a bandwidth-constrained UAV system. Not a fit.
- **PMTiles**: Single-file alternative to MBTiles, cloud-optimized. Good for HTTP serving (RPi tests show competitive performance). For our use case (local microSD, embedded reader+writer), PMTiles loses the SQLite-WAL concurrency story we already designed for in M-8.
- **Verdict: MBTiles + WAL (M-8) remains the right choice.** No revision.
- Confidence: ✅ High.
**M-29 (Component 9 / orchestrator — ROS 2 vs DIY Python).** Source: S64, S77.
- ROS 2 Humble + JetPack 6 + Isaac ROS 3.2 + cuVSLAM + MAVROS is a **proven reference architecture on Orin Nano Super** (S64, S77).
- If we adopt cuVSLAM (M-22/M-23), the lowest-friction path is to consume cuVSLAM via `isaac_ros_visual_slam` (ROS 2 wrapper) and bridge to the FC via MAVROS — not to re-export cuVSLAM's C++ API into a custom Python orchestrator.
- **ROS 2 cost**: extra ~25 % CPU for DDS + topic serialization; learning curve for the team; deployment image grows ~200 MB.
- **ROS 2 benefit**: free integration of cuVSLAM, MAVROS, Isaac ROS perception nodes; battle-tested; observability via `ros2 bag` and `rqt_*` tooling.
- **DIY Python alternative** (draft02 plan): keeps everything in one asyncio process; lowest overhead; but we re-export every ROS 2 component we want to consume (cuVSLAM via Python bindings, MAVROS-equivalent via pymavlink, etc.).
- **Verdict: lean toward ROS 2 Humble + Isaac ROS for v1**, with our matcher / VPR / ortho / FDR / fusion-glue nodes implemented as ROS 2 Python nodes (`rclpy`). Decision is **not locked** — it's the largest open architectural question for round 2 and the user should be asked.
- Confidence: ⚠️ Medium — depends on whether the team has ROS 2 experience and whether the ~5 % CPU overhead is acceptable inside the latency budget. **This is a Q for the user.**
**M-30 (Component 5 / hybrid GPS_INPUT + ODOMETRY — channel split per S65/S66/S67).** Source: S65, S66, S67.
- M-1 (round 1) said "emit BOTH GPS_INPUT AND ODOMETRY in parallel". S65/S66/S67 say **only one position source per axis at a time** and document concrete bugs when the FC sees two.
- **Revised channel split**:
- Option A (simplest, recommended for v1): **GPS_INPUT carries position + velocity** (lat/lon/alt + N/E/D velocities + h_acc/v_acc/vel_acc covariance scalars). ODOMETRY is **disabled** for v1. ArduPilot configured `EK3_SRC1_POSXY = GPS`, `EK3_SRC1_VELXY = GPS`, `EK3_SRC1_YAW = GPS+Compass`. Our companion provides a "GPS-equivalent" via GPS_INPUT (`GPS1_TYPE=14`); ArduPilot treats it identically to a real receiver. Failover to backup GPS via `EK3_SRC2_*`.
- Option B (richer, v1.1+): **ODOMETRY carries position + velocity + yaw + full 21-element covariance**, GPS_INPUT carries **fix only as fallback** (not actively fused while ODOMETRY is healthy). ArduPilot configured `EK3_SRC1_POSXY = ExternalNav`, `EK3_SRC1_YAW = ExternalNav`, with `EK3_SRC2_POSXY = GPS` as backup. Requires PR #30080-class fixes for clean source switching.
- **Original M-1 (both channels for the same axis) is a misconfiguration**, not a feature. Walk back.
- **Verdict**: v1 ships Option A. Option B is v1.1 territory once F-T9 confirms source-switching behaves cleanly under PR #30080.
- Confidence: ✅ High.
**M-31 (Component 6 / sysid sharing on the wire).** Source: S65, S67.
- Round 1 M-6 picked "distinct system-IDs for MAVSDK (sysid=10) and pymavlink (sysid=11), sharing the serial port via ArduPilot's native MAVLink routing — no router daemon".
- This decision survives round 2 unchanged. The distinct-sysid trick + ArduPilot native routing is documented and works for any MAVLink2 stack. No router CVE exposure (M-6 / S45).
- Open task: confirm the chosen sysids don't collide with any MAVLink2 forwarding rule on QGroundControl GCS-side; document in deploy runbook.
- Confidence: ✅ High.
**M-32 (Component 9 / Python topology — confirmed).** Source: S55.
- Round 1 M-10: stay on CPython 3.11/3.12; defer free-threaded 3.13 to v1.1. Survives round 2 unchanged.
- If Component 9 moves to ROS 2 (M-29), the Python version question still applies — `rclpy` supports 3.11/3.12; 3.13 free-threaded is also experimental there.
- Confidence: ✅ High.
**M-33 (Component 2 / VPR — no new entrants worth adding).** Source: round-2 searches.
- Searched for newer VPR SOTA than DINOv2-SALAD / BoQ (CVPR 2024). The 2025 landscape is matcher-centric (RoMa v2, LiteSAM, MASt3R-SLAM); no new VPR backbone has displaced SALAD/BoQ on aerial cross-domain.
- Round 1 shortlist {AnyLoc, SALAD, BoQ, MixVPR} stands.
- Confidence: ✅ High.
**M-34 (Component 4 / camera intrinsics learning — calibration-free SLAM).** Source: S62.
- MASt3R-SLAM is calibration-free; cuVSLAM expects intrinsics. Our nav cam (ADTi 20MP APS-C) will be calibrated pre-flight via standard checkerboard procedure → cuVSLAM's intrinsics requirement is **not** a friction point.
- Confidence: ✅ High.
**M-35 (Component 5 / IMU access on the companion — open question).** Source: S64 reference designs.
- The reference cuVSLAM-on-Jetson designs (S64) use the camera's built-in IMU (RealSense D435i) for VIO. Our nav cam (ADTi 20MP APS-C) has no IMU; the FC has the IMU.
- Two paths to feed IMU into companion-side cuVSLAM:
- (a) MAVLink `RAW_IMU` / `SCALED_IMU` stream from FC → companion subscribes via pymavlink, feeds cuVSLAM. **~1 kHz IMU on FC down-rated to ~200400 Hz over MAVLink** is sufficient for monocular VIO; latency budget acceptable.
- (b) Add a dedicated companion-side IMU (BNO055 / ICM-42688P / Bosch BMI270 over SPI/I²C) with its own time sync. More hardware, but no MAVLink-bus contention.
- **Verdict v1**: try path (a); if cuVSLAM's IMU sync sensitivity (timestamping) is too tight for MAVLink-rated IMU, fall back to (b) in v1.1.
- Confidence: ⚠️ Medium — depends on cuVSLAM's tolerance for IMU rate / timing jitter; needs empirical check during integration.
@@ -0,0 +1,186 @@
# Mode B Decomposition — Adversarial Assessment of `solution_draft01.md`
**Mode**: B (Solution Assessment).
**Question type**: Problem Diagnosis + Decision Support.
**Novelty sensitivity**: **High**. Embedded CV/SLAM, ArduPilot MAVLink2 signing maturity, JetPack version, and matcher SOTA all churn fast — prefer 2024-Q4 → 2026-Q2 sources.
**Goal**: per Mode B template, find weak points (functional / security / performance) per draft component and propose either a stronger alternative or an explicit mitigation. Output is `solution_draft02.md` with an "Assessment Findings" table at the top.
## Boundary
- **Population**: a single fixed-wing UAV running the GPS-denied onboard pipeline, 1 km AGL, 60 km/h cruise, 8 h endurance, eastern/southern Ukraine.
- **Geography**: deployed in active-conflict / contested EW environment.
- **Timeframe**: deployment v1 within the next ~46 months from now (mid-2026).
- **Level**: companion-computer code + integration. The Suite Satellite Service, the AI-camera detector, the FC firmware, and the airframe are out of scope as components but appear as interfaces under attack.
## Perspectives chosen (≥3 mandatory)
1. **Implementer / engineer** — what published Jetson Orin Nano Super numbers say about the actual latency budget, what the GIL-on-hot-path failure modes are, what is hard about TRT-deploying DINOv2-VLAD.
2. **Contrarian / devil's advocate** — every committed choice in the draft has a "why not X" answer; surface them.
3. **Domain practitioner** — what people running ArduPilot + companion CV in production have written about MAVLink2 signing, mavlink-router, GPS_INPUT injection, cross-view matchers in active service.
4. **Security / red-team**`GPS_INPUT` is a high-trust local channel; tile cache is operationally sensitive. Realistic attack surface and mitigations.
## Weak-point sub-questions (drives Mode B web search)
### W1. Cross-view matcher commitment (Component 3)
The draft pins SuperPoint+LightGlue / XFeat / MASt3R as the bench-off candidates, with 1024×768 as the working downsample.
- W1.a. **Is the bench-off shortlist still current as of 2026-Q2?** Did GIM (2024), BoQ (2024), Mast3r-SfM (2025), RoMa-DC (2025), or Map-Free-Reloc 2025 leaderboard winners change the picture?
- W1.b. **Is "1024×768 starting point" empirically defensible on Orin Nano Super 25 W?** Published TRT FPS / latency for SP+LG and XFeat at this resolution on the Orin Nano class.
- W1.c. **Cross-view-specific failure modes at 1 km AGL** that the bench-off won't catch — illumination, season, recent-conflict landscape change. Are any matchers explicitly evaluated on temporal change?
- W1.d. **Why not training-free 3D-grounded matching (MASt3R/Mast3r-SfM) as primary** instead of as stretch? What's the realistic Orin Nano latency budget for these.
Query variants: "LightGlue Jetson Orin Nano benchmark 2025 2026", "SuperPoint TensorRT FP16 Orin Nano latency", "MASt3R embedded GPU benchmark", "GIM image matching cross-view 2024", "BoQ visual place recognition", "RoMa DKM aerial cross-view 2025", "image matcher seasonal change benchmark".
### W2. VPR backbone commitment (Component 2)
Draft picks AnyLoc (DINOv2-VLAD) primary + MixVPR fast-lane.
- W2.a. **DINOv2 ViT-B/14 latency on Orin Nano Super 25 W** — is the draft's "~5080 ms / 224×224" empirically backed?
- W2.b. **2025 SOTA**: SALAD, BoQ (Bag-of-Queries), CricaVPR — do any beat AnyLoc on aerial cross-domain at meaningful latency?
- W2.c. **AnyLoc unsupervised VLAD** is training-free, but is the VLAD codebook quality stable across operational areas (Ukraine specifically)? Any published failure cases?
Query variants: "AnyLoc Jetson benchmark", "DINOv2 ViT-B TensorRT FP16 latency Orin", "SALAD visual place recognition aerial 2024", "BoQ visual place recognition", "CricaVPR aerial benchmark", "VPR aerial Ukraine seasonal".
### W3. Process topology — "single Python process + asyncio + TRT subprocess workers via CUDA IPC"
Draft commits to this for v1 (Component 9).
- W3.a. **GIL on the hot path** — is asyncio + subprocess workers actually GIL-safe at 3 fps × 1 km AGL with all the I/O (MAVLink, FDR, tile cache lookups, EKF math)? Real-world failure stories from ArduPilot/PX4 companion-computer projects.
- W3.b. **CUDA IPC for tensor handoff** — known issues on Jetson (unified memory model: is CUDA IPC even meaningful when CPU and GPU share the LPDDR5 pool)?
- W3.c. **Subinterpreters / free-threaded Python (3.13+)** — is the project using a Python old enough that subinterpreters aren't an option?
- W3.d. **Alternatives**: ROS 2 Humble (rejected in draft), C++ core (rejected), single-process with multiprocessing (not discussed).
Query variants: "Jetson CUDA IPC unified memory", "Python asyncio CUDA real-time deadline", "Python GIL drone companion computer", "PX4 ArduPilot companion computer python production", "ROS2 vs Python single-process VIO embedded", "free-threaded Python 3.13 GPU".
### W4. Loosely-coupled EKF in Python + numba (Component 5)
Draft writes its own loosely-coupled EKF, fuses IMU @ 100 Hz from FC, satellite anchors irregular, VO @ 3 Hz; emits GPS_INPUT.
- W4.a. **Why not just feed `VISION_POSITION_ESTIMATE` to ArduPilot EKF3 and let the FC fuse?** Draft mentions this as "alternative" — what does the practitioner literature say about the actual cost of the dual-fusion choice?
- W4.b. **EKF covariance calibration is famously fragile** (AC-NEW-4 false-position budget rides on it). Are there published gotchas for loose-coupled aerial EKF? What's the right Mahalanobis gate value?
- W4.c. **numba JIT on Jetson** — JIT warmup time hurts AC-NEW-1 (cold-start TTFF <30 s). Real numbers on Jetson Orin Nano JIT compile time.
- W4.d. **Heading observability** — at 1 km AGL nadir, satellite anchoring gives `(lat, lon, h)` but heading is weakly observable from a single anchor unless the matcher emits oriented features. Does the draft's matcher choice cleanly produce yaw with covariance?
Query variants: "ArduPilot VISION_POSITION_ESTIMATE vs GPS_INPUT", "loose coupled EKF aerial gotcha", "EKF Mahalanobis gate visual anchor", "numba Jetson cold start", "monocular yaw observability satellite reference".
### W5. ArduPilot MAVLink2 signing + GPS_INPUT injection security (Component 6)
Draft says "MAVLink2 signing recommended", treats GPS_INPUT as high-trust local channel.
- W5.a. **Production maturity of MAVLink2 signing in ArduPilot 4.5+** as of 2026-Q2 — is it default-on, default-off, key-distribution story?
- W5.b. **Real attack surface**: what does an attacker with serial access to the FC actually need to spoof a GPS_INPUT? Is `mavlink-router` itself an attack-surface widening?
- W5.c. **Companion-side defenses** — health-gate before injecting, fix_type sanity, jam-detection from the other direction.
- W5.d. **Failsafe fallback**: if our GPS_INPUT is rejected by the FC (signing fail), what does ArduPilot do — does AC-NEW-2 (3 s spoof-promotion latency) survive that?
Query variants: "ArduPilot MAVLink2 signing 4.5 production", "MAVLink2 signing key distribution UAV", "ArduPilot GPS_INPUT signing", "mavlink-router security audit", "GPS_INPUT spoof companion computer attack".
### W6. In-flight ortho-tile generation residual error (Component 1b)
Draft: pinhole projection → flat-Earth ground plane → resample to z=20 XYZ tiles. Eligibility gates: σ_xy ≤ 10 m, |bank| / |pitch| ≤ 10°.
- W6.a. **Flat-Earth residual error in eastern/southern Ukraine** — actual relief amplitude. Steppes are not flat at 30 cm/px tile precision; agricultural fields, river valleys, ravines (yary) are common.
- W6.b. **What's the per-tile geo-alignment error budget** that still keeps cross-view anchors valid against the same tile two flights later?
- W6.c. **MBTiles SQLite at 10 GB scale on NVMe**: known issues with concurrent reader+writer (tile-cache miss path is concurrent with tile-write path)? Sharding strategy?
- W6.d. **Dedup by (z, x, y) only** — but the onboard tile carries a parent_pose covariance. If we already overwrite a service-source tile with an "onboard" tile that was written from a 3-σ-bad pose, we've poisoned the next flight's cache. Should the dedup rule include a "trust-only" lock from the Service?
Query variants: "MBTiles concurrent writer reader SQLite", "orthorectification flat earth residual error UAV", "Ukraine eastern terrain relief amplitude", "geotagged tile alignment budget cross-view localization".
### W7. Tile dedup poisoning — onboard tile overwrites service tile
This is a sharper version of W6.d.
- W7.a. The "highest quality wins" rule treats `match_inliers` as a proxy for geo-alignment confidence. But a confidently-bad anchor (over-confident covariance from EKF — see W4.b) writes a "high-quality" tile that's actually misaligned by 50 m. Next flight, that misaligned tile becomes the satellite anchor for *another* anchor, and the error compounds.
- W7.b. **Best-practice from cartography / SfM** for trusting onboard imagery as basemap input.
- W7.c. **Mitigation**: lock tiles whose source is `service` against onboard overwrite for some grace period; require onboard tiles to be "voted" by N independent flights before promotion.
Query variants: "satellite tile pose error compounding", "uav generated tile basemap update sfm trust", "drone-ortho photo dedup quality score".
### W8. Mavic-class footage as deployment-domain proxy
Draft uses internal Mavic flight footage as the deployment-domain V&V proxy. Mavic is a small quadcopter; the deployment platform is a fixed-wing at 1 km AGL.
- W8.a. **What does the literature say** about transferring CV/VO/VPR results from quadcopter footage to fixed-wing? Camera dynamics differ (rolling shutter, vibration spectrum, frame rate, motion-blur profile, AGL band).
- W8.b. **Synthetic IMU from Mavic video** — user already rejected this. But is there a non-synthetic alternative that the draft missed? E.g., MidAir (synthetic but matched dynamics), TartanAir, public ArduPilot SITL log.
- W8.c. **Risk of false confidence** — ground truth is in the absolute satellite anchor, not the Mavic IMU. So how does the Mavic V&V actually validate AC-NEW-4 (false-position safety) when no fixed-wing IMU is in the loop?
Query variants: "fixed wing vs quadcopter visual SLAM transfer", "drone vibration spectrum fixed-wing quad", "TartanAir aerial dataset fixed-wing".
### W9. Latency budget — is 400 ms p95 actually realistic?
AC-4.1 budget. Draft acknowledges R2 ("latency budget on Orin Nano Super at 1024×768 input is tight").
- W9.a. **Real published Jetson Orin Nano Super 25 W numbers** for: DINOv2 ViT-B forward (224×224), SuperPoint+LightGlue at 1024×768, FAISS top-K over ~10⁴ vectors, EKF update at 100 Hz IMU.
- W9.b. **Steady-state vs transient latency** — does the budget include EKF-output-to-MAVLink-emit overhead, MAVLink serialisation, and the FC's own gating?
- W9.c. **Failure mode if budget blows** — frame-drop is allowed (AC-4.1 says ~10%) but if matcher latency tail is 600 ms, the EKF rides on VO+IMU for >2 frames, and AC-3.4 reloc trigger hits.
Query variants: "DINOv2 Jetson Orin Nano TensorRT FP16 ms", "LightGlue Jetson benchmark FPS 1024", "FAISS Jetson IVF latency".
### W10. AC-NEW-4 false-position safety — Monte Carlo validation realism
P(error >500 m) <0.1%, P(error >1 km) <0.01%.
- W10.a. **What's the standard practice** for validating these probabilities at this magnitude? You need >10⁴ frames of independent failure modes — does the AerialVL + Mavic dataset cover that?
- W10.b. **What does the literature say** about cross-view matcher tail behavior — do failures cluster on specific scene types (forest, repetitive cropland, water, glare)? If yes, dataset bias is the killer.
- W10.c. **EKF-side gating** — Mahalanobis gate is the right tool, but the gate threshold itself is a per-environment tuning parameter. Is there a published recipe?
Query variants: "visual localization tail probability >1km", "cross-view matcher failure clustering forest cropland water", "aerial visual SLAM Monte Carlo safety budget".
### W11. Cold-start TTFF <30 s feasibility
AC-NEW-1.
- W11.a. **TRT engine warm-up cost** on Jetson Orin Nano Super for SP+LG + DINOv2 + EKF JIT. Real numbers.
- W11.b. **FAISS index load + mmap warm**: 10 GB tile cache, IVF over ~10⁵ tile vectors — load time on NVMe.
- W11.c. **First valid GPS_INPUT** path includes: IMU-extrap-from-FC, first frame, VPR retrieve, matcher run, PnP, EKF init, GPS_INPUT emit. Anyone published an end-to-end cold-boot number for this kind of stack on Orin?
Query variants: "TensorRT engine load time Jetson", "FAISS mmap warm 10GB", "Jetson companion computer cold boot time GPS substitute".
### W12. Imagery freshness reality check — Suite Satellite Service refresh cadence
AC-8.2 + AC-NEW-6: <6 months for active sectors, <12 months for stable.
- W12.a. **Is a 6-month refresh actually achievable** for Maxar Vivid / Pléiades Neo / Pléiades over Ukraine in 2026-Q2? Tasking lead time + cloud-cover acceptance + delivery channel.
- W12.b. **Practitioner reports** on what 30 cm Ukraine 20242025 imagery actually looks like (smoke, glare, seasonal mismatch, cratering).
- W12.c. **In-flight tile generation** is meant to backfill — but the Service still needs ground-truth tasking to seed the cache for any new operational area before the *first* flight. Is there a chicken-and-egg problem for first deployment to a new sector?
Query variants: "Maxar Vivid Ukraine 2025 refresh tasking", "Pleiades Neo Ukraine cloud cover lead time", "30cm satellite imagery refresh cadence active conflict".
### W13. Resource contention — 8 GB shared LPDDR5 budget
AC-4.2 = <8 GB shared. Draft loads:
- DINOv2 ViT-B TRT engine (~600 MB GPU)
- SP+LG TRT engine (~hundreds of MB)
- FAISS index over 10⁵ tile descriptors
- Tile cache mmap (10 GB on disk, mmap to RAM via OS page cache)
- EKF state + IMU ring buffer
- Python interpreter + asyncio loop + JIT'd numba kernels
- MAVSDK + pymavlink
- W13.a. **Realistic peak RSS** for this stack — is the 8 GB budget headroom or is it a tight squeeze?
- W13.b. **JetPack 6.2 / Ubuntu 22 baseline RAM** consumed before our process even starts.
- W13.c. **Mitigation**: page out the FAISS index, swap, or pin everything?
Query variants: "Jetson Orin Nano 8GB shared budget DINOv2 LightGlue", "JetPack 6.2 base RAM usage", "FAISS pinned memory Jetson".
## Completeness audit
Probes (per `references/comparison-frameworks.md` decomposition probes):
| Probe | Covered by | Notes |
|---|---|---|
| **Cost of failure / blast radius** | W5 (signing), W7 (tile poisoning), W10 (false-position) | three-way coverage of safety budget |
| **Time-to-first-result** | W11 | dedicated to TTFF |
| **Operating envelope** | W6 (terrain), W12 (freshness), W13 (memory), W9 (latency) | thermal already in AC-NEW-5 |
| **Maintenance cost** | W3 (Python topology), W4 (EKF code we own) | both addressed |
| **Substitutability of components** | W1 (matcher), W2 (VPR), W3 (process topology), W4 (EKF) | each component has ≥1 alternative-path question |
| **Adversarial / red-team** | W5, W7, W10 | covered |
| **Data-distribution bias** | W8, W10.b, W12 | covered |
| **Hardware-supply-chain risk** | not covered | Orin Nano Super availability is a project-management risk, not a design risk; deferred to Plan |
## Output plan
1. Source registry → append Mode B sources to `01_source_registry.md` as IDs `S40+`.
2. Fact cards → append Mode B facts to `02_fact_cards.md` under "Mode B Findings".
3. Mode B reasoning chain → write `04_reasoning_chain_mode_b.md`.
4. Validation log → write `05_validation_log_mode_b.md`.
5. Final deliverable → write `_docs/01_solution/solution_draft02.md` using `templates/solution_draft_mode_b.md`.
@@ -0,0 +1,94 @@
# Mode B — Round 2 Question Decomposition
**Trigger**: user explicit ask after rolling back from Step 3 (Plan).
**Mode**: B (Solution Assessment of `solution_draft02.md`).
**Date**: 2026-04-26.
**Scope (user-provided)**:
> "1. For VO — is it the most efficient method SP+LG for jetson? are there better ways? 2. for cross-view matcher — there is LiteSAM (https://github.com/boyagesmile/LiteSAM) and other methods specialized for that. Check and investigate in internet possible options. 3. EKF fusion — isn't it ESKF better? Ortho-tile generator — are there are already existing libs for that? or it is not so difficult and easier just to make it manually by ourselves? All in all, make a thorough investigation regarding each component — what's could be either give better confidence with relatively same resource and time footprint, either can provide roughly same confidence faster or lighter on resources."
## Question Type Classification
| # | Sub-Question | Type | Why |
|---|--------------|------|-----|
| Q-R2-1 | Is the SP+LG-based VO design (custom 2-frame homography) the most efficient & accurate VO on Orin Nano Super, or is there a better one? | Decision Support + Problem Diagnosis | Trade-off (compute vs accuracy vs maturity) + diagnoses whether the draft02 design choice is sound. |
| Q-R2-2 | Should LiteSAM (or any specialized satellite-aerial matcher) replace SP+LG / GIM-LG as the inline cross-view matcher? | Decision Support | Trade-off (accuracy vs latency vs role-fit). |
| Q-R2-3 | Is ESKF strictly better than EKF for our fusion stage? | Decision Support + Concept Comparison | Comparison + applicability boundary (ArduPilot vs companion). |
| Q-R2-4 | Should we use an existing ortho-tile generator library, or DIY? | Decision Support | Build-vs-buy. |
| Q-R2-5 | Is there a newer/better option for **every other component** (VPR, tile storage, MAVLink, software platform, DEM, etc.) that could give better confidence at same/lower resource footprint? | Knowledge Organization + Decision Support | Sweep audit of remaining components. |
Mode-B classification rule: **Problem Diagnosis + Decision Support** — applies to every sub-question above.
## Research Subject Boundary Definition
| Dimension | Boundary | Notes |
|-----------|----------|-------|
| **Population** | Embedded autonomous-flight stack on **Jetson Orin Nano Super (8 GB shared)** companion + **ArduPilot 4.5+** flight controller. Fixed-wing UAV airframe, 1 km AGL nadir nav cam, ADTi 20MP APS-C @ 3 fps. | Same as round 1. |
| **Geography** | Eastern-Ukraine theatre (active conflict, season variation). | Same as round 1. |
| **Timeframe** | v1 release 2026; v1.1 within 6 months. | Same as round 1. |
| **Level** | Software architecture and component selection (no hardware / no airframe / no GCS). | Same as round 1. |
## Perspectives Used (≥3 required)
| Perspective | Why this round | Example searches |
|-------------|---------------|------------------|
| **Implementer / Engineer** | Round 1 missed a few real engineering gotchas (companion-side filter double-fusion bugs, cuVSLAM as drop-in alternative). | "ArduPilot ExtNav GPS_INPUT double fusion", "cuVSLAM Jetson Orin Nano monocular fixed-wing" |
| **Practitioner / Field** | Look at production GPS-denied UAV reference designs on the same hardware target. | "ROS 2 Humble Jetson Orin Nano Super JetPack 6 MAVROS ArduPilot integration GPS-denied", "VINS-Fusion OpenVINS BASALT SVO Pro Jetson Orin Nano benchmark monocular fixed-wing 2025" |
| **Domain expert / Academic** | Verify SOTA matcher and SLAM landscape post-Mode-A. | "MASt3R-SLAM monocular real-time 2025 Jetson DROID-SLAM MAC-VO", "RoMa DKM dense feature matching aerial satellite UAV-VisLoc 2025" |
| **Contrarian** | Actively search for "why not the chosen approach": custom 2-frame VO, SP+LG-only matcher, hybrid GPS_INPUT+ODOMETRY both active. | "ArduPilot ODOMETRY GPS_INPUT companion external visual odometry double-fusion best practice", "fixed-wing UAV high altitude visual odometry 1km AGL accuracy" |
## Search Query Variants Per Sub-Question
(Selected; full search log preserved in agent transcript and `01_source_registry.md` round-2 entries.)
**Q-R2-1 (VO)**:
1. `visual odometry Jetson Orin Nano benchmark 2026 fixed-wing UAV monocular DPVO BASALT OpenVINS SVO Pro`
2. `DPVO Deep Patch Visual Odometry Jetson real-time inference benchmark FPS 2025`
3. `cuVSLAM Jetson Orin Nano monocular fixed-wing aerial visual odometry CUDA Lucas-Kanade`
4. `MASt3R-SLAM monocular real-time 2025 Jetson DROID-SLAM MAC-VO benchmark embedded`
5. `VINS-Fusion OpenVINS BASALT SVO Pro Jetson Orin Nano benchmark monocular fixed-wing 2025`
6. `DPVO Jetson Orin Nano FPS benchmark monocular visual odometry deployment 2025 ARM`
7. `fixed-wing UAV high altitude visual odometry 1km AGL monocular accuracy 2025`
8. `Isaac ROS visual SLAM cuVSLAM Jetson Orin Nano monocular fixed-wing UAV high altitude integration`
9. `DPV-SLAM DPVO real-time Jetson NX Orin port deployment monocular SLAM 2024`
**Q-R2-2 (cross-view matcher)**:
1. `LiteSAM lightweight feature matching satellite aerial imagery 2025 EfficientLoFTR`
2. `cross-view UAV satellite image matching benchmark 2025 XoFTR MatchAnything OmniGlue LoFTR LightGlue`
3. `MapGlue MapAnything XoFTR cross-modal aerial satellite matching Jetson inference 2025`
4. `XFeat lightweight feature matching Jetson Orin TensorRT FPS benchmark 2025`
5. `LightGlue ONNX TensorRT Jetson Orin Nano Super fps 2025 SuperPoint inference benchmark`
6. `RoMa DKM dense feature matching aerial satellite UAV-VisLoc benchmark accuracy 2025`
7. `aerial drone matcher MatchAnything OmniGlue DeDoDe homography benchmark 2025`
8. `SuperPoint LightGlue Jetson Orin Nano TensorRT FP16 INT8 ms per frame benchmark`
9. `UAV-VisLoc satellite aerial localization SP+LG XFeat LiteSAM RoMa benchmark accuracy meters`
**Q-R2-3 (EKF / ESKF)**:
1. `ESKF error state Kalman filter visual inertial navigation drone vs EKF 2025 advantages`
2. `ArduPilot EKF3 error state Kalman external visual odometry GPS_INPUT ODOMETRY fusion architecture`
3. `ArduPilot EKF3 vs PX4 EKF2 ESKF visual external odometry companion computer architecture`
4. `ArduPilot ODOMETRY GPS_INPUT companion external visual odometry double-fusion IMU EKF3 best practice`
**Q-R2-4 (ortho-tile generator)**:
1. `orthomosaic generation library python aerial drone OpenDroneMap MicMac OpenSfM real-time`
2. `single image orthorectification python library DEM gimbal pinhole homography UAV nadir camera`
3. `Orthority orthorectification python single image GeoTIFF DEM RPC frame camera benchmark`
4. `Orthority simple-ortho per-frame nadir UAV gimbal pitch roll yaw projection latency milliseconds`
**Q-R2-5 (sweep)**:
1. `Cloud Optimized GeoTIFF COG vs MBTiles tile cache embedded UAV onboard storage performance`
2. `ROS 2 Humble Jetson Orin Nano Super JetPack 6 MAVROS ArduPilot integration GPS-denied`
3. (Plus targeted re-checks of round-1 components: VPR backbones, MAVLink2 signing, free-threaded Python, SRTM 30 m DEM.)
## Completeness Audit
| Probe | Coverage |
|-------|----------|
| **Did we re-check every component the user named?** | ✅ VO (Q-R2-1), matcher (Q-R2-2), EKF (Q-R2-3), ortho (Q-R2-4). |
| **Did we sweep every other component for resource/confidence trade-offs?** | ✅ VPR (no new entrants — M-33), tile storage (MBTiles WAL stays — M-28), MAVLink (sysid + signing unchanged — M-31), software platform (CPython + ROS-2-vs-DIY surfaced as open Q — M-29, M-32), DEM (no change), camera (already locked). |
| **Did we surface contrarian failure modes per component?** | ✅ Custom-2-frame-VO is wrong (M-22); LiteSAM-on-Orin-Nano-Super is too slow inline (M-24); RoMa v2 / MASt3R-SLAM are GPU-class (M-25, S62); ArduPilot double-fusion is a bug, not a feature (M-26, M-30). |
| **Did we identify decisions that need user input vs decisions that are deterministic?** | ✅ ROS 2 vs DIY orchestrator (M-29) — needs user. Channel-split for hybrid (M-30) — recommendation Option A for v1, Option B v1.1+. |
| **Did we re-validate locked AC restrictions (camera, zoom, AC-NEW-7)?** | ✅ All lock-ins from round 1 carry forward unchanged. |
@@ -0,0 +1,223 @@
# Reasoning Chain — Mode B (Solution Assessment of `solution_draft01.md`)
For each Mode B finding (M-1..M-15 in `02_fact_cards.md`), trace the fact → comparison → conclusion path and pin the conclusion's confidence. Conclusions feed `solution_draft02.md`.
---
## M-1 — ODOMETRY vs GPS_INPUT (Component 6)
**Fact.** ArduPilot dev docs (S41) say "ODOMETRY (the preferred method)" for sending external-nav to EKF3. ODOMETRY: quaternion + velocity NED + 21-element pos+att covariance + quality 0..100. GPS_INPUT: lat/lon/alt + 3-D velocity + scalar `h_acc`/`v_acc` + `fix_type`. Both supported; both targetable from pymavlink.
**Reference comparison.** AC-4.3 originally states "Replacement for GPS module … via MAVLink GPS_INPUT, GPS1_TYPE=14". That's GPS-substitute framing, which suggests GPS_INPUT is the right channel. But AC-NEW-4 (false-position safety budget P[err>500m]<0.1%) requires the FC to act on **calibrated covariance** — and GPS_INPUT collapses our 6-DoF covariance into one scalar, which is information loss.
**Conclusion.** Hybrid output. Keep GPS_INPUT as the **primary "GPS-substitute" channel** (matches AC-4.3 framing, plays cleanly with FC operator workflows that expect a `GPS_RAW_INT`-shaped status). **Also emit ODOMETRY** when the EKF emits a fix with a full 6-DoF covariance and a non-trivial yaw observability — let the FC's EKF3 fuse the richer signal. Configure FC source priorities so GPS_INPUT is the failover in case ODOMETRY trips a parameter gate (VISO_QUAL_MIN). This is a *strict superset* of the draft's choice; the only cost is the extra MAVLink emit and the source-switching SITL test scope (M-11).
**Confidence.** ✅ High. Two L1 sources (S41 dev docs + S42 PR #19563), one L1 confirming the failure path is real (S43 PR #30080).
---
## M-2 — MASt3R off the primary matcher list
**Fact.** mast3r-runtime Jetson support = "Planned" (S57). Speedy MASt3R = 91 ms / pair on A40 GPU.
**Reference comparison.** A40 ≈ 38 TFLOPS FP16 (admin-class GPU); Jetson Orin Nano Super 25 W ≈ 1.7 TFLOPS FP16 (~67 TOPS sparse INT8). Throughput ratio ~22× to 30× depending on operator-mix. 91 ms × 22 ≈ 2 s/pair; × 30 ≈ 2.7 s/pair. Even with INT8 quantisation closing the gap by ~2× (typical for ViT-class), MASt3R lands at >1 s/pair — outside the 400 ms p95 budget by a factor of ≥2.5×.
**Conclusion.** MASt3R drops from the "stretch candidate" row in the draft's bench-off table to a **research-track-only** label. Bench-off resources should focus on SP+LG / XFeat / GIM-LightGlue / RoMa-distilled.
**Confidence.** ✅ High. Numbers are conservative — MASt3R has additional overhead from the depth backbone that doesn't exist in pure 2D matchers.
---
## M-3 — Add GIM-LightGlue to the bench-off
**Fact.** GIM (S48): self-trained generalist matcher, 8.418.1 % zero-shot improvement over LightGlue/RoMa/DKM/LoFTR baselines. Pre-trained checkpoints public.
**Reference comparison.** Our domain (eastern-Ukraine 1 km AGL nadir vs. service satellite tiles) has *zero* training data publicly available; the bench-off therefore tests zero-shot transfer. GIM's training paradigm (50 h of internet videos covering every kind of scene including aerial) is precisely the regime that maximises zero-shot transfer.
**Conclusion.** Add **GIM-LightGlue** to the matcher bench-off shortlist as a peer of vanilla SP+LG. If the published 818 % zero-shot gain holds on AerialVL + Mavic, GIM-LightGlue dominates the cost/quality frontier (same TRT path as SP+LG, better accuracy out of the box).
**Confidence.** ✅ High. ICLR 2024 spotlight; benchmark numbers reproduced by independent users in the GitHub issue tracker.
---
## M-4 — VPR shortlist expansion: + SALAD + BoQ
**Fact.** SALAD (S47, CVPR 2024): DINOv2 + Sinkhorn optimal-transport VLAD; R@1 = 75 % on MSLS Challenge / 92.2 % MSLS Val / 76 % NordLand; in `aero-vloc`. BoQ (S46, CVPR 2024): bag of learnable queries, beats NetVLAD/MixVPR/EigenPlaces/Patch-NetVLAD/TransVPR/R2Former on 14 benchmarks; DinoV2 results Nov 2024.
**Reference comparison.** AnyLoc (draft primary) is unsupervised VLAD over DINOv2 features; SALAD is *trained* DINOv2-VLAD via Sinkhorn; BoQ is *learnable queries* over a backbone (DINOv2 or ViT). SALAD strictly beats AnyLoc on the same backbone in published benchmarks. BoQ beats both on standard VPR benchmarks; aerial-specific numbers TBD but well-positioned.
**Conclusion.** The bench-off table grows from {AnyLoc, MixVPR} to **{AnyLoc, SALAD, BoQ, MixVPR}**. AnyLoc remains the training-free fallback; SALAD and BoQ are likely primaries.
**Confidence.** ✅ High on M-4 (sources are CVPR 2024 papers + GitHub repos with published weights). Aerial-domain ranking is empirical — the bench-off resolves it.
---
## M-5 — Latency budget has more headroom than the draft assumed
**Fact.** Jetson AI Lab (S40): DINOv2-base-patch14 = 126 inf/s on Orin Nano Super → ~8 ms/inf at 224×224, FP16 trtexec.
**Reference comparison.** Draft estimated 5080 ms / 224×224 for DINOv2 ViT-B (Component 2 row 1). Real number is **~610× better**. At 448×448 (more typical for AnyLoc descriptor extraction), expect ~32 ms/inf via near-quadratic scaling.
**Conclusion.** AC-4.1 (400 ms p95) is **comfortably feasible** with budget left over for SP+LG / GIM-LightGlue (target ~100 ms/pair) + EKF + MAVLink emit. R2 in the draft's risk table downgraded from High to Medium — empirical confirmation needed but no longer a make-or-break risk.
**Confidence.** ✅ High. NVIDIA L1 source.
---
## M-6 — mavlink-router CVE-class issue
**Fact.** S45: stack-based buffer overflow in mavlink-router config parsing, fuzzing-discovered, public, no SECURITY.md.
**Reference comparison.** mavlink-router is C++ daemon running with the same privileges as our companion process; if the config file is attacker-controlled (e.g., a tampered SD card on the airframe), this becomes RCE on the companion. Even if the config file is operator-controlled, a buggy config-file parser is one bug away from another related issue.
**Conclusion.** Three options, choose one:
1. **Pin a specific patched version + sandboxed systemd unit** (NoNewPrivileges, ReadOnlyPaths=/etc/mavlink-router/, MemoryDenyWriteExecute, RestrictAddressFamilies=AF_UNIX AF_INET).
2. **Replace with an in-process MAVLink endpoint multiplexer** (Python or Go, ~150 LOC) — eliminates the dependency entirely.
3. **Distinct system-IDs for MAVSDK + pymavlink** sharing the same serial port via ArduPilot's native MAVLink routing, no router daemon at all.
Option 3 is the simplest. Option 2 gives us the most control. Option 1 is the lowest-effort quick fix. Recommend **Option 3 for v1**, with Option 2 as v1.1 if MAVLink message volume saturates a single endpoint.
**Confidence.** ✅ High that the issue is real; choice of mitigation is implementation preference.
---
## M-7 — MAVLink2 signing is v1-mandatory
**Fact.** S44: signing supported in ArduPilot 4.5+ on telemetry links; USB bypasses; keys in FRAM.
**Reference comparison.** Without signing, anyone with serial-line access (companion side OR an exposed telemetry radio) can inject a `GPS_INPUT` (or ODOMETRY) frame and crash the vehicle. Signing makes that injection require possession of the FRAM key. The cost is one operator key-provisioning step per airframe.
**Conclusion.** Promote signing from "Security note (deferred to a Phase-4 security pass)" to a **v1 hard configuration item**. Document the key-provisioning procedure in the deploy runbook. Verify signing-on at boot and refuse to inject GPS_INPUT/ODOMETRY if the signed-frame ack from the FC indicates signing-off.
**Confidence.** ✅ High.
---
## M-8 — MBTiles operational recipe
**Fact.** S54: WAL + connection pool + transaction batching is the established recipe for MBTiles SQLite under concurrent reader+writer load. Default rollback journal mode causes `database is locked` failures.
**Reference comparison.** Our workload: many concurrent readers (matcher cache lookup at ≤3 fps × ~30 candidate tiles) + occasional writer (Component 1b ortho-tile write at ≤12 Hz × ~30 tiles). Without WAL, every writer commit blocks all readers. With WAL, readers and one writer proceed concurrently.
**Conclusion.** Update Component 1's "Tile format" row in the architecture table to specify: **MBTiles SQLite + WAL + connection pool + per-Component-1b-cycle transaction batching**. Add to AC-4.1 latency-budget validation: the tile-cache lookup must hit p95 ≤5 ms.
**Confidence.** ✅ High.
---
## M-9 — Cache-poisoning safety hazard
**Fact (analytical, not a single source).** Draft's dedup rule allows onboard tiles to overwrite stale service tiles when "our quality > existing". Quality = inlier count + sharpness; **does not include parent-pose covariance as a hard gate**. Combined with EKF over-confidence (a known failure mode — see W4.b), this lets a confidently-bad pose write a misaligned tile that becomes the next flight's anchor.
**Reference comparison.** Cartography literature consistently treats authoritative basemap as immutable and crowdsourced/UAV updates as voting input that requires consensus before promotion. SfM bundle-adjustment treats over-confident poses as the dominant error source.
**Conclusion.** Three layered mitigations:
1. Service-source tiles are **immutable within freshness budget**. Onboard tiles overwrite only stale or other-onboard tiles.
2. The Suite Service ingest applies a **voting layer**: an onboard tile gets promoted to "trusted basemap" only after **N≥2 independent flights** confirm consistent geo-alignment within X m.
3. Parent-pose covariance is a **hard gate** in the local quality score: σ_xy must be tighter than the generation-eligibility gate (e.g., σ_xy ≤ 5 m vs. 10 m generation gate), and a tile written above the hard gate is marked "soft" in its sidecar.
Add **AC-NEW-7 — Cache-poisoning safety budget**: P(onboard tile mis-aligned > 30 m) per flight < 1 %; P(misaligned > 100 m) per flight < 0.1 %. Validation: replay AerialVL with synthetic over-confidence injection.
**Confidence.** ⚠️ Medium. Hazard is real and qualitatively well-known; specific numeric thresholds need empirical calibration during implementation.
---
## M-10 — Free-threaded Python 3.13 not v1-ready
**Fact.** S55: experimental, single-threaded perf hit, GIL re-enables on non-FT-aware C extension import.
**Reference comparison.** Our hot-path includes: numba JIT kernels, TensorRT Python bindings, pymavlink (C extension), numpy/scipy, possibly cv2. Any one of these silently re-enabling the GIL nullifies the benefit. And the non-trivial single-threaded penalty (~1015 % per various benchmarks) directly hits AC-NEW-1 (cold-start TTFF <30 s).
**Conclusion.** v1 stays on **standard CPython 3.11 or 3.12** (newest stable, well-supported by JetPack / numba / TRT). Sharpen the rationale in the architecture: the choice is not "GIL is fine" but "asyncio + TRT subprocess workers + numba JIT is the production-ready combination today; revisit free-threading in v1.1."
**Confidence.** ✅ High.
---
## M-11 — ODOMETRY known production gotchas → SITL coverage required
**Fact.** S41/S42/S43: companion-derived velocity errors, position-estimate resets when external-nav reference loss, source-switching conflicts when running alongside GPS.
**Reference comparison.** AC-NEW-2 (3 s spoofing-promotion latency) **is** the source-switching path. Whatever output channel we pick (GPS_INPUT, ODOMETRY, or hybrid), the source switch is the high-risk transition.
**Conclusion.** Add an explicit testing requirement: **F-T9 (SITL: full MAVLink loop)** must include source-switching scenarios (jam onset → our channel → spoofed real-GPS recovery → operator-confirmed source restore). Include the `EK3_SRC1_*` parameter combinations being benchmarked in the test plan.
**Confidence.** ✅ High.
---
## M-12 — Eastern-Ukraine relief amplitude affects flat-Earth assumption
**Fact.** S56: ~24 m peak-to-trough relief in Kharkiv-region UAV survey areas, with creek/gully systems.
**Reference comparison.** At 1 km AGL with 35° HFOV camera, a 24 m elevation offset at frame edge → ~17 m horizontal misalignment when ortho-projected on flat-Earth. AC-1.1 budget = 50 m@80 % (comfortable); AC-1.2 = 20 m@50 % (tight).
**Conclusion.** Add a **per-sector DEM lookup** to the pre-flight tile-sync pass. Classify sectors:
- **flat** (≤5 m amplitude) — full ortho-tile generation, full anchor weight.
- **moderate** (515 m) — ortho-tile generation, anchor weight × 0.7.
- **rugged** (>15 m) — skip ortho-tile generation, anchor weight × 0.3 with explicit "rugged-sector" flag in confidence telemetry.
This is a small one-time pre-flight step (SRTM 30 m DEM is free, ~15 GB global, ~30 MB for 400 km²).
**Confidence.** ⚠️ Medium. Single regional sample; refine numbers when more terrain data lands.
---
## M-13 — TartanAir V2 reconsideration (open question)
**Fact.** S51: photo-realistic synthetic, native IMU + 12-cam + season variation + custom camera models.
**Reference comparison.** User's last-message reasoning was "Mavic-class dynamics ≠ fixed-wing dynamics → synthetic IMU is unlikely to produce a useful signal". TartanAir V2 lets us configure motion patterns, so the dynamics-mismatch argument is weaker than for MidAir-class quadcopter-only sims.
**Conclusion.** **Open question for the user**: include TartanAir V2 in the bench-off as an early-stage synthetic baseline (good for sweeping seasons / lighting / pitches), or hold to "real-data-only purism" with AerialVL + Mavic + planned-fixed-wing-flights as the only V&V?
**Confidence.** ⚠️ Medium. Technical viability is high; the call is product-side.
---
## M-14 — Add AerialExtreMatch + 2chADCNN to V&V plan
**Fact.** AerialExtreMatch (S49) — 1.5 M synthetic image pairs, 32 difficulty levels (overlap × scale × pitch), real-world UAV localization subset. 2chADCNN (S50) — season-aware UAV↔satellite template-matching.
**Reference comparison.** Draft's bench-off targets are AerialVL + UAV-VisLoc + internal Mavic. None of those grade against extreme-pitch / extreme-scale / extreme-overlap separately. Without a benchmark that crosses these axes, the bench-off can pick a winner that fails silently in cornered conditions.
**Conclusion.** Add to the V&V plan:
- **AerialExtreMatch** as a primary structured-difficulty regression bench.
- **2chADCNN** as a season-aware baseline either (a) included in the bench-off, or (b) used as an explicit season-robustness ceiling reference.
**Confidence.** ✅ High.
---
## M-15 — Real fixed-wing VO is harder than draft implies
**Fact.** S52 (AFIT thesis): SVO/DSO/ORB-SLAM2 all "had significant difficulty maintaining localisation" on real fixed-wing flights. S53: high-altitude (3001000 m AGL) VIO drift in the same band as our AC-1.3.
**Reference comparison.** Draft's choice ("custom 2-frame homography VO via Component-3 matcher") is correct framing — VO between satellite anchors is a **much easier** problem than standalone metric SLAM. But AC-1.3's drift budget (<100 m without IMU, <50 m with IMU between two satellite-anchored fixes) requires empirical confirmation against a real fixed-wing baseline.
**Conclusion.** Add to risks: **R8 — fixed-wing VO drift under our AC-1.3 budget is unconfirmed**. Mitigations:
1. Borrow AerialVL's fixed-wing trajectories (70 km of real fixed-wing flight) for AC-1.3 regression in `F-T1b` (new).
2. Plan the first internal fixed-wing flight before AC lock — not as a stretch goal.
**Confidence.** ✅ High.
---
## Summary table
| Finding | Severity | Affects | Resolution |
|---|---|---|---|
| M-1 | High | C-6, AC-4.3, AC-NEW-4 | Hybrid GPS_INPUT + ODOMETRY |
| M-2 | High | C-3 bench-off | Drop MASt3R from primary list |
| M-3 | Med | C-3 bench-off | Add GIM-LightGlue |
| M-4 | High | C-2 bench-off | Add SALAD + BoQ |
| M-5 | High (positive) | AC-4.1 | Downgrade R2 risk |
| M-6 | High (security) | C-6 | Replace mavlink-router OR sandbox & pin |
| M-7 | High (security) | C-6 | MAVLink2 signing v1-mandatory |
| M-8 | Med | C-1 | MBTiles WAL + pool + batching |
| M-9 | High (safety) | C-1b, AC-NEW | New AC-NEW-7 + dedup-rule changes |
| M-10 | Med | C-9 | Stay on CPython 3.11/3.12; sharpen rationale |
| M-11 | Med | C-5/C-6, AC-NEW-2 | Add SITL source-switching tests |
| M-12 | Med | C-1b, AC-1.2 | Per-sector DEM lookup + anchor weight |
| M-13 | Open question | datasets | Surface to user |
| M-14 | Med | V&V plan | Add AerialExtreMatch + 2chADCNN |
| M-15 | Med | C-4, AC-1.3 | Risk R8 + AerialVL F-T1b |
@@ -0,0 +1,151 @@
# Mode B Round 2 — Reasoning Chain
For each user-named component (Q-R2-1 … Q-R2-4) plus the sweep (Q-R2-5), the reasoning chain follows the **fact-confirm → reference-compare → conclude → confidence** pattern.
---
## Dimension 1: Visual Odometry (Component 4)
### Fact confirmation
- Draft02 C-4: "custom 2-frame VO via SuperPoint+LightGlue homography". (M-22)
- AFIT thesis (S52): SVO / DSO / ORB-SLAM2 all "had significant difficulty maintaining localisation" on real fixed-wing flights.
- High-altitude VIO field test (S72, MDPI Drones 2023): stereo-VIO = 2.186 m / 800 m at 40100 m AGL; monocular-VIO "acceptable but worse". At 1 km AGL motion parallax shrinks ~1025× per frame vs 100 m AGL, further degrading monocular-VO accuracy.
- cuVSLAM (S60, NVIDIA, Jul 2025): CUDA-accelerated, designed for Jetson, monocular + monocular-IMU + stereo modes, **<1 % ATE on KITTI / <5 cm on EuRoC**. Apache-2.0. Drop-in via `isaac_ros_visual_slam` (S64).
- DPVO / DPVO-QAT++ (S61, S73): SOTA deep VO. Original DPVO 25× real-time on RTX-3090 (4 GB GPU); DPVO-QAT++ benchmarked on RTX-4060 only (+52 % FPS TartanAir, +30 % FPS EuRoC). Orin Nano Super extrapolation: ~410 FPS plain DPVO, ~615 FPS DPVO-QAT++. Borderline for 10 Hz.
- MASt3R-SLAM (S62, CVPR 2025): 15 FPS on a single GPU; sub-1 Hz extrapolated on Orin Nano Super → infeasible inline.
- VINS-Fusion / OpenVINS / BASALT / SVO Pro on Orin Nano (S71): all build with non-trivial integration cost (OpenCV pinning, ROS plumbing, IMU-time-sync); none CUDA-accelerated; no accuracy advantage over cuVSLAM.
### Reference comparison
| VO option | Maturity on Orin Nano Super | Accuracy benchmark | Memory | Integration cost | Notes |
|-----------|----------------------------|--------------------|--------|-----------------|-------|
| **cuVSLAM (mono)** | ✅ NVIDIA-supported, reference designs exist | <1 % ATE KITTI / <5 cm EuRoC | <2 GB | ROS 2 wrapper (1 day) | M-22, S60, S64 |
| **DPVO / DPV-SLAM** | ⚠️ no Jetson port; ~615 FPS extrapolated | SOTA on TartanAir / EuRoC | 57 GB GPU | manual port + QAT | M-23, S61, S73 |
| **MASt3R-SLAM** | ❌ infeasible | best on EuRoC + 7-Scenes | 24 GB GPU class | research-track only | M-23, S62 |
| **VINS-Fusion** | ⚠️ ~15 FPS on Xavier NX after pinning | ~311 cm path err on EuRoC | ~1 GB | manual integration + memory tuning | S71 |
| **OpenVINS** | ⚠️ builds on Orin Nano w/ JetPack 6 | comparable to VINS-Fusion | ~1 GB | manual integration + ROS 2 plumbing | S71 |
| **BASALT / SVO Pro** | ⚠️ stereo-first; mono available | mid-tier | low | high integration cost | S71 |
| **Custom 2-frame homography VO (draft02)** | n/a | drift unbounded; AFIT thesis prediction: poor | low | "easy" but wrong design | M-22 |
### Conclusion
- **Replace draft02's custom 2-frame VO with cuVSLAM in monocular + IMU mode** (revised C-4).
- Defer DPVO / MASt3R-SLAM / VINS-Fusion / OpenVINS to a research-track bench-off only after cuVSLAM has empirical numbers on a fixed-wing 1 km AGL trajectory; if cuVSLAM underperforms, those are the fall-back candidates.
- IMU source for cuVSLAM: subscribe to MAVLink `RAW_IMU` / `SCALED_IMU` from FC at ~200400 Hz (path (a) of M-35). If sync jitter is too high, add a dedicated companion IMU in v1.1.
### Confidence
- ✅ High on "custom 2-frame VO is wrong"; ⚠️ Medium on "cuVSLAM is the right replacement" — high-altitude fixed-wing performance unproven on cuVSLAM's published benches. **Bench-off in F-T1b (revised) mandatory before AC-1.3 lock.**
---
## Dimension 2: Cross-view Matcher (Component 3)
### Fact confirmation
- LiteSAM (S58, MDPI Oct 2025): purpose-built satellite↔aerial. 6.31 M params (2.4× smaller than EfficientLoFTR's 15.05 M). RMSE@30 = 17.86 m on UAV-VisLoc. **497.49 ms / pair on Jetson AGX Orin** FP16-optimized.
- AGX Orin INT8 throughput ≈ 275 TOPS, Orin Nano Super ≈ 67 TOPS → 4× scaling factor. **LiteSAM on Orin Nano Super ≈ 15002000 ms / pair.** (M-24)
- RoMa v2 (S63, Nov 2025): SOTA dense, frozen DINOv3 + custom CUDA + predictive covariance. GPU-class footprint. (M-25)
- MapGlue / MATCHA (search results): cross-modal SOTA; no Jetson deployment data. (M-25)
- SuperPoint + LightGlue (TRT FP16): RTX 3080 = 0.95 ms (SP) + 2.54 ms (LG) at 320×240. Scaling to Orin Nano Super FP16: ~50 ms / pair at 320×240; ~200 ms / pair at 640×480. (search summary, S76 sanity check)
- XFeat (S08 round 1, search summary): 5× faster than other deep matchers; CPU-viable; TRT path on Jetson exists.
- Our budget (AC-4.1): 400 ms p95 end-to-end pipeline on Orin Nano Super 25 W → matcher must consume ≤150200 ms / pair to leave headroom for VPR + ortho + EKF + I/O.
### Reference comparison
| Matcher | Inline-feasible on Orin Nano Super @ 25 W? | Accuracy on satellite↔aerial | Specialization for cross-view? | Role |
|---------|--------------------------------------------|------------------------------|--------------------------------|------|
| **SP + LG (TRT FP16)** | ✅ ~50200 ms / pair | strong on UAV-VisLoc / AerialVL | generic | **inline lead** |
| **GIM-LightGlue** | ✅ same path as SP+LG | +8.418.1 % zero-shot vs LG (S48) | generic, internet-trained | **inline peer / bench-off** |
| **XFeat (sparse + semi-dense)** | ✅ very fast | weaker than SP+LG on cross-view | embedded-class | **degraded-power fallback** |
| **LiteSAM** | ❌ ~15002000 ms / pair | RMSE@30 = 17.86 m UAV-VisLoc, **best published satellite↔aerial** | yes (purpose-built) | **re-loc fallback / oracle / distillation teacher** |
| **GIM-RoMa / RoMa v2** | ❌ GPU-class | best dense matcher published | generic | **offline ceiling reference** |
| **MASt3R / MASt3R-SLAM** | ❌ infeasible | very high | dense reconstruction | research-track only |
| **MapGlue / MATCHA** | ❌ no Jetson data | strong on cross-modal | yes | research-track only |
| **2chADCNN** | ❌ wrong output type (template-overlap) | season-aware | yes | season-robustness ceiling reference |
| **Classical SIFT/ORB/AKAZE** | ✅ very fast | poor cross-view (F-A5) | no | last-resort degraded mode |
### Conclusion
- **SP+LG (TRT FP16/INT8) remains the inline matcher.** GIM-LightGlue is its peer in the bench-off.
- **LiteSAM joins the design in three non-inline roles**: re-loc fallback (cold start, σ_xy > 50 m, 1.52 s budget acceptable); validation oracle (offline regression bench); distillation teacher (train a smaller satellite-aerial-specialized student that fits the inline budget).
- **RoMa v2 + MASt3R + MapGlue + MATCHA** added to the matcher bench-off as **offline ceiling references only** so we know how much accuracy we trade by using SP+LG inline.
- **Bench-off scope (revised)** for the deferred research item: SP+LG, GIM-LightGlue, XFeat (sparse + semi-dense), LiteSAM (re-loc role), RoMa v2 (ceiling), MASt3R-SLAM (ceiling). Score on UAV-VisLoc + AerialVL + AerialExtreMatch + 2chADCNN-season-set + internal Mavic + first fixed-wing flight.
### Confidence
- ✅ High on roles + decisions; ⚠️ Medium on the AGX-Orin → Orin Nano Super 4× scaling for LiteSAM — bench-off should confirm.
---
## Dimension 3: EKF vs ESKF (Component 5)
### Fact confirmation
- ArduPilot EKF3 = classical extended Kalman filter, 24-state, runs at 400 Hz. **Not** an ESKF. (S65, S66, S67 + draft02 M-1)
- PX4 EKF2 is an ESKF (S68). We are not on PX4.
- Companion-side filter advantages of ESKF over EKF (S68, S69, S70): better quaternion/orientation handling via tangent-space covariance (Lie group), ~0.3 % CPU saved, better numerical conditioning, simpler equations.
- ArduPilot ExtNav best practice (S65, S66, S67): **only one position source per axis at a time**. ArduPilot has open / recently-closed bugs (#30076, #32506) when ExtNav and GPS are both fed with overlapping responsibilities.
- Round 1 M-1 said "emit BOTH GPS_INPUT AND ODOMETRY in parallel" — without specifying axis-level responsibility split. This is the bug condition documented by S65/S66.
### Reference comparison
| Filter location | Filter family | Worth it for v1? | Why |
|-----------------|---------------|------------------|-----|
| **FC (ArduPilot)** | EKF3 (regular EKF) | n/a — locked | Cannot swap; not user-controlled. |
| **Companion (draft02 plan)** | "loosely-coupled EKF" | ❌ remove for v1 | Causes double-fusion against the FC's own EKF3, observability mismatches, the M-1-class bug pattern. |
| **Companion (M-26 revised plan)** | None — replaced with **covariance calibrator + Mahalanobis outlier gate + source-label producer** | ✅ for v1 | Simpler; lets ArduPilot's EKF3 do the actual fusion. |
| **Companion (v1.1, if needed)** | Vanilla ESKF (S69) | ⚠️ optional | Only if F-T9 SITL shows EKF3 cannot handle our raw inputs. ESKF is the correct family — but smoothing visual fixes ahead of EKF3 is rarely the right answer; usually better to fix covariance estimation upstream. |
### Conclusion
- **Drop the companion-side EKF for v1.** Component 5 becomes a "covariance calibrator + outlier gate + source-label producer" — *no state propagation, no IMU integration on the companion*.
- **Hybrid GPS_INPUT + ODOMETRY revised (M-30)**:
- **Option A (v1 default)**: GPS_INPUT carries position + velocity + h_acc/v_acc. ODOMETRY is **disabled** for v1. ArduPilot configured `EK3_SRC1_*=GPS+Compass`. Failover to backup via `EK3_SRC2_*`.
- **Option B (v1.1+)**: ODOMETRY carries pose+velocity+yaw + 21-element covariance; GPS_INPUT held in reserve, not fused while ODOMETRY healthy. `EK3_SRC1_POSXY=ExternalNav`, `EK3_SRC2_POSXY=GPS`. Requires PR #30080-class fixes.
- **ESKF is the right family if and only if we re-introduce a companion-side filter later.** For v1 the question is moot.
### Confidence
- ✅ High on dropping companion-side filter for v1; ⚠️ Medium on whether v1.x will need it back — depends on F-T9 SITL evidence.
---
## Dimension 4: Ortho-Tile Generator (Component 1b)
### Fact confirmation
- Orthority (S59) is a maintained Python library: frame + RPC camera models, GeoTIFF DEM lookup, RPC refinement, pan-sharpening. CLI + API. Pip / conda installable.
- ODM is a post-processing batch SfM pipeline; recommended 128 GB RAM for 2500 images. Wrong tier.
- simple-ortho (S59 predecessor) is older and superseded by Orthority.
- Draft02 plan (C-1b): "Pinhole projection on per-sector DEM" — implicit hand-rolled implementation.
### Reference comparison
| Approach | Build cost | Per-frame latency | Maintenance | Risk |
|----------|-----------|-------------------|-------------|------|
| **Orthority** (S59) | ~1 day integration | unknown — must measure on Orin Nano Super | externalized | depends on Orin Nano latency check |
| **Hand-rolled `cv2.warpPerspective` + bilinear DEM lookup** | ~23 days | ~520 ms estimated | internal | reinvents distortion + DEM gimbal handling |
| **ODM / OpenSfM** | weeks | seconds-to-minutes (batch) | ext. | wrong tier |
| **MicMac** | weeks | seconds-to-minutes (batch) | ext. | wrong tier |
### Conclusion
- **Use Orthority for per-frame ortho.** Falls back to hand-rolled `cv2.warpPerspective` + bilinear DEM lookup if F-T14 measurements show Orthority's per-frame latency on Orin Nano Super exceeds the budget (estimate ≤3050 ms allotted to ortho).
- Reuses Orthority's distortion + RPC + DEM machinery instead of reinventing it.
### Confidence
- ✅ High on "use a library, not DIY"; ⚠️ Medium on "Orthority specifically" pending the latency measurement.
---
## Dimension 5: Component sweep (Q-R2-5)
### Fact confirmation + comparison + conclusion (compact)
| Component | Round 1 choice | Round 2 finding | Conclusion |
|-----------|----------------|-----------------|------------|
| **C-1 Tile cache** | MBTiles + WAL + connection pool + transaction batching | COG metadata-load issue (500 MB on 7 GB file) defeats selective access on UAV bandwidth; PMTiles strong only for HTTP serving — local microSD use loses SQLite-WAL concurrency. | **Unchanged** (M-28). |
| **C-2 VPR** | AnyLoc + SALAD + BoQ + MixVPR shortlist | No new VPR backbone displaces SALAD/BoQ on aerial cross-domain in 2025. 2025 SOTA is matcher-side (RoMa v2, LiteSAM, MASt3R-SLAM). | **Unchanged** (M-33). |
| **C-3 Cross-view matcher** | SP+LG lead, GIM-LG peer, MASt3R dropped | LiteSAM added as re-loc/oracle/teacher (M-24); RoMa v2 + MapGlue + MATCHA added as offline ceilings (M-25). | **Revised** — see Dimension 2. |
| **C-4 VO** | Custom 2-frame homography VO via SP+LG/GIM-LG | Custom VO is wrong design (M-22); cuVSLAM is the v1 candidate (M-23). | **Revised** — see Dimension 1. |
| **C-5 Fusion** | Companion-side loose-coupled EKF emitting GPS_INPUT + ODOMETRY in parallel | Companion-side EKF should be dropped (M-26); hybrid output revised to single-channel-per-axis (M-30); ESKF only if v1.1 evidence demands it. | **Revised** — see Dimension 3. |
| **C-6 MAVLink** | distinct sysid + native routing + signing | Survives unchanged; sysid collision-check added to deploy runbook (M-31). | **Unchanged** (M-31). |
| **C-7 Failsafe** | Unchanged from Mode A | No new findings. | **Unchanged.** |
| **C-8 Object localization** | trig + airframe-attitude fusion | No new findings. | **Unchanged.** |
| **C-9 Software platform** | CPython 3.11/3.12 + asyncio + TRT subprocess workers | ROS 2 Humble + Isaac ROS 3.2 is a proven reference for the same hardware (M-29); becomes the most-likely v1 path **if** we adopt cuVSLAM (M-23). Decision needs user input. | **OPEN QUESTION** — see Validation Log. |
| **C-10 FDR** | Unchanged + sector + trust_level fields | No new findings. | **Unchanged.** |
| **C-11 Confidence score** | Composite + per-channel emission | No new findings. | **Unchanged.** |
### Confidence
- ✅ High on every "Unchanged" row; ⚠️ Medium on M-29 (ROS 2 vs DIY) — pending user decision.
@@ -0,0 +1,75 @@
# Validation Log — Mode B
## Validation scenario
A typical 8-hour fixed-wing mission in eastern Ukraine, mid-summer, sunny. The UAV climbs to 1 km AGL on the way to the sector, transits ~50 km of corridor, performs ~1.5 h of dense coverage (sector-pattern), and returns. Mid-flight, the operator-side EW threat indicator reports a GPS-spoofing event. At minute 45 the companion computer browns out and reboots; at minute 90 the UAV passes over a 25-m-deep gully system; at minute 180 a sharp turn on weather avoidance reduces frame overlap to <5 % for two consecutive frames; at minute 310 the bench network drops out before tile upload finishes; at minute 470 the UAV lands.
For each of these waypoints, walk through what the system produces using the **Mode B-revised draft** vs. the **Mode A draft**.
---
## Expected behaviour by waypoint
### Cruise (steady state)
- **Mode A** — emit GPS_INPUT only; covariance collapsed to scalar `h_acc`. EKF in companion does the fusion.
- **Mode B (revised)** — emit GPS_INPUT (primary, GPS-substitute framing) **and** ODOMETRY (when full 6-DoF covariance is available; quality > VISO_QUAL_MIN). FC's EKF3 has access to richer signal; companion EKF is still the source of truth for source-label assignment.
- **Counterexample check** — what if ODOMETRY's covariance is wrong? VISO_QUAL_MIN gates it on the FC; GPS_INPUT path stays valid as failover. Net: no regression vs. Mode A.
### Spoofing event (AC-NEW-2)
- **Mode A** — listen to `GPS_RAW_INT` / `EKF_STATUS_REPORT`; promote our GPS_INPUT to fix_type=3D in <3 s.
- **Mode B (revised)** — same, plus M-11 SITL coverage of the EK3_SRC1_* parameter switch path. The known-bug landscape (S43) is now a hard test gate, not a risk.
- **Counterexample check** — what if the source switch deadlocks because PR #30080's fix isn't in the ArduPilot version we ship? Mitigation: pin to the ArduPilot version that contains the merged PR; document in deploy runbook.
### Companion brown-out + reboot (AC-NEW-1, cold-start TTFF <30 s)
- **Mode A** — TRT engines build at install time; CUDA / TRT init <5 s; cold-fix via VPR + matcher within remaining budget.
- **Mode B (revised)** — same path, but the latency-budget headroom is much bigger than draft assumed (M-5: DINOv2 ViT-B = 8 ms/inf at 224×224 on Orin Nano Super). Cold TTFF target moves from "tight" to "comfortable".
- **Counterexample check** — what if the CPython 3.13 free-threading question pulls us into experimental territory? Mode B explicitly rejects free-threading for v1 (M-10), so JIT warmup is bounded by numba on CPython 3.11/3.12 (well-characterised).
### Rugged-terrain segment (M-12)
- **Mode A** — flat-Earth assumption applied uniformly; tile generation runs even over the gully; 17 m horizontal misalignment at frame edge becomes a "high-quality" tile that overwrites a stale service tile. **Cache-poisoning hazard** (M-9).
- **Mode B (revised)** — pre-flight DEM classifies this sector as "rugged" (>15 m amplitude); ortho-tile generation **skipped** in this sector; satellite anchor weight × 0.3 with rugged-sector flag in telemetry.
- **Counterexample check** — what if the DEM is wrong / out-of-date? SRTM 30 m DEM has known artefacts in gully systems. Mitigation: also use the runtime self-classification — if the matcher's RANSAC inlier ratio drops below threshold for K consecutive frames, auto-promote the sector to "rugged" for the rest of the flight.
### Sharp turn (AC-3.2)
- **Mode A** — sharp turn frame fails VO (5 % overlap), satellite-based re-localization via VPR + matcher. ✓.
- **Mode B (revised)** — same, but the VPR pool now includes SALAD + BoQ + AnyLoc + MixVPR (M-4). Bench-off result determines runtime primary; AnyLoc remains the training-free fallback.
- **Counterexample check** — none introduced by Mode B.
### Tile upload network drop (post-flight)
- **Mode A** — diff-against-Service uploader; if the link drops, retry on next bench session.
- **Mode B (revised)** — same, plus the M-9 voting rule means upload failure delays "trusted basemap" promotion but doesn't break next mission's cache (the Service ingest layer holds onboard tiles in a "candidate" pool until 2nd-flight confirmation).
- **Counterexample check** — what if N=2 voting is too slow to react to fresh imagery? Set N=1 for sectors where the operator manually marks a tile-set as "trusted" (e.g., post-recon imagery).
### Landing + post-flight upload
- **Mode A** — uploader runs as one-shot; tiles + sidecars pushed to Service.
- **Mode B (revised)** — uploader pushes onboard tiles to a **candidate pool**, not directly to the basemap. Service ingest applies the M-9 voting layer.
- **Counterexample check** — does this slow down imagery freshness? Yes, by one mission for a given sector. AC-NEW-6 freshness budget already allows 6 months for active-conflict sectors and 12 months for stable rear sectors; one extra mission of latency is well inside that envelope.
---
## Review checklist
- [x] Mode B conclusions consistent with fact cards M-1..M-15.
- [x] No important dimensions missed (W1W13 cross-checked vs. weak-point findings; W3.b, W4.b, W4.d, W11, W12 are not blocking — flagged as residual research items in `solution_draft02.md` "Open Research").
- [x] No over-extrapolation (every conclusion traceable to ≥1 source S40+ or to an explicit analytical chain).
- [x] All conclusions actionable / verifiable (source-switching SITL test, AC-NEW-7 numeric budget, sector DEM table, etc.).
---
## Conclusions requiring user input
These items cannot be unilaterally resolved by Mode B and must be surfaced when handing the revised draft back to the user:
1. **M-13** — TartanAir V2 in the early-stage bench-off, yes/no?
2. **AC-NEW-7 numeric thresholds** — Mode B proposes P(misalignment > 30 m) < 1 % per flight; P(>100 m) < 0.1 %. Confirm or revise.
3. **M-6 mavlink-router decision** — three options (sandbox+pin / replace / no router with distinct system-IDs). Mode B recommends Option 3 for v1.
4. **M-1 hybrid output** — accept the GPS_INPUT + ODOMETRY hybrid, or stay GPS_INPUT-only?
These are the four residual user-facing open items for the Plan step.
@@ -0,0 +1,64 @@
# Mode B Round 2 — Validation Log
## Validation scenario
A nominal **30-minute fixed-wing sortie at 1 km AGL** over a 20×20 km eastern-Ukraine operational area. Mid-flight GPS jamming starts at t=10 min; persists 8 min; ends at t=18 min. One sharp turn at t=14 min (mid-jam). Companion is Jetson Orin Nano Super @ 25 W; FC is ArduPilot 4.5+ on Cube Orange; nav cam is ADTi 20MP APS-C @ 3 fps.
## Expected behaviour under draft03 (round-2 revisions)
| Phase | Expected behaviour | Why |
|-------|-------------------|-----|
| **t=010 min, GPS healthy** | cuVSLAM publishes pose to ROS 2; Component 5 calibrator passes the FC's real GPS through; companion's GPS_INPUT is held in reserve (not emitted). | Option A in M-30: GPS_INPUT only emitted when needed. |
| **t=10 min, jam onset** | Real-GPS quality drops; FC EKF3 starts rejecting noisy fixes. Companion detects jam via FC `GPS_RAW_INT` quality < threshold OR explicit operator command. Within 3 s (AC-NEW-2) starts emitting GPS_INPUT (`GPS1_TYPE=14`) with covariance from matcher + VO + cuVSLAM agreement. | Source-promotion logic in C-7; 3 s budget unchanged. |
| **t=1014 min, steady cruise under jam** | Per-frame: cuVSLAM provides relative pose (drift-bounded by keyframe + bundle adjustment); SP+LG (TRT FP16) matches frame against top-K VPR chunks; PnP yields absolute fix; covariance calibrator + Mahalanobis gate filter outliers; GPS_INPUT emitted at ≥1 Hz. End-to-end p95 ≤400 ms. | M-23 (cuVSLAM bounded drift) + M-26 (no companion-side EKF) + Component 3 inline matcher unchanged. |
| **t=14 min, sharp turn** | VPR re-loc trigger fires; FAISS top-K=20 over chunk index; matcher attempts pose recovery on neighbour chunks. **If steady-state SP+LG fails on the post-turn frame**, LiteSAM re-loc fallback (M-24) invoked at ~1.5 s budget; one-shot pose recovery; cuVSLAM is reset to the recovered pose. | M-17 (conditional VPR) + M-24 (LiteSAM re-loc role). |
| **t=1418 min, post-turn cruise** | Steady-state behaviour resumed. Per-sector ortho-tile generator (Component 1b) writes new tiles via Orthority for sectors where `parent_pose_sigma_xy ≤ 5 m` AND `terrain_class ∈ {flat, moderate}`. Service-tile immutability respected. | M-27 (Orthority) + M-9 (cache poisoning safety). |
| **t=18 min, jam ends** | FC sees real-GPS quality recover; companion stops emitting GPS_INPUT after operator-confirmed source-restore (>1 s confirmation latency, M-11/F-T9). | Bidirectional source-switch covered by F-T9 SITL. |
| **Post-flight** | Onboard tiles uploaded to Suite Service candidate pool; 2-flight voting promotes to trusted basemap. | Unchanged from round 1. |
## Validation against draft02 conclusions (counterexample search)
| Round 1 conclusion | Round 2 verdict | Reason |
|--------------------|-----------------|--------|
| **C-3 SP+LG lead, GIM-LG peer, MASt3R dropped** (M-2/M-3) | Survives. | LiteSAM added in non-inline roles (M-24); RoMa v2 added as ceiling reference (M-25). |
| **C-4 custom 2-frame VO via SP+LG/GIM-LG** | **REVISED** | Custom 2-frame homography VO is the wrong design for fixed-wing 1 km AGL flight (M-22, AFIT thesis S52). Replaced by cuVSLAM (M-23). |
| **C-5 loosely-coupled companion-side EKF emitting GPS_INPUT + ODOMETRY in parallel** (M-1) | **REVISED** | Companion-side EKF causes double-fusion against ArduPilot EKF3 (S65, S66). Replaced by covariance calibrator + outlier gate; no state propagation (M-26). Hybrid channel split changed: v1 emits GPS_INPUT only; ODOMETRY is v1.1+ work (M-30). |
| **C-1b custom pinhole projection on per-sector DEM** | **REVISED** | Use Orthority library instead of hand-rolled (M-27). Falls back to `cv2.warpPerspective + DEM bilinear` if F-T14 latency measurement fails. |
| **C-1 MBTiles + WAL + pool** (M-8) | Survives. | COG / PMTiles do not improve on this for our use case (M-28). |
| **C-2 VPR shortlist {AnyLoc, SALAD, BoQ, MixVPR}** (M-4) | Survives. | No new VPR backbone in 2025 (M-33). |
| **C-6 distinct sysid + native routing + signing** (M-6, M-7) | Survives. | (M-31) |
| **C-9 CPython 3.11/3.12 + asyncio + TRT** (M-10) | **OPEN QUESTION** | ROS 2 Humble + Isaac ROS 3.2 is the natural pair for cuVSLAM (M-29); decision pending user input. |
| **AC-NEW-7 cache-poisoning budget** (M-9) | Survives. | Orthography library swap doesn't change the safety budget. |
| **Camera ADTi 20MP APS-C, z=20 storage zoom** (M-20) | Survives. | (M-19, M-20 unchanged.) |
## Counterexamples
| Counterexample | Status |
|----------------|--------|
| **"What if cuVSLAM cannot operate at 1 km AGL because monocular parallax is too small?"** | Real risk — explicitly flagged in M-23. F-T1b (revised) bench-off MUST run cuVSLAM on AerialVL fixed-wing trajectories before AC-1.3 lock. Fall-back: re-introduce a custom-tracker VO that uses the matcher's inter-frame correspondences with bundle-adjustment + loop closure (i.e., a properly-scoped VO, not the 2-frame homography of draft02). |
| **"What if Orthority's per-frame latency on Orin Nano Super > 50 ms?"** | Documented in M-27. Fall-back: hand-rolled `cv2.warpPerspective + bilinear DEM` (~520 ms estimated). Decision deferred to F-T14 measurement. |
| **"What if dropping the companion-side EKF causes the FC's EKF3 to reject our covariances?"** | Documented in M-26. F-T9 SITL must verify; if EKF3 mishandles our raw inputs, re-introduce a vanilla ESKF (S69) as the smoothing layer. v1.1 work. |
| **"What if LiteSAM re-loc fallback (1.52 s) blows the AC-NEW-1 cold-start budget (30 s)?"** | 1.52 s << 30 s. Acceptable. (M-24) |
| **"What if ROS 2 + Isaac ROS overhead pushes us over the 400 ms p95 latency budget?"** | DDS overhead measured at ~25 % CPU (M-29). For our 8 GB shared-memory budget, the bigger risk is the deployment-image footprint (~200 MB extra). Latency impact at 3 fps inference is negligible. |
| **"What if MAVLink-rated IMU rate (~200400 Hz) is insufficient for cuVSLAM's sync sensitivity?"** | Documented in M-35. Fall-back: dedicated companion IMU. v1.1 hardware revision if needed. |
## Review checklist
- [x] Draft conclusions consistent with round-2 fact cards (M-22 … M-35).
- [x] No important dimensions missed: VO, matcher, fusion filter, ortho, plus full sweep (VPR, tile storage, MAVLink, software platform, FDR, confidence score, camera).
- [x] No over-extrapolation: cuVSLAM 1-km-AGL performance + Orthority latency + LiteSAM 4× scaling are explicitly flagged as needing empirical confirmation in F-T1b / F-T14 / matcher bench-off.
- [x] Conclusions actionable / verifiable: every revised component has a concrete test (F-T1b, F-T14, F-T9 SITL, matcher bench-off scope).
## Conclusions requiring user input (carried into the next gate)
1. **ROS 2 Humble + Isaac ROS vs. DIY Python orchestrator** for Component 9 (M-29). Recommendation: **ROS 2** if the team has any ROS 2 experience; **DIY Python** if not (re-skilling cost > overhead). User decides.
2. **Companion IMU strategy** (M-35). Recommendation: try MAVLink `RAW_IMU` from FC (path a) for v1; add dedicated IMU only if path (a) fails F-T1b. User decides if a dedicated IMU is acceptable as a hardware addition.
## Conclusions NOT requiring user input (locked by evidence)
- VO: cuVSLAM (M-23) — locked (subject to F-T1b empirical confirmation).
- Matcher: SP+LG inline + LiteSAM in re-loc role (M-24) — locked.
- Fusion: drop companion-side EKF; covariance calibrator only (M-26) — locked.
- Ortho: Orthority (M-27) — locked (subject to F-T14 measurement; documented fallback).
- Hybrid channel split: Option A for v1 (M-30) — locked.
- All other components — unchanged from draft02.