# Reasoning Chain — Mode B (Solution Assessment of `solution_draft01.md`)

For each Mode B finding (M-1..M-15 in `02_fact_cards.md`), trace the fact → comparison → conclusion path and pin the conclusion's confidence. Conclusions feed `solution_draft02.md`.

---

## M-1 — ODOMETRY vs GPS_INPUT (Component 6)

**Fact.** ArduPilot dev docs (S41) say "ODOMETRY (the preferred method)" for sending external-nav to EKF3. ODOMETRY: quaternion + velocity NED + 21-element pos+att covariance + quality 0..100. GPS_INPUT: lat/lon/alt + 3-D velocity + scalar `h_acc`/`v_acc` + `fix_type`. Both supported; both targetable from pymavlink.

**Reference comparison.** AC-4.3 originally states "Replacement for GPS module … via MAVLink GPS_INPUT, GPS1_TYPE=14". That's GPS-substitute framing, which suggests GPS_INPUT is the right channel. But AC-NEW-4 (false-position safety budget P[err>500m]<0.1%) requires the FC to act on **calibrated covariance** — and GPS_INPUT collapses our 6-DoF covariance into one scalar, which is information loss.

**Conclusion.** Hybrid output. Keep GPS_INPUT as the **primary "GPS-substitute" channel** (matches AC-4.3 framing, plays cleanly with FC operator workflows that expect a `GPS_RAW_INT`-shaped status). **Also emit ODOMETRY** when the EKF emits a fix with a full 6-DoF covariance and a non-trivial yaw observability — let the FC's EKF3 fuse the richer signal. Configure FC source priorities so GPS_INPUT is the failover in case ODOMETRY trips a parameter gate (VISO_QUAL_MIN). This is a *strict superset* of the draft's choice; the only cost is the extra MAVLink emit and the source-switching SITL test scope (M-11).

**Confidence.** ✅ High. Two L1 sources (S41 dev docs + S42 PR #19563), one L1 confirming the failure path is real (S43 PR #30080).

---

## M-2 — MASt3R off the primary matcher list

**Fact.** mast3r-runtime Jetson support = "Planned" (S57). Speedy MASt3R = 91 ms / pair on A40 GPU.

**Reference comparison.** A40 ≈ 38 TFLOPS FP16 (admin-class GPU); Jetson Orin Nano Super 25 W ≈ 1.7 TFLOPS FP16 (~67 TOPS sparse INT8). Throughput ratio ~22× to 30× depending on operator-mix. 91 ms × 22 ≈ 2 s/pair; × 30 ≈ 2.7 s/pair. Even with INT8 quantisation closing the gap by ~2× (typical for ViT-class), MASt3R lands at >1 s/pair — outside the 400 ms p95 budget by a factor of ≥2.5×.

**Conclusion.** MASt3R drops from the "stretch candidate" row in the draft's bench-off table to a **research-track-only** label. Bench-off resources should focus on SP+LG / XFeat / GIM-LightGlue / RoMa-distilled.

**Confidence.** ✅ High. Numbers are conservative — MASt3R has additional overhead from the depth backbone that doesn't exist in pure 2D matchers.

---

## M-3 — Add GIM-LightGlue to the bench-off

**Fact.** GIM (S48): self-trained generalist matcher, 8.4–18.1 % zero-shot improvement over LightGlue/RoMa/DKM/LoFTR baselines. Pre-trained checkpoints public.

**Reference comparison.** Our domain (eastern-Ukraine 1 km AGL nadir vs. service satellite tiles) has *zero* training data publicly available; the bench-off therefore tests zero-shot transfer. GIM's training paradigm (50 h of internet videos covering every kind of scene including aerial) is precisely the regime that maximises zero-shot transfer.

**Conclusion.** Add **GIM-LightGlue** to the matcher bench-off shortlist as a peer of vanilla SP+LG. If the published 8–18 % zero-shot gain holds on AerialVL + Mavic, GIM-LightGlue dominates the cost/quality frontier (same TRT path as SP+LG, better accuracy out of the box).

**Confidence.** ✅ High. ICLR 2024 spotlight; benchmark numbers reproduced by independent users in the GitHub issue tracker.

---

## M-4 — VPR shortlist expansion: + SALAD + BoQ

**Fact.** SALAD (S47, CVPR 2024): DINOv2 + Sinkhorn optimal-transport VLAD; R@1 = 75 % on MSLS Challenge / 92.2 % MSLS Val / 76 % NordLand; in `aero-vloc`. BoQ (S46, CVPR 2024): bag of learnable queries, beats NetVLAD/MixVPR/EigenPlaces/Patch-NetVLAD/TransVPR/R2Former on 14 benchmarks; DinoV2 results Nov 2024.

**Reference comparison.** AnyLoc (draft primary) is unsupervised VLAD over DINOv2 features; SALAD is *trained* DINOv2-VLAD via Sinkhorn; BoQ is *learnable queries* over a backbone (DINOv2 or ViT). SALAD strictly beats AnyLoc on the same backbone in published benchmarks. BoQ beats both on standard VPR benchmarks; aerial-specific numbers TBD but well-positioned.

**Conclusion.** The bench-off table grows from {AnyLoc, MixVPR} to **{AnyLoc, SALAD, BoQ, MixVPR}**. AnyLoc remains the training-free fallback; SALAD and BoQ are likely primaries.

**Confidence.** ✅ High on M-4 (sources are CVPR 2024 papers + GitHub repos with published weights). Aerial-domain ranking is empirical — the bench-off resolves it.

---

## M-5 — Latency budget has more headroom than the draft assumed

**Fact.** Jetson AI Lab (S40): DINOv2-base-patch14 = 126 inf/s on Orin Nano Super → ~8 ms/inf at 224×224, FP16 trtexec.

**Reference comparison.** Draft estimated 50–80 ms / 224×224 for DINOv2 ViT-B (Component 2 row 1). Real number is **~6–10× better**. At 448×448 (more typical for AnyLoc descriptor extraction), expect ~32 ms/inf via near-quadratic scaling.

**Conclusion.** AC-4.1 (400 ms p95) is **comfortably feasible** with budget left over for SP+LG / GIM-LightGlue (target ~100 ms/pair) + EKF + MAVLink emit. R2 in the draft's risk table downgraded from High to Medium — empirical confirmation needed but no longer a make-or-break risk.

**Confidence.** ✅ High. NVIDIA L1 source.

---

## M-6 — mavlink-router CVE-class issue

**Fact.** S45: stack-based buffer overflow in mavlink-router config parsing, fuzzing-discovered, public, no SECURITY.md.

**Reference comparison.** mavlink-router is C++ daemon running with the same privileges as our companion process; if the config file is attacker-controlled (e.g., a tampered SD card on the airframe), this becomes RCE on the companion. Even if the config file is operator-controlled, a buggy config-file parser is one bug away from another related issue.

**Conclusion.** Three options, choose one:
1. **Pin a specific patched version + sandboxed systemd unit** (NoNewPrivileges, ReadOnlyPaths=/etc/mavlink-router/, MemoryDenyWriteExecute, RestrictAddressFamilies=AF_UNIX AF_INET).
2. **Replace with an in-process MAVLink endpoint multiplexer** (Python or Go, ~150 LOC) — eliminates the dependency entirely.
3. **Distinct system-IDs for MAVSDK + pymavlink** sharing the same serial port via ArduPilot's native MAVLink routing, no router daemon at all.

Option 3 is the simplest. Option 2 gives us the most control. Option 1 is the lowest-effort quick fix. Recommend **Option 3 for v1**, with Option 2 as v1.1 if MAVLink message volume saturates a single endpoint.

**Confidence.** ✅ High that the issue is real; choice of mitigation is implementation preference.

---

## M-7 — MAVLink2 signing is v1-mandatory

**Fact.** S44: signing supported in ArduPilot 4.5+ on telemetry links; USB bypasses; keys in FRAM.

**Reference comparison.** Without signing, anyone with serial-line access (companion side OR an exposed telemetry radio) can inject a `GPS_INPUT` (or ODOMETRY) frame and crash the vehicle. Signing makes that injection require possession of the FRAM key. The cost is one operator key-provisioning step per airframe.

**Conclusion.** Promote signing from "Security note (deferred to a Phase-4 security pass)" to a **v1 hard configuration item**. Document the key-provisioning procedure in the deploy runbook. Verify signing-on at boot and refuse to inject GPS_INPUT/ODOMETRY if the signed-frame ack from the FC indicates signing-off.

**Confidence.** ✅ High.

---

## M-8 — MBTiles operational recipe

**Fact.** S54: WAL + connection pool + transaction batching is the established recipe for MBTiles SQLite under concurrent reader+writer load. Default rollback journal mode causes `database is locked` failures.

**Reference comparison.** Our workload: many concurrent readers (matcher cache lookup at ≤3 fps × ~30 candidate tiles) + occasional writer (Component 1b ortho-tile write at ≤1–2 Hz × ~30 tiles). Without WAL, every writer commit blocks all readers. With WAL, readers and one writer proceed concurrently.

**Conclusion.** Update Component 1's "Tile format" row in the architecture table to specify: **MBTiles SQLite + WAL + connection pool + per-Component-1b-cycle transaction batching**. Add to AC-4.1 latency-budget validation: the tile-cache lookup must hit p95 ≤5 ms.

**Confidence.** ✅ High.

---

## M-9 — Cache-poisoning safety hazard

**Fact (analytical, not a single source).** Draft's dedup rule allows onboard tiles to overwrite stale service tiles when "our quality > existing". Quality = inlier count + sharpness; **does not include parent-pose covariance as a hard gate**. Combined with EKF over-confidence (a known failure mode — see W4.b), this lets a confidently-bad pose write a misaligned tile that becomes the next flight's anchor.

**Reference comparison.** Cartography literature consistently treats authoritative basemap as immutable and crowdsourced/UAV updates as voting input that requires consensus before promotion. SfM bundle-adjustment treats over-confident poses as the dominant error source.

**Conclusion.** Three layered mitigations:
1. Service-source tiles are **immutable within freshness budget**. Onboard tiles overwrite only stale or other-onboard tiles.
2. The Suite Service ingest applies a **voting layer**: an onboard tile gets promoted to "trusted basemap" only after **N≥2 independent flights** confirm consistent geo-alignment within X m.
3. Parent-pose covariance is a **hard gate** in the local quality score: σ_xy must be tighter than the generation-eligibility gate (e.g., σ_xy ≤ 5 m vs. 10 m generation gate), and a tile written above the hard gate is marked "soft" in its sidecar.

Add **AC-NEW-7 — Cache-poisoning safety budget**: P(onboard tile mis-aligned > 30 m) per flight < 1 %; P(misaligned > 100 m) per flight < 0.1 %. Validation: replay AerialVL with synthetic over-confidence injection.

**Confidence.** ⚠️ Medium. Hazard is real and qualitatively well-known; specific numeric thresholds need empirical calibration during implementation.

---

## M-10 — Free-threaded Python 3.13 not v1-ready

**Fact.** S55: experimental, single-threaded perf hit, GIL re-enables on non-FT-aware C extension import.

**Reference comparison.** Our hot-path includes: numba JIT kernels, TensorRT Python bindings, pymavlink (C extension), numpy/scipy, possibly cv2. Any one of these silently re-enabling the GIL nullifies the benefit. And the non-trivial single-threaded penalty (~10–15 % per various benchmarks) directly hits AC-NEW-1 (cold-start TTFF <30 s).

**Conclusion.** v1 stays on **standard CPython 3.11 or 3.12** (newest stable, well-supported by JetPack / numba / TRT). Sharpen the rationale in the architecture: the choice is not "GIL is fine" but "asyncio + TRT subprocess workers + numba JIT is the production-ready combination today; revisit free-threading in v1.1."

**Confidence.** ✅ High.

---

## M-11 — ODOMETRY known production gotchas → SITL coverage required

**Fact.** S41/S42/S43: companion-derived velocity errors, position-estimate resets when external-nav reference loss, source-switching conflicts when running alongside GPS.

**Reference comparison.** AC-NEW-2 (3 s spoofing-promotion latency) **is** the source-switching path. Whatever output channel we pick (GPS_INPUT, ODOMETRY, or hybrid), the source switch is the high-risk transition.

**Conclusion.** Add an explicit testing requirement: **F-T9 (SITL: full MAVLink loop)** must include source-switching scenarios (jam onset → our channel → spoofed real-GPS recovery → operator-confirmed source restore). Include the `EK3_SRC1_*` parameter combinations being benchmarked in the test plan.

**Confidence.** ✅ High.

---

## M-12 — Eastern-Ukraine relief amplitude affects flat-Earth assumption

**Fact.** S56: ~24 m peak-to-trough relief in Kharkiv-region UAV survey areas, with creek/gully systems.

**Reference comparison.** At 1 km AGL with 35° HFOV camera, a 24 m elevation offset at frame edge → ~17 m horizontal misalignment when ortho-projected on flat-Earth. AC-1.1 budget = 50 m@80 % (comfortable); AC-1.2 = 20 m@50 % (tight).

**Conclusion.** Add a **per-sector DEM lookup** to the pre-flight tile-sync pass. Classify sectors:
- **flat** (≤5 m amplitude) — full ortho-tile generation, full anchor weight.
- **moderate** (5–15 m) — ortho-tile generation, anchor weight × 0.7.
- **rugged** (>15 m) — skip ortho-tile generation, anchor weight × 0.3 with explicit "rugged-sector" flag in confidence telemetry.
This is a small one-time pre-flight step (SRTM 30 m DEM is free, ~15 GB global, ~30 MB for 400 km²).

**Confidence.** ⚠️ Medium. Single regional sample; refine numbers when more terrain data lands.

---

## M-13 — TartanAir V2 reconsideration (open question)

**Fact.** S51: photo-realistic synthetic, native IMU + 12-cam + season variation + custom camera models.

**Reference comparison.** User's last-message reasoning was "Mavic-class dynamics ≠ fixed-wing dynamics → synthetic IMU is unlikely to produce a useful signal". TartanAir V2 lets us configure motion patterns, so the dynamics-mismatch argument is weaker than for MidAir-class quadcopter-only sims.

**Conclusion.** **Open question for the user**: include TartanAir V2 in the bench-off as an early-stage synthetic baseline (good for sweeping seasons / lighting / pitches), or hold to "real-data-only purism" with AerialVL + Mavic + planned-fixed-wing-flights as the only V&V?

**Confidence.** ⚠️ Medium. Technical viability is high; the call is product-side.

---

## M-14 — Add AerialExtreMatch + 2chADCNN to V&V plan

**Fact.** AerialExtreMatch (S49) — 1.5 M synthetic image pairs, 32 difficulty levels (overlap × scale × pitch), real-world UAV localization subset. 2chADCNN (S50) — season-aware UAV↔satellite template-matching.

**Reference comparison.** Draft's bench-off targets are AerialVL + UAV-VisLoc + internal Mavic. None of those grade against extreme-pitch / extreme-scale / extreme-overlap separately. Without a benchmark that crosses these axes, the bench-off can pick a winner that fails silently in cornered conditions.

**Conclusion.** Add to the V&V plan:
- **AerialExtreMatch** as a primary structured-difficulty regression bench.
- **2chADCNN** as a season-aware baseline either (a) included in the bench-off, or (b) used as an explicit season-robustness ceiling reference.

**Confidence.** ✅ High.

---

## M-15 — Real fixed-wing VO is harder than draft implies

**Fact.** S52 (AFIT thesis): SVO/DSO/ORB-SLAM2 all "had significant difficulty maintaining localisation" on real fixed-wing flights. S53: high-altitude (300–1000 m AGL) VIO drift in the same band as our AC-1.3.

**Reference comparison.** Draft's choice ("custom 2-frame homography VO via Component-3 matcher") is correct framing — VO between satellite anchors is a **much easier** problem than standalone metric SLAM. But AC-1.3's drift budget (<100 m without IMU, <50 m with IMU between two satellite-anchored fixes) requires empirical confirmation against a real fixed-wing baseline.

**Conclusion.** Add to risks: **R8 — fixed-wing VO drift under our AC-1.3 budget is unconfirmed**. Mitigations:
1. Borrow AerialVL's fixed-wing trajectories (70 km of real fixed-wing flight) for AC-1.3 regression in `F-T1b` (new).
2. Plan the first internal fixed-wing flight before AC lock — not as a stretch goal.

**Confidence.** ✅ High.

---

## Summary table

| Finding | Severity | Affects | Resolution |
|---|---|---|---|
| M-1 | High | C-6, AC-4.3, AC-NEW-4 | Hybrid GPS_INPUT + ODOMETRY |
| M-2 | High | C-3 bench-off | Drop MASt3R from primary list |
| M-3 | Med | C-3 bench-off | Add GIM-LightGlue |
| M-4 | High | C-2 bench-off | Add SALAD + BoQ |
| M-5 | High (positive) | AC-4.1 | Downgrade R2 risk |
| M-6 | High (security) | C-6 | Replace mavlink-router OR sandbox & pin |
| M-7 | High (security) | C-6 | MAVLink2 signing v1-mandatory |
| M-8 | Med | C-1 | MBTiles WAL + pool + batching |
| M-9 | High (safety) | C-1b, AC-NEW | New AC-NEW-7 + dedup-rule changes |
| M-10 | Med | C-9 | Stay on CPython 3.11/3.12; sharpen rationale |
| M-11 | Med | C-5/C-6, AC-NEW-2 | Add SITL source-switching tests |
| M-12 | Med | C-1b, AC-1.2 | Per-sector DEM lookup + anchor weight |
| M-13 | Open question | datasets | Surface to user |
| M-14 | Med | V&V plan | Add AerialExtreMatch + 2chADCNN |
| M-15 | Med | C-4, AC-1.3 | Risk R8 + AerialVL F-T1b |