# Mode B Decomposition — Adversarial Assessment of `solution_draft01.md` **Mode**: B (Solution Assessment). **Question type**: Problem Diagnosis + Decision Support. **Novelty sensitivity**: **High**. Embedded CV/SLAM, ArduPilot MAVLink2 signing maturity, JetPack version, and matcher SOTA all churn fast — prefer 2024-Q4 → 2026-Q2 sources. **Goal**: per Mode B template, find weak points (functional / security / performance) per draft component and propose either a stronger alternative or an explicit mitigation. Output is `solution_draft02.md` with an "Assessment Findings" table at the top. ## Boundary - **Population**: a single fixed-wing UAV running the GPS-denied onboard pipeline, 1 km AGL, 60 km/h cruise, 8 h endurance, eastern/southern Ukraine. - **Geography**: deployed in active-conflict / contested EW environment. - **Timeframe**: deployment v1 within the next ~4–6 months from now (mid-2026). - **Level**: companion-computer code + integration. The Suite Satellite Service, the AI-camera detector, the FC firmware, and the airframe are out of scope as components but appear as interfaces under attack. ## Perspectives chosen (≥3 mandatory) 1. **Implementer / engineer** — what published Jetson Orin Nano Super numbers say about the actual latency budget, what the GIL-on-hot-path failure modes are, what is hard about TRT-deploying DINOv2-VLAD. 2. **Contrarian / devil's advocate** — every committed choice in the draft has a "why not X" answer; surface them. 3. **Domain practitioner** — what people running ArduPilot + companion CV in production have written about MAVLink2 signing, mavlink-router, GPS_INPUT injection, cross-view matchers in active service. 4. **Security / red-team** — `GPS_INPUT` is a high-trust local channel; tile cache is operationally sensitive. Realistic attack surface and mitigations. ## Weak-point sub-questions (drives Mode B web search) ### W1. Cross-view matcher commitment (Component 3) The draft pins SuperPoint+LightGlue / XFeat / MASt3R as the bench-off candidates, with 1024×768 as the working downsample. - W1.a. **Is the bench-off shortlist still current as of 2026-Q2?** Did GIM (2024), BoQ (2024), Mast3r-SfM (2025), RoMa-DC (2025), or Map-Free-Reloc 2025 leaderboard winners change the picture? - W1.b. **Is "1024×768 starting point" empirically defensible on Orin Nano Super 25 W?** Published TRT FPS / latency for SP+LG and XFeat at this resolution on the Orin Nano class. - W1.c. **Cross-view-specific failure modes at 1 km AGL** that the bench-off won't catch — illumination, season, recent-conflict landscape change. Are any matchers explicitly evaluated on temporal change? - W1.d. **Why not training-free 3D-grounded matching (MASt3R/Mast3r-SfM) as primary** instead of as stretch? What's the realistic Orin Nano latency budget for these. Query variants: "LightGlue Jetson Orin Nano benchmark 2025 2026", "SuperPoint TensorRT FP16 Orin Nano latency", "MASt3R embedded GPU benchmark", "GIM image matching cross-view 2024", "BoQ visual place recognition", "RoMa DKM aerial cross-view 2025", "image matcher seasonal change benchmark". ### W2. VPR backbone commitment (Component 2) Draft picks AnyLoc (DINOv2-VLAD) primary + MixVPR fast-lane. - W2.a. **DINOv2 ViT-B/14 latency on Orin Nano Super 25 W** — is the draft's "~50–80 ms / 224×224" empirically backed? - W2.b. **2025 SOTA**: SALAD, BoQ (Bag-of-Queries), CricaVPR — do any beat AnyLoc on aerial cross-domain at meaningful latency? - W2.c. **AnyLoc unsupervised VLAD** is training-free, but is the VLAD codebook quality stable across operational areas (Ukraine specifically)? Any published failure cases? Query variants: "AnyLoc Jetson benchmark", "DINOv2 ViT-B TensorRT FP16 latency Orin", "SALAD visual place recognition aerial 2024", "BoQ visual place recognition", "CricaVPR aerial benchmark", "VPR aerial Ukraine seasonal". ### W3. Process topology — "single Python process + asyncio + TRT subprocess workers via CUDA IPC" Draft commits to this for v1 (Component 9). - W3.a. **GIL on the hot path** — is asyncio + subprocess workers actually GIL-safe at 3 fps × 1 km AGL with all the I/O (MAVLink, FDR, tile cache lookups, EKF math)? Real-world failure stories from ArduPilot/PX4 companion-computer projects. - W3.b. **CUDA IPC for tensor handoff** — known issues on Jetson (unified memory model: is CUDA IPC even meaningful when CPU and GPU share the LPDDR5 pool)? - W3.c. **Subinterpreters / free-threaded Python (3.13+)** — is the project using a Python old enough that subinterpreters aren't an option? - W3.d. **Alternatives**: ROS 2 Humble (rejected in draft), C++ core (rejected), single-process with multiprocessing (not discussed). Query variants: "Jetson CUDA IPC unified memory", "Python asyncio CUDA real-time deadline", "Python GIL drone companion computer", "PX4 ArduPilot companion computer python production", "ROS2 vs Python single-process VIO embedded", "free-threaded Python 3.13 GPU". ### W4. Loosely-coupled EKF in Python + numba (Component 5) Draft writes its own loosely-coupled EKF, fuses IMU @ 100 Hz from FC, satellite anchors irregular, VO @ 3 Hz; emits GPS_INPUT. - W4.a. **Why not just feed `VISION_POSITION_ESTIMATE` to ArduPilot EKF3 and let the FC fuse?** Draft mentions this as "alternative" — what does the practitioner literature say about the actual cost of the dual-fusion choice? - W4.b. **EKF covariance calibration is famously fragile** (AC-NEW-4 false-position budget rides on it). Are there published gotchas for loose-coupled aerial EKF? What's the right Mahalanobis gate value? - W4.c. **numba JIT on Jetson** — JIT warmup time hurts AC-NEW-1 (cold-start TTFF <30 s). Real numbers on Jetson Orin Nano JIT compile time. - W4.d. **Heading observability** — at 1 km AGL nadir, satellite anchoring gives `(lat, lon, h)` but heading is weakly observable from a single anchor unless the matcher emits oriented features. Does the draft's matcher choice cleanly produce yaw with covariance? Query variants: "ArduPilot VISION_POSITION_ESTIMATE vs GPS_INPUT", "loose coupled EKF aerial gotcha", "EKF Mahalanobis gate visual anchor", "numba Jetson cold start", "monocular yaw observability satellite reference". ### W5. ArduPilot MAVLink2 signing + GPS_INPUT injection security (Component 6) Draft says "MAVLink2 signing recommended", treats GPS_INPUT as high-trust local channel. - W5.a. **Production maturity of MAVLink2 signing in ArduPilot 4.5+** as of 2026-Q2 — is it default-on, default-off, key-distribution story? - W5.b. **Real attack surface**: what does an attacker with serial access to the FC actually need to spoof a GPS_INPUT? Is `mavlink-router` itself an attack-surface widening? - W5.c. **Companion-side defenses** — health-gate before injecting, fix_type sanity, jam-detection from the other direction. - W5.d. **Failsafe fallback**: if our GPS_INPUT is rejected by the FC (signing fail), what does ArduPilot do — does AC-NEW-2 (3 s spoof-promotion latency) survive that? Query variants: "ArduPilot MAVLink2 signing 4.5 production", "MAVLink2 signing key distribution UAV", "ArduPilot GPS_INPUT signing", "mavlink-router security audit", "GPS_INPUT spoof companion computer attack". ### W6. In-flight ortho-tile generation residual error (Component 1b) Draft: pinhole projection → flat-Earth ground plane → resample to z=20 XYZ tiles. Eligibility gates: σ_xy ≤ 10 m, |bank| / |pitch| ≤ 10°. - W6.a. **Flat-Earth residual error in eastern/southern Ukraine** — actual relief amplitude. Steppes are not flat at 30 cm/px tile precision; agricultural fields, river valleys, ravines (yary) are common. - W6.b. **What's the per-tile geo-alignment error budget** that still keeps cross-view anchors valid against the same tile two flights later? - W6.c. **MBTiles SQLite at 10 GB scale on NVMe**: known issues with concurrent reader+writer (tile-cache miss path is concurrent with tile-write path)? Sharding strategy? - W6.d. **Dedup by (z, x, y) only** — but the onboard tile carries a parent_pose covariance. If we already overwrite a service-source tile with an "onboard" tile that was written from a 3-σ-bad pose, we've poisoned the next flight's cache. Should the dedup rule include a "trust-only" lock from the Service? Query variants: "MBTiles concurrent writer reader SQLite", "orthorectification flat earth residual error UAV", "Ukraine eastern terrain relief amplitude", "geotagged tile alignment budget cross-view localization". ### W7. Tile dedup poisoning — onboard tile overwrites service tile This is a sharper version of W6.d. - W7.a. The "highest quality wins" rule treats `match_inliers` as a proxy for geo-alignment confidence. But a confidently-bad anchor (over-confident covariance from EKF — see W4.b) writes a "high-quality" tile that's actually misaligned by 50 m. Next flight, that misaligned tile becomes the satellite anchor for *another* anchor, and the error compounds. - W7.b. **Best-practice from cartography / SfM** for trusting onboard imagery as basemap input. - W7.c. **Mitigation**: lock tiles whose source is `service` against onboard overwrite for some grace period; require onboard tiles to be "voted" by N independent flights before promotion. Query variants: "satellite tile pose error compounding", "uav generated tile basemap update sfm trust", "drone-ortho photo dedup quality score". ### W8. Mavic-class footage as deployment-domain proxy Draft uses internal Mavic flight footage as the deployment-domain V&V proxy. Mavic is a small quadcopter; the deployment platform is a fixed-wing at 1 km AGL. - W8.a. **What does the literature say** about transferring CV/VO/VPR results from quadcopter footage to fixed-wing? Camera dynamics differ (rolling shutter, vibration spectrum, frame rate, motion-blur profile, AGL band). - W8.b. **Synthetic IMU from Mavic video** — user already rejected this. But is there a non-synthetic alternative that the draft missed? E.g., MidAir (synthetic but matched dynamics), TartanAir, public ArduPilot SITL log. - W8.c. **Risk of false confidence** — ground truth is in the absolute satellite anchor, not the Mavic IMU. So how does the Mavic V&V actually validate AC-NEW-4 (false-position safety) when no fixed-wing IMU is in the loop? Query variants: "fixed wing vs quadcopter visual SLAM transfer", "drone vibration spectrum fixed-wing quad", "TartanAir aerial dataset fixed-wing". ### W9. Latency budget — is 400 ms p95 actually realistic? AC-4.1 budget. Draft acknowledges R2 ("latency budget on Orin Nano Super at 1024×768 input is tight"). - W9.a. **Real published Jetson Orin Nano Super 25 W numbers** for: DINOv2 ViT-B forward (224×224), SuperPoint+LightGlue at 1024×768, FAISS top-K over ~10⁴ vectors, EKF update at 100 Hz IMU. - W9.b. **Steady-state vs transient latency** — does the budget include EKF-output-to-MAVLink-emit overhead, MAVLink serialisation, and the FC's own gating? - W9.c. **Failure mode if budget blows** — frame-drop is allowed (AC-4.1 says ~10%) but if matcher latency tail is 600 ms, the EKF rides on VO+IMU for >2 frames, and AC-3.4 reloc trigger hits. Query variants: "DINOv2 Jetson Orin Nano TensorRT FP16 ms", "LightGlue Jetson benchmark FPS 1024", "FAISS Jetson IVF latency". ### W10. AC-NEW-4 false-position safety — Monte Carlo validation realism P(error >500 m) <0.1%, P(error >1 km) <0.01%. - W10.a. **What's the standard practice** for validating these probabilities at this magnitude? You need >10⁴ frames of independent failure modes — does the AerialVL + Mavic dataset cover that? - W10.b. **What does the literature say** about cross-view matcher tail behavior — do failures cluster on specific scene types (forest, repetitive cropland, water, glare)? If yes, dataset bias is the killer. - W10.c. **EKF-side gating** — Mahalanobis gate is the right tool, but the gate threshold itself is a per-environment tuning parameter. Is there a published recipe? Query variants: "visual localization tail probability >1km", "cross-view matcher failure clustering forest cropland water", "aerial visual SLAM Monte Carlo safety budget". ### W11. Cold-start TTFF <30 s feasibility AC-NEW-1. - W11.a. **TRT engine warm-up cost** on Jetson Orin Nano Super for SP+LG + DINOv2 + EKF JIT. Real numbers. - W11.b. **FAISS index load + mmap warm**: 10 GB tile cache, IVF over ~10⁵ tile vectors — load time on NVMe. - W11.c. **First valid GPS_INPUT** path includes: IMU-extrap-from-FC, first frame, VPR retrieve, matcher run, PnP, EKF init, GPS_INPUT emit. Anyone published an end-to-end cold-boot number for this kind of stack on Orin? Query variants: "TensorRT engine load time Jetson", "FAISS mmap warm 10GB", "Jetson companion computer cold boot time GPS substitute". ### W12. Imagery freshness reality check — Suite Satellite Service refresh cadence AC-8.2 + AC-NEW-6: <6 months for active sectors, <12 months for stable. - W12.a. **Is a 6-month refresh actually achievable** for Maxar Vivid / Pléiades Neo / Pléiades over Ukraine in 2026-Q2? Tasking lead time + cloud-cover acceptance + delivery channel. - W12.b. **Practitioner reports** on what 30 cm Ukraine 2024–2025 imagery actually looks like (smoke, glare, seasonal mismatch, cratering). - W12.c. **In-flight tile generation** is meant to backfill — but the Service still needs ground-truth tasking to seed the cache for any new operational area before the *first* flight. Is there a chicken-and-egg problem for first deployment to a new sector? Query variants: "Maxar Vivid Ukraine 2025 refresh tasking", "Pleiades Neo Ukraine cloud cover lead time", "30cm satellite imagery refresh cadence active conflict". ### W13. Resource contention — 8 GB shared LPDDR5 budget AC-4.2 = <8 GB shared. Draft loads: - DINOv2 ViT-B TRT engine (~600 MB GPU) - SP+LG TRT engine (~hundreds of MB) - FAISS index over 10⁵ tile descriptors - Tile cache mmap (10 GB on disk, mmap to RAM via OS page cache) - EKF state + IMU ring buffer - Python interpreter + asyncio loop + JIT'd numba kernels - MAVSDK + pymavlink - W13.a. **Realistic peak RSS** for this stack — is the 8 GB budget headroom or is it a tight squeeze? - W13.b. **JetPack 6.2 / Ubuntu 22 baseline RAM** consumed before our process even starts. - W13.c. **Mitigation**: page out the FAISS index, swap, or pin everything? Query variants: "Jetson Orin Nano 8GB shared budget DINOv2 LightGlue", "JetPack 6.2 base RAM usage", "FAISS pinned memory Jetson". ## Completeness audit Probes (per `references/comparison-frameworks.md` decomposition probes): | Probe | Covered by | Notes | |---|---|---| | **Cost of failure / blast radius** | W5 (signing), W7 (tile poisoning), W10 (false-position) | three-way coverage of safety budget | | **Time-to-first-result** | W11 | dedicated to TTFF | | **Operating envelope** | W6 (terrain), W12 (freshness), W13 (memory), W9 (latency) | thermal already in AC-NEW-5 | | **Maintenance cost** | W3 (Python topology), W4 (EKF code we own) | both addressed | | **Substitutability of components** | W1 (matcher), W2 (VPR), W3 (process topology), W4 (EKF) | each component has ≥1 alternative-path question | | **Adversarial / red-team** | W5, W7, W10 | covered | | **Data-distribution bias** | W8, W10.b, W12 | covered | | **Hardware-supply-chain risk** | not covered | Orin Nano Super availability is a project-management risk, not a design risk; deferred to Plan | ## Output plan 1. Source registry → append Mode B sources to `01_source_registry.md` as IDs `S40+`. 2. Fact cards → append Mode B facts to `02_fact_cards.md` under "Mode B Findings". 3. Mode B reasoning chain → write `04_reasoning_chain_mode_b.md`. 4. Validation log → write `05_validation_log_mode_b.md`. 5. Final deliverable → write `_docs/01_solution/solution_draft02.md` using `templates/solution_draft_mode_b.md`.