Files
gps-denied-onboard/_docs/00_research/02_fact_cards.md
T
Oleksandr Bezdieniezhnykh 48dd81ee0f Enhance skill discipline and clarify acceptance criteria and restrictions
Updated the meta-rule document to emphasize strict adherence to skill instructions, prohibiting unnecessary investigations or external checks. Revised acceptance criteria and restrictions to correct communication protocol details for ArduPilot and iNav, ensuring clarity on external-positioning interfaces. Adjusted autodev state to reflect ongoing research phase and updated sub-step details for improved tracking.
2026-05-07 06:09:37 +03:00

64 KiB
Raw Blame History

Fact Cards

Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in 01_source_registry.md. Confidence labels: High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), Low (L3/L4 inferential).

Bound to sub-questions in 00_question_decomposition.md. Many SQ6 facts also bind directly to the Project Constraint Matrix (acceptance_criteria.md / restrictions.md); per the engine's "Per-Mode API Capability Verification" rule, MAVLink/MSP messages are treated as candidate modes and are bound Pass/Fail/Verify/N/A against numbered ACs and restrictions.


SQ6 — ArduPilot Plane vs iNav external positioning

  • Statement: ArduPilot's AP_GPS_MAV driver (master) decodes MAVLINK_MSG_ID_GPS_INPUT and stores the resulting state into the GPS slot identified by gps_id. Decoded fields: lat/lon (degE7), alt (mm → cm internally), hdop/vdop, velocity (vn/ve/vd cm/s), speed/horizontal/vertical accuracy (m / m/s), yaw (cdeg, 0 sentinel = "not provided"). Honors ignore_flags for ALT/HDOP/VDOP/VEL_HORIZ/VEL_VERT/SPEED_ACCURACY/HORIZONTAL_ACCURACY/VERTICAL_ACCURACY. Requires fix_type ≥ 3 and time_week > 0 for jitter-corrected timestamping.
  • Source: Source #4 (AP_GPS_MAV.cpp master), Source #1 (Plane Non-GPS Navigation docs)
  • Phase: Phase 2
  • Target Audience: ArduPilot Plane operators / developers
  • Confidence:
  • Related Dimension: C8 (FC adapter), C5 (estimator covariance contract)
  • Fit Impact: supports selection — ArduPilot side of AC-4.3 is satisfied by GPS_INPUT as the primary external-positioning message; covariance fields (horiz_accuracy, vert_accuracy, speed_accuracy) are wired through.

Fact #2 — ArduPilot's covariance honesty (AC-NEW-4) is enforced via the horiz_accuracy field of GPS_INPUT

  • Statement: When GPS_INPUT_IGNORE_FLAG_HORIZONTAL_ACCURACY is unset, AP_GPS stores packet.horiz_accuracy into state.horizontal_accuracy and sets state.have_horizontal_accuracy = true. EKF3's quality chain consumes this via (a) ground-stationary 3 m drift check (_gpsCheckScaler-modulated), (b) innovation gating (POS_I_GATE/VEL_I_GATE), (c) soft de-weighting via EK3_GLITCH_RADIUS (PR #24135). Under-reporting horiz_accuracy defeats these gates — exactly the AC-NEW-4 risk the project flagged.
  • Source: Source #4, Source #23 (PR #24135), Source #24 (AP_NavEKF3 master)
  • Phase: Phase 2
  • Target Audience: System designers writing the C5 estimator → C8 adapter
  • Confidence: (source code + L1 docs); ⚠️ for the precise innovation-gate mechanics (deferred to design-phase SITL tuning)
  • Related Dimension: C5 covariance, AC-NEW-4
  • Fit Impact: architectural constraint — the C5 estimator MUST publish honest horiz_accuracy (not optimistic) for AP's EKF3 quality chain to function. Aligns directly with AC-1.4 / AC-NEW-4.

Fact #3 — ArduPilot supports runtime EKF source-set switching from companion via MAV_CMD_SET_EKF_SOURCE_SET

  • Statement: EKF3 supports up to three source sets (EK3_SRC1..3_*). A companion can request a switch by sending MAV_CMD_SET_EKF_SOURCE_SET. Alternative paths: RC aux-switch option 90 ("EKF Pos Source"), Lua scripts (e.g., ahrs-source.lua). Caveat from L1 docs: "no GCSs are currently known to implement this" — companion-driven switching works at the firmware level but is not exposed in stock GCS UIs.
  • Source: Source #2, Source #3
  • Phase: Phase 2
  • Target Audience: System designers handling AC-NEW-2 spoof-promotion path on ArduPilot
  • Confidence:
  • Related Dimension: C8 + AC-NEW-2
  • Fit Impact: supports selection — AP allows the project to model two source sets (set 1 = real GPS, set 2 = onboard GPS_INPUT) and switch automatically. Keeps companion lightweight; switching does not require the companion to suppress real-GPS itself.

Fact #4 — ArduPilot ODOMETRY-velocity-only fusion is currently NOT supported (open enhancement)

  • Statement: Issue #23485 confirms current limitation: feeding ODOMETRY without position causes EKF position-estimate timeout / failsafe. Implication: the project's visual_propagated mode (VO drift between satellite anchors, no global position) cannot be expressed as ODOMETRY-velocity-only on current AP — must be sent as a full GPS_INPUT with covariance widened to reflect drift uncertainty.
  • Source: Source #8
  • Phase: Phase 2
  • Target Audience: System designers
  • Confidence: (open enhancement, open as of accessed date)
  • Related Dimension: C5 + C8 + AC-1.3 (visual_propagated label) + AC-1.4 (covariance ellipse)
  • Fit Impact: architectural constraintvisual_propagated and dead_reckoned labels both ride GPS_INPUT with growing horiz_accuracy, NOT a separate ODOMETRY channel. Single-message contract = simpler. AC-NEW-8 thresholds (horiz_accuracy = 999.0 for "no fix") map directly.

Fact #5 — iNav firmware (master, post-9.0) has NO inbound MAVLink handler for any external-positioning message

  • Statement: Authoritative inbound switch in src/main/telemetry/mavlink.c::processMAVLinkIncomingTelemetry (master) handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub reply), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only MAV_CMD_DO_REPOSITION), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. No GPS_INPUT, VISION_POSITION_ESTIMATE, ODOMETRY, GLOBAL_POSITION_INT, or GPS_RAW_INT are accepted as inputs. Wiki page (Source #10) confirms: "Limited command support: Commands that are not implemented are ignored."
  • Source: Source #9 (master code), Source #10 (wiki, edited 2025-12-11)
  • Phase: Phase 2
  • Target Audience: System designers + AC-4.3 author
  • Confidence:
  • Related Dimension: C8, AC-4.3
  • Fit Impact: DISQUALIFIES the literal AC-4.3 wording ("the standard external-positioning message type(s) accepted by ArduPilot AND iNav"). No single MAVLink external-positioning message is accepted by both FCs. Project must adopt a per-FC adapter design and AC-4.3 must be revised to acknowledge two transports.

Fact #6 — iNav accepts external GPS injection via two MSP paths; MSP2_SENSOR_GPS is the covariance-rich path

  • Statement: MSP_SET_RAW_GPS (201) (legacy MSP1, 14 bytes): fixType, numSat, lat, lon, alt (m, internal cm), speed (cm/s). No covariance, no per-axis velocity, no yaw. MSP2_SENSOR_GPS (7939, MSPv2 sensor plugin): instance, gpsWeek, msTOW, fixType, satellitesInView, hPosAccuracy (mm), vPosAccuracy (mm), hVelAccuracy (cm/s), hdop, lat, lon, mslAltitude (cm), nedVelNorth/East/Down (cm/s), groundCourse (cdeg×100), trueYaw (cdeg×100), date+time. Routes through mspGPSReceiveNewData() via GPS_PROVIDER_MSP. Requires build flag USE_GPS_PROTO_MSPenabled by default in iNav's target/common.h, so stock firmware reaches this path.
  • Source: Source #12 (MSP message reference, master), Source #13 (target/common.h master + gps.c provider table)
  • Phase: Phase 2
  • Target Audience: System designers (C8 adapter, MSP transport)
  • Confidence:
  • Related Dimension: C8, C5 covariance contract
  • Fit Impact: supports selection of MSP2_SENSOR_GPS for the iNav adapter. Covariance fields (hPosAccuracy, vPosAccuracy, hVelAccuracy) align semantically with GPS_INPUT.horiz_accuracy / vert_accuracy / speed_accuracy, but unit conversions differ (mm vs m). The C8 adapter must therefore be FC-aware, not protocol-monomorphic.

Fact #7 — iNav does NOT support dual-GPS arbitration; companion must be the SOLE GPS source

  • Statement: Issue #10141 is an open feature request for dual-GPS support. Current iNav (master incl. 9.0.x) has single-GPS architecture with one UART selected as the GPS port. There is no primary/secondary failover and no per-instance arbitration in the nav stack.
  • Source: Source #14
  • Phase: Phase 2
  • Target Audience: System designers (architecture)
  • Confidence:
  • Related Dimension: C8, C5, AC-NEW-2 (spoof promotion)
  • Fit Impact: architectural constraint — on iNav, real GPS receivers must NOT be wired directly to the FC. Real GPS goes to the companion; the companion fuses (or rejects) it and emits the single iNav-facing feed via MSP2_SENSOR_GPS (or via a UBX-emulation UART). AC-NEW-2 latency on iNav = companion's internal reaction time only; iNav does not participate in source switching at all.

Fact #8 — iNav explicitly does NOT validate GPS for spoofing; anti-spoofing is fully the companion's responsibility

  • Statement: iNav's docs/GPS_fix_estimation.md states verbatim: "Not a solution for GPS spoofing (GPS output is not validated in INAV)." Combined with Fact #7, the architectural conclusion on iNav: companion = anti-spoofing oracle + nav-camera estimator + IMU-propagation source, all collapsed into the single MSP2_SENSOR_GPS feed.
  • Source: Source #15
  • Phase: Phase 2
  • Target Audience: System designers; AC-NEW-2 / AC-3.5 / AC-NEW-8 owners
  • Confidence:
  • Related Dimension: AC-NEW-2, AC-3.5, AC-NEW-8
  • Fit Impact: supports selection of "companion as iNav's only GPS"; disqualifies any architecture that relies on iNav-side spoof detection for AC-NEW-2 reaction.

Fact #9 — iNav dead-reckoning has documented stability bugs under intermittent feeds; AC-NEW-8 must avoid letting iNav enter dead-reckoning

  • Statement: Issue #10588 documents porpoising and motor-burst behaviour during intermittent GPS outages on iNav fixed-wing dead-reckoning. The community recommendation captured in the issue: "GPS should be rejected if providing erroneous coordinates rather than no fix." inav_allow_dead_reckoning (default OFF) and inav_allow_gps_fix_estimation (default OFF) are both fixed-state booleans — entering dead-reckoning mid-flight is a discrete transition, not a smooth degrade.
  • Source: Source #15, Source #16 (Settings.md), Source #17 (#10588)
  • Phase: Phase 2
  • Target Audience: System designers; AC-NEW-8 owner
  • Confidence: for setting names; ⚠️ for severity of stability bug (single open issue)
  • Related Dimension: AC-NEW-8, AC-3.5, C8
  • Fit Impact: architectural constraint — on iNav, the AC-NEW-8 path must keep emitting MSP2_SENSOR_GPS with growing hPosAccuracy rather than letting the feed drop and iNav switch to dead-reckoning. The "no fix" semantics on iNav must be expressed via fixType field of MSP2_SENSOR_GPS (not by silence). The horiz/vert accuracy fields are the only signal available; iNav has no equivalent of the AP horiz_accuracy = 999.0 "no fix" sentinel — must verify which fixType enum values iNav treats as no-fix.

Fact #10 — iNav supports UBX-only over UART (NMEA dropped in 7.0); UBX emulation is a viable third transport

  • Statement: iNav 7.0 removed NMEA. Currently supports u-blox UBX protocol with version ≥ 15.00 in 9.0+. Recommended physical receivers: u-blox M8/M9/M10. Companion can implement a UBX-emulation writer on the iNav GPS UART (NAV-PVT mandatory; NAV-DOP optional). UBX carries hAcc/vAcc/headAcc/velocity components — covariance honesty preserved.
  • Source: Source #11 (iNav GPS-and-Compass-setup wiki)
  • Phase: Phase 2
  • Target Audience: System designers (transport-choice)
  • Confidence: for UBX-only; ⚠️ for "minimum NAV-* set" — the canonical U-blox protocol spec (Source filed in agent-tools as fd8513f8-...txt) plus iNav's gps_ublox.c drive the precise message set; this is a follow-up search before final selection.
  • Related Dimension: C8 transport choice
  • Fit Impact: alternate candidate, NOT YET SELECTED — UBX path bypasses MSP queueing/arbitration concerns and treats the companion as a normal GPS to iNav. Trade-off: implementation cost (UBX writer + correct ACK behaviour) vs. MSP path (already-designed wire format, but iNav-specific).

SQ6 — Conclusions (working summary, will be re-checked at Step 7.5)

Per-FC adapter design is unavoidable (single-message AC-4.3 wording is unsatisfiable)

FC Inbound external-positioning transport Message Covariance fields Per-axis velocity Yaw Source-switching from companion
ArduPilot Plane MAVLink (TELEM/USB/UDP serial) GPS_INPUT (id 232) — primary horiz_accuracy, vert_accuracy, speed_accuracy (m/m·s⁻¹) vn, ve, vd (cm/s) yaw cdeg, 0 = not provided MAV_CMD_SET_EKF_SOURCE_SET (FW supports; stock GCS UIs do not — companion-driven OK)
iNav MSP2 (UART/USB) MSP2_SENSOR_GPS (id 7939) — primary candidate hPosAccuracy mm, vPosAccuracy mm, hVelAccuracy cm/s nedVelNorth/East/Down cm/s trueYaw cdeg×100 N/A — iNav has single-GPS arch; companion = sole GPS source
iNav alt 1 MSP1 MSP_SET_RAW_GPS (id 201) — rejected for production none none none N/A
iNav alt 2 UART UBX emulation (NAV-PVT etc.) — alternate candidate, requires NAV- subset verification* UBX hAcc/vAcc/headAcc mm/cm/scale NED in NAV-PVT yes N/A

Selection (preliminary, pending Step 7.5 component-fit gate):

  • AP path: GPS_INPUT — Selected (lead).
  • iNav path: MSP2_SENSOR_GPS — Selected (lead). UBX-emulation kept as fallback if MSP2_SENSOR_GPS proves rate-limited or quality-flag-lossy.

AC / Restriction binding (per-mode, Per-Mode API Capability Verification rule)

Numbered AC / Restriction AP GPS_INPUT iNav MSP2_SENSOR_GPS iNav MSP_SET_RAW_GPS
AC-1.4 (95% cov + source label {satellite_anchored, visual_propagated, dead_reckoned}) Pass (horiz_accuracy carries 95% covariance proxy; source label is companion-side metadata, not in MAVLink — emit via STATUSTEXT/NAMED_VALUE_FLOAT) Pass (hPosAccuracy = covariance proxy; same off-band source-label channel) Fail (no covariance field → cannot publish 95% ellipse)
AC-NEW-4 (false-position safety budget; covariance honesty) Pass (de-weighted via EK3_GLITCH_RADIUS if covariance is honest) Verify (need to confirm iNav nav-stack actually uses hPosAccuracy for outlier handling — pre-Step-7.5 follow-up) Fail
AC-NEW-2 (<3 s p95 spoof promotion) Verify via SITL (MAV_CMD_SET_EKF_SOURCE_SET round-trip latency under load) Pass by architecture (companion is sole GPS, no FC-side switch needed) Pass-by-arch but Fails AC-1.4
AC-NEW-8 (visual-blackout + spoofed GPS failsafe; covariance growth + degraded fix levels) Pass (fix_type 0/1/2 + horiz_accuracy=999.0 documented sentinel maps to AC-NEW-8 thresholds) Verify (iNav's fixType enum mapping for "no fix" — pre-Step-7.5 follow-up) Fail (no graceful degrade signal)
AC-3.5 (label switch within ≤1 frame OR ≤400 ms; reject spoofed GPS as input) Pass by architecture (EKF source switch + STATUSTEXT) Pass by architecture (companion suppresses spoofed-GPS contribution upstream) Pass-by-arch but Fails AC-1.4
AC-4.3 (FC accepts the chosen messages) Pass Pass (default build, USE_GPS_PROTO_MSP on) Pass but Fails AC-1.4 — discard
Restriction "Supported FCs: ArduPilot, iNav (both via standard MAVLink)" Pass Fail of "via standard MAVLink" — restriction's literal wording is incorrect because iNav has no inbound MAVLink external-positioning. The restriction must be revised to "ArduPilot via MAVLink GPS_INPUT; iNav via MSP2_SENSOR_GPS". n/a

Required AC / Restrictions edits flagged for user review

  1. AC-4.3 — current text says "the standard external-positioning message type(s) accepted by ArduPilot and iNav". Reality: no single message type is accepted by both. Proposed revision (outcome-shaped, IEEE-830-style): "WGS84 coordinates are delivered to each supported FC via that FC's documented external-positioning interface — MAVLink GPS_INPUT for ArduPilot Plane, MSP2 MSP2_SENSOR_GPS for iNav. Honest covariance is carried in the field each FC uses for outlier rejection (under-reported covariance is a defect — see AC-NEW-4). Source-label semantics per AC-1.4 are emitted out-of-band (FC-appropriate STATUSTEXT / NAMED_VALUE_FLOAT / equivalent)."
  2. Restriction "Communication protocol (pinned): MAVLink for both FC and GCS" — incorrect for iNav. Proposed revision: "Communication protocol: MAVLink for ArduPilot Plane and for QGroundControl GCS; MSP2 for iNav (UART or USB transport). MAVLink remains the GCS-facing protocol for both FCs." (iNav still emits MAVLink telemetry outbound to QGC; this is preserved.)
  3. AC-NEW-2 — keep numerical budget (<3 s p95) but split per-FC validation: ArduPilot validation = SITL round-trip of MAV_CMD_SET_EKF_SOURCE_SET from companion under spoof injection; iNav validation = companion-internal reaction time (companion-only metric — iNav doesn't participate).
  4. AC-NEW-8 — language "fix-quality 2D fix or worse when covariance > 100 m" maps to GPS_INPUT.fix_type for AP. iNav's fixType enum mapping (per gpsFixType_e in iNav's enums-reference) must be confirmed at design time before this AC is testable on iNav.

Open follow-up probes (deferred to SQ8 + design phase, NOT blocking SQ6 closure)

  • (SQ8) Confirm the precise MAVLink message + field set ArduPilot exposes for spoofing/jamming integrity reports (PR #2110 merged, but GPS_RAW_INT in current published common.xml shows no spoofing bits — likely lives in a sibling message such as GPS_INTEGRITY). This is the FC→companion direction needed for AC-NEW-2's input side and AC-3.5's spoofing detection.
  • (SQ8) UBX-emulation minimum NAV-* subset for iNav 9.0 (UBX ≥ 15.00). Authoritative inputs: U-blox protocol spec (cached) + iNav gps_ublox.c (cached). Output a "minimum companion-side UBX writer" definition.
  • (design) SITL parameter sets for both FCs for AC-NEW-2 / AC-NEW-8 validation. Out of research scope.
  • (design) Verify iNav nav-stack consumption of MSP2_SENSOR_GPS.hPosAccuracy for outlier handling (read src/main/io/gps_msp.c / mspGPSReceiveNewData in design phase, not research phase).

Boundary check: this SQ6 is saturated for the architectural decision

Saturation signals observed: ArduPilot side covered by L1 docs + L1 source code; iNav side covered by L1 source code (master) + L1 wiki (edited 2025-12-11) + L1 release notes (8.0/9.0). Three independent rounds of search yielded the same architectural conclusion (no inbound external-positioning MAVLink on iNav). Last queries returned no novel facts. Per references/source-tiering.md "Search saturation rule" → SQ6 is closed pending the SQ8 follow-up probes above; user decision required on the AC/restriction edits before further architectural work.


SQ1 — Existing / competitor GPS-denied UAV navigation systems

Fact #11 — Twist Robotics OSCAR is a deployed Ukrainian peer system in the same architectural class as this project

  • Statement: Twist Robotics (Ukraine) has a fielded camera + map-matching navigation module called OSCAR (Optical System of Coordinates with Automatic Relocalisation). The vendor states the system "captures the terrain, identifies landmarks, compares them with a map, determines coordinates, and transmits them to the autopilot as a reliable GPS signal" — the same five-stage architecture this project is building. Vendor-stated specs: ≤20 m accuracy without cumulative error, day/night/fog operation, and operational deployment of "more than 500,000 km across 25,000 combat missions over 24 months". Hardware includes active cooling, indicating a non-trivial onboard compute (likely Jetson-class). No public independent benchmark of the 20 m number.
  • Source: Source #25, Source #26
  • Phase: Phase 2
  • Target Audience: System architects + AC owners (existence-of-peer evidence, not implementation guide)
  • Confidence: for "deployed at scale on Ukrainian combat platforms"; ⚠️ for "20 m accuracy" (vendor self-report); for "fully resistant to spoofing and jamming" (claim not independently verified)
  • Related Dimension: SQ1, SQ8 (anti-spoofing claim audit), SQ9 (synthesis — ours must beat or at least match this in the operational regime)
  • Fit Impact: establishes feasibility floor — a Ukrainian peer is operating a similar architecture against the same threat environment our system targets. Project framing must explicitly differentiate (e.g., 1 km AGL vs unspecified OSCAR altitude; 8 h endurance vs unspecified OSCAR endurance; AC-NEW-4 honest covariance contract vs OSCAR's unspecified covariance reporting).

Fact #12 — Auterion Artemis is a production-shipping fixed-wing one-way attack drone with Ukraine-validated GPS-denied navigation, defining the production benchmark for this class

  • Statement: Auterion completed the US Defense Innovation Unit Artemis program in October 2025, delivering a Shahed-class deep-strike drone with up to 1,000-mile range and up to 40 kg warhead, running on Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. Manufacturing is being established in US, UA, and DE; Auterion is offering the system to the US Department of War and allied nations.
  • Source: Source #31; Source #32 confirms Skynode S sibling architecture (NPU-equipped companion).
  • Phase: Phase 2
  • Target Audience: System architects (production-pattern reference)
  • Confidence:
  • Related Dimension: SQ1 (closest commercial production peer), SQ9 (architecture template)
  • Fit Impact: establishes production reference architecture — companion-class autopilot + visual navigation + terminal guidance is shipping at production scale to a US defense customer. Implication: building a per-FC adapter (project decision in SQ6) is consistent with what production stacks already do; integrating against the Artemis architecture is realistic; competing on price + Ukraine-specific operational tuning + AC-NEW-4 honest-covariance contract is a viable differentiation.

Fact #13 — Vantor Raptor is a production COTS visual-GPS-replacement software suite, demonstrating that "branded sat-tile basemap + on-drone vision software" is a viable commercial pattern

  • Statement: Vantor Raptor product family (Guide / Sync / Ace) provides vision-based GPS replacement using the drone's existing camera plus Vantor's "100 million-plus sq km of highly accurate 3D terrain data" (Vivid Terrain, vendor-stated 3 m accuracy). Vendor-demonstrated absolute accuracy: <7 m in all dimensions for aerial position (Guide), <3 m for ground coordinate extraction (Sync, Ace). Works at night and at low altitudes. Platform-agnostic, deployable on commodity hardware, integrates with existing onboard cameras. Inertial Labs has published a VINS-integrated Raptor Guide white paper. Recent partnerships: Niantic Spatial (Dec 2025) for unified air-to-ground positioning in GPS-denied areas; Maxar partnership with AIDC (Sep 2025) for Taiwan UAV resilience against GPS interference.
  • Source: Source #30
  • Phase: Phase 2
  • Target Audience: Architecture / business decision-makers (build-vs-buy framing)
  • Confidence: for product existence + claimed accuracy bounds (vendor primary); ⚠️ for whether Vantor's commercial accuracy figures hold under the project's specific Ukrainian-steppe + active-conflict-tile-staleness conditions
  • Related Dimension: SQ1 (commercial), C2/C3 (commercial alternatives to building ourselves), SQ8 (basemap as a service vs offline cache)
  • Fit Impact: build-vs-buy lens — Raptor Guide's <7 m claim is better than the project's AC-1.1 budget (≤80 m / 95% under AC-1.1.1), so it's not a disqualifier on accuracy. Reasons we still build vs buy: (a) Vantor is a US vendor; export / dual-use licensing into the Ukrainian battlefield is uncertain; (b) restrictions specify offline cache from the project's own Azaion Suite Satellite Service (AC-2.x), not Vantor's Vivid Terrain — replacing the basemap is non-negotiable; (c) covariance honesty contract (AC-NEW-4) and source-label contract (AC-1.4) are project-specific and may not be exposed by Vantor's API. Outcome: keep Raptor as a competitive comparator in solution_draft01, NOT as a candidate component to integrate.

Fact #14 — snktshrma/ngps_flight (NGPS — ArduPilot GSoC 2024) is the closest open-source pipeline match to this project's exact C1+C2+C3+C5+C8 stack

  • Statement: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: ap_ngps_ros2 (visual geo-localization at 12 Hz by matching live camera frames to georeferenced satellite imagery using LightGlue + SuperPoint, deep-learning-based feature matching), ap_ukf (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates), ap_vips (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF (per related ArduPilot issue #23471, this is via VISION_POSITION_ESTIMATE requiring EKF source-set 2/3 with EK3_SRC*_POSXY=Vision). Project is published under ArduPilot's GSoC 2024 program. Sibling ap_nongps is an earlier OpenCV-based prototype.
  • Source: Source #33
  • Phase: Phase 2
  • Target Audience: Implementer / Engineer
  • Confidence: for project existence, component breakdown, and matcher choice (LightGlue+SuperPoint); ⚠️ for runtime behaviour under our exact constraints (Jetson Orin Nano, 1 km AGL, 17 m/s, 3 fps); for production hardening / covariance honesty / spoof-defence (none documented)
  • Related Dimension: SQ1 (closest open-source peer), SQ2 (canonical pipeline confirmation), SQ3+SQ4 (architectural template for component candidate matrix), SQ6 (alternate AP transport debate)
  • Fit Impact: architectural template — confirms the project's split (C1 VIO ↔ C2/C3 visual absolute ↔ C5 fusion ↔ C8 FC adapter) is canonical, not novel. Two concrete deltas:
    1. Transport choice on AP: NGPS uses VISION_POSITION_ESTIMATE. SQ6 picked GPS_INPUT because it carries horiz_accuracy directly, supports source-set switching via MAV_CMD_SET_EKF_SOURCE_SET, and avoids EKF-source-set reconfiguration. The trade-off (NGPS's path vs SQ6's pick) must be re-examined at design time before final AP-transport selection.
    2. Estimator choice: NGPS uses UKF; SQ3/SQ4 will compare UKF vs ESKF vs MSCKF vs factor-graph (GTSAM) on the same matrix.

Fact #15 — RGB satellite-image matching as a low-altitude (<25 m AGL) localization technique is unreliable per the SPRIN-D Challenge; our 1 km AGL operates in the regime where the same authors note it "works reasonably well"

  • Statement: The CTU Prague team's SPRIN-D winning paper directly states: "Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes." (referring to <25 m AGL). The paper's related-work review separately notes that "high-altitude matching... works reasonably well, but at low altitudes (25 m) the viewpoint differs drastically, making roofs, facades, and vegetation inconsistent with satellite imagery." The project operates at ≤1 km AGL — which is the high-altitude regime in the paper's terminology — making RGB sat-matching the appropriate technique class. The paper's CPU-only winning method (LiDAR heightmap-gradients + clustered particle filter) is not transferable to our hardware: our project has no LiDAR.
  • Source: Source #28
  • Phase: Phase 2
  • Target Audience: Implementer / Engineer + Domain expert
  • Confidence:
  • Related Dimension: SQ1, SQ5 (failure modes), SQ2 (canonical pipeline)
  • Fit Impact: disambiguates a potentially-disqualifying lesson — the CTU paper's "RGB sat-matching is unreliable" finding does NOT disqualify our approach because the failure was caused by low-altitude viewpoint mismatch, which our 1 km AGL regime does not have. This must be cited explicitly in solution_draft01 to pre-empt the natural objection from anyone who reads the paper. Separately, the CTU paper's specific lessons are still binding: VIO degrades catastrophically without IMU vibration isolation; magnetometer is unreliable near steel/concrete; "ability to recover from periods of high uncertainty and re-localize" matters more than instantaneous RMSE — this last lesson is a direct architectural input for AC-NEW-2 / AC-NEW-8.

Fact #16 — RTAB-Map and ORB-SLAM3 both fail beyond 1 km / above 2 m/s flight in the SPRIN-D environment; our cruise profile (≤17 m/s, kilometers between satellite anchors) explicitly excludes both as primary candidates

  • Statement: The SPRIN-D paper states: "We tested state-of-the-art visual SLAM systems such as RTAB-Map and ORB-SLAM3 in a high-fidelity simulator, and found that both performance degraded significantly in a long-range scenario (beyond 1 km), as their memory and compute demands grow with the size of the environment. Moreover, RTAB-Map was unable to maintain quality odometry in faster flight speeds (beyond 2 m/s), while ORB-SLAM3 suffered from tracking loss in textureless areas."
  • Source: Source #28
  • Phase: Phase 2
  • Target Audience: Implementer / Engineer (component selection for C1)
  • Confidence:
  • Related Dimension: SQ1, SQ3+SQ4 component C1 (VO/VIO), SQ5 (failure modes)
  • Fit Impact: prunes the C1 candidate landscape — RTAB-Map and ORB-SLAM3 should not be pursued as C1 leads. Plausible C1 leads remain: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline (KLT + RANSAC homography). NGPS (Fact #14) uses ap_vips = OpenVINS-class VIO — confirming an aligned community choice. Final C1 selection happens in SQ3+SQ4.

Fact #17 — DSMAC + TERCOM lineage: pre-cached scene matching for downward-looking navigation is a 40+ year deployed technique class with documented sub-10 m terminal accuracy

  • Statement: DSMAC (Digital Scene Matching Area Correlator) is an autonomous missile-guidance system based on area correlation of sensed downward-camera ground scenes against pre-stored reference imagery (often satellite reconnaissance). It achieves 310 m terminal accuracy by correlating buildings, road intersections, and distinctive terrain landmarks. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight + DSMAC for terminal guidance reduces CEP from ~30 m to "only meters". Documented combat record: 1991 Gulf War, >80% of 280 launched Tomahawks hit target. Recent miniaturisation: Destinus Ruta (300 km strike-class) is integrating UAV Navigation's (Spanish, Grupo Oesía) DSMAC-class system, validated in Ukrainian combat conditions including GNSS-denied / jamming / spoofing.
  • Source: Source #36, Source #27
  • Phase: Phase 2
  • Target Audience: Domain expert + Decision-maker
  • Confidence: for the lineage and Tomahawk performance numbers (DTIC + open-source); ⚠️ for the Ruta-specific "DSMAC operating principle" inference (Defense Express analyst inference, not vendor disclosure)
  • Related Dimension: SQ1 (lineage), SQ8 (baseline accuracy expectations for AC-1.1.1 80 m / AC-NEW-4 false-position budget)
  • Fit Impact: establishes baseline accuracy expectations — the technique class has documented sub-10 m accuracy in the cruise-missile-terminal regime. Our budget (AC-1.1.1: <80 m at 1 km AGL with ≥0.5 m/px tiles) is loose by comparison, indicating that the AC budget is not aggressive against the technique-class baseline — it is aggressive against the Jetson Orin Nano + 8-h-continuous + 25 W envelope. Implication for AC-NEW-4: claiming P(error >500 m) <0.1% per flight is consistent with the DSMAC-lineage class; an honestly-reported failure rate at this level is realistic, not unprecedented.

Fact #18 — Hierarchical Image Matching (arXiv 2506.09748, June 2025) is a current academic SOTA pipeline for our exact problem, but uses DINOv2 — a heavyweight foundation model that must be benchmarked under our 25 W / 8 GB Jetson envelope before any selection

  • Statement: 2025 academic SOTA pipeline structure: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation); (2) Semantic-Aware and Structure-Constrained Matching Module (SASCM) using DINOv2 features + 4D correlation tensor + SoftMNN + 4D conv; (3) lightweight fine-grained matching module for pixel-level. Constructs UAV absolute visual localization without VIO/relative-localization dependence (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV dataset claims superior accuracy under cross-source and cross-temporal variation.
  • Source: Source #29
  • Phase: Phase 2
  • Target Audience: Implementer / Engineer + Domain expert
  • Confidence: for pipeline structure and method; ⚠️ for "superior" claim (single-paper benchmark; AerialExtreMatch evaluates 16 methods with broader rigor — Source #34 is the better cross-method ranker); for Jetson-Orin-Nano runtime (no published number)
  • Related Dimension: SQ1 (academic SOTA), C2 (VPR), C3 (cross-domain registration), SQ5 (foundation-model-on-Jetson failure mode)
  • Fit Impact: academic-SOTA snapshot, candidate template — the retrieval → semantic-aware coarse → fine-grained pipeline is a candidate template for our C2+C3, but DINOv2 introduces a Jetson-deployment risk that must be quantified before commitment. Candidate-level decision: include DINOv2-based pipelines (AnyLoc, BoQ, this paper's SASCM) in the C2/C3 candidate matrix with mandatory MVE on Jetson Orin Nano under our exact frame size and 3 fps cadence. Reject DINOv2 if total inference latency cannot be brought under (400 ms - other-stages budget) at INT8 / fp16. Per Source #28 lesson, classical matchers (LightGlue+SuperPoint as in NGPS) should also be in the matrix as the "simple baseline / known-Jetson-runnable" option.

Fact #19 — AerialExtreMatch (2025) is the academic benchmark our C2+C3 candidate matrix must publish numbers against, with 32 difficulty-stratified cells exposing exactly the cross-source / cross-pitch / cross-scale failure modes our project will face

  • Statement: AerialExtreMatch publishes (a) 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); (b) ~30,000 evaluation pairs in 32 difficulty levels stratified by overlap (4 bins: <20%, 2040%, 4060%, >60%), pitch difference (4 bins: 5055°, 5560°, 6065°, 6570°), and scale variation (2 bins: 12×, >2×); (c) a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. The benchmark evaluates 16 representative detector-based and detector-free image matching methods.
  • Source: Source #34
  • Phase: Phase 2
  • Target Audience: Domain expert + Implementer
  • Confidence:
  • Related Dimension: SQ1 (academic landscape), SQ7 (datasets), C2 (VPR), C3 (cross-domain registration)
  • Fit Impact: defines the C2/C3 evaluation matrix — every C2/C3 candidate going into solution_draft01 must report numbers on AerialExtreMatch's 32 difficulty cells, with at least the high-pitch (6570°) and high-scale (>2×) cells representing our worst-case (UAV vs satellite tile geometry mismatch + ortho-rectification residual). The dataset's real-world UAV-localization split with both UAV-orthomosaic AND satellite-map references mirrors our project's offline-cache-tile semantics directly.

Fact #20 — DARPA FLA + USAF SBIR establish the US-defense-program tailwind, but do not directly validate the project's specific regime (fixed-wing, ~1 km AGL, sat-tile basemap, 8-h endurance)

  • Statement: DARPA Fast Lightweight Autonomy (FLA) program ran 20152018 (Phase 1 Florida 2017; Phase 2 Georgia 2018; complete). Focused on small quadcopter autonomy at ≤20 m/s through cluttered indoor/outdoor environments using onboard cameras + LIDAR + sonar + IMU, no GPS / datalink / pilot. A 2025 retrospective (arXiv 2504.08122) reviews FLA testing methodology and Phase 1 results. A 2025 USAF SBIR Phase II solicitation (Sweetspot ID 7946c818-409f-5b31-8f06-554466071d83) is requesting visual position and navigation capability for sUAS in GPS-denied environments — confirming the regulatory + funding environment is currently active for this category in 2025.
  • Source: Source #35
  • Phase: Phase 2
  • Target Audience: Decision-maker + Domain expert
  • Confidence:
  • Related Dimension: SQ1 (defense-program lineage)
  • Fit Impact: context only, no direct candidate gain — FLA pre-dates the project's specific regime by 8 years, focused on a different platform (multirotor) and altitude (low-altitude obstacle avoidance, not 1 km AGL nadir-camera satellite-anchor). Useful only to establish lineage and context. The USAF SBIR datapoint is more directly relevant: confirms that an active US-defense-funded need exists for sUAS visual position + navigation in GPS-denied environments — i.e., the project's market exists outside Ukraine.

SQ1 — Conclusions (working summary, will be re-checked at Step 7.5)

Existing-systems landscape (5 named-and-evidenced peer / adjacent systems)

System Class Operational regime Closest match dimension Closest mismatch dimension Status as evidence
Twist Robotics OSCAR (UA) Deployed Ukrainian peer Combat-deployed, fixed-wing-class, GPS-denied vision-nav Same architecture, same threat environment Altitude / endurance / FC / accuracy contract not publicly specified Closest peer for "feasibility floor"
Auterion Artemis Production COTS one-way attack drone Shahed-class, 1000-mile range, 40 kg warhead, Ukraine-validated GPS-denied nav Same architectural pattern (Skynode + Visual Navigation + terminal guidance) One-way attack vs reusable; no covariance/source-label contract published Closest production reference architecture
Vantor Raptor (Guide / Sync / Ace) Production COTS software suite Vision-based GPS replacement on existing drone camera + Vivid Terrain 3D basemap Visual-position software pattern Vendor-managed sat-tile basemap is not the project's Azaion Suite Satellite Service; no AC-NEW-4 / AC-1.4 contract Closest commercial peer for "build-vs-buy" framing
snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024) Open-source research prototype LightGlue+SuperPoint+UKF+VISION_POSITION_ESTIMATE to AP Same component split, same FC family GSoC prototype, not production; no spoof defence; no covariance honesty Closest open-source pipeline match — explicit architectural template
CTU Prague SPRIN-D winner Academic / competition Multirotor, ≤25 m AGL, LiDAR + heightmap gradient + particle filter on CPU "Recover-from-uncertainty > low-instantaneous-RMSE" lesson; VIO discipline LiDAR-required, low-altitude regime, no sat-tile basemap Architectural-pattern reference + cautionary tale
Destinus Ruta + UAV Navigation Production miniaturised cruise missile 300 km strike, DSMAC-class, Ukraine-combat-validated Pre-cached basemap + visual matching + autopilot ingestion One-way attack, terminal guidance, no covariance contract Shows DSMAC-class miniaturised into UAV tier

Per-perspective coverage

Perspective Facts supporting Saturation status
Implementer / Engineer Fact #14 (NGPS), Fact #16 (SLAM failure modes), Fact #18 (DINOv2 risk) Saturated for SQ1 — deeper component-level deep-dives go to SQ3/SQ4
Practitioner / Field (Ukraine) Fact #11 (OSCAR), Source #37 (~70% UAV losses to EW), Source #27 (Ruta + UAV Navigation Ukraine combat validation) Saturated for SQ1
Domain expert / Academic Fact #18 (Hierarchical Matching SOTA), Fact #19 (AerialExtreMatch benchmark), Fact #15 (SPRIN-D regime distinction) Saturated for SQ1 — academic SOTA benchmarking handed off to SQ3/SQ4 + SQ7
Contrarian / Devil's advocate Fact #15 (low-altitude RGB matching unreliable lesson), Fact #16 (RTAB-Map / ORB-SLAM3 disqualified), Fact #18 (DINOv2-on-Jetson risk) Saturated for SQ1
Decision-maker / Business Fact #12 (production-ready Auterion), Fact #13 (commercial Vantor build-vs-buy framing), Fact #20 (USAF SBIR market context) Saturated for SQ1

Architectural conclusions for solution_draft01

  1. Build-vs-buy stance: build. Vantor Raptor and Auterion Visual Navigation are commercially superior on hardening + integration but neither exposes the covariance honesty contract (AC-NEW-4) nor uses the project-specified Azaion Suite Satellite Service tile cache (AC-2.x); both are dual-use export risks for the Ukrainian battlefield. NGPS (Fact #14) is the open-source architectural template to learn from but is a GSoC research prototype lacking production hardening, spoof defence, and the covariance-honesty contract. Architectural conclusion: build with NGPS as the template, with project-specific contracts (AC-NEW-4, AC-1.4, AC-NEW-7) and per-FC adapter (SQ6 conclusion) layered on top.
  2. Differentiation from OSCAR (Twist Robotics) must be made explicit in solution_draft01: (a) honest covariance contract per AC-NEW-4; (b) explicit {satellite_anchored, visual_propagated, dead_reckoned} source-label contract per AC-1.4; (c) AC-NEW-7 cache-poisoning safety budget on tile write-back; (d) ArduPilot Plane + iNav both supported per project's revised AC-4.3.
  3. Pipeline canonicalness: the C1+C2+C3+C4+C5+C8 split is canonical (NGPS + the 2025 hierarchical-matching paper + SPRIN-D winner all use the same shape; only the specific algorithm choices differ). SQ2 will sanity-check this against one more pipeline-survey paper, but this is essentially a low-risk question now.
  4. Component-pruning carried into SQ3/SQ4:
    • C1: prune RTAB-Map and ORB-SLAM3 as primary candidates per Fact #16. Carry: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline.
    • C2/C3: mandatorily benchmark any DINOv2-based candidate (AnyLoc, BoQ, SASCM-style) against AerialExtreMatch at our pitch / scale / overlap regime AND against Jetson Orin Nano latency budget (per Fact #18). Maintain LightGlue+SuperPoint as the "simple-baseline / known-Jetson-runnable" option per NGPS precedent.
    • C8 transport: NGPS uses VISION_POSITION_ESTIMATE. SQ6 picked GPS_INPUT. Re-examine the trade-off in design phase, but SQ6's selection stands for the research draft.
  5. Lessons from SPRIN-D winner that must propagate to solution_draft01:
    • "Ability to recover from periods of high uncertainty and re-localize" > "low instantaneous RMSE" — directly informs AC-NEW-2 / AC-NEW-8.
    • VIO requires mechanically-decoupled IMU; this is a hardware-integration constraint, not a software issue.
    • Magnetometer is unreliable near steel/concrete; sensor fusion of heading sources is essential.
    • "No single sensor can be fully relied upon" — directly supports our IMU+camera+sat-tile multi-source posture.

Open follow-ups (deferred to later sub-questions)

  • (SQ8) Independent verification of OSCAR's "fully resistant to spoofing/jamming" claim — if available. Otherwise, Twist Robotics's claim remains a vendor-only signal.
  • (SQ8) Vantor Raptor and Auterion Visual Navigation's covariance reporting behaviour — for benchmarking AC-NEW-4 compliance.
  • (SQ3+SQ4 / C2) AnyLoc / BoQ / DINOv2-VLAD / MixVPR / EigenPlaces / NetVLAD on AerialExtreMatch for cross-source aerial — already in C2 search plan; SQ1 just confirmed they're the right candidate set.
  • (SQ3+SQ4 / C3) LightGlue / LoFTR / RoMa / DKM / MASt3R + classical SIFT+RANSAC + XFeat on AerialExtreMatch — already in C3 search plan; SQ1 confirms shape.
  • (SQ7) AerialExtreMatch + AerialVL + CS-UAV + RealUAV/SAVL + UAV-VisLoc as the dataset shortlist for our cross-validation — confirmed by SQ1 hits.

Boundary check: SQ1 is saturated

Saturation signals observed: 4 perspectives saturated, ≥3 high-confidence facts per perspective, last 3 search rounds (Anduril Iris detail probe, ArduPilot prior-art probe, DSMAC lineage probe) yielded only one new substantive datapoint (NGPS) and confirmed already-known patterns. No unresolved contradictions. Per references/source-tiering.md "Search saturation rule" → SQ1 is closed.


SQ2 — Canonical pipeline decomposition (sanity-check)

Fact #21 — The canonical pipeline for offline-cache visual geo-localization is two-stage: global VPR retrieval, then local alignment (image matching → pose)

  • Statement: Source #38 (Skoltech aerial-VPR survey) defines the field's canonical pipeline verbatim: "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: global localization (or Visual Place Recognition, VPR) and local alignment. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Source #42 (NUDT 2026 absolute-VL survey) names the same shape "retrieval → matching → pose-estimation hierarchical framework" and explicitly contrasts it against three rejected alternatives: (a) relative-only VIO/SLAM (cumulative error), (b) end-to-end direct localization (poor generalization), (c) map-free localization (scene-dependent). Source #39 (U.Maine cross-view survey) traces the same lineage from 2003 pixel-wise template-matching → 2013 hand-engineered features → 2017 CNN/triplet-loss → 2018+ Siamese/GAN → 2022+ Transformer → 2023 DINOv2-class. Source #41 (AnyVisLoc benchmark) implements this hierarchy as: image retrieval (rough) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC, with Top-N re-rank by inlier count as a critical fourth stage between matching and pose.
  • Source: Source #38, Source #39, Source #41, Source #42
  • Phase: Phase 2
  • Target Audience: Architects of solution_draft01
  • Confidence: (four independent surveys/benchmarks converge)
  • Related Dimension: SQ2, C2 (VPR), C3 (cross-domain matching), C4 (pose estimation)
  • Fit Impact: confirms the project's C1C10 decomposition is canonical for the C2 → C3 → C4 chain. The component split is not novel; the project's contribution is the integration discipline (covariance honesty AC-NEW-4, source-label contract AC-1.4, offline-cache safety AC-NEW-7) layered on top. Augment the existing decomposition with an explicit "Top-N re-rank by inlier count" stage between C3 and C4 (currently implicit).

Fact #22 — AdHoP (Adaptive Homography Preconditioning) is a method-agnostic post-matching refinement loop that improves translation accuracy by ~30% average and up to 63% for previously-underperforming methods, at the cost of a second matching pass

  • Statement: Source #40 (OrthoLoC benchmark, Sep 2025): from initial 2D-2D query↔orthophoto correspondences, estimate a homography H via DLT+RANSAC, warp the orthophoto with H to better match the query's perspective (reducing residual perspective gap), re-match in this warped frame, then map the new correspondences back to the original orthophoto via H⁻¹, lift to 3D using DSM, and run PnP+RANSAC + Levenberg-Marquardt refinement. Accept the AdHoP-refined pose only if reprojection error decreases vs. the non-refined pose. Quantitative effects (16,425 images, 47 locations, 1m-1° threshold): GIM+DKM 75.4% recall (best); AdHoP-refined methods see ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods AdHoP yields up to 95% matching improvement (XFeat*) or 63% translation reduction (DKM); for RoMa, AdHoP lifts 1m-1° recall by +23 points (54.6% → 77.6%-class). Cross-domain regime (war-zone-equivalent: scene change between query and reference): translation error increases ~3× when only the visual modality differs, ~7× when both visual and structural (DSM) gaps exist (0.16 m → 1.12 m for GIM+DKM+AdHoP). Method-agnostic — works on top of any 2D-2D matcher.
  • Source: Source #40
  • Phase: Phase 2
  • Target Audience: System architects + C3/C4 implementers
  • Confidence: for headline numbers (single-paper, but published dataset + open code + reproducible per repo)
  • Related Dimension: SQ2 (new sub-stage), C3 (matcher), C4 (pose), SQ5 (cross-domain failure mode)
  • Fit Impact: adds a new sub-stage between C3 and C4. Decision for solution_draft01: include AdHoP-class refinement as an optional stage gated on Jetson Orin Nano latency budget — if (single-pass match latency × 2) + homography estimation + reprojection check fits under (400 ms - other-stages), include it; otherwise reserve as offline-replay-time refinement. Cross-domain 3× translation-error penalty is a direct AC-NEW-4 calibration input — companion-side covariance must inflate proportionally when scene-change detection (deferred to SQ8) flags a stale tile.

Fact #23 — 6-DoF aerial-to-satellite localization requires DSM (Digital Surface Model) elevation data; without DSM, the system collapses to 3-DoF (position + 1 rotation) or must compute attitude purely from IMU/VIO

  • Statement: Source #40 OrthoLoC explicitly: "Our pipeline matches the query image with the DOP, lifts the matched 2D points in DOP to 3D using the DSM, and then estimates the camera pose using PnP and RANSAC." Without the DSM lift, the matcher produces 2D↔2D correspondences that constrain a homography (which encodes 3-DoF for a planar scene + planar camera) but not the full 6-DoF camera pose. Source #41 AnyVisLoc independently confirms by measuring: aerial-photogrammetry map (with paired DSM at 0.94 m/px) achieves 74.1% A@5m; satellite map (with ALOS 30 m DSM) achieves only 18.5% A@5m — a 4× accuracy collapse driven by DSM coarseness. The project's offline cache from the Azaion Suite Satellite Service is currently specified as 2D ortho tiles only (no DSM commitment in restrictions.md or AC). Three architectural responses are available: (a) 3-DoF acceptance — fix attitude from IMU/VIO, treat the matcher output as a homography-only constraint, ignore DSM; sacrifices the up-to-2× higher accuracy reported when DSM is present, but stays within current cache contract; (b) Request DSM tiles from the Suite Sat Service — adds C2 cache schema work + a Suite Sat Service contract change; preserves 6-DoF accuracy; (c) IMU/VIO-only attitude + 2D-2D matching translation — same as (a) but explicitly contracts the IMU/VIO module to provide attitude with σ ≤ 5° (per Fact #24); operationally identical to (a), differs only in how the contract is written.
  • Source: Source #40, Source #41
  • Phase: Phase 2
  • Target Audience: System architects + Suite Sat Service stakeholder + AC owner
  • Confidence: for the architectural claim; for the 4× accuracy collapse number
  • Related Dimension: SQ2 (decomposition), C2 (cache schema), C3 (matcher output contract), C4 (pose), C5 (estimator), C6 (IMU/VIO contract), AC-1.1 / AC-1.1.1 (accuracy budget)
  • Fit Impact: architectural decision required, surfaced for user. The current restrictions.md (no DSM commitment) implicitly forces option (a) or (c). The accuracy budget AC-1.1.1 (≤80 m at 1 km AGL) is loose enough that 3-DoF + IMU-attitude almost certainly satisfies it on a per-frame basis (per Fact #21 and DSMAC-class lineage in Fact #17), but requires explicit acknowledgement in the architecture before commitment. Proposed default for solution_draft01: option (c) — fix attitude from IMU/VIO with documented σ ≤ 5° contract on yaw, σ ≤ 5° on pitch (per Fact #24), translation from 2D-2D matching + camera pose. Flag option (b) as a "Suite Sat Service follow-up" if 6-DoF accuracy ever becomes a hard requirement.

Fact #24 — IMU-derived yaw and pitch priors with σ ≤ 5° are required for the matching+PnP stack to hit benchmark accuracy; σ ≥ 10° causes 24% A@5m drops, σ ≥ 30° causes ≥4% drops, σ ≥ 60° causes 25.7% drops

  • Statement: Source #41 AnyVisLoc systematically perturbs yaw and pitch priors and measures localization accuracy collapse. Yaw: σ = 5° → no impact; σ = 10° → 1.9% A@5m; σ = 30° → 4.1%; σ = 50° → 13.7%; σ = 60° → 25.7%. Pitch: σ < 5° → no impact; σ ≥ 7° → 15% drops. The benchmark is conducted at low altitude (30300 m AGL) with 2090° pitch range; lessons transfer to our 1 km AGL nadir-camera regime in the direction but the magnitudes may be lower at 1 km AGL because nadir geometry is less yaw-sensitive than oblique. Conservatively adopting the benchmark numbers gives a hard contract: IMU/VIO must deliver yaw with σ ≤ 5° and pitch with σ ≤ 5° to the matcher (1σ, not 95%, since the benchmark is single-σ). Pitch is naturally tighter on a nadir-fixed camera (mechanically constrained); yaw is the binding constraint and is the typical IMU/magnetometer failure mode (per SPRIN-D lesson Fact #15).
  • Source: Source #41
  • Phase: Phase 2
  • Target Audience: System architects + C1 (VIO) implementer + C5 (estimator) implementer
  • Confidence: for the AnyVisLoc numbers; ⚠️ for direct transfer to 1 km AGL nadir regime (magnitudes likely smaller at our altitude/pitch — direction is conservative)
  • Related Dimension: SQ2 (sensor-prior contract), C1 (VIO output contract), C5 (estimator), C6 (IMU)
  • Fit Impact: architectural contract for solution_draft01: the C1 module's published contract to the C2/C3 stack is yaw σ ≤ 5° AND pitch σ ≤ 5°. Magnetometer-only yaw is insufficient by the SPRIN-D lesson (Fact #15) — VIO must contribute. Adds a constraint that flows back to the C6 IMU integration: IMU mechanical isolation per SPRIN-D Fact #15 is required; magnetometer + GPS-yaw startup alignment at the airbase (before take-off, while real GPS is healthy) is part of the boot sequence.

Fact #25 — Top-N re-ranking by inlier count is the dominant accuracy/cost trade-off; pure-matching-without-retrieval is catastrophic (A@5m collapses from 62.2% to 34.3% with the same matcher)

  • Statement: Source #41 AnyVisLoc and Source #38 Skoltech survey both quantify the value of retrieval as a search-space reducer for matching. Source #41 explicitly: "Top-N re-rank by inlier count is the best accuracy/cost trade-off" → 62.2% A@5m at 0.8 s/frame on RTX 3090. Without retrieval (pure exhaustive matching against the cache): 34.3% A@5m — i.e., almost half the accuracy at infeasible compute. Source #38 measures sparse-VPR re-ranking specifically: AnyLoc descriptor + SuperGlue re-rank on top-100 candidates = 1525 s/frame on RTX 3090 (catastrophic for our 400 ms budget); LightGlue re-rank ≈ 1 s/frame (still over budget); SelaVPR re-rank < 0.1 s/frame (in-budget on RTX 3090, must be re-tested on Jetson Orin Nano). Re-ranking budget = (frame budget) (descriptor extraction) (initial top-N retrieval) (matcher pose estimation) (AdHoP if included).
  • Source: Source #38, Source #41
  • Phase: Phase 2
  • Target Audience: System architects + C2 implementer
  • Confidence: (two-source convergence on the qualitative claim; quantitative numbers are RTX-3090-specific and must be Jetson-MVE'd)
  • Related Dimension: SQ2 (pipeline structure), C2 (VPR), C3 (matcher), SQ3+SQ4 (Jetson MVE)
  • Fit Impact: mandates Top-N re-rank by inlier count as a stage in solution_draft01. Trade-off Top-N value (typical N=520 in literature) goes to SQ3+SQ4 candidate matrix, not SQ2.

Fact #26 — High-accuracy SOTA models (AnyLoc + SuperGlue + RoMa-class) are NOT viable on Jetson Orin Nano under the 400 ms p95 budget; lightweight VPR (MixVPR / SALAD / SelaVPR-class) + lightweight matchers (LightGlue / XFeat-class) are the only candidates that survive a basic latency pre-screen

  • Statement: Two independent runtime measurements on RTX 3090 (≥10× faster than Jetson Orin Nano in dense matrix ops): Source #38 — AnyLoc descriptor calculation 0.370.84 s/frame (huge ViT-G DINOv2); SuperGlue re-rank 1525 s/frame on top-100; LightGlue re-rank ~1 s/frame; SelaVPR re-rank < 0.1 s/frame. Source #41 — RoMa dense matcher 659 ms/frame; SP+LightGlue+GIM sparse 105 ms/frame; ratio = 6.3×. Memory: AnyLoc descriptors = 2.313.9 GB for 47k tiles (out of 8 GB Jetson Orin Nano envelope before model weights); SelaVPR descriptors < 0.2 GB. Pre-screen conclusion: AnyLoc / SuperGlue / RoMa-class are disqualified on the Jetson Orin Nano at 3 fps unless heavy quantization (INT8) reduces them ≥10×, which is not yet established for our latency target on this hardware. Surviving candidates from the literature: VPR: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD-class; matchers: LightGlue, XFeat, XFeat*, SP+LightGlue. Disqualification is preliminary — final go/no-go happens at SQ3+SQ4 with on-Jetson MVE per references/mode-A-mve-rules.md.
  • Source: Source #38, Source #41
  • Phase: Phase 2
  • Target Audience: C2 + C3 implementer; SQ3+SQ4 candidate-matrix author
  • Confidence: for RTX-3090 numbers; ⚠️ for direct Jetson translation (Jetson Orin Nano AI score is well-published; ratio is conservative)
  • Related Dimension: SQ2 (Jetson budget feasibility), SQ3+SQ4 (candidate pre-screen), SQ5 (foundation-model-on-edge failure mode), C2, C3, C7 (Jetson runtime)
  • Fit Impact: prunes the SQ3+SQ4 candidate matrix BEFORE expensive Jetson MVE. Candidates entering SQ3+SQ4 with mandatory Jetson MVE: (C2 VPR) MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD; (C3 matcher) LightGlue, XFeat, XFeat*, SP+LightGlue. Candidates that need Jetson INT8 quant before they earn an MVE slot: AnyLoc, BoQ, DINOv2-VLAD (must demonstrate INT8 build path with vendor-validated accuracy preservation). Candidates pruned outright: RoMa dense, SuperGlue, MASt3R (latency).

Fact #27 — A 20% covisibility floor between query frame and reference tile is required for localization to succeed; below it, ALL methods fail regardless of matcher quality

  • Statement: Source #40 OrthoLoC: "When the covisibility between the UAV image and the orthographic geodata is too small (less than ~20%), the localization fails for all methods regardless of matcher quality." This is a geometric floor, not a method-specific limit. The implication for the project: any tile-cache design that allows a query to fall outside 20% covisibility with the best available cached tile must also include a runtime covisibility-check + graceful degrade to visual_propagated mode (per AC-1.4 source label). This is a runtime condition, not a one-time setup parameter.
  • Source: Source #40
  • Phase: Phase 2
  • Target Audience: C2 (cache scheduler) + C5 (estimator) + AC-1.4 owner
  • Confidence:
  • Related Dimension: SQ2 (boundary condition), C2 (tile cache), C5 (estimator state machine), AC-1.4
  • Fit Impact: adds a runtime invariant to solution_draft01: tile selection must guarantee ≥20% covisibility OR explicitly emit the visual_propagated source label per AC-1.4 with covariance widened per AC-NEW-4. This becomes a hard constraint on the C2 cache schema (must support tile-extent metadata) and a runtime check before invoking C3 matcher.

SQ2 — Conclusions (working summary, will be re-checked at Step 7.5)

Pipeline-component coverage table (existing C1C10 vs. survey-listed components)

Survey/benchmark canonical stage Project component (current) Coverage status Required action
Image retrieval (global VPR) C2 — Visual Place Recognition covered No change
Re-ranking (top-N inlier-based) (currently implicit, inside C2 or C3) ⚠️ implicit Promote to explicit sub-stage (C2.5 or C3.0) in solution_draft01
Local image matching (2D-2D, sparse or dense) C3 — Cross-domain registration covered Add Top-N re-rank-by-inlier-count requirement
AdHoP-style perspective preconditioning (not represented) missing Add as optional sub-stage between C3 and C4, gated on Jetson latency budget
2D-3D lift via DSM (not represented; current cache is 2D ortho only) architectural decision required Decision required from user — see below
Pose estimation (PnP + RANSAC + LM) C4 — Pose estimation covered No change
State estimator / fusion (UKF / ESKF / MSCKF / factor graph) C5 — Estimator / fusion covered Augmented with covariance-honesty contract from AC-NEW-4
IMU + VIO contract C1 — VO/VIO + C6 — IMU integration covered Add yaw σ ≤ 5°, pitch σ ≤ 5° hard contract from Fact #24
Tile cache + scheduler C2 — VPR tile cache + C9 — Cache hygiene covered Add 20% covisibility runtime invariant (Fact #27)
Anti-spoof / source-switch C7 — Spoof detection + C8 — FC adapter covered Already addressed in SQ6
Health monitoring / safety C10 — Safety / health monitoring covered Already addressed

Architectural decisions surfaced (require user resolution before SQ3+SQ4 starts)

  1. DSM dependency on the Suite Sat Service tile cache (per Fact #23). Three options:

    • (a) 3-DoF acceptance — accept that without DSM, only position is recovered from matching; attitude is fixed by IMU/VIO with no satellite-tile cross-check. Lowest project scope. Requires AC budget verification (likely passes AC-1.1.1).
    • (b) Request DSM tiles — Suite Sat Service contract change. Highest accuracy. Adds ~1 cycle to delivery. Recommended if 6-DoF accuracy ever becomes a hard AC.
    • (c) IMU/VIO-attitude + 2D-2D matching translation — operationally identical to (a) but contracts the IMU/VIO module explicitly with σ ≤ 5° yaw / pitch (Fact #24).
    • Recommended default: (c) — explicit IMU/VIO contract; fall back to (b) if AC tightens.
  2. AdHoP refinement loop (per Fact #22). Three options:

    • (a) Always-on — included in every frame; Jetson budget must accommodate 2× matching latency.
    • (b) Conditional — only when initial reprojection error exceeds a threshold; gated on per-frame budget.
    • (c) Off (initial release) — relegate to offline-replay refinement.
    • Recommended default: (b) Conditional — fits within latency variance budget while capturing the cross-domain accuracy gain.
  3. Top-N re-rank promotion to explicit pipeline sub-stage (per Fact #25). Recommendation: promote to a named sub-stage in solution_draft01 with N as an SQ3+SQ4 hyperparameter sweep target.

Component-pruning carried into SQ3+SQ4

  • C2 candidates entering SQ3+SQ4 with mandatory Jetson MVE: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD.
  • C2 candidates entering SQ3+SQ4 conditional on INT8 quantization path: AnyLoc, BoQ, DINOv2-VLAD.
  • C2 candidates pruned: SuperGlue-as-reranker (latency).
  • C3 candidates entering SQ3+SQ4 with mandatory Jetson MVE: LightGlue, XFeat, XFeat*, SP+LightGlue (NGPS template).
  • C3 candidates pruned: RoMa, MASt3R, DKM (dense matcher latency on Jetson).
  • C3 candidates as "AerialExtreMatch reference points" only, NOT for production: GIM+DKM, GIM+LightGlue (per Source #40, used as accuracy benchmark only).

Boundary check: SQ2 is saturated

Saturation signals observed: (a) four independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey) converge on the same "retrieval → matching → pose-estimation hierarchical framework" as canonical; (b) two independent runtime sources (Skoltech survey on RTX 3090; AnyVisLoc on RTX 3090 with explicit dense-vs-sparse breakdown) agree on the relative cost ordering of model classes; (c) cross-source agreement on AdHoP value (Source #40 only, but with reproducible code and dataset — single-source-but-strong evidence); (d) cross-source agreement on covisibility / sensor-prior thresholds. Two outstanding decisions are flagged for user — neither blocks SQ2's saturation status, both block SQ3+SQ4 start. Per references/source-tiering.md "Search saturation rule" → SQ2 is closed pending user decisions on DSM dependency + AdHoP gating.