mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 18:51:15 +00:00
Refactor documentation for splittable artifacts and update references
Updated various documentation files to clarify the handling of splittable artifacts, allowing for folder equivalents of key markdown files when they exceed size limits. Adjusted references in multiple sections to reflect this new structure, ensuring consistency across the research methodology. Enhanced clarity on the saving actions and artifact organization, particularly for `01_source_registry.md`, `02_fact_cards.md`, and `06_component_fit_matrix.md`. This change aims to improve usability and maintainability of the research documentation.
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# Acceptance Criteria
|
||||
|
||||
> Last revised 2026-05-07 (cleanup pass: stripped algorithm/library/parameter implementation details; renamed source label `vo_extrapolated` → `visual_propagated`; broadened FC scope to ArduPilot + iNav).
|
||||
> Subsequent revision 2026-05-07 (post-SQ6 research): AC-4.3 reworded to acknowledge that no single message type is accepted by both ArduPilot Plane and iNav — per-FC interface is named explicitly (MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards.md` SQ6 / `_docs/00_research/01_source_registry.md` Sources #4, #9, #10, #12, #13.
|
||||
> Subsequent revision 2026-05-07 (post-SQ6 research): AC-4.3 reworded to acknowledge that no single message type is accepted by both ArduPilot Plane and iNav — per-FC interface is named explicitly (MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards/SQ6_fc_external_positioning.md` / `_docs/00_research/01_source_registry/SQ6_external_positioning.md` Sources #4, #9, #10, #12, #13.
|
||||
> See git history for prior versions.
|
||||
|
||||
## Position Accuracy
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Restrictions
|
||||
|
||||
> Last revised 2026-05-07 (cleanup pass — design-independent, IEEE-830 style; only external dependencies, environmental constraints, integration boundaries).
|
||||
> Subsequent revision 2026-05-07 (post-SQ6 research): the FC-facing communication protocol entries below were corrected — iNav firmware (master, post-9.0) has no inbound MAVLink external-positioning handler; the project must use a per-FC adapter (MAVLink `GPS_INPUT` for ArduPilot Plane; MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards.md` SQ6 / `_docs/00_research/01_source_registry.md` Sources #4, #9, #10, #12, #13.
|
||||
> Subsequent revision 2026-05-07 (post-SQ6 research): the FC-facing communication protocol entries below were corrected — iNav firmware (master, post-9.0) has no inbound MAVLink external-positioning handler; the project must use a per-FC adapter (MAVLink `GPS_INPUT` for ArduPilot Plane; MSP2 `MSP2_SENSOR_GPS` for iNav). Rationale and L1 sources in `_docs/00_research/02_fact_cards/SQ6_fc_external_positioning.md` / `_docs/00_research/01_source_registry/SQ6_external_positioning.md` Sources #4, #9, #10, #12, #13.
|
||||
|
||||
## UAV & Flight
|
||||
- Fixed-wing UAVs only; navigation camera fixed downward (no gimbal).
|
||||
|
||||
@@ -74,8 +74,8 @@ For each component below, the search plan covers all option families per `Compon
|
||||
| C6 | **Tile cache + spatial index** (storage + retrieval of basemap tiles + descriptors, with manifests, freshness, dedup, and write-back) | mmap-friendly storage; ANN over global descriptors; spatial query for geographic prior; manifest schema per AC | Storage: GeoTIFF + COG, MBTiles, custom flat layout. ANN: FAISS (IVF/PQ/HNSW), hnswlib, ScaNN, brute-force (small index). Spatial: R-tree / KD-tree / GeoPandas / SQLite+SpatiaLite. Manifest: SQLite, JSON-per-tile, Parquet sidecar |
|
||||
| C7 | **On-Jetson inference runtime** | INT8/FP16 inference of the chosen VPR + matcher models within latency + memory budget | TensorRT (native), Torch-TensorRT, ONNX Runtime + TRT EP, NVIDIA Triton (probably overkill), pure PyTorch fp16, NVIDIA DeepStream (for video), CUDA-Python custom kernels |
|
||||
| C8 | **MAVLink FC adapter** (per-FC external-positioning emission + spoofing-signal subscription, for ArduPilot AND iNav) | MAVLink frames consumed by ArduPilot Plane and iNav as external position; spoofing signals consumed from each FC | Libraries: `pymavlink` (per-message), MAVSDK (high-level), ArduPilot/iNav SITL for verification. Per-FC choice of message: `GPS_INPUT` vs `ODOMETRY` vs `VISION_POSITION_ESTIMATE` vs `GLOBAL_POSITION_INT` (documented capability per FC must be verified, not assumed) |
|
||||
| C9 | **Datasets + SITL / replay** | Reproducible validation against AC-1/2/3/4/NEW-4/NEW-7/NEW-8 budgets; fixtures for AerialVL S03, AerialExtreMatch, own Mavic flights, Derkachi flight footage | AerialVL (VISTA / NTU), AerialExtreMatch, VPR-Bench, MahalNotchVPR / Mid-Air UAV; SITL: ArduPilot Plane SITL, iNav SITL/HITL, Gazebo, Webots; replay: PX4-Avionics-Replay-style or custom |
|
||||
| C10 | **Pre-flight cache provisioning + sector classification + freshness pipeline** | Tooling (operator-side) to pull tiles from Suite Sat Service for an operational area, classify active-conflict vs stable rear, age-stamp, populate descriptor index | Likely a custom CLI/desktop tool — research existing UAV mission-prep tools (QGC plan files, MAVProxy, ArduPilot Mission Planner equivalents on the operator side) |
|
||||
| ~~C9~~ | ~~**Datasets + SITL / replay**~~ — **DROPPED from research scope per 2026-05-08 restructure (user choice A)**; deferred to **Test Spec (greenfield Step 5)**. See "C9 / SQ7 Restructure" section below. | — | — |
|
||||
| C10 | **Pre-flight cache provisioning + sector classification + freshness pipeline** (RESEARCH SCOPE NARROWED 2026-05-08 to cross-coupling minimal — see "C10 Scope Restructure" section below) | (in research scope) confirmed orchestration mechanism for descriptor-cache rebuild (D-C6-3) + TensorRT engine build (D-C7-7) at pre-flight; on-disk artifact format(s); time/memory budget; failure-mode + retry behavior. (deferred to Plan-phase) operator CLI/desktop tool design, sector classification heuristics, freshness pipeline workflow. | (in research scope) FAISS Python API for write_index/read_index orchestration; TensorRT build orchestration `trtexec` CLI vs Python `IBuilderConfig` vs Polygraphy. (deferred) custom CLI/desktop, QGC plan files, MAVProxy, Mission Planner integration patterns. |
|
||||
|
||||
## Perspectives Chosen (≥3 mandatory)
|
||||
|
||||
@@ -86,13 +86,13 @@ For each component below, the search plan covers all option families per `Compon
|
||||
|
||||
## Search Query Variants per Sub-Question
|
||||
|
||||
(Detailed query lists are appended below per sub-question; these will be executed in Step 2 and saved to `01_source_registry.md`. The shape is shown here so the search plan is auditable; the full execution log will populate downstream files.)
|
||||
(Detailed query lists are appended below per sub-question; these will be executed in Step 2 and saved to the `01_source_registry/` folder, indexed by `01_source_registry/00_summary.md`. The shape is shown here so the search plan is auditable; the full execution log will populate downstream files.)
|
||||
|
||||
**SQ1** (existing systems / competitors): "GPS-denied UAV navigation 2025", "visual GPS denied fixed wing UAV", "satellite map matching UAV localization 2024 2025", "Ukraine UAV GPS spoofing countermeasures", "ARL ANT Project visual navigation", "vision-based GPS replacement UAV production", "UAV GPS spoofing real-world deployment 2025".
|
||||
|
||||
**SQ2** (canonical pipeline): "visual aerial localization pipeline survey", "UAV satellite map matching architecture", "monocular UAV global localization pipeline 2024 2025".
|
||||
|
||||
**SQ3 / SQ4** (per-component candidates + binding): per-component query templates (5+ variants each) — see Step 2 plan in `01_source_registry.md` once initialised. Each lead library/SDK candidate triggers the mandatory `context7` per-mode capability verification per `research/steps/03_engine-investigation.md`.
|
||||
**SQ3 / SQ4** (per-component candidates + binding): per-component query templates (5+ variants each) — see Step 2 plan in `01_source_registry/00_summary.md` once initialised. Each lead library/SDK candidate triggers the mandatory `context7` per-mode capability verification per `research/steps/03_engine-investigation.md`.
|
||||
|
||||
**SQ5** (failure modes): "VPR cropland failure", "DINOv2 Jetson Orin Nano latency", "SuperGlue LightGlue Jetson Orin", "ESKF cross-domain over-confidence", "RANSAC homography low-texture failure UAV", "ortho photo geometric error airframe tilt".
|
||||
|
||||
@@ -114,7 +114,7 @@ Probes (per `references/comparison-frameworks.md` → Decomposition Completeness
|
||||
| Cost / TCO dimension? | Hardware is pinned (Jetson Orin Nano Super); Service-side cost is out of scope; SW cost = mostly open-source candidates. Will revisit during Phase 3 (tech stack consolidation) if commercial options emerge. ✓ |
|
||||
| Maintenance / community-health dimension? | SQ4 binds it per candidate. ✓ |
|
||||
| Adjacent-domain dimension? | Robot SLAM, AGV warehouse navigation, aerial photogrammetry will be searched as analogues. ✓ |
|
||||
| Validation / dataset coverage? | SQ7 + C9. ✓ |
|
||||
| Validation / dataset coverage? | **Deferred to Test Spec (greenfield Step 5) per 2026-05-08 C9 / SQ7 restructure** — fixture-class, not research-class. Dataset shortlist preserved for handoff. |
|
||||
| Integration / boundary coverage? | SQ6 (FC adapters) + C8 + C10 (pre-flight provisioning). ✓ |
|
||||
| Operational/human-factors? | Pre-flight cache provisioning (C10) and operator re-loc hint (AC-3.4) covered. Mission-planning UX is out of scope. ✓ |
|
||||
| Security / threat model? | SQ8. Will deepen in Phase 4 (Security Deep Dive) if invoked. ✓ |
|
||||
@@ -124,7 +124,7 @@ No major gap detected at decomposition time. If domain-discovery searches in Ste
|
||||
## Notes on Output-Class Mode-Verification
|
||||
|
||||
Because this is **Technical-component selection**, every lead library/SDK candidate triggers:
|
||||
- Pinned mode/configuration sentence in `02_fact_cards.md`.
|
||||
- Pinned mode/configuration sentence in `02_fact_cards/Cx_*.md` (per-component sub-files).
|
||||
- `context7` lookup with the three mandatory queries (mode enumeration; project's exact mode runnable example; disqualifier probe).
|
||||
- MVE block per candidate.
|
||||
- Per-numbered-Restriction and per-numbered-AC binding (`Pass` / `Fail` / `Verify` / `N/A`).
|
||||
@@ -148,7 +148,7 @@ Source-time-window rules for this run:
|
||||
|
||||
## SQ2 Closure — Pipeline-component coverage table (Mode A Phase 2, Step 3 result)
|
||||
|
||||
The C1–C10 decomposition was sanity-checked against five independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey — all logged in `01_source_registry.md` as Sources #38–#42). The canonical hierarchical framework `retrieval → matching → pose-estimation` is unanimously confirmed; project's split is **canonical, not novel**. Two augmentations are required.
|
||||
The C1–C10 decomposition was sanity-checked against five independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey — all logged in `01_source_registry/SQ2_canonical_pipeline.md` as Sources #38–#42). The canonical hierarchical framework `retrieval → matching → pose-estimation` is unanimously confirmed; project's split is **canonical, not novel**. Two augmentations are required.
|
||||
|
||||
| Survey/benchmark canonical stage | Project component | Coverage status | Required action |
|
||||
|---|---|---|---|
|
||||
@@ -163,7 +163,7 @@ The C1–C10 decomposition was sanity-checked against five independent surveys/b
|
||||
| Tile cache + scheduler | **C6 (Tile cache + spatial index)** | ✅ covered | Add 20% covisibility runtime invariant (Fact #27) |
|
||||
| On-Jetson runtime | **C7 — On-Jetson inference runtime** | ✅ covered | Pre-screen prunes non-viable candidates (Fact #26) |
|
||||
| Anti-spoof / FC adapter | **C8 — MAVLink FC adapter** | ✅ covered | Already addressed by SQ6 |
|
||||
| Datasets / SITL / replay | **C9 — Datasets + SITL / replay** | ✅ covered | None |
|
||||
| Datasets / SITL / replay | **Deferred to Test Spec (greenfield Step 5)** per 2026-05-08 C9 / SQ7 restructure | ⚠️ moved out of research scope | Test Spec owns dataset-corpus selection, SITL framework choice (ArduPilot Plane SITL + iNav SITL/HITL), and replay framework choice |
|
||||
| Pre-flight cache provisioning | **C10 — Pre-flight cache + sector classification** | ✅ covered | None |
|
||||
|
||||
⁂ The "IMU integration" concern lives in C1 (VIO) and partially flows from FC IMU; there is no separately numbered IMU component in the original C1–C10 split. SQ2 confirms this was correct — IMU is best owned by C1 (VIO) which already produces the yaw/pitch attitude. The σ ≤ 5° contract belongs on C1's output interface.
|
||||
@@ -187,10 +187,67 @@ Per Fact #26 (RTX-3090-measured runtime → conservative Jetson-Orin-Nano transl
|
||||
- **C3 candidates pruned outright**: RoMa, MASt3R, DKM (dense-matcher latency on Jetson).
|
||||
- **C3 candidates as "AerialExtreMatch reference points" only**: GIM+DKM, GIM+LightGlue (per Source #40 — accuracy benchmark, not for production deployment).
|
||||
|
||||
## C9 / SQ7 Restructure (2026-05-08, user choice A)
|
||||
|
||||
**Decision**: drop C9 (Datasets + SITL / replay) entirely from the research scope. Defer dataset-corpus selection, SITL framework choice (ArduPilot Plane SITL + iNav SITL/HITL), and replay framework choice (custom vs PX4-Avionics-Replay-style) to **Test Spec (greenfield Step 5)**. Pull D-C7-1 (calibration-dataset-strategy) back inside C7 batch 1 and close it there.
|
||||
|
||||
**Rationale**: datasets are test fixtures, not architectural commitments. They feed into Test Spec → Decompose Tests → Implement Tests, not into the deployed pipeline on the Jetson. They don't bind against the AC-4.1 / AC-4.2 / R-NEW-2 / R-NEW-4 envelope. Choosing among AerialVL S03 vs AerialExtreMatch vs VPR-Bench vs MahalNotchVPR / Mid-Air UAV vs the project's own Mavic + Derkachi flight footage is a "what evidence proves the system meets AC-X" question, not a "what gets implemented on the Orin Nano" question. SITL and replay framework choice are test-infra commitments rather than runtime commitments; SITL framework is largely deterministic at this point (ArduPilot Plane SITL + iNav SITL/HITL are the canonical paths the locked C8 closure already implies).
|
||||
|
||||
**Effective changes**:
|
||||
- **Component Areas table**: C9 removed; remaining components are C1–C8 + C10.
|
||||
- **Sub-Questions table**: SQ7 is deferred to Test Spec (Step 5) — its query variants and dataset shortlist remain documented here for handoff but are not researched in this Mode A run.
|
||||
- **SQ2 closure table**: "Datasets / SITL / replay" row → "Deferred to Test Spec".
|
||||
- **D-C7-1 (calibration-dataset-strategy)**: closed inside C7 batch 1. Strategy = prefer real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles as the calibration corpus distribution; specific fixture-file selection (AerialVL S03 vs project's Mavic + Derkachi clips vs other corpora) is fixture-class and delegated to Test Spec. Synthetic-tile augmentation via random homography is a documented low-data fallback, only invoked if real flight footage is insufficient for Recall@K-target calibration.
|
||||
- **Cross-component gates**: D-C7-1 is no longer cross-coupled to C9; owner narrows to Plan-phase architect (closed at research time).
|
||||
- **Cross-row dependencies in C7 / C8 fact cards and fit-matrix files**: every "C9 datasets / SITL / replay row when opened" reference becomes "Test Spec (Step 5) when opened".
|
||||
|
||||
**Carryforward to Test Spec (Step 5)** — preserved here so Test Spec's first invocation has the handoff payload ready:
|
||||
- **Dataset shortlist**: AerialVL (VISTA / NTU), AerialExtreMatch, VPR-Bench, MahalNotchVPR / Mid-Air UAV, project's own Mavic + Derkachi flights.
|
||||
- **SITL frameworks**: ArduPilot Plane SITL (canonical), iNav SITL/HITL (canonical); Gazebo / Webots noted-and-rejected as overkill for the spoof-promotion + visual-blackout failsafe scenarios that AC-NEW-2 and AC-NEW-8 actually exercise.
|
||||
- **Replay frameworks**: PX4-Avionics-Replay-style canonical reference; custom Python harness as the lightweight default if PX4 replay's MAVLink-injection point doesn't cleanly match the C8 closure's per-FC injection cadence (5 Hz GPS_INPUT for AP / 5 Hz MSP2_SENSOR_GPS for iNav).
|
||||
- **SQ7 query variants** (carried forward verbatim from above): "AerialVL dataset", "AerialExtreMatch", "VPR-Bench cross-season aerial", "Mid-Air UAV dataset", "Mavic Mavik UAV public flight dataset", "satellite-aerial cross-view localization benchmark".
|
||||
- **Test-coverage obligations Test Spec must answer**:
|
||||
- Which corpora exercise which AC (AC-1.1 / AC-1.2 / AC-2.1 / AC-2.2 / AC-3.1 / AC-3.2 / AC-3.3 / AC-3.4 / AC-NEW-1 / AC-NEW-2 / AC-NEW-4 / AC-NEW-7 / AC-NEW-8).
|
||||
- SITL test-harness shape exercising AC-NEW-2 spoof-promotion <3 s end-to-end on **both** ArduPilot Plane SITL **and** iNav SITL/HITL (per locked C8 batch 1 closure cross-component decision D-C8-2).
|
||||
- Replay-fixture format compatible with both C8 injection paths (pymavlink GPS_INPUT for AP, YAMSPy MSP2_SENSOR_GPS for iNav).
|
||||
- INT8 calibration corpus pin (specific files satisfying the C7 batch 1 D-C7-1 strategy = real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles).
|
||||
|
||||
## C10 Scope Restructure (2026-05-08, user choice C — cross-coupling minimal)
|
||||
|
||||
**Decision**: narrow C10 (Pre-flight cache provisioning + sector classification + freshness pipeline) research scope to the two cross-coupling confirmation sub-areas. Defer the operator-side CLI/desktop tool, sector classification heuristics, and tile age-stamping/freshness schema to Plan-phase as `operator tooling design` out-of-research-scope.
|
||||
|
||||
**In-scope (C10 batch 1)**:
|
||||
1. **D-C6-3 confirmation** — descriptor-cache rebuild trigger pipeline. Recommendation inherited from C6 batch 1 (Fact #92 + D-C6-3) = `periodic rebuild during C10 pre-flight provisioning + faiss.write_index serialize + load-at-takeoff in <5 s`. Confirmation work: pin the orchestration tool (FAISS Python API vs subprocess invocation), the trigger semantics (manifest hash change vs operator-manual vs new-tile-delivered), the on-disk file format, the rebuild time budget at pre-flight, and the failure-mode + retry behavior.
|
||||
2. **D-C7-7 confirmation** — TensorRT engine-build pipeline. Recommendation inherited from C7 batch 1 (Fact #94 + D-C7-7) = `primary build-on-deployed-Jetson during pre-flight + reference-Jetson-built engines as fallback`. Confirmation work: pin the build-orchestration tool (`trtexec` CLI vs Python `IBuilderConfig` vs Polygraphy), the calibration-corpus shipping mechanism into the pre-flight build (per D-C7-1 closure: real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles), the per-model build-duration budget, the retry/fallback logic on build failure, and the on-disk engine cache layout.
|
||||
|
||||
**Out-of-research-scope (deferred to Plan-phase)**:
|
||||
- Operator-side CLI/desktop tool design (mission-prep tooling shape; CLI vs GUI; integration with QGC plan files / MAVProxy / Mission Planner equivalents).
|
||||
- Sector classification (active-conflict vs stable rear) heuristics + interface — used to decide AC-8.2 freshness threshold (6 mo vs 12 mo).
|
||||
- Tile age-stamping + freshness schema beyond what AC-8.2 + AC-NEW-6 already mandate.
|
||||
|
||||
**Rationale for narrowing**:
|
||||
- The C6 and C7 closures already locked architectural recommendations (`periodic rebuild during pre-flight` and `build-on-deployed-Jetson at pre-flight`). What remains is mechanism confirmation, not candidate enumeration.
|
||||
- The deferred items are fixture/operator-tooling-class concerns. Their cross-coupling with the runtime architecture is mediated entirely by the descriptor-cache file and the TensorRT engine cache file — both fixed by the in-scope confirmations. Operator tool design can iterate freely at Plan-phase without touching runtime contracts.
|
||||
- Aligns with the C9-restructure precedent: keep research focused on architecture-binding decisions; push fixture/tooling decisions to the phases that own them.
|
||||
|
||||
**Effective changes**:
|
||||
- **Component Areas table**: C10 row preserved with reduced scope. Per-FC details below.
|
||||
- **`Required outputs` for C10 in the table**: narrows from `Tooling (operator-side) to pull tiles from Suite Sat Service for an operational area, classify active-conflict vs stable rear, age-stamp, populate descriptor index` to `Confirmed orchestration mechanism for descriptor-cache rebuild + TensorRT engine build at pre-flight; on-disk artifact format(s); time/memory budget; failure-mode + retry behavior`.
|
||||
- **Cross-component gates**: D-C6-3 and D-C7-7 remain owned jointly with C10; new C10-internal decisions D-C10-x will be added at C10 batch 1 closure.
|
||||
- **SQ5 interleaving**: limited C10 SQ5 facts (failure modes during pre-flight build/rebuild) collected during this batch.
|
||||
|
||||
**Carryforward to Plan-phase** — operator-tooling design issues preserved here so Plan-phase has a starting list:
|
||||
- Tool shape: integrate as a sub-command of Mission Planner / QGC plan-file workflow vs standalone CLI vs lightweight desktop GUI.
|
||||
- Sector-classification source: operator-marked geofence polygons vs Suite Sat Service metadata vs hybrid.
|
||||
- Tile age-stamping: per-tile capture date in manifest (already mandated by restrictions.md) vs additional sector-class tag vs full audit trail per AC-NEW-7.
|
||||
- Freshness pipeline: when to re-pull from Suite Sat Service (every flight, weekly, on operator demand, on sector-class change).
|
||||
|
||||
## Next Step
|
||||
|
||||
SQ1 ✓ → SQ2 ✓ (with three architectural decisions resolved) → **SQ3+SQ4 per component (C1→C10)** → SQ5 interleaved → SQ7 → SQ8 → SQ9 synthesis at engine Step 8.
|
||||
SQ1 ✓ → SQ2 ✓ (with three architectural decisions resolved) → **SQ3+SQ4 per component (C1→C8)** ✓ → **C10 batch 1 in progress (cross-coupling minimal scope, 2 sub-areas: D-C6-3 + D-C7-7 confirmation)** → SQ5 interleaved → SQ8 → SQ9 synthesis at engine Step 8.
|
||||
|
||||
Pipeline shape entering SQ3+SQ4: `C1 (VIO) → C2 (VPR) → Top-N re-rank by inlier count → C3 (matcher) → AdHoP-conditional refinement → C4 (PnP+RANSAC+LM) → C5 (estimator) → C8 (FC adapter)` with C6 (cache, 2D ortho) + C7 (Jetson runtime) + C9 (datasets) + C10 (provisioning) cross-cutting.
|
||||
(SQ7 deferred to Test Spec per C9 restructure; C9 dropped; C10 operator-tooling-design deferred to Plan-phase per the C10 scope restructure above.)
|
||||
|
||||
Pipeline shape (final, post-C10-restructure): `C1 (VIO) → C2 (VPR) → Top-N re-rank by inlier count → C3 (matcher) → AdHoP-conditional refinement → C4 (PnP+RANSAC+LM) → C5 (estimator) → C8 (FC adapter)` with C6 (cache, 2D ortho) + C7 (Jetson runtime) + C10 (pre-flight orchestration: descriptor-cache rebuild + TensorRT engine build) cross-cutting.
|
||||
|
||||
First C1 (VIO) candidate batch: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure-VO baseline (RTAB-Map and ORB-SLAM3 already pruned by Fact #16). Per-mode `context7` capability verification mandatory for every lead library/SDK candidate.
|
||||
|
||||
@@ -1,659 +0,0 @@
|
||||
# Source Registry
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> Investigation order saved in `00_question_decomposition.md` → "Next Step": SQ6 → SQ1 → SQ2 → SQ3+SQ4 per component (C1→C10) → SQ5 interleaved → SQ7 → SQ8 → SQ9 synthesis at engine Step 8.
|
||||
|
||||
## Investigation Status
|
||||
|
||||
| Sub-question | Status | Notes |
|
||||
|---|---|---|
|
||||
| SQ6 — ArduPilot vs iNav external positioning | **Saturated for protocol-level architectural decision** (further detail deferred to SQ8 for spoofing-side fields and to design phase for SITL parameter tuning) | Major finding: iNav has no inbound external-positioning MAVLink handler; AC-4.3 wording must be revised. See `02_fact_cards.md` "SQ6 Conclusions". |
|
||||
| SQ1 — Existing GPS-denied UAV systems | **Saturated.** 13 sources logged across academic / open-source / commercial / defense-program / Ukraine-practitioner. Closest peer system: Twist Robotics OSCAR (deployed in Ukraine). Closest open-source pipeline-match: snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024 — LightGlue+SuperPoint+UKF+VISION_POSITION_ESTIMATE). Closest deployed commercial: Auterion Artemis (Skynode N + Visual Navigation, Ukraine-tested, 1000-mile range). | See `02_fact_cards.md` SQ1 cluster + working summary. |
|
||||
| SQ2 — Canonical pipeline decomposition | **Saturated.** 5 surveys/benchmarks logged (Skoltech aerial VPR, U.Maine cross-view, OrthoLoC 2.5D geodata, AnyVisLoc low-altitude multi-view, NUDT 2026 sciopen survey). All converge on **`retrieval → matching → pose-estimation`** hierarchical framework with VIO/IMU as auxiliary. Two new architectural facts added to C1–C10: (a) **AdHoP-style perspective-refinement loop** between matching and PnP (+63% translation accuracy, method-agnostic), (b) **DSM 2.5D dependency** for full 6-DoF on aerial-to-satellite (must be resolved with the Suite Sat Service or accepted as a 3-DoF degraded mode). Practitioner runtime evidence: AnyLoc on RTX 3090 = 0.63s/descriptor, SuperGlue re-rank = 17–25s; on Jetson Orin Nano these are non-viable for our 400 ms p95 budget — must restrict to lightweight VPR (e.g., MixVPR / SALAD class) + LightGlue/XFeat-class matchers. See `02_fact_cards.md` "SQ2 Conclusions". |
|
||||
| SQ3+SQ4 — Per-component candidates (C1–C10) | **In progress** — C1 (VIO) candidate enumeration done (Sources #43–#52); per-mode `context7` verification + Restrictions×AC sub-matrix per surviving candidate deferred to next session. C2–C10 not started. | See `02_fact_cards.md` C1 cluster + preliminary applicability table. |
|
||||
| SQ5 — Failure modes / deployment lessons | Not started (interleaved with SQ3/SQ4) | |
|
||||
| SQ7 — Datasets, SITL, replay environments | Not started | |
|
||||
| SQ8 — Safety considerations (AC-NEW-4 / AC-NEW-7) | Not started | Carries the AP_GPS spoofing-signal probe deferred from SQ6. |
|
||||
| SQ9 — End-to-end synthesis | Step 8 of engine (deferred) | |
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
### Source #1
|
||||
- **Title**: Non-GPS Navigation — Plane documentation
|
||||
- **Link**: https://ardupilot.org/plane/docs/common-non-gps-navigation-landing-page.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live docs (current ArduPilot stable, accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: ArduPilot 4.7+ (persistent origin storage); applies to current Plane stable
|
||||
- **Target Audience**: ArduPilot Plane operators / developers
|
||||
- **Research Boundary Match**: Full match (fixed-wing, ArduPilot Plane is in scope)
|
||||
- **Summary**: Lists supported non-GPS navigation systems for Plane. Notes that boards <1MB flash still support `GPS_INPUT` even when they cannot run other non-GPS messages. Notes that Plane (non-VTOL) is generally not applicable for low-altitude non-GPS — but `GPS_INPUT` as an external GPS replacement is not constrained by that note.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #2
|
||||
- **Title**: GPS / Non-GPS Transitions — Plane documentation
|
||||
- **Link**: https://ardupilot.org/plane/docs/common-non-gps-to-gps.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live docs (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: EKF3 (default since AP 4.0+)
|
||||
- **Target Audience**: ArduPilot operators using mixed GPS / non-GPS sources
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Documents the EKF3 source-set mechanism (`EK3_SRC1..3_POSXY/VELXY/POSZ/VELZ/YAW`), three source sets, RC aux switch (option 90 "EKF Pos Source"), `MAV_CMD_SET_EKF_SOURCE_SET`, Lua-script driven switching. Explicitly named messages for non-GPS path: ExternalNav (option 6). GPS_INPUT is treated as a GPS source (set 1).
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #3
|
||||
- **Title**: EKF Source Selection and Switching — Plane documentation
|
||||
- **Link**: https://ardupilot.org/plane/docs/common-ekf-sources.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live docs (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: EKF3 stable
|
||||
- **Target Audience**: ArduPilot operators / developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative parameter reference for `EK3_SRCx_*` (POSXY/VELXY/POSZ/VELZ/YAW). Important caveat: "Ground stations or companion computers may set the source by sending a `MAV_CMD_SET_EKF_SOURCE_SET` mavlink command **but no GCSs are currently known to implement this**." Source-set switching from companion is supported by AP, not by stock GCS UI. Mentions ExternalNAV/OpticalFlow transition options via `EK3_SRC_OPTIONS` bit 1.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #4
|
||||
- **Title**: ArduPilot AP_GPS_MAV.cpp (master)
|
||||
- **Link**: https://raw.githubusercontent.com/ArduPilot/ardupilot/master/libraries/AP_GPS/AP_GPS_MAV.cpp
|
||||
- **Tier**: L1 (source code)
|
||||
- **Publication Date**: master HEAD (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master branch
|
||||
- **Target Audience**: ArduPilot developers, integrators of external GPS via MAVLink
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative implementation of `MAVLINK_MSG_ID_GPS_INPUT` ingestion into AP_GPS state. Decodes lat/lon/alt, hdop/vdop, velocity (vn/ve/vd), speed/horizontal/vertical accuracy, yaw. Honors `gps_id` (multi-GPS instance), `ignore_flags` bitmask (ALT, HDOP, VDOP, VEL_HORIZ, VEL_VERT, SPEED_ACCURACY, HORIZONTAL_ACCURACY, VERTICAL_ACCURACY). Requires `fix_type ≥ 3` and `time_week > 0` for jitter-corrected timestamping. Yaw uses `0` as "not provided" sentinel. Only `GPS_INPUT` is handled by this driver — `VISION_POSITION_ESTIMATE` / `ODOMETRY` go via the external-nav driver, not AP_GPS_MAV.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #5
|
||||
- **Title**: ArduPilot PR #28750 — AP_NavEKF3: added two more EK3_OPTION bits (GPS-denied testing)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/pull/28750
|
||||
- **Tier**: L2 (development PR, ArduPilot core team)
|
||||
- **Publication Date**: 2024 (accessed via search 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master / pending stable branch propagation
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Adds new `EK3_OPTION` bits to allow easier GPS-denied testing of EKF3, including an aux-switch / MAVLink command path to disable GPS use. Confirms ongoing 2024-2025 work on GPS-denied robustness.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #6
|
||||
- **Title**: ArduPilot Issue #15859 — EKF3: improve source switching (GPS<->NonGPS)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/15859
|
||||
- **Tier**: L4 (issue tracker — open enhancement list)
|
||||
- **Publication Date**: ongoing (long-running issue, accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (still open per dev docs reference)
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative list of planned improvements for source-switching. Linked from the L1 GPS-Non-GPS Transitions page. Indicates current source switching has known rough edges acknowledged by the core team.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #7
|
||||
- **Title**: ArduPilot Issue #27193 — EK3 Source Switching wrong frame for GUIDED commands SOLVED
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/27193
|
||||
- **Tier**: L4 (issue tracker, resolved)
|
||||
- **Publication Date**: 2024 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Reference only (resolved as user-config)
|
||||
- **Target Audience**: ArduPilot operators using GPS↔Vision source switching
|
||||
- **Research Boundary Match**: Partial overlap (Copter context but the bug was in shared SET_POSITION_TARGET_GLOBAL_INT path)
|
||||
- **Summary**: Documented frame-interpretation issue when companion switches source set 1 (GPS) → set 3 (VISION_POSITION_ESTIMATES) and back. Resolved as configuration not code, but illustrates the kind of edge case to validate in SITL for AC-NEW-2 promotion.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #8
|
||||
- **Title**: ArduPilot Issue #23485 — AP_NavEKF3: support fusing only External Nav Velocities (without position)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/23485
|
||||
- **Tier**: L4 (open enhancement)
|
||||
- **Publication Date**: ongoing (open as of accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms current limitation: ODOMETRY without position causes position-estimate timeout / failsafe. Implies the project's `visual_propagated` path (VO without satellite anchor) cannot be expressed as ODOMETRY-velocity-only on current AP — must be sent as full GPS_INPUT with widened covariance.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #9
|
||||
- **Title**: iNavFlight/inav — telemetry/mavlink.c (master, processMAVLinkIncomingTelemetry)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/src/main/telemetry/mavlink.c
|
||||
- **Tier**: L1 (source code, authoritative)
|
||||
- **Publication Date**: master HEAD (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav master (post-9.0)
|
||||
- **Target Audience**: iNav developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative inbound MAVLink switch (lines ~1334–1390). Handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only `MAV_CMD_DO_REPOSITION`), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. **No `GPS_INPUT`, no `VISION_POSITION_ESTIMATE`, no `ODOMETRY`, no `GLOBAL_POSITION_INT`, no `GPS_RAW_INT`** are accepted as inputs. Wiki page (Source #10) confirms.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #10
|
||||
- **Title**: iNav Wiki — MAVLink (frogmane edited 2025-12-11)
|
||||
- **Link**: https://github.com/iNavFlight/inav/wiki/Mavlink
|
||||
- **Tier**: L1 (project wiki)
|
||||
- **Publication Date**: 2025-12-11
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 8.0 / 9.0 era
|
||||
- **Target Audience**: iNav users / integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative inbound/outbound MAVLink message lists. "Limited command support: Commands that are not implemented are ignored." Explicitly enumerates the supported incoming list (matches Source #9). Confirms iNav MAVLink is "intended primarily for simple telemetry and operation" and "not 100% compatible".
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #11
|
||||
- **Title**: iNav Wiki — GPS and Compass setup
|
||||
- **Link**: https://github.com/iNavFlight/inav/wiki/GPS-and-Compass-setup
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live wiki (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 7.0+ (UBX-only); 9.0 requires UBX protocol ≥15.00
|
||||
- **Target Audience**: iNav operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: From iNav 7.0 NMEA was removed; only UBX is supported. Recommends u-blox M8/M9/M10 with protocol ≥15.00. Sets up the constraint for any UBX-emulation path the companion would take.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #12
|
||||
- **Title**: iNavFlight/inav docs/development/msp/README.md (MSP message reference)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/development/msp/README.md
|
||||
- **Tier**: L1 (project docs)
|
||||
- **Publication Date**: live (master, accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav master
|
||||
- **Target Audience**: iNav developers / integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative spec for `MSP_SET_RAW_GPS (201)` and `MSP2_SENSOR_GPS (7939)`. `MSP_SET_RAW_GPS` is 14-byte, lossy (no covariance, no per-axis velocity, altitude in meters with cm internal mismatch — bug fixed in 5.0.0 per issue #8336). `MSP2_SENSOR_GPS` is the newer plugin-style message with `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` (mm and cm/s), `hdop`, NED velocity components, `trueYaw`, GPS week + time-of-week, fix type, satellite count. Requires `USE_GPS_PROTO_MSP` build flag and routes through `mspGPSReceiveNewData()` (the GPS_PROVIDER_MSP driver path).
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #13
|
||||
- **Title**: iNavFlight/inav src/main/io/gps.c + src/main/target/common.h (master)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/src/main/target/common.h
|
||||
- **Tier**: L1 (source code)
|
||||
- **Publication Date**: master (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master
|
||||
- **Target Audience**: iNav developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: `USE_GPS_PROTO_MSP` is enabled by default in the common target configuration; on default builds the MSP GPS provider (`GPS_PROVIDER_MSP`) is registered with `gpsRestartMSP` / `gpsHandleMSP`. Confirms the MSP2_SENSOR_GPS path is reachable on stock iNav firmware without custom builds.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #14
|
||||
- **Title**: iNav Issue #10141 — dual GPS support
|
||||
- **Link**: https://github.com/iNavFlight/inav/issues/10141
|
||||
- **Tier**: L4 (open feature request)
|
||||
- **Publication Date**: ongoing (open as of accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: iNav users
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms iNav does **not** support dual-GPS / primary-secondary failover. Open enhancement; no implementation in 8.0 / 9.0. Architectural implication: companion must be the sole GPS source for iNav (not a backup to a real GPS connected directly to FC).
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #15
|
||||
- **Title**: iNav docs/GPS_fix_estimation.md (master)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/GPS_fix_estimation.md
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 8.0+
|
||||
- **Target Audience**: iNav fixed-wing operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: iNav's internal dead-reckoning ("GPS fix estimation") for fixed-wing. Uses gyro/accel/baro/(mag/pitot). RTH-only intent. **Explicitly states: "Not a solution for GPS spoofing (GPS output is not validated in INAV)"** — iNav has no internal anti-spoofing, so anti-spoofing is fully the companion's responsibility. Two settings: `inav_allow_gps_fix_estimation` (RTH-with-no-GPS) and `inav_allow_dead_reckoning` (short-outage tolerance) — both default OFF. `failsafe_gps_fix_estimation_delay` controls mission-vs-RTH tradeoff (default 7 s).
|
||||
- **Related Sub-question**: SQ6 (dead-reckoning fallback) + SQ8 (anti-spoofing implication)
|
||||
|
||||
### Source #16
|
||||
- **Title**: iNav docs/Settings.md (master)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/Settings.md
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: master (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav master
|
||||
- **Target Audience**: iNav operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative parameter list. Confirms `inav_allow_dead_reckoning` (line 2081, default OFF) ≠ `inav_allow_gps_fix_estimation` (line 2091, default OFF). The two settings address different scenarios. `failsafe_gps_fix_estimation_delay` (line 1041, default 7 s) governs mission-abort timing.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #17
|
||||
- **Title**: iNav Issue #10588 — Weird behaviour in DeadReckoning mode while GPS outage is not constant
|
||||
- **Link**: https://github.com/iNavFlight/inav/issues/10588
|
||||
- **Tier**: L4 (open issue, 2025)
|
||||
- **Publication Date**: 2025
|
||||
- **Timeliness Status**: Currently valid (open)
|
||||
- **Target Audience**: iNav operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Documented stability bug: intermittent GPS outages cause porpoising and motor bursts in dead-reckoning. Cited recommendation: "GPS should be rejected if providing erroneous coordinates rather than no fix." Risk for AC-NEW-8 (visual blackout + spoofed GPS) on iNav: do NOT rely on iNav's dead-reckoning for the spoof-active failsafe path; companion must actively suppress its own MSP feed and accept that iNav may misbehave during the gap. Better: continue feeding companion-IMU-propagated position with growing covariance via MSP2_SENSOR_GPS so iNav never enters its dead-reckoning state.
|
||||
- **Related Sub-question**: SQ6 + AC-NEW-8 design implication
|
||||
|
||||
### Source #18
|
||||
- **Title**: iNav Release 8.0.0 (highlights, Dec 2024)
|
||||
- **Link**: https://github.com/iNavFlight/inav/releases/tag/8.0.0
|
||||
- **Tier**: L1 (project release notes)
|
||||
- **Publication Date**: late 2024 / early 2025
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 8.0
|
||||
- **Target Audience**: iNav users
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Introduces fixed-wing GPS fix estimation (dead reckoning RTH-only) — the milestone for #8347. No new external-positioning inbound MAVLink in 8.0. Confirms iNav's 2024–2025 trajectory has not added a `GPS_INPUT`-equivalent inbound interface.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #19
|
||||
- **Title**: iNav Release 9.0.0 / 9.0.1 + 9.0.0 Release Notes wiki
|
||||
- **Link**: https://github.com/iNavFlight/inav/wiki/9.0.0-Release-Notes
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2025-2026
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 9.0.x
|
||||
- **Target Audience**: iNav users
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: New in 9.0: pitot APA/TPA, position estimator improvements, MSP_REBOOT DFU, GCS NAV via `COMMAND_INT` `MAV_CMD_DO_REPOSITION`. **No** new external-positioning inbound MAVLink. UBX <15.00 dropped. Confirms iNav 9.x continues the same external-positioning architecture as 8.x.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
### Source #20
|
||||
- **Title**: MAVLink common message set — GPS_RAW_INT (24)
|
||||
- **Link**: https://mavlink.io/en/messages/common.html
|
||||
- **Tier**: L1 (MAVLink spec, live)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: MAVLink common, current
|
||||
- **Target Audience**: MAVLink integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Current published `GPS_RAW_INT` extension fields: `alt_ellipsoid`, `h_acc` (mm), `v_acc` (mm), `vel_acc` (mm/s), `hdg_acc` (degE5), `yaw` (cdeg). **No spoofing/jamming/integrity bitfield is present in `GPS_RAW_INT` at the time of access**, despite PR #2110 having been merged for spoofing/integrity reporting. Spoofing/integrity may live in a separate message (`GPS_INTEGRITY` or similar — to be verified in SQ8). For now, spoof-detection signals available to companion from FC are limited at the message-shape level; FC-side textual signals (`STATUSTEXT`) and `NAMED_VALUE_INT` are the documented practical path.
|
||||
- **Related Sub-question**: SQ6 + SQ8
|
||||
|
||||
### Source #21
|
||||
- **Title**: MAVLink PR #2110 — gps: add status and integrity information
|
||||
- **Link**: https://github.com/mavlink/mavlink/pull/2110
|
||||
- **Tier**: L2 (protocol PR with cross-project sign-off)
|
||||
- **Publication Date**: merged (accessed via search 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: MAVLink common
|
||||
- **Target Audience**: MAVLink integrators across PX4 / ArduPilot / QGC / Mission Planner
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Adds GNSS status / integrity reporting (jamming/spoofing/error) at the protocol level. Cross-project sign-off across PX4, ArduPilot, QGC, Mission Planner. Field-level breakdown to be cross-checked in SQ8 against the dialect XML — current `common.html` does not show those fields inside `GPS_RAW_INT` itself, suggesting they live in a sibling message (likely `GPS_INTEGRITY` or `GPS_STATUS_EXT`).
|
||||
- **Related Sub-question**: SQ6 → defer to SQ8 for the precise message name and field set ArduPilot uses to expose spoofing.
|
||||
|
||||
### Source #22
|
||||
- **Title**: AirDroper — GNSS Spoofing Filter (companion device, MAVLink2 NAMED_VALUE_INT pattern)
|
||||
- **Link**: https://gps.airdroper.org/
|
||||
- **Tier**: L3 (vendor product page; design pattern reference, not protocol authority)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: ArduPilot integrators considering anti-spoofing
|
||||
- **Research Boundary Match**: Reference only (vendor's specific algorithm not relevant; the integration pattern is)
|
||||
- **Summary**: Establishes a precedent that "companion-runs-spoofing-detection → publishes confidence to GCS as MAVLink2 `NAMED_VALUE_INT`, logged to dataflash" is a real-world integration pattern with ArduPilot, not novel to this project. Useful for SQ8.
|
||||
- **Related Sub-question**: SQ8 (referenced from SQ6)
|
||||
|
||||
### Source #23
|
||||
- **Title**: ArduPilot PR #24135 — Add option to make EKF3 more robust to bad IMU and lagged GPS data
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/pull/24135
|
||||
- **Tier**: L2 (development PR)
|
||||
- **Publication Date**: 2023-2024 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master / propagated to stable
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Introduces `EK3_GLITCH_RADIUS` parameter — soft outlier rejection: instead of dropping a GPS measurement that fails innovation gating, the EKF inflates innovation variance to the minimum that just passes, effectively de-weighting the measurement. Implication for AC-NEW-4 (false-position safety): the project's covariance honesty contract on `GPS_INPUT.horiz_accuracy` is the ONLY way for AP's EKF to detect and de-weight a bad estimate; under-reporting collapses this safety net.
|
||||
- **Related Sub-question**: SQ6 + AC-NEW-4 design implication
|
||||
|
||||
### Source #24
|
||||
- **Title**: ArduPilot AP_NavEKF3 — VehicleStatus.cpp + AP_NavEKF3.cpp (master)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_NavEKF3/AP_NavEKF3_VehicleStatus.cpp ; https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_NavEKF3/AP_NavEKF3.cpp
|
||||
- **Tier**: L1 (source code)
|
||||
- **Publication Date**: master HEAD (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master
|
||||
- **Target Audience**: ArduPilot EKF3 developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: EKF3 quality control: (a) ground-stationary GPS drift check ≤ 3 m (gated by `_gpsCheckScaler`); (b) innovation gating per `POS_I_GATE` / `VEL_I_GATE`; (c) soft de-weighting via `EK3_GLITCH_RADIUS` (Source #23). Confirms AP's covariance-driven quality path actually exists; companion-supplied `horiz_accuracy` flows into this chain.
|
||||
- **Related Sub-question**: SQ6 (full file analysis deferred to design phase)
|
||||
|
||||
---
|
||||
|
||||
## SQ1 — Existing / competitor GPS-denied UAV navigation systems
|
||||
|
||||
### Source #25
|
||||
- **Title**: Twist Robotics develops OSCAR — a GPS-independent visual navigation system for drones resistant to electronic warfare equipment
|
||||
- **Link**: https://www.pravda.com.ua/eng/news/2026/01/28/8018266/
|
||||
- **Tier**: L2 (national newspaper of record reporting on a Technology Forces of Ukraine release; primary press is the Technology Forces of Ukraine FB post)
|
||||
- **Publication Date**: 2026-01-28 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (within 6-month critical-novelty window)
|
||||
- **Target Audience**: Ukraine-deployment practitioners; UAV companion-system designers
|
||||
- **Research Boundary Match**: **Full match** — Ukrainian fixed-wing-class UAV, GPS-denied, vision-based, deployed in active conflict
|
||||
- **Summary**: Twist Robotics (UA) deployed OSCAR ("Optical System of Coordinates with Automatic Relocalisation") — camera + landmark-matching + map → autopilot ingests as a "reliable GPS signal". Vendor claims: 20 m accuracy without cumulative error, day/night/fog operation, 500,000 km logged across 25,000 combat missions over 24 months development, AI-augmented + Obrii proprietary simulator for training. Note: hardware photo shows active cooling on the module — implies non-trivial compute (probably Jetson-class). **No public independent benchmark.** Closest deployed peer system to this project.
|
||||
- **Related Sub-question**: SQ1 (closest peer); also informs SQ8 (anti-spoofing claims), SQ9 (synthesis)
|
||||
|
||||
### Source #26
|
||||
- **Title**: Ukraine Gives Drones Vision-Based Navigation to Push Past Heavy Jamming — The Defense Post
|
||||
- **Link**: https://thedefensepost.com/2026/01/29/ukraine-drones-vision-navigation/
|
||||
- **Tier**: L2 (defense-trade publication; corroborates Source #25 with a second-party byline)
|
||||
- **Publication Date**: 2026-01-29 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Defense-policy / procurement readership
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms OSCAR is operational, terrain-imagery-against-mapped-landmarks pattern, autopilot-ingestion. Adds "live imagery" framing. No new technical detail beyond Source #25.
|
||||
- **Related Sub-question**: SQ1
|
||||
|
||||
### Source #27
|
||||
- **Title**: Ukraine's Ruta Missile Drone Will Get an EW-Immune Navigation System — Defense Express
|
||||
- **Link**: https://en.defence-ua.com/weapon_and_tech/ukraines_ruta_missile_drone_will_get_an_ew_immune_navigation_system-14541.html
|
||||
- **Tier**: L2 (defense-trade publication, Ukraine-domestic)
|
||||
- **Publication Date**: 2025-05-17 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (within 18-month authority window)
|
||||
- **Target Audience**: Defense-procurement / industry analysts
|
||||
- **Research Boundary Match**: Partial — operational profile (cruise-missile-class, terminal guidance) differs from our 8-h fixed-wing surveillance/strike profile; technique class is closely related (DSMAC pattern)
|
||||
- **Summary**: Destinus Ruta (Ukrainian-Swiss origin; ~300 km strike range, miniature cruise missile) will integrate a navigation system from UAV Navigation (Spanish, Grupo Oesía). Defense Express infers DSMAC-style operating principle: "takes images of surface mid-flight, identifies location through comparison with reference". Vendor announcement notes validation in Ukrainian combat conditions including GNSS-denied / jamming / spoofing. Establishes that the cruise-missile-tier vision-nav pattern is now being miniaturised for ~300 km strike drones.
|
||||
- **Related Sub-question**: SQ1 (commercial/military landscape)
|
||||
|
||||
### Source #28
|
||||
- **Title**: Kilometer-Scale GNSS-Denied UAV Navigation via Heightmap Gradients: A Winning System from the SPRIN-D Challenge
|
||||
- **Link**: https://arxiv.org/abs/2510.01348
|
||||
- **Tier**: L1 (peer-style preprint, full system description, real flight data, competition results)
|
||||
- **Publication Date**: October 2025 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: arXiv v1 (2510.01348v1)
|
||||
- **Target Audience**: GNSS-denied UAV system designers (academic + practitioner)
|
||||
- **Research Boundary Match**: **Partial — different regime.** Multirotor (≤25 kg), <25 m AGL, LiDAR-equipped, no satellite-tile basemap; 9 km waypoint mission. Our project is fixed-wing, ~1 km AGL, no LiDAR, monocular + sat-tile basemap. **Architectural pattern transfers; specific algorithm does NOT** (heightmap gradients require LiDAR).
|
||||
- **Summary**: CTU Prague team won SPRIN-D Funke Fully Autonomous Flight Challenge with: VIO (OpenVINS) + LiDAR-derived local heightmap + gradient template matching against open-data DEM + clustered K-means particle filter, all on Intel NUC i7 16 GB CPU-only (no GPU). Achieved RMSE <11 m over kilometer-scale flights vs ≤53 m for raw odometry. Critical observations explicitly stated:
|
||||
- **RTAB-Map and ORB-SLAM3 both fail** beyond 1 km / above 2 m/s flight (compute/memory) and ORB-SLAM3 loses tracking in textureless areas — directly applicable to our 17 m/s cruise over agricultural steppe.
|
||||
- **"Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes."** This is a low-altitude (<25 m AGL) finding; our 1 km AGL operates in the high-altitude regime where the same paper notes RGB sat-matching "works reasonably well" (refs [5][6]).
|
||||
- Lesson: "ability to recover from periods of high uncertainty and re-localize is more critical than maintaining consistently low instantaneous RMSE." Direct architectural input for AC-NEW-2 / AC-NEW-8.
|
||||
- Lesson: IMU-from-airframe vibration isolation is mission-critical for VIO usability.
|
||||
- Lesson: magnetometer is unreliable near steel-reinforced structures; sensor-fusion is essential for heading robustness.
|
||||
- **Related Sub-question**: SQ1 + SQ5 (failure modes for VIO/SLAM at speed) + SQ2 (canonical pipeline)
|
||||
|
||||
### Source #29
|
||||
- **Title**: Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints
|
||||
- **Link**: https://arxiv.org/abs/2506.09748 (PDF: https://arxiv.org/pdf/2506.09748)
|
||||
- **Tier**: L1 (peer-submitted preprint, IEEE-bound, with public CS-UAV dataset)
|
||||
- **Publication Date**: June 2025 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (within 6-month critical-novelty window for SOTA claims)
|
||||
- **Version Info**: arXiv v1 (2506.09748v1)
|
||||
- **Target Audience**: Academic SOTA researchers + UAV-localization implementers
|
||||
- **Research Boundary Match**: **Full match** — exact same problem (UAV absolute visual localization in GNSS-denied conditions, downward-facing camera, satellite reference)
|
||||
- **Summary**: 2025 SOTA pipeline: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation), (2) Semantic-Aware and Structure-Constrained Matching Module using **DINOv2** features + 4D correlation tensor + SoftMNN + 4D conv, (3) lightweight fine-grained module for pixel-level. Constructs UAV absolute visual-loc pipeline **without VIO/relative-loc dependence** (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV. **Direct relevance**: this is a candidate template for our C2 (VPR) + C3 (cross-domain registration) components, but DINOv2 is a heavyweight foundation model — must be benchmarked under our 25 W / 8 GB Jetson Orin Nano envelope before selection (handed off to SQ3/SQ4 + SQ5 for that component).
|
||||
- **Related Sub-question**: SQ1 (academic SOTA), SQ3+SQ4 (C2/C3 candidates), SQ5 (Jetson-on-Foundation-Model failure mode)
|
||||
|
||||
### Source #30
|
||||
- **Title**: Raptor — GPS-Denied UAV Navigation & Coordinate Extraction (Vantor product page; Guide / Sync / Ace suite)
|
||||
- **Link**: https://www.vantor.com/product/mission-solutions/raptor/
|
||||
- **Tier**: L2 (vendor product spec; primary for the product itself, not for independent benchmark numbers)
|
||||
- **Publication Date**: live (accessed 2026-05-07; references Mar 2026 + Dec 2025 + Sep 2025 partner blog posts indicating active product line)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Defense / commercial / industrial UAV integrators
|
||||
- **Research Boundary Match**: **Full match** — vision-based aerial position software using existing camera + 3D terrain data, deployable on commodity hardware
|
||||
- **Summary**: Vantor Raptor product family: **Guide** (on-drone vision-based positioning, demonstrated <7 m absolute accuracy in all dimensions, day/night/low-altitude, runs on commodity HW); **Sync** (georegisters live drone video against 3D terrain in real time, <3 m coordinate extraction); **Ace** (laptop-side coordinate extraction at <3 m). Backbone: Vantor's "100 million-plus sq km of highly accurate 3D terrain data, regularly updated" (Vivid Terrain, 3 m accuracy). Inertial Labs partnership (VINS-integrated Raptor Guide). Use cases include joint multi-domain ops, large-scale autonomous delivery, search-and-rescue. **This is the closest production-grade commercial peer to the project's architecture (sat-basemap-as-service + on-drone vision).**
|
||||
- **Related Sub-question**: SQ1 (commercial), SQ3+SQ4 (commercial alternatives to building C2/C3 ourselves), SQ8 (basemap as a service vs offline cache)
|
||||
|
||||
### Source #31
|
||||
- **Title**: Auterion successfully completes Artemis program to deliver long-range deep strike drone (press release)
|
||||
- **Link**: https://auterion.com/auterion-successfully-completes-artemis-program-to-deliver-long-range-deep-strike-drone/
|
||||
- **Tier**: L1 (official vendor press release)
|
||||
- **Publication Date**: 2025-10-15 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Defense-procurement; UAV-integration architects
|
||||
- **Research Boundary Match**: **Full match** — fixed-wing-class one-way attack drone with Ukraine-validated GPS-denied navigation; the system architecture is directly comparable
|
||||
- **Summary**: Auterion Artemis (DIU project, completed Oct 2025) = Shahed-style design developed in Ukraine; up to 1,000-mile range; up to 40 kg warhead; runs on Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. **Establishes that the integration pattern (companion-class autopilot + visual navigation + terminal guidance) is shipping at production scale to a US defense customer.** Open architecture, manufacturing in US/UA/DE.
|
||||
- **Related Sub-question**: SQ1
|
||||
|
||||
### Source #32
|
||||
- **Title**: Bring AI and computer vision to small autonomous systems — Auterion Skynode S product page
|
||||
- **Link**: https://auterion.com/product/skynode-s
|
||||
- **Tier**: L2 (vendor product spec)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Small-UAS integrators
|
||||
- **Research Boundary Match**: Full match (companion-class autopilot with NPU)
|
||||
- **Summary**: Auterion Skynode S = compact mission computer with **dedicated Neural Processing Unit** for AI / computer-vision applications on small UAS systems. Architecturally the same niche our Jetson Orin Nano Super sits in (companion compute + autopilot integration), but with Auterion's PX4 fork pre-integrated. Hardware/runtime envelope is comparable; the product establishes that this is a product category, not a one-off integration.
|
||||
- **Related Sub-question**: SQ1, SQ7 (alternate companion HW for adjacent context)
|
||||
|
||||
### Source #33
|
||||
- **Title**: snktshrma/ngps_flight — Next-Generation Positioning System for ArduPilot (GSoC 2024)
|
||||
- **Link**: https://github.com/snktshrma/ngps_flight (sibling: https://github.com/snktshrma/ap_nongps)
|
||||
- **Tier**: L1 (open-source code repository, published GSoC project under ArduPilot organisation)
|
||||
- **Publication Date**: GSoC 2024 timeframe (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: GSoC 2024 prototype (research-grade, not production firmware)
|
||||
- **Target Audience**: ArduPilot integrators building visual-positioning companion stacks
|
||||
- **Research Boundary Match**: **Full match — closest open-source peer to our exact pipeline.** ArduPilot, downward-facing camera, satellite-image reference, deep-learning matching, fused with VIO, fed back to autopilot.
|
||||
- **Summary**: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: **`ap_ngps_ros2`** (visual geo-localization at 1–2 Hz by matching live camera frames to georeferenced satellite imagery using **LightGlue + SuperPoint**); **`ap_ukf`** (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates); **`ap_vips`** (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF via `VISION_POSITION_ESTIMATE` (per the related issue #23471 framing). **This is the architectural template** the project should explicitly compare against — same component split as our C1+C2+C3+C5+C8 stack.
|
||||
- Caveats: (a) GSoC prototype, not production-hardened; (b) uses `VISION_POSITION_ESTIMATE` which on AP requires EKF source set 2/3 with EK3_SRC*_POSXY=Vision; our SQ6 conclusion picked `GPS_INPUT` as primary AP path because it carries `horiz_accuracy` directly and supports source-set switching via `MAV_CMD_SET_EKF_SOURCE_SET` — must compare the trade-off in design phase; (c) no documented spoofing-defence integration; (d) no documented covariance-honesty contract.
|
||||
- **Related Sub-question**: SQ1 (closest open-source peer), SQ2 (canonical-pipeline confirmation), SQ3+SQ4 (architectural template for component selection), SQ6 (alternate AP transport: `VISION_POSITION_ESTIMATE` vs `GPS_INPUT`)
|
||||
|
||||
### Source #34
|
||||
- **Title**: AerialExtreMatch — A Benchmark for Extreme-View Image Matching and Localization (project page + GitHub + Hugging Face dataset)
|
||||
- **Link**: https://xecades.github.io/AerialExtreMatch/ ; https://github.com/Xecades/AerialExtreMatch ; https://huggingface.co/datasets/Xecades/AerialExtreMatch-Localization
|
||||
- **Tier**: L1 (peer-reviewed benchmark with public dataset, code, model checkpoints; OpenReview submission)
|
||||
- **Publication Date**: 2025 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Academic + practitioner image-matching evaluators
|
||||
- **Research Boundary Match**: **Full match** for cross-source UAV-satellite image matching evaluation
|
||||
- **Summary**: 2025 benchmark with: 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); ~30,000 evaluation pairs in 32 difficulty levels stratified by overlap (4 bins: <20/20-40/40-60/>60%), pitch difference (4 bins: 50–55, 55–60, 60–65, 65–70°), and scale (2 bins: 1-2×, >2×); a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. Evaluates 16 representative detector-based + detector-free image matching methods. **This is the academic benchmark our C2+C3 candidate selection must publish numbers against.**
|
||||
- **Related Sub-question**: SQ1 (academic landscape), SQ7 (datasets)
|
||||
|
||||
### Source #35
|
||||
- **Title**: DARPA Fast Lightweight Autonomy (FLA) program page + Test-and-Evaluation review (arXiv 2504.08122)
|
||||
- **Link**: https://www.darpa.mil/research/programs/fast-lightweight-autonomy ; https://arxiv.org/abs/2504.08122
|
||||
- **Tier**: L1 (DARPA program page + 2025 academic review of program results)
|
||||
- **Publication Date**: program 2015–2018 (concluded); review 2025-04 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Foundational reference; review is current (within 18-month authority window)
|
||||
- **Target Audience**: Defense-program historians + indoor-low-altitude GPS-denied autonomy researchers
|
||||
- **Research Boundary Match**: **Partial — different regime.** FLA = small quadcopters at ≤20 m/s in cluttered indoor/outdoor with onboard sensing only, no satellite-tile basemap. Our project is fixed-wing, ~17 m/s, 1 km AGL, with sat-tile basemap.
|
||||
- **Summary**: Foundational US-defense lineage for GPS-denied autonomy (2015–2018, complete). Set the template for "small UAV + onboard sensors + onboard compute → autonomous obstacle-avoidance + navigation without datalink/GPS". Phase 1 in Florida 2017; Phase 2 in Georgia 2018. The 2025 retrospective (arXiv 2504.08122) reviews FLA's testing methodology and Phase 1 results. Companion 2025 USAF SBIR Phase II solicitation (Sweetspot ID `7946c818-409f-5b31-8f06-554466071d83`) is requesting visual-position-and-navigation capability for sUAS in GPS-denied environments — the regulatory tailwind is now active.
|
||||
- **Related Sub-question**: SQ1 (defense-program lineage)
|
||||
|
||||
### Source #36
|
||||
- **Title**: DSMAC / TERCOM lineage — DTIC ADA315439 (Scene Matching Missile Guidance Technologies) + Wikipedia / SPIE references
|
||||
- **Link**: https://apps.dtic.mil/sti/tr/pdf/ADA315439.pdf ; https://en.wikipedia.org/wiki/DSMAC ; https://www.spiedigitallibrary.org/conference-proceedings-of-spie/0238/1/Terrain-Contour-Matching-TERCOM-A-Cruise-Missile-Guidance-Aid/10.1117/12.959127.short
|
||||
- **Tier**: L1 (DTIC unclassified technical report) + L2 (encyclopedia/SPIE proceedings)
|
||||
- **Publication Date**: DTIC: 1996; SPIE: 1980; Wikipedia: live
|
||||
- **Timeliness Status**: Foundational baseline (no time window per Step 0.5 — established classical algorithms)
|
||||
- **Target Audience**: Cruise-missile-class designers; analogues for downward-vision navigation
|
||||
- **Research Boundary Match**: **Partial — different regime** (cruise missile, terminal guidance). Architectural pattern (pre-cached scene reference + downward camera + correlation matching) is the direct ancestor of our C3 pipeline.
|
||||
- **Summary**: DSMAC = electro-optical camera correlated against pre-stored reference scenes (often from satellite reconnaissance), achieving 3–10 m terminal accuracy. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight; DSMAC for terminal. CEP without DSMAC: ~30 m; with DSMAC: "only meters". Gulf War 1991: >80% of 280 launched Tomahawks hit target. **Establishes that downward-vision-against-pre-stored-imagery is a 40+ year-old well-characterised technique class with documented accuracy bounds; our project's claim of <500 m / 99.9% reliability is achievable in the same technique class.**
|
||||
- **Related Sub-question**: SQ1 (lineage), SQ8 (baseline accuracy expectations)
|
||||
|
||||
### Source #37
|
||||
- **Title**: Electronic Warfare in Ukraine: The Invisible Battle — Ukraine War Analytics
|
||||
- **Link**: https://ukraine-war-analytics.com/analysis/electronic-warfare-ukraine.html
|
||||
- **Tier**: L3 (analytical aggregator; primary-source numbers cite vendor / OSINT reports)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (operational-context reference)
|
||||
- **Target Audience**: Ukraine-deployment practitioners
|
||||
- **Research Boundary Match**: Full match (operational geography, threat environment)
|
||||
- **Summary**: Operational-context anchor: Russian EW systems including Pole-21 GPS jammers (25+ km range) plus spoofing capabilities have driven ~70% of small-tactical-UAV losses to EW across the conflict. Twist Robotics' OSCAR cites the same approximate number (~75% of small tactical UAV losses to EW at the front per Source #25). **Confirms the demand-side number is consistent across two independent reporting chains.**
|
||||
- **Related Sub-question**: SQ1 (Ukraine practitioner perspective)
|
||||
|
||||
---
|
||||
|
||||
## SQ2 — Canonical pipeline decomposition
|
||||
|
||||
### Source #38
|
||||
- **Title**: Visual Place Recognition for Aerial Imagery: A Survey (Moskalenko, Kornilova, Ferrer — Skoltech)
|
||||
- **Link**: https://arxiv.org/abs/2406.00885 (v2)
|
||||
- **Tier**: L1 (peer-reviewed survey, accepted in Robotics and Autonomous Systems; companion benchmark code: https://github.com/prime-slam/aero-vloc)
|
||||
- **Publication Date**: arXiv 2024-06; v2 update through 2024
|
||||
- **Timeliness Status**: Currently valid (within 18-month authority window for established surveys; specific candidate latency numbers will need cross-validation against newer Jetson-class hardware reports)
|
||||
- **Target Audience**: Aerial-VPR practitioners + UAV navigation system architects
|
||||
- **Research Boundary Match**: **Full match** for the offline-cache visual geo-localization decomposition (aerial-nadir UAV vs. satellite tile basemap)
|
||||
- **Summary**: Authoritative two-stage pipeline definition (verbatim): "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: **global localization (or Visual Place Recognition, VPR) and local alignment**. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Re-ranking is treated as an integral sub-stage of VPR for aerial data because of agricultural/urban grid repetition. Local alignment = SuperPoint/keypoint detector → LightGlue/SuperGlue/SelaVPR matcher → cv2.findHomography → cv2.perspectiveTransform → Web-Mercator coordinate conversion. **Practitioner-critical runtime numbers (RTX 3090, NOT Jetson)**: AnyLoc descriptor calculation = 0.37–0.84 s/frame (huge ViT-G DINOv2); MixVPR / SALAD = 0.05–0.20 s; SelaVPR = 0.04 s; SuperGlue re-rank = 15–25 s on top-100 candidates; LightGlue re-rank = ~1 s; SelaVPR re-rank = <0.1 s. Memory: AnyLoc descriptors = 2.3–13.9 GB for 4–7k tiles; SelaVPR = <0.2 GB. Final commentary: "While our methodology alone may not provide comprehensive robustness, it can be effectively augmented with additional sensors, such as inertial measurement units (IMUs). This integration enhances its utility for Visual Inertial Odometry (VIO) and Simultaneous Localization and Mapping (SLAM) systems, particularly for periodic location refinement and loop closure tasks. Additionally, our methodology could serve as a dependable emergency localization fallback in the event of an unexpected GNSS signal loss." → **Validates the project's IMU/VIO + sat-anchor architecture as the canonical extension of the survey's two-stage core.**
|
||||
- **Related Sub-question**: SQ2 (canonical decomposition), SQ3+SQ4 (C2/C3 candidate latency budgets), SQ5 (foundation-model-on-Jetson failure mode)
|
||||
|
||||
### Source #39
|
||||
- **Title**: Cross-View Geo-Localization: A Survey (Durgam, Paheding, Dhiman, Devabhaktuni — U. Maine / Fairfield / ISU)
|
||||
- **Link**: https://arxiv.org/abs/2406.09722 (v1)
|
||||
- **Tier**: L1 (peer-style preprint, journal-bound — Expert Systems with Applications)
|
||||
- **Publication Date**: arXiv 2024-06
|
||||
- **Timeliness Status**: Currently valid (≤18 months for survey-of-deep-learning architectures)
|
||||
- **Target Audience**: Cross-view (ground↔aerial) geo-localization researchers; partial overlap with our aerial↔satellite pipeline
|
||||
- **Research Boundary Match**: **Partial — different cross-view setup** (the survey focuses on ground panorama → aerial overhead; ours is aerial nadir → satellite ortho). The pipeline-shape lessons transfer; the polar-transform / Siamese-network / GAN-based view-synthesis lessons do NOT directly apply because our two views are both top-down.
|
||||
- **Summary**: Confirms the canonical pipeline decomposition (feature extraction → cross-view matching → similarity-driven retrieval) is the dominant pattern across 2015–2024 SOTA. Establishes the historical lineage: pixel-wise (Sheikh 2003) → feature-based (Lin 2013) → CNN/triplet-loss (Tian 2017) → Siamese+GAN (Hu 2018) → polar-transform (Shi 2019) → CosPlace/EigenPlaces (2022–2023) → DINOv2-class (AnyLoc 2023) → Transformer-only (TransGeo 2022, MGTL 2022) → multi-method fusion (2023+). Backbone comparison table establishes that ViT/DINOv2 is the current SOTA backbone; ResNet-class is the established production baseline; SIFT/SURF/PHOW remain the handcrafted baseline. **Confirms our component-area split (C2 VPR + C3 cross-domain matching) is canonical and matches the survey's two-axis organization (backbone × matching strategy).**
|
||||
- **Related Sub-question**: SQ2 (decomposition lineage), SQ3+SQ4 (C2 candidate landscape)
|
||||
|
||||
### Source #40
|
||||
- **Title**: OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata (Dhaouadi, Marin, Meier, Kaiser, Cremers — DeepScenario / TU Munich / MCML)
|
||||
- **Link**: https://arxiv.org/abs/2509.18350 ; project page https://deepscenario.github.io/OrthoLoC
|
||||
- **Tier**: L1 (peer-style preprint with public dataset, code, model checkpoints; 16,425 UAV images Germany+US, full 6-DoF ground truth)
|
||||
- **Publication Date**: arXiv 2025-09 (within 6-month critical-novelty window)
|
||||
- **Timeliness Status**: Currently valid (within 6-month critical-novelty window for SOTA aerial-localization claims)
|
||||
- **Target Audience**: UAV-localization implementers + system architects building on Digital Orthophotos (DOP) + Digital Surface Models (DSM)
|
||||
- **Research Boundary Match**: **Full match — direct paradigm match** to our project: "lightweight orthographic representations" instead of 3D meshes; "increasingly accessible through free releases by governmental authorities"; "no internet connection or GNSS/GPS support" — exactly the project's constraint envelope.
|
||||
- **Summary**: **Most directly applicable SQ2 source.** Defines the 6-DoF localization pipeline using 2.5D geodata: (1) match query UAV image against DOP (orthophoto raster) using state-of-the-art matchers; (2) lift each 2D match in the DOP to 3D using the corresponding DSM elevation; (3) PnP+RANSAC (RANSAC-EPnP, 5-pixel inlier threshold) → initial pose; (4) Levenberg-Marquardt joint refinement of intrinsics + extrinsics; (5) **AdHoP refinement**: estimate homography from initial 2D-2D correspondences via DLT+RANSAC, warp the DOP to better match the query's perspective, re-match, map back via H⁻¹, lift to 3D, refine pose; accept refinement only if reprojection error decreases. **Quantitative results** on 16.4k images, 47 locations: best matcher = GIM+DKM achieves 75.4% recall at 1m-1° threshold (sparse SP+SG = 64.4%, sparse SP+LG = 64.2%, MASt3R = 63.5%, RoMa+AdHoP = 54.6%, XFeat*+AdHoP = 59.8%; LoFTR / eLoFTR / XoFTR all <23% recall). AdHoP yields ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods (XFeat* → 95% matching improvement; DKM → 63% translation reduction; RoMa → 1m-1° recall +23%). **Performance factors** explicitly characterized: (a) **cross-domain DOPs (visual gap only) cause ~3× translation error increase** even on best method; (b) **cross-domain DOPs+DSMs (visual + structural gap) cause ~7× translation error increase** (0.16 m → 1.12 m for GIM+DKM+AdHoP) — **this is exactly the war-zone scene-change scenario AC-3.x covers**; (c) **20% covisibility floor** between query and reference; below it localization fails; (d) **Calibration is fundamentally ambiguous** between focal length and translation → camera intrinsics MUST be calibrated upstream, not jointly optimized in flight. (e) Resolution: scaling images to 30% of original (~300 px) still works; geodata at 13 m/pixel is the floor, with degradation below.
|
||||
- **Related Sub-question**: SQ2 (canonical pipeline + AdHoP refinement loop), SQ3+SQ4 (C3 matcher candidate ranks), SQ5 (war-zone scene-change failure mode), SQ8 (covisibility safety gate)
|
||||
|
||||
### Source #41
|
||||
- **Title**: Exploring the best way for UAV visual localization under Low-altitude Multi-view Observation Condition: a Benchmark — AnyVisLoc (Ye, Teng, Chen, Li, Liu, Yu, Tan — NUDT / Macao Polytechnic)
|
||||
- **Link**: https://arxiv.org/abs/2503.10692 ; benchmark code https://github.com/UAV-AVL/Benchmark
|
||||
- **Tier**: L1 (peer-style preprint with public 18,000-image dataset across 15 Chinese cities, multi-pitch / multi-altitude / multi-scene, with both aerial-photogrammetry AND satellite reference maps)
|
||||
- **Publication Date**: arXiv 2025-03 (within 6-month critical-novelty window)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Aerial AVL practitioners; UAV-system designers facing pitch/altitude/yaw uncertainty
|
||||
- **Research Boundary Match**: **Partial — different altitude regime** (the benchmark covers 30–300 m AGL, ours is ~1 km AGL); pitch range is 20–90° (ours is mostly nadir, ~80–90°). Lessons on the **pipeline structure, retrieval-vs-matching trade-offs, sensor-prior noise tolerance, and aerial-vs-satellite reference-map gap** transfer directly.
|
||||
- **Summary**: Independently confirms the SAME pipeline as Source #40: image retrieval (rough position) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC. Best baseline = CAMP (retrieval) + RoMa (dense matcher) + Top-N re-rank → 74.1% A@5m on aerial photogrammetry map, 18.5% A@5m on satellite map (ALOS 30m DSM). **Critical AC-quantitative findings**: (a) **Aerial map vs satellite map**: 4× accuracy gap at A@5m (74.1% vs 18.5%) — driven by satellite-DSM coarseness (ALOS 30m vs aerial 0.94m) and modality difference. **Direct relevance**: project's offline cache is satellite tiles ≥0.5 m/px without DSM; this places us between the two data points (better than ALOS 30m, worse than aerial photogrammetry) — exact accuracy must be re-established once tile resolution is pinned. (b) **Yaw prior noise**: σ ≤ 5° → no impact; σ = 10° → 1.9% A@5m drop; σ = 30° → 4.1% drop; σ = 50° → 13.7% drop; σ = 60° → 25.7% drop. **Implication for project's C1+C5+IMU**: companion-side yaw estimate must hold σ < 10°. (c) **Pitch prior noise**: σ < 5° → no impact; σ ≥ 7° causes ~1–5% drops. (d) **Pitch angle**: smaller pitch (more oblique) → lower accuracy; nadir is best. Project's nadir-fixed camera at 1 km AGL is consistent with the benchmark's most-favourable regime. (e) **Sparse vs dense matchers**: SP+LightGlue+GIM+k2s = 75.4% A@10m at 105 ms/frame; RoMa = 81.3% A@10m at 659 ms/frame. **Implication for project's C7 Jetson runtime**: dense matchers ~6× more accurate but ~6× slower → SP+LightGlue-class is the production sweet spot under our 400 ms budget. (f) **Re-ranking strategy**: Top-N re-rank by inlier count = best accuracy/cost trade-off (62.2% A@5m at 0.8 s/frame on RTX 3090). Match-without-retrieval = catastrophic (34.3% A@5m, search-space too large).
|
||||
- **Related Sub-question**: SQ2 (pipeline + sensor-prior tolerance), SQ3+SQ4 (C2 retrieval-vs-matcher trade-offs, C5 IMU prior contract), SQ5 (war-zone reference-map staleness failure mode), SQ7 (aerial-vs-satellite reference benchmarks)
|
||||
|
||||
### Source #42
|
||||
- **Title**: Survey on absolute visual localization techniques for low-altitude unmanned aerial vehicles (Ye, Chen, Teng, Li, Yang, Song, Yu — NUDT, College of Aerospace Science)
|
||||
- **Link**: https://www.sciopen.com/article/10.11887/j.issn.1001-2486.25120033 ; DOI 10.11887/j.issn.1001-2486.25120033
|
||||
- **Tier**: L1 (peer-reviewed Chinese journal — Journal of National University of Defense Technology, vol 48 issue 2, 2026; same lab as Source #41 with overlapping authorship — confirmed cross-validation, not duplicative)
|
||||
- **Publication Date**: 2026-04-01 (within 6-month critical-novelty window)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: UAV-system architects + Chinese-defense-research community
|
||||
- **Research Boundary Match**: **Full match** (low-altitude UAV AVL is the survey's exact subject)
|
||||
- **Summary**: Survey-level confirmation of the canonical "**retrieval-matching-pose estimation**" hierarchical framework. Verbatim claim: "the hierarchical framework balances search efficiency, positioning accuracy, and scene generalization, becoming a robust technical path for low-altitude long-endurance absolute localization." Compares the framework against alternatives that are explicitly rejected: (a) relative visual localization (cumulative errors — VIO/SLAM only); (b) end-to-end direct localization (poor generalization); (c) map-free localization (scene-dependent). Sub-component evolution per stage: (a) retrieval = template-matching (SAD/SSD/NCC) → BoW/VLAD → deep-learning (annular/dense feature segmentation, contrastive InfoNCE, self-supervised); (b) matching = SIFT/SURF/ORB → SuperPoint+LightGlue/RoMa (sparse / semi-dense / dense); (c) pose estimation = PnP variants + RANSAC + IMU prior fusion. **Identifies four open challenges** that align with project risks: (i) cross-domain generalization (war-zone scene change); (ii) real-time inference on edge platforms (Jetson); (iii) robustness to complex environments (cropland, snow, low texture); (iv) high-quality datasets (the same gap our project's AC-NEW-7 / cache provisioning works around). **Lightweight-model-design-for-edge-deployment is named as a primary future-research direction** — directly validates project's Jetson Orin Nano constraint as a recognized field-level challenge, not a project-specific oddity.
|
||||
- **Related Sub-question**: SQ2 (framework canonicalness), SQ3+SQ4 (per-component evolution), SQ5 (named open challenges align with project risks)
|
||||
|
||||
---
|
||||
|
||||
## SQ3+SQ4 / C1 (Visual / Visual-Inertial Odometry) — Candidate enumeration
|
||||
|
||||
### Source #43
|
||||
- **Title**: VINS-Mono — A Robust and Versatile Monocular Visual-Inertial State Estimator (HKUST-Aerial-Robotics)
|
||||
- **Link**: https://github.com/HKUST-Aerial-Robotics/VINS-Mono ; LICENCE: https://github.com/HKUST-Aerial-Robotics/VINS-Mono/blob/master/LICENCE
|
||||
- **Tier**: L1 (canonical reference implementation; published in IEEE T-RO 2018 by Qin, Li, Shen)
|
||||
- **Publication Date**: original 2018; repository last meaningful update 2024-02-25 (per GitHub commit log; 2024-05-23 simulation-data commit only)
|
||||
- **Timeliness Status**: ⚠️ **Borderline.** ~24 months since the last meaningful master-branch commit at access time (2026-05-07). Established baseline that does NOT trigger Step 0.5's 18-month timeliness rejection because (a) IEEE T-RO publication is the canonical authority for the algorithm, (b) downstream forks (vins-mono-android, embedded variants) keep the algorithm class actively deployed.
|
||||
- **Version Info**: No GitHub releases / tags (master-branch-only project). Stars 5,829.
|
||||
- **Target Audience**: Mono+IMU VIO implementers; UAV state estimation researchers
|
||||
- **Research Boundary Match**: **Full match for the candidate's pinned mode** — monocular camera + IMU producing 6-DoF metric pose. The VINS-Mono README explicitly names this configuration as primary.
|
||||
- **Summary**: Optimization-based sliding-window monocular VIO. Features: efficient IMU pre-integration (Forster et al. 2017), automatic initialization, online camera-IMU extrinsic calibration, online camera-IMU temporal calibration, failure detection + recovery, loop detection (DBoW2-based), global pose graph optimization. Output is metric-scale 6-DoF pose at IMU rate (typically 100–200 Hz) with covariance from the optimization Hessian. **License: GPL-3.0 (copyleft viral)** — every binary distribution requires source disclosure for the entire linked binary; relevant for dual-use deployment if the companion image is sold or transferred to a customer.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate
|
||||
|
||||
### Source #44
|
||||
- **Title**: VINS-Fusion — Optimization-based multi-sensor state estimator (HKUST-Aerial-Robotics)
|
||||
- **Link**: https://github.com/HKUST-Aerial-Robotics/VINS-Fusion ; LICENCE: https://github.com/HKUST-Aerial-Robotics/VINS-Fusion/blob/master/LICENCE
|
||||
- **Tier**: L1 (canonical reference; superset of VINS-Mono)
|
||||
- **Publication Date**: original 2019 (Qin, Cao, Pan, Shen — ICRA workshop / IROS); repository last update 2024-05-23
|
||||
- **Timeliness Status**: ⚠️ **Borderline.** ~24 months since the last update at access time. Same Step-0.5 reasoning as VINS-Mono — established class.
|
||||
- **Version Info**: master-branch-only. Stars 4,476. Top-ranked open-source stereo-VIO on KITTI Odometry as of January 2019.
|
||||
- **Target Audience**: Multi-sensor VIO implementers (mono+IMU, stereo, stereo+IMU, +GPS fusion)
|
||||
- **Research Boundary Match**: **Full match** for monocular+IMU mode. VINS-Fusion README explicitly enumerates four sensor configurations (mono+IMU, stereo, stereo+IMU, +GPS toy example).
|
||||
- **Summary**: Superset of VINS-Mono adding stereo and GPS-fusion modes. Same algorithmic core (sliding-window optimization with IMU pre-integration). Online spatial + temporal camera-IMU calibration; visual loop closure; ROS Kinetic/Melodic build dependency. **License: GPL-3.0** — same dual-use distribution constraint as VINS-Mono. Independent KAIST benchmark (Source #46) found VINS-Fusion CPU mode + VINS-Fusion-imu **fail to run** on Jetson TX2 (insufficient memory and CPU); GPU-accelerated VINS-Fusion-gpu does run on TX2. Implication for project: VINS-Fusion-imu on Jetson Orin Nano Super is feasible but not certain; needs MVE.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate
|
||||
|
||||
### Source #45
|
||||
- **Title**: OpenVINS — An open source platform for visual-inertial navigation research (Robot Perception and Navigation Group, U. of Delaware — rpng)
|
||||
- **Link**: https://github.com/rpng/open_vins ; docs: https://docs.openvins.com/ ; LICENSE: https://github.com/rpng/open_vins/blob/master/LICENSE
|
||||
- **Tier**: L1 (canonical research implementation; ICRA 2020 paper Geneva, Eckenhoff, Lee, Yang, Huang)
|
||||
- **Publication Date**: original 2020; latest tagged release v2.7 = 2023-06; ongoing master-branch commits through 2024–2025 (latest issue threads through Feb 2025)
|
||||
- **Timeliness Status**: ✅ Currently valid (master branch active; latest tagged release ~35 months but library is in stable/maintenance mode with continued issue triage).
|
||||
- **Version Info**: Stars 2,828; 30 contributors; 12 releases. v2.7 is the current tagged stable.
|
||||
- **Target Audience**: MSCKF/EKF VIO implementers; researchers needing a reference MSCKF
|
||||
- **Research Boundary Match**: **Full match** for monocular+IMU mode. OpenVINS supports mono, stereo, multi-camera (1–N cameras) + IMU; mono is a documented first-class mode.
|
||||
- **Summary**: Modular MSCKF (Multi-State Constraint Kalman Filter) implementation built around an Extended Kalman filter that fuses inertial state with sparse visual feature tracks via the sliding-window MSCKF formulation (Mourikis & Roumeliotis 2007). Supports SLAM features (in-state landmarks) plus pure MSCKF features (out-of-state). ROS1 + ROS2 (Humble) builds documented; Jetson Orin Nano Dev Kit + JetPack 6 + ROS 2 Humble compilation **confirmed working** by community contributors (rpng/open_vins issue #421, fdcl-gwu/openvins_jetson_realsense Nov 2025 setup guide). **License: GPL-3.0** — same dual-use distribution constraint. Reported latency ~270 ms on Xavier NX (4-core, ARM, 40% CPU usage) per issue #164; needs Jetson-Orin-Nano-Super MVE for production budget verification.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate
|
||||
|
||||
### Source #46
|
||||
- **Title**: Run Your Visual-Inertial Odometry on NVIDIA Jetson — Benchmark Tests on a Micro Aerial Vehicle (Jeon, Jung, Lee, Choi, Myung — KAIST)
|
||||
- **Link**: https://arxiv.org/abs/2103.01655 ; KAIST VIO dataset: https://github.com/zinuok/kaistviodataset
|
||||
- **Tier**: L1 (peer-reviewed conference, IROS-track preprint with public dataset)
|
||||
- **Publication Date**: arXiv 2021-03-02
|
||||
- **Timeliness Status**: ⚠️ Older than the 18-month Critical-novelty window, but **uniquely authoritative** for the specific question "do these VIO algorithms run on a Jetson?"; the included algorithms (VINS-Mono, VINS-Fusion, ROVIO, ALVIO, Stereo-MSCKF, Kimera, ORB-SLAM2-stereo) are all classical baselines whose runtime characteristics on ARM CPUs have not changed materially. Jetson hardware comparison (TX2 / Xavier NX / AGX Xavier) does NOT include Orin Nano — must extrapolate.
|
||||
- **Version Info**: Conference paper.
|
||||
- **Target Audience**: UAV state-estimation engineers picking a VIO for a Jetson companion
|
||||
- **Research Boundary Match**: **Strong match for the question**, partial for the hardware (no Orin Nano). KAIST VIO dataset is indoor mocap, not UAV-aerial-nadir — the *latency / CPU / memory* numbers transfer; the *accuracy* numbers do not transfer to our domain.
|
||||
- **Summary**: Comprehensive benchmark of 9 algorithms on TX2, Xavier NX, AGX Xavier: VINS-Mono, VINS-Fusion (CPU), VINS-Fusion-gpu, VINS-Fusion-imu, ROVIO, Stereo-MSCKF, ALVIO, Kimera, ORB-SLAM2-stereo. **Hard findings**: (a) on TX2, **VINS-Fusion (CPU) and VINS-Fusion-imu fail to run** due to insufficient memory and CPU performance — VINS-Fusion-gpu does run; (b) all algorithms except ROVIO show >100% CPU usage (multi-core utilisation, OK for our 6-core Orin Nano A78AE); (c) Kimera has the highest memory usage among VIO methods (numerous computations per keyframe), failure-prone on Xavier NX-class memory; (d) Stereo-MSCKF has the lowest memory among stereo VIOs; (e) ROVIO has the lowest CPU usage owing to its patch-tracking formulation. **Implication for project**: Jetson Orin Nano Super (8 GB shared, 6-core A78AE, Ampere GPU, 67 TOPS sparse INT8) is between Xavier NX and AGX Xavier in CPU performance and memory; algorithms passing on Xavier NX should pass on Orin Nano Super, but VINS-Fusion-imu's TX2 failure is a yellow-flag for memory pressure under co-resident C2/C3/C5 modules.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 (VINS-Mono / VINS-Fusion / OpenVINS / Kimera / Stereo-MSCKF / ROVIO Jetson runtime evidence), SQ5 (resource-budget failure modes)
|
||||
|
||||
### Source #47
|
||||
- **Title**: OKVIS2 — Realtime Scalable Visual-Inertial SLAM with Loop Closure (Leutenegger, ETH/Imperial/TUM Smart Robotics Lab)
|
||||
- **Link**: https://github.com/ethz-mrl/okvis2 ; arXiv: https://arxiv.org/abs/2202.09199 ; LICENSE: https://github.com/ethz-mrl/okvis2/blob/main/LICENSE
|
||||
- **Tier**: L1 (canonical implementation; arXiv 2022 by paper author)
|
||||
- **Publication Date**: original arXiv 2022; OKVIS2-X T-RO 2025 successor (Boche, Jung, Laina, Leutenegger — IEEE T-RO 2025, vol 41 pp 6064–6083, DOI 10.1109/TRO.2025.3619051; arXiv 2510.04612, Oct 2025). Repository last push 2026-03-17 (ethz-mrl/OKVIS2-X).
|
||||
- **Timeliness Status**: ✅ **Current.** Active development through 2026; OKVIS2-X is the most recent published VI-SLAM system in this class.
|
||||
- **Version Info**: ethz-mrl/okvis2 (core) and ethz-mrl/OKVIS2-X (multi-sensor extension with optional GNSS / LiDAR / dense depth).
|
||||
- **Target Audience**: Factor-graph VI-SLAM implementers; mid-large-scale loop-closure use cases
|
||||
- **Research Boundary Match**: **Full match** for monocular+IMU mode. OKVIS2 README + paper explicitly support mono and multi-camera VI configurations. OKVIS2-X adds GNSS fusion (relevant: VINS-Fusion-style GPS-when-available drop-in IS the project's eventual posture in non-spoofed regions).
|
||||
- **Summary**: Factor-graph VI-SLAM with bounded-size optimization. Innovation: pose-graph edges from marginalised observations can be "seamlessly turned back into observations" upon loop closure, reviving old landmarks and reprojection errors. Includes lightweight CNN segmentation for dynamic-region removal. OKVIS2-X (2025) generalises the core to fuse multi-camera + IMU + optional GNSS + LiDAR/depth — directly aligned with project's "VIO that may opportunistically fuse a non-spoofed GPS update" pattern and AC-NEW-2's spoof-promotion path. **License: 3-clause BSD (permissive)** — no copyleft / dual-use distribution friction. Note: GitHub UI shows "Other (NOASSERTION)" because of the standard BSD clause language pattern; the LICENSE file is canonical 3-clause BSD.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate (factor-graph + permissive license + active maintenance)
|
||||
|
||||
### Source #48
|
||||
- **Title**: OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS (Boche, Jung, Laina, Leutenegger — TUM / ETH Zurich Smart Robotics Lab)
|
||||
- **Link**: https://github.com/ethz-mrl/OKVIS2-X ; arXiv: https://arxiv.org/abs/2510.04612 ; IEEE T-RO 2025 vol 41 pp 6064–6083 DOI 10.1109/TRO.2025.3619051
|
||||
- **Tier**: L1 (peer-reviewed IEEE Transactions on Robotics, Special Issue Visual SLAM 2025)
|
||||
- **Publication Date**: arXiv 2025-10-04; T-RO 2025 vol 41
|
||||
- **Timeliness Status**: ✅ Current (within 6-month Critical-novelty window)
|
||||
- **Version Info**: 295 stars; 38 forks; 2 contributors; created 2025-09-23, last push 2026-03-17. License: NOASSERTION on GitHub UI; per-paper license follows ethz-mrl convention (BSD-3 derived).
|
||||
- **Target Audience**: Multi-sensor SLAM researchers; large-scale VI-SLAM with optional GNSS/LiDAR
|
||||
- **Research Boundary Match**: **Strong match** — extends OKVIS2 monocular+IMU mode with optional GNSS fusion (Visual-Inertial SLAM with Tightly-Coupled Dropout-Tolerant GPS Fusion lineage from IROS 2022). Project's `MAV_CMD_SET_EKF_SOURCE_SET` switch + companion-side spoof-detection conceptually mirrors OKVIS2-X's "GPS as drop-out-tolerant signal".
|
||||
- **Summary**: Non-trivial extension of OKVIS2; submap-based volumetric occupancy mapping. Demonstrates that the OKVIS2 factor-graph backbone can absorb spoofing-aware GPS without re-architecting. Useful as architectural template for project's C5 estimator + C8 adapter integration. License: same as OKVIS2 (BSD-3-derived). Two named contributors (bochsim, SebsBarbas) actively pushing through Mar 2026.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 (OKVIS2 lineage; VI-SLAM with optional GPS/LiDAR), SQ8 (GPS-fusion dropout-tolerant lineage)
|
||||
|
||||
### Source #49
|
||||
- **Title**: Kimera-VIO — Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation (MIT-SPARK)
|
||||
- **Link**: https://github.com/MIT-SPARK/Kimera-VIO ; LICENSE.BSD: https://github.com/MIT-SPARK/Kimera-VIO/blob/master/LICENSE.BSD
|
||||
- **Tier**: L1 (canonical implementation by MIT SPARK Lab)
|
||||
- **Publication Date**: original 2020 (Rosinol, Abate, Chang, Carlone — ICRA 2020); ongoing development through 2024–2025 issue threads (Dec 2024 / Feb 2025 ROS2 / mono-inertial discussion).
|
||||
- **Timeliness Status**: ✅ Active maintenance (recent issues / PRs through 2025).
|
||||
- **Version Info**: master-branch-only; LICENSE.BSD = BSD 2-Clause "Simplified".
|
||||
- **Target Audience**: VI-SLAM + mesh-mapping researchers
|
||||
- **Research Boundary Match**: **Partial.** Stereo+IMU is the primary supported configuration; mono+IMU is **optional but documented**. Kimera also produces 3D mesh and high-level semantic labels (relevant to neither C1 nor the project's bandwidth budget — overhead).
|
||||
- **Summary**: Frontend (image processing + IMU pre-integration) + Backend (factor-graph optimization in iSAM2 or GTSAM) + Mesher + Pose-Graph-Optimizer. **License: BSD 2-Clause (permissive)** — no dual-use distribution friction. **Penalty for project**: Source #46 KAIST benchmark found Kimera has highest memory usage among the VIOs tested (numerous computations per keyframe), and Kimera failed to fit on Xavier-NX-class memory under multi-process load. Mesh + semantic features are unused by the project — Kimera's overhead is unjustified vs OKVIS2 / OpenVINS for the project's narrow C1 mandate. **Status**: viable secondary fallback if OKVIS2 / VINS-Mono runtime issues arise; not a lead candidate due to overhead misfit.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 secondary candidate (BSD-permissive but resource-heavy)
|
||||
|
||||
### Source #50
|
||||
- **Title**: DROID-SLAM — Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras (princeton-vl, Teed & Deng)
|
||||
- **Link**: https://github.com/princeton-vl/droid-slam ; arXiv: https://arxiv.org/abs/2108.10869 ; NeurIPS 2021
|
||||
- **Tier**: L1 (canonical reference)
|
||||
- **Publication Date**: NeurIPS 2021; repository latest tagged baseline.
|
||||
- **Timeliness Status**: ✅ Foundational reference; DPV-SLAM (Source #51) is the lighter successor.
|
||||
- **Version Info**: master-branch-only.
|
||||
- **Target Audience**: Deep-learning-based VO/VSLAM researchers
|
||||
- **Research Boundary Match**: **Disqualified by hardware budget.** Inference requires ≥11 GB GPU VRAM per official README; project budget is 8 GB **shared CPU+GPU** on Jetson Orin Nano Super, leaving <8 GB for VO + VPR + matcher + estimator + cache co-resident. DROID-SLAM is also **monocular VO/SLAM, not VIO** — no native IMU fusion; metric scale recovery requires external scale alignment.
|
||||
- **Summary**: Recurrent dense bundle adjustment over a complete history of camera poses. State-of-the-art accuracy on TartanAir / EuRoC / TUM-RGBD at the cost of GPU memory. **Disqualified outright for C1 lead** by AC-4.2 (≤8 GB shared RAM) and the lack of IMU fusion that would require an additional ESKF/UKF wrapping. Kept as **reference baseline** to be cited as "what we cannot afford" in `solution_draft01`.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 disqualified candidate
|
||||
|
||||
### Source #51
|
||||
- **Title**: DPVO — Deep Patch Visual Odometry (princeton-vl, Teed, Lipson, Deng) + DPV-SLAM (Lipson, Teed, Deng — ECCV 2024)
|
||||
- **Link**: https://github.com/princeton-vl/DPVO ; LICENSE: https://github.com/princeton-vl/DPVO/blob/main/LICENSE ; ECCV 2024 paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00272.pdf
|
||||
- **Tier**: L1 (canonical implementation; NeurIPS 2023 + ECCV 2024)
|
||||
- **Publication Date**: NeurIPS 2023 (DPVO); ECCV 2024 (DPV-SLAM); repository last update 2024-10-12.
|
||||
- **Timeliness Status**: ⚠️ Borderline. ~19 months since last code update; ECCV-2024 publication of DPV-SLAM keeps the algorithm class within the 6-month claim window for the SLAM successor.
|
||||
- **Version Info**: 989 stars; primary languages C++ / Python / CUDA. **License: MIT (permissive)** — no dual-use distribution friction.
|
||||
- **Target Audience**: Deep-learning VO/SLAM with reduced memory footprint
|
||||
- **Research Boundary Match**: **Partial.** DPVO is **monocular VO only — no IMU fusion**. Output pose is in arbitrary scale (no metric scale recovery). To be a viable C1 candidate the project must wrap DPVO with an external IMU+scale-fusion stage (loosely-coupled ESKF / VIO-fusion module). This makes DPVO **not a drop-in C1** like VINS-Mono / OpenVINS / OKVIS2; it is a **VO module that needs a separate VIO wrapper**.
|
||||
- **Summary**: Sparse patch tracking + differentiable bundle adjustment back end. Outperforms DROID-SLAM on TartanAir / EuRoC ATE while using ~1/3 of DROID-SLAM's GPU memory (DROID-SLAM: 8.7 GB VO mode vs DPVO: ~3 GB). DPV-SLAM (Lipson, Teed, Deng — ECCV 2024) adds full SLAM capability with 4–5 GB GPU usage. **Jetson runtime evidence**: indirect via DPVO-QAT++ (Source #52) — peak reserved memory 1.02 GB on RTX 4060 (8 GB) after INT8 fake-quant + custom CUDA kernel fusion; not directly tested on Jetson Orin Nano. **Status for C1**: pure-VO candidate (must be paired with separate IMU integration to deliver metric scale + attitude); would not satisfy "monocular VIO" gate alone, but viable as the *VO half* of a hybrid C1+C5 design.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 conditional candidate (VO not VIO; needs external IMU wrapper)
|
||||
|
||||
### Source #52
|
||||
- **Title**: DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry (Cheng Liao)
|
||||
- **Link**: https://arxiv.org/abs/2511.12653 ; project HTML: https://arxiv.org/html/2511.12653
|
||||
- **Tier**: L2 (single-author preprint, code partially released; no peer-review yet)
|
||||
- **Publication Date**: arXiv 2025-11-16 (within 6-month Critical-novelty window)
|
||||
- **Timeliness Status**: ✅ Current
|
||||
- **Version Info**: arXiv preprint; code & weights released for QAT-only and fused-CUDA variants.
|
||||
- **Target Audience**: Embedded-platform DPVO deployers
|
||||
- **Research Boundary Match**: **Partial.** Hardware tested = RTX 4060 (8 GB) + Intel Core Ultra 5-125H + 32 GB RAM — desktop GPU, NOT Jetson Orin Nano. Direct extrapolation requires Jetson MVE; Orin Nano Super's Ampere GPU is architecturally similar but smaller than RTX 4060.
|
||||
- **Summary**: Quantization-Aware Training framework for DPVO with fused CUDA kernels. Reduces peak GPU memory from 1.94 GB → 1.02 GB (-47%) on a representative TartanAir sequence; +34.6% median FPS on TartanAir, +26.7% on EuRoC; -22.8 ms / -19.7 ms median P99 tail latency on TartanAir / EuRoC respectively. Heterogeneous precision: front-end pseudo-quantization (FP16/FP32 with INT8 simulation) + FP32 back-end geometric solver. **Implication for project**: shows DPVO has a documented Jetson-suitable footprint **path** but not a Jetson-Orin-Nano measurement. ATE accuracy comparable to baseline DPVO across 32 TartanAir + 11 EuRoC validation sequences. Notable: requires a teacher-student distillation training pipeline before deployment — adds operational complexity vs classical VINS-* / OpenVINS / OKVIS2.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 supporting evidence for DPVO embedded feasibility
|
||||
|
||||
### Source #53
|
||||
- **Title**: Pure VO baseline — KLT optical flow + 5-point essential matrix or homography RANSAC (OpenCV reference)
|
||||
- **Link**: https://docs.opencv.org/4.x/d4/dee/tutorial_optical_flow.html ; representative public implementation: https://github.com/alishobeiri/Monocular-Video-Odometery (MIT, 2018) ; tutorial reference: https://zxh.me/posts/2022-12-19-monocular-visual-odometry/
|
||||
- **Tier**: L1 (OpenCV official documentation) + L2 (representative public implementations)
|
||||
- **Publication Date**: OpenCV docs continuously updated; tutorial 2022-12; reference implementation 2018 (algorithmic class is foundational, no time window per Step 0.5)
|
||||
- **Timeliness Status**: ✅ Foundational baseline (no time window).
|
||||
- **Version Info**: OpenCV `cv::calcOpticalFlowPyrLK` (KLT) + `cv::findEssentialMat` (5-point Nister) or `cv::findHomography` with RANSAC.
|
||||
- **Target Audience**: Implementers needing a transparent low-complexity fallback
|
||||
- **Research Boundary Match**: **Full match for the simple-baseline candidate.** Suits planar nadir-down UAV at altitude (Ukrainian steppe is ~planar at 1 km AGL — homography is geometrically appropriate; for non-planar relief the essential matrix path is more appropriate but adds scale-recovery work).
|
||||
- **Summary**: Established classical pipeline: Shi-Tomasi or FAST corner detection → KLT pyramidal optical flow tracking → 5-point essential matrix or homography RANSAC → relative pose with arbitrary scale (must be metric-scale-aligned via IMU integration externally). Reference implementations widely available in OpenCV samples and pedagogical repos. **Status**: candidate as the project's `Simple baseline / known-runnable / known-failure-mode` C1 option per Component Option Breadth rule. Not a lead, but mandatory fallback presence per the research engine's "include at least one simple baseline" rule.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 simple-baseline candidate
|
||||
@@ -0,0 +1,171 @@
|
||||
# Source Registry — Summary & Index
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `../00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> Investigation order saved in `../00_question_decomposition.md` → "Next Step": SQ6 → SQ1 → SQ2 → SQ3+SQ4 per component (C1→C8) ✓ → C10 next → SQ5 interleaved → SQ8 → SQ9 synthesis at engine Step 8. **SQ7 (datasets / SITL / replay) deferred to Test Spec (greenfield Step 5) per 2026-05-08 C9 / SQ7 restructure** — see `../00_question_decomposition.md` → "C9 / SQ7 Restructure" section.
|
||||
>
|
||||
> This folder replaces the previous monolithic `01_source_registry.md`. The full per-source description for any source `#N` in the table below lives in the category file linked in its row.
|
||||
|
||||
## Category Index
|
||||
|
||||
| Category | File | Sources | Status |
|
||||
|---|---|---|---|
|
||||
| SQ6 — ArduPilot Plane vs iNav external positioning | [`SQ6_external_positioning.md`](SQ6_external_positioning.md) | #1–#24 | Saturated for protocol-level architectural decision |
|
||||
| SQ1 — Existing GPS-denied UAV systems | [`SQ1_existing_systems.md`](SQ1_existing_systems.md) | #25–#37 | Saturated |
|
||||
| SQ2 — Canonical pipeline decomposition | [`SQ2_canonical_pipeline.md`](SQ2_canonical_pipeline.md) | #38–#42 | Saturated |
|
||||
| C1 — VIO candidates | [`C1_vio.md`](C1_vio.md) | #43–#56 | Closed at documentary level |
|
||||
| C2 — VPR candidates | [`C2_vpr.md`](C2_vpr.md) | #57–#68 | Mandatory pre-screen complete (5/5) |
|
||||
| C3 — Matcher candidates | [`C3_matchers.md`](C3_matchers.md) | #69–#81 | Closed at documentary level |
|
||||
| C4 — Pose estimation candidates | [`C4_pose_estimation.md`](C4_pose_estimation.md) | #82–#87 | Closed at 3/N |
|
||||
| C5 — State estimator / sensor fusion candidates | [`C5_state_estimator.md`](C5_state_estimator.md) | #88–#91 | Closed at 2/N (batch 1 closed) |
|
||||
| C6 — Tile cache + spatial index candidates | [`C6_tile_cache_spatial_index.md`](C6_tile_cache_spatial_index.md) | #92–#98 | Closed at 2/N (batch 1 closed) — Cand 1 (mirror-suite-pattern) RECOMMENDED PRIMARY; Cand 2 (PostGIS+pgvector) DEFERRED secondary |
|
||||
| C7 — On-Jetson inference runtime candidates | [`C7_inference_runtime.md`](C7_inference_runtime.md) | #99–#105 | Closed at 3/N (batch 1 closed 2026-05-08) — Cand 1 (TensorRT native) RECOMMENDED PRIMARY; Cand 2 (ONNX Runtime + TRT EP) modern-competitive-lead-cross-architecture-portability; Cand 3 (pure PyTorch FP16) mandatory simple-baseline |
|
||||
| C8 — MAVLink / MSP2 FC adapter candidates | [`C8_fc_adapter.md`](C8_fc_adapter.md) | #106–#113 | Closed at 3/N (batch 1 closed 2026-05-08) — Cand 1 (pymavlink → MAVLink GPS_INPUT) RECOMMENDED PRIMARY for ArduPilot Plane; Cand 2 (MSP2_SENSOR_GPS via Python MSP V2) RECOMMENDED PRIMARY for iNav (locked SQ6 + AC-4.3 transport); Cand 3 (UBX impersonation via pyubx2 NAV-PVT) DEFERRED secondary for iNav after comparative-improvement verdict |
|
||||
| C10 — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; D-C6-3 + D-C7-7 confirmation pipelines only, operator tooling deferred to Plan-phase) | [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | #114–#121 | Closed at 2/N (batch 1 closed 2026-05-08) — D-C6-3 confirmation: direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` + content-hash verification gate at takeoff (FAISS MIT, atomicwrites MIT); D-C7-7 confirmation: hybrid Polygraphy CLI primary + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API escape hatch (Polygraphy + TensorRT 10.x Apache-2.0 throughout) |
|
||||
|
||||
## Investigation Status
|
||||
|
||||
| Sub-question | Status | Notes |
|
||||
|---|---|---|
|
||||
| SQ6 — ArduPilot vs iNav external positioning | **Saturated for protocol-level architectural decision** (further detail deferred to SQ8 for spoofing-side fields and to design phase for SITL parameter tuning) | Major finding: iNav has no inbound external-positioning MAVLink handler; AC-4.3 wording must be revised. See `../02_fact_cards/SQ6_fc_external_positioning.md` "SQ6 Conclusions". |
|
||||
| SQ1 — Existing GPS-denied UAV systems | **Saturated.** 13 sources logged across academic / open-source / commercial / defense-program / Ukraine-practitioner. Closest peer system: Twist Robotics OSCAR (deployed in Ukraine). Closest open-source pipeline-match: snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024 — LightGlue+SuperPoint+UKF+VISION_POSITION_ESTIMATE). Closest deployed commercial: Auterion Artemis (Skynode N + Visual Navigation, Ukraine-tested, 1000-mile range). | See `../02_fact_cards/SQ1_existing_systems.md` cluster + working summary. |
|
||||
| SQ2 — Canonical pipeline decomposition | **Saturated.** 5 surveys/benchmarks logged (Skoltech aerial VPR, U.Maine cross-view, OrthoLoC 2.5D geodata, AnyVisLoc low-altitude multi-view, NUDT 2026 sciopen survey). All converge on **`retrieval → matching → pose-estimation`** hierarchical framework with VIO/IMU as auxiliary. Two new architectural facts added to C1–C10: (a) **AdHoP-style perspective-refinement loop** between matching and PnP (+63% translation accuracy, method-agnostic), (b) **DSM 2.5D dependency** for full 6-DoF on aerial-to-satellite (must be resolved with the Suite Sat Service or accepted as a 3-DoF degraded mode). Practitioner runtime evidence: AnyLoc on RTX 3090 = 0.63s/descriptor, SuperGlue re-rank = 17–25s; on Jetson Orin Nano these are non-viable for our 400 ms p95 budget — must restrict to lightweight VPR (e.g., MixVPR / SALAD class) + LightGlue/XFeat-class matchers. See `../02_fact_cards/SQ2_canonical_pipeline.md` "SQ2 Conclusions". |
|
||||
| SQ3+SQ4 — Per-component candidates (C1–C10) | **In progress** — C1 (VIO) **CLOSED** at documentary level (Sources #43–#56). C2 (VPR) — **mandatory pre-screen COMPLETE at documentary level (5 of 5 candidates)**: MixVPR (Sources #57+#58), SALAD (Sources #59+#60+#61), SelaVPR (Sources #62+#63), NetVLAD (Sources #64+#65+#66), **EigenPlaces (Sources #67+#68 — closure 2026-05-08)**. All five mandatory candidates have per-mode API capability verification ✅, per-numbered-Restriction × per-numbered-AC sub-matrix written, and `../06_component_fit_matrix/C2_vpr.md` rows populated. **Conditional pre-screen candidates (AnyLoc / BoQ / DINOv2-VLAD)** are GATED on a prerequisite **INT8 quantization survey** before they can be added to per-mode rows (per Fact #26 pre-screen rule). C3 closed at documentary level (Sources #69–#81). C4 closed at 3/N (Sources #82–#87). **C5 CLOSED at 2/N — batch 1 closed 2026-05-08** (mandatory simple-baseline = Manual ESKF Solà 2017 [Sources #88–#89]; modern-competitive-lead-factor-graph = GTSAM iSAM2 + ImuFactor + smart factors + Marginals [Sources #90–#91]). **C6 CLOSED at 2/N — batch 1 closed 2026-05-08** (Cand 1 RECOMMENDED PRIMARY = mirror-of-suite-satellite-provider pattern: PostgreSQL btree + bytea + FAISS HNSW + filesystem [Sources #92+#96+#97+#98]; Cand 2 DEFERRED secondary = PostGIS GiST + pgvector HNSW + filesystem [Sources #94+#95]; Source #93 = PostgreSQL btree multicolumn-indexes docs cross-cite). **C7 CLOSED at 3/N — batch 1 closed 2026-05-08** (Cand 1 RECOMMENDED PRIMARY = TensorRT native [Sources #99+#104+#105]; Cand 2 modern-competitive-lead-cross-architecture-portability = ONNX Runtime + TRT EP [Source #100 + #103]; Cand 3 mandatory simple-baseline = pure PyTorch FP16 [Source #101]; Source #102 = YOLO26 Jetson Orin Nano Super benchmark; Source #103 = LightGlue+TRT+FP8 quantization-sensitivity finding driving D-C7-6 cross-component precision policy). **C8 CLOSED at 3/N — batch 1 closed 2026-05-08** (Cand 1 RECOMMENDED PRIMARY for ArduPilot = pymavlink → MAVLink GPS_INPUT msg 232 cooperative-path [Sources #106+#107 + cross-cite SQ6 Source #4 AP_GPS_MAV.cpp ingestion-path]; Cand 2 RECOMMENDED PRIMARY for iNav = MSP2_SENSOR_GPS id 7939 / 0x1F03 via Python MSP V2 implementation [Sources #111+#112+#113 + cross-cite SQ6 Source #12+#13]; Cand 3 DEFERRED secondary for iNav = UBX impersonation via pyubx2 NAV-PVT [Sources #108+#109+#110 + cross-cite SQ6 Fact #10] with comparative-improvement verdict that does NOT clear user's "significant-improvement-only" bar over Cand 2; mid-batch correction via c8_inav_recovery=B preserved locked SQ6 + AC-4.3 + restrictions.md verdicts). **C9 DROPPED** from research scope per 2026-05-08 SQ7/C9 restructure (datasets/SITL/replay deferred to Test Spec greenfield Step 5). **C10 CLOSED at 2/N — batch 1 closed 2026-05-08** under CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C (operator CLI/desktop tooling, sector classification, freshness pipeline deferred to Plan-phase): D-C6-3 confirmation = direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` + content-hash (SHA-256) verification gate at takeoff load + `IO_FLAG_MMAP_IFC` mmap [Sources #114+#115+#116]; D-C7-7 confirmation = hybrid Polygraphy CLI primary for INT8-calibrating builds + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API escape hatch [Sources #117+#118+#119+#120+#121]; **no further C10 batches required at the research layer** — operator tooling design enters at Plan-phase. | See `../02_fact_cards/C1_vio.md` + `../02_fact_cards/C2_vpr.md` + `../02_fact_cards/C3_matchers.md` + `../02_fact_cards/C4_pose_estimation.md` + `../02_fact_cards/C5_state_estimator.md` + `../02_fact_cards/C6_tile_cache_spatial_index.md` + `../02_fact_cards/C7_inference_runtime.md` clusters; `../06_component_fit_matrix/C{1..7}_*.md` rows. |
|
||||
| SQ5 — Failure modes / deployment lessons | Not started (interleaved with SQ3/SQ4) | |
|
||||
| SQ7 — Datasets, SITL, replay environments | **Deferred to Test Spec (greenfield Step 5)** per 2026-05-08 C9 / SQ7 restructure | Fixture-class / test-infra-class — not researched in this Mode A run. Carryforward payload preserved in `../00_question_decomposition.md` → "C9 / SQ7 Restructure" section. |
|
||||
| SQ8 — Safety considerations (AC-NEW-4 / AC-NEW-7) | Not started | Carries the AP_GPS spoofing-signal probe deferred from SQ6. |
|
||||
| SQ9 — End-to-end synthesis | Step 8 of engine (deferred) | |
|
||||
|
||||
---
|
||||
|
||||
## Source Summary Table
|
||||
|
||||
Compact one-line index across all 121 sources. For full per-source description, follow the **File** link.
|
||||
|
||||
| # | Title | Tier | File |
|
||||
|---|---|---|---|
|
||||
| 1 | Non-GPS Navigation — Plane documentation | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 2 | GPS / Non-GPS Transitions — Plane documentation | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 3 | EKF Source Selection and Switching — Plane documentation | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 4 | ArduPilot AP_GPS_MAV.cpp (master) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 5 | ArduPilot PR #28750 — AP_NavEKF3 EK3_OPTION bits (GPS-denied testing) | L2 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 6 | ArduPilot Issue #15859 — EKF3 source switching (GPS↔NonGPS) | L4 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 7 | ArduPilot Issue #27193 — EK3 Source Switching wrong frame for GUIDED | L4 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 8 | ArduPilot Issue #23485 — fuse only External Nav Velocities | L4 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 9 | iNavFlight/inav telemetry/mavlink.c (master inbound switch) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 10 | iNav Wiki — MAVLink (frogmane edited 2025-12-11) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 11 | iNav Wiki — GPS and Compass setup | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 12 | iNavFlight/inav docs/development/msp/README.md (MSP message reference) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 13 | iNavFlight/inav src/main/io/gps.c + target/common.h (master) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 14 | iNav Issue #10141 — dual GPS support | L4 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 15 | iNav docs/GPS_fix_estimation.md (master) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 16 | iNav docs/Settings.md (master) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 17 | iNav Issue #10588 — DeadReckoning weird behaviour during GPS outage | L4 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 18 | iNav Release 8.0.0 (highlights, Dec 2024) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 19 | iNav Release 9.0.0 / 9.0.1 + Release Notes wiki | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 20 | MAVLink common message set — GPS_RAW_INT (24) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 21 | MAVLink PR #2110 — gps: add status and integrity information | L2 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 22 | AirDroper — GNSS Spoofing Filter companion device | L3 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 23 | ArduPilot PR #24135 — EKF3 robust to bad IMU and lane-switching | L2 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 24 | ArduPilot AP_NavEKF3 — VehicleStatus.cpp + AP_NavEKF3.cpp (master) | L1 | [SQ6](SQ6_external_positioning.md) |
|
||||
| 25 | Twist Robotics OSCAR — visual navigation system (Ukraine deployment) | L2 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 26 | Ukraine Drones with Vision-Based Navigation Past Heavy Jamming (TWZ) | L2 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 27 | Ukraine's Ruta Missile Drone EW-Immune Navigation (Defense Express) | L2 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 28 | Kilometer-Scale GNSS-Denied UAV Navigation via Heightmap Gradients | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 29 | Hierarchical Image Matching for UAV Absolute Visual Localization | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 30 | Raptor — GPS-Denied UAV Navigation & Coordinate Extraction (Vantor) | L2 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 31 | Auterion Artemis program — long-range deep-strike completion | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 32 | Auterion Skynode N — AI/CV for small autonomous systems | L2 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 33 | snktshrma/ngps_flight — NGPS for ArduPilot (GSoC 2024) | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 34 | AerialExtreMatch — benchmark for extreme-view image matching/localization | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 35 | DARPA Fast Lightweight Autonomy (FLA) program page + T&E review | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 36 | DSMAC / TERCOM lineage — DTIC ADA315439 | L1 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 37 | Electronic Warfare in Ukraine — Ukraine War Analytics | L3 | [SQ1](SQ1_existing_systems.md) |
|
||||
| 38 | VPR for Aerial Imagery: A Survey (Skoltech, Moskalenko et al.) | L1 | [SQ2](SQ2_canonical_pipeline.md) |
|
||||
| 39 | Cross-View Geo-Localization: A Survey (U. Maine) | L1 | [SQ2](SQ2_canonical_pipeline.md) |
|
||||
| 40 | OrthoLoC: UAV 6-DoF Localization with Orthographic Geodata | L1 | [SQ2](SQ2_canonical_pipeline.md) |
|
||||
| 41 | AnyVisLoc — UAV visual localization, low-altitude multi-view | L1 | [SQ2](SQ2_canonical_pipeline.md) |
|
||||
| 42 | NUDT 2026 — survey on absolute visual localization for low-altitude UAV | L1 | [SQ2](SQ2_canonical_pipeline.md) |
|
||||
| 43 | VINS-Mono — robust monocular visual-inertial state estimator | L1 | [C1](C1_vio.md) |
|
||||
| 44 | VINS-Fusion — optimization-based multi-sensor state estimator | L1 | [C1](C1_vio.md) |
|
||||
| 45 | OpenVINS — open-source VI navigation research platform | L1 | [C1](C1_vio.md) |
|
||||
| 46 | Run VIO on NVIDIA Jetson — KAIST benchmark | L1 | [C1](C1_vio.md) |
|
||||
| 47 | OKVIS2 — realtime scalable VI-SLAM with loop closure | L1 | [C1](C1_vio.md) |
|
||||
| 48 | OKVIS2-X — open keyframe VI-SLAM with dense depth | L1 | [C1](C1_vio.md) |
|
||||
| 49 | Kimera-VIO — VIO with SLAM + 3D mesh (MIT-SPARK, BSD) | L1 | [C1](C1_vio.md) |
|
||||
| 50 | DROID-SLAM — deep visual SLAM (princeton-vl) | L1 | [C1](C1_vio.md) |
|
||||
| 51 | DPVO / DPV-SLAM — deep patch visual odometry | L1 | [C1](C1_vio.md) |
|
||||
| 52 | DPVO-QAT++ — heterogeneous QAT + CUDA kernel fusion for DPVO | L2 | [C1](C1_vio.md) |
|
||||
| 53 | Pure-VO baseline — KLT optical flow + 5-point/homography RANSAC (OpenCV) | L1 | [C1](C1_vio.md) |
|
||||
| 54 | OpenVINS — context7 per-mode capability lookup (`/rpng/open_vins`) | L1 | [C1](C1_vio.md) |
|
||||
| 55 | VINS-Mono README + VINS-Fusion context7 per-mode lookup | L1 | [C1](C1_vio.md) |
|
||||
| 56 | OKVIS2 — official README (`smartroboticslab/okvis2`, main) | L1 | [C1](C1_vio.md) |
|
||||
| 57 | OpenVPRLab — open-source VPR framework (MixVPR / BoQ / NetVLAD / GeM) | L1 | [C2](C2_vpr.md) |
|
||||
| 58 | MixVPR canonical paper (WACV 2023, arXiv:2303.02190) | L1 | [C2](C2_vpr.md) |
|
||||
| 59 | SALAD canonical implementation (`serizba/salad`, GPL-3.0) | L1 | [C2](C2_vpr.md) |
|
||||
| 60 | SALAD canonical paper — Optimal Transport Aggregation (CVPR 2024) | L1 | [C2](C2_vpr.md) |
|
||||
| 61 | OpenVPRLab DinoV2 backbone — context7 cross-source for ViT-B/14 | L1 | [C2](C2_vpr.md) |
|
||||
| 62 | SelaVPR canonical implementation (`Lu-Feng/SelaVPR`, MIT) | L1 | [C2](C2_vpr.md) |
|
||||
| 63 | SelaVPR canonical paper (ICLR 2024, arXiv:2402.14505) | L1 | [C2](C2_vpr.md) |
|
||||
| 64 | NetVLAD canonical implementation `Relja/netvlad` v1.03 (MIT) | L1 | [C2](C2_vpr.md) |
|
||||
| 65 | NetVLAD modern PyTorch reproduction `Nanne/pytorch-NetVlad` | L2 | [C2](C2_vpr.md) |
|
||||
| 66 | NetVLAD canonical paper (CVPR 2016 / TPAMI 2018, arXiv:1511.07247) | L1 | [C2](C2_vpr.md) |
|
||||
| 67 | EigenPlaces canonical implementation (`gmberton/EigenPlaces`, MIT) | L1 | [C2](C2_vpr.md) |
|
||||
| 68 | EigenPlaces canonical paper (ICCV 2023, arXiv:2308.10832) | L1 | [C2](C2_vpr.md) |
|
||||
| 69 | LightGlue — context7 per-mode capability lookup (`/cvg/lightglue`) | L1 | [C3](C3_matchers.md) |
|
||||
| 70 | LightGlue canonical implementation (`cvg/LightGlue`) | L1 | [C3](C3_matchers.md) |
|
||||
| 71 | LightGlue canonical paper (ICCV 2023, arXiv:2306.13643) | L1 | [C3](C3_matchers.md) |
|
||||
| 72 | LightGlue HuggingFace Transformers integration | L1 | [C3](C3_matchers.md) |
|
||||
| 73 | LightGlue-ONNX — `fabio-sim/LightGlue-ONNX` (Jetson TensorRT path) | L2 | [C3](C3_matchers.md) |
|
||||
| 74 | ALIKED canonical implementation (`Shiaoming/ALIKED`) | L1 | [C3](C3_matchers.md) |
|
||||
| 75 | ALIKED canonical paper (TIM 2023, arXiv:2304.03608) | L1 | [C3](C3_matchers.md) |
|
||||
| 76 | DISK canonical implementation (`cvlab-epfl/disk`, Apache-2.0) | L1 | [C3](C3_matchers.md) |
|
||||
| 77 | DISK canonical paper — RL-trained local features (NeurIPS 2020) | L1 | [C3](C3_matchers.md) |
|
||||
| 78 | SuperGlue canonical implementation (`magicleap/SuperGluePretrainedNetwork`) | L1 | [C3](C3_matchers.md) |
|
||||
| 79 | SuperGlue canonical paper — graph-NN feature matching (CVPR 2020) | L1 | [C3](C3_matchers.md) |
|
||||
| 80 | XFeat canonical implementation (`verlab/accelerated_features`, Apache-2.0) | L1 | [C3](C3_matchers.md) |
|
||||
| 81 | XFeat canonical paper — accelerated features (CVPR 2024) | L1 | [C3](C3_matchers.md) |
|
||||
| 82 | OpenCV canonical implementation — `opencv/opencv` (calib3d module) | L1 | [C4](C4_pose_estimation.md) |
|
||||
| 83 | OpenCV 4.x calib3d module canonical documentation | L1 | [C4](C4_pose_estimation.md) |
|
||||
| 84 | OpenGV canonical implementation (`laurentkneip/opengv`) | L1 | [C4](C4_pose_estimation.md) |
|
||||
| 85 | OpenGV canonical Doxygen documentation portal | L1 | [C4](C4_pose_estimation.md) |
|
||||
| 86 | GTSAM canonical implementation (`borglab/gtsam`, BSD-3) | L1 | [C4](C4_pose_estimation.md) |
|
||||
| 87 | GTSAM canonical Python documentation via context7 | L1 | [C4](C4_pose_estimation.md) |
|
||||
| 88 | Solà 2017 — "Quaternion kinematics for the error-state Kalman filter" (arXiv:1711.02508) | L1 | [C5](C5_state_estimator.md) |
|
||||
| 89 | Reference open-source ESKF implementations (canonical-paper-derived) | L2 | [C5](C5_state_estimator.md) |
|
||||
| 90 | GTSAM `ImuFactor` / `CombinedImuFactor` / `PreintegratedImuMeasurements` / `PreintegratedCombinedMeasurements` (context7 indexed) | L1 | [C5](C5_state_estimator.md) |
|
||||
| 91 | GTSAM `ISAM2` / `IncrementalFixedLagSmoother` / `Marginals` with iSAM2 results (context7 indexed) | L1 | [C5](C5_state_estimator.md) |
|
||||
| 92 | Parent-suite `satellite-provider` existing pattern (PostgreSQL + Dapper + filesystem tile storage; verified directly) | L1 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 93 | PostgreSQL 16 official documentation — Multicolumn Indexes + btree access method | L1 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 94 | PostGIS official documentation — GiST + KNN distance ordering + ST_DWithin | L1 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 95 | pgvector official documentation — HNSW index API (context7 + canonical README) | L1 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 96 | FAISS official documentation — IndexFlatL2 / IndexHNSWFlat / IndexIVFFlat (context7 indexed) | L1 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 97 | Postgres on NVIDIA Jetson Orin Nano — March 2026 Medium article + Coding Steve minimal-config guide | L2 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 98 | Slippy Map Tilenames — OpenStreetMap canonical specification (Web Mercator XYZ) | L1 | [C6](C6_tile_cache_spatial_index.md) |
|
||||
| 99 | NVIDIA TensorRT 10.x official documentation portal (context7-indexed `/nvidia/tensorrt`) | L1 | [C7](C7_inference_runtime.md) |
|
||||
| 100 | Microsoft ONNX Runtime official documentation (context7-indexed `/microsoft/onnxruntime`) + Jetson AI Lab community wheel index | L1 | [C7](C7_inference_runtime.md) |
|
||||
| 101 | PyTorch official documentation (context7-indexed `/pytorch/pytorch`) + Jetson AI Lab PyTorch wheel availability for JetPack 6 | L1 | [C7](C7_inference_runtime.md) |
|
||||
| 102 | Ultralytics YOLO26 benchmark suite on Jetson Orin Nano Super (April 2026) | L2 | [C7](C7_inference_runtime.md) |
|
||||
| 103 | LightGlue ONNX Runtime + TensorRT acceleration + FP8 ModelOpt quantization findings (Fabio Sim's Journal) | L2 | [C7](C7_inference_runtime.md) |
|
||||
| 104 | JetPack SDK release notes (NVIDIA official) — JetPack 6.0 / 6.1 / 6.2 version matrix | L1 | [C7](C7_inference_runtime.md) |
|
||||
| 105 | TensorRT-on-Jetson canonical install constraints (Ultralytics issue reports + NVIDIA forum) | L2 | [C7](C7_inference_runtime.md) |
|
||||
| 106 | ArduPilot Pymavlink (context7-indexed `/ardupilot/pymavlink`) — canonical Python MAVLink stack | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 107 | ArduPilot Plane Non-GPS Position Estimation + MAVProxy GPS Input module dev docs (`GPS1_TYPE=14`, `EK3_SRC1_POSXY=3`) | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 108 | pyubx2 (context7-indexed `/semuconsulting/pyubx2`) — canonical Python UBX/NMEA/RTCM3 parser | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 109 | u-blox NEO-M9N Integration Manual (UBX-19014286) + u-blox 8/M8 Receiver Description (UBX-13003221) — UBX-NAV-PVT canonical specification | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 110 | iNav `gps_ublox.c` source (master) — UBX validation gates `gpsMapFixType()` requires `flags & 0x01 = 1` AND `fixType ∈ {2,3}` | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 111 | iNav `docs/development/msp/README.md` (master) — `MSP2_SENSOR_GPS (7939 / 0x1F03)` canonical 36-byte payload spec | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 112 | Python MSP2 implementations: YAMSPy + INAV-Toolkit `inav_msp.py` (MSP V2 `msp_v2_encode` with CRC-8 DVB-S2) | L2 | [C8](C8_fc_adapter.md) |
|
||||
| 113 | iNav `src/main/msp/msp_protocol_v2_sensor.h` (master) — MSP V2 sensor command-ID range (0x1F00-0x1FFF) | L1 | [C8](C8_fc_adapter.md) |
|
||||
| 114 | FAISS `write_index` / `read_index` Python API + on-disk format + security warning (canonical wiki + context7) | L1 | [C10](C10_preflight_provisioning.md) |
|
||||
| 115 | FAISS IndexHNSWFlat per-vector memory + on-disk file size formula (Discussions #3953 + C++ API docs) | L2 | [C10](C10_preflight_provisioning.md) |
|
||||
| 116 | Python atomic file write pattern (gocept blog + python-atomicwrites docs + Python Issue 8604) | L2 | [C10](C10_preflight_provisioning.md) |
|
||||
| 117 | Polygraphy `polygraphy convert` CLI for TensorRT INT8 engine build with calibration cache reuse (NVIDIA TensorRT repo + context7) | L1 | [C10](C10_preflight_provisioning.md) |
|
||||
| 118 | Polygraphy `Calibrator` class API — algo defaults + dynamic-shapes calibration profile + warning behavior (NVIDIA TRT/Polygraphy SDK docs) | L1 | [C10](C10_preflight_provisioning.md) |
|
||||
| 119 | `trtexec` CLI for one-off engine builds — INT8/FP16 flags + calibration cache support (NVIDIA TRT SDK docs) | L1 | [C10](C10_preflight_provisioning.md) |
|
||||
| 120 | TensorRT INT8 calibration corpus size guidance (~500-1000 images) — Jetson AGX Orin (vendor engineering guide) | L2 | [C10](C10_preflight_provisioning.md) |
|
||||
| 121 | Direct TensorRT `IBuilderConfig` + `IInt8EntropyCalibrator2` Python API (NVIDIA TRT Python API docs, cross-cite from C7 #105) | L1 | [C10](C10_preflight_provisioning.md) |
|
||||
@@ -0,0 +1,119 @@
|
||||
# Source Registry — C10: Pre-flight cache provisioning (cross-coupling minimal scope)
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation). Sources for C10 batch 1 (cross-coupling minimal: D-C6-3 descriptor-cache rebuild trigger pipeline + D-C7-7 TensorRT engine-build pipeline). Sibling registries: [SQ1](SQ1_existing_systems.md), [SQ2](SQ2_canonical_pipeline.md), [SQ6](SQ6_external_positioning.md), [C1](C1_vio.md), [C2](C2_vpr.md), [C3](C3_matchers.md), [C4](C4_pose_estimation.md), [C5](C5_state_estimator.md), [C6](C6_tile_cache_spatial_index.md), [C7](C7_inference_runtime.md), [C8](C8_fc_adapter.md). Index: [`00_summary.md`](00_summary.md).
|
||||
>
|
||||
> Source-tier definitions per `references/source-tiering.md`: L1 = official primary docs / source code / canonical specs; L2 = official blog posts, vendor SDK docs, peer-reviewed papers; L3 = community Q&A, tutorial sites, secondary commentary; L4 = forum posts, mailing-list threads, single-author blog posts.
|
||||
|
||||
---
|
||||
|
||||
## Source #114 — FAISS `write_index` / `read_index` Python API + on-disk format + security warning (L1 official)
|
||||
|
||||
**URL**: <https://github.com/facebookresearch/faiss/wiki/Index-IO,-cloning-and-hyper-parameter-tuning> + context7 indexed at `/facebookresearch/faiss` (Benchmark Score consistent with C6 batch 1 Source #96 lookup)
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L1** — canonical FAISS GitHub Wiki + canonical context7-indexed documentation
|
||||
|
||||
**Relevance**: Confirms `faiss.write_index(index, path)` + `faiss.read_index(path)` Python API for serializing IndexHNSWFlat to disk and loading it back; confirms `IO_FLAG_MMAP_IFC` enables memory-mapped loading for HNSW + IndexFlatCodes-derived classes (zero-copy load — important for the project's <5 s takeoff load budget); documents the explicit security warning "No attempt is made to check the correctness of loaded data. A faulty or malicious file could lead to out-of-memory errors or code execution. Users are responsible for verifying that files loaded with `read_index` have not been altered since being written by `write_index`." This warning binds directly to AC-NEW-7 (cache-poisoning safety) and motivates the project-side content-hash verification gate before takeoff load. Confirms FAISS C++ signature: `void write_index(Index* index, const char* filename)` / `Index* read_index(const char* filename)`.
|
||||
|
||||
**Evidence quality**: ✅ High — L1 canonical FAISS docs. Direct API verification.
|
||||
|
||||
---
|
||||
|
||||
## Source #115 — FAISS IndexHNSWFlat per-vector memory + on-disk file size formula (L2 community + L1 cross-cite)
|
||||
|
||||
**URL**: <https://github.com/facebookresearch/faiss/discussions/3953> + cross-cite <https://faiss.ai/cpp_api/struct/structfaiss_1_1IndexHNSWFlat.html>
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L2** — FAISS GitHub Discussions thread (maintainer-confirmed answer) + L1 canonical FAISS C++ API docs cross-cite
|
||||
|
||||
**Relevance**: Confirms IndexHNSWFlat per-vector on-disk + RAM cost formula: `(vector_dim × 4 bytes) + (M × 4 bytes × 2) + overhead from graph layers and geometric reallocation`. For project's pinned VPR descriptor candidates (per D-C2-9 / D-C2-10 / D-C2-6 / D-C6-1 = halfvec): at 2048-D float32 + M=32 → 8192 + 256 = **8448 bytes/vector** (~845 MB on disk for 100K tiles); at 2048-D halfvec (2-byte storage per descriptor element) → 4096 + 256 = **4352 bytes/vector** (~430 MB on disk for 100K tiles); at 512-D halfvec + M=32 → 1024 + 256 = **1280 bytes/vector** (~130 MB on disk for 100K tiles); at 256-D halfvec + M=32 → 512 + 256 = **768 bytes/vector** (~80 MB on disk for 100K tiles). All variants well within AC-8.3 10 GB cache budget (assuming D-C2-10 EigenPlaces 512-D path or D-C6-1 halfvec mitigation). Supplementary cross-cite to C6 Fact #92 evidence base. **Load latency**: Issue #622 confirms post-load search performance is "slightly slower initially due to memory layout and cache effects" but identical results — implies a warmup-search-pass at takeoff after `read_index` would smooth p99 latency; aligns with the <5 s takeoff load budget (pure file read at ~430 MB / SATA SSD ~500 MB/s = <1 s; mmap path eliminates the read entirely).
|
||||
|
||||
**Evidence quality**: ✅ High — formula matches FAISS source code in `IndexHNSW.cpp`; multiple maintainer-confirmed reproductions; conservative for project's pinned descriptor dimensions per D-C2-9/10/6 closures.
|
||||
|
||||
---
|
||||
|
||||
## Source #116 — Python atomic file write pattern: write-to-temp + fsync + atomic rename (L2 reference + L1 POSIX standard cross-cite)
|
||||
|
||||
**URL**: <https://blog.gocept.com/2013/07/15/reliable-file-updates-with-python/> + <https://python-atomicwrites.readthedocs.io/en/stable> + Python tracker Issue 8604 <https://bugs.python.org/issue8604>
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L2** — well-known engineering blog reference + canonical Python package docs + Python core developer issue tracker
|
||||
|
||||
**Relevance**: Documents the canonical Python crash-safe atomic file write pattern required for the project's pre-flight FAISS index file write (and TensorRT engine file write). The pattern is: (1) write to a temporary file in the same directory as target (ensures same filesystem so `os.rename` is atomic), (2) call `fsync(temp_fd)` to flush content + metadata to disk, (3) atomically rename via `os.rename(temp_path, target_path)`, (4) call `fsync` on the parent directory to flush the filename change to disk. Without this pattern, a power loss or process kill mid-write leaves a truncated/partial file that `faiss.read_index` will load successfully (no internal integrity check per Source #114 warning) and produce silently-wrong descriptor matches at takeoff — direct violation of AC-NEW-7 (cache-poisoning safety) + AC-3.3 (re-localization stability). The `python-atomicwrites` package provides this pattern with a simple API: `with atomic_write(path, overwrite=True) as f: ...`; pure-Python; trivially auditable; cross-platform (Windows + POSIX + macOS). On macOS specifically, must use `fcntl.fcntl(fd, fcntl.F_FULLFSYNC)` instead of `os.fsync()` to handle Apple's user-space write buffers — not relevant for the Jetson deployment target (Linux/JetPack). Project-side wrapper around `faiss.write_index` should use this pattern to safely write the FAISS cache file alongside content-hash verification.
|
||||
|
||||
**Evidence quality**: ✅ High — pattern matches POSIX `rename(2)` atomicity guarantee; extensively documented; multiple production Python packages (atomicwrites, ruamel-yaml, etc.) implement it.
|
||||
|
||||
---
|
||||
|
||||
## Source #117 — Polygraphy `polygraphy convert` CLI for TensorRT INT8 engine build with calibration cache reuse (L1 official)
|
||||
|
||||
**URL**: <https://github.com/NVIDIA/TensorRT/blob/main/tools/Polygraphy/examples/cli/convert/01_int8_calibration_in_tensorrt/README.md> + context7 indexed at `/websites/nvidia_deeplearning_tensorrt_static_polygraphy` (1041 code snippets, Benchmark Score 67.2, Source Reputation High)
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L1** — official NVIDIA TensorRT source repository documentation + canonical Polygraphy docs
|
||||
|
||||
**Relevance**: Confirms Polygraphy as the canonical NVIDIA-blessed orchestration wrapper around TensorRT's engine build pipeline. Documents the canonical INT8 calibration workflow: first build with `--data-loader-script ./data_loader.py --calibration-cache identity_calib.cache` (computes scales + writes cache); subsequent builds with `--calibration-cache identity_calib.cache` (skips calibration step entirely — cache contains scales). Confirms Polygraphy's `Calibrator` class API: `data_loader` parameter (generator/iterable yielding `{input_name: numpy.ndarray}` dicts), `cache` parameter (calibration cache file path), `BaseClass` parameter (defaults to `trt.IInt8EntropyCalibrator2` — matches project's D-C7-2 + D-C7-6 lock), `algo` parameter (defaults to `trt.CalibrationAlgoType.MINMAX_CALIBRATION`). CLI supports `--int8 --fp16` mixed precision flags directly per project's D-C7-2 = (b) per-family precision policy. The full CLI invocation pattern for project: `polygraphy convert <model>.onnx --int8 --fp16 --data-loader-script ./calib_data_loader.py --calibration-cache <model>_calib.cache -o <model>_sm87_jp62_trt103_int8fp16.engine`. Polygraphy is bundled inside the TensorRT distribution (no separate install on Jetson — `pip install nvidia-pyindex && pip install polygraphy` or via TensorRT installer). Production-mature and cross-referenced from canonical TensorRT documentation.
|
||||
|
||||
**Evidence quality**: ✅ High — official NVIDIA repository docs, multi-snippet context7 coverage, production-mature tooling.
|
||||
|
||||
---
|
||||
|
||||
## Source #118 — Polygraphy `Calibrator` class API — algo defaults + dynamic-shapes calibration profile + warning behavior (L1 official)
|
||||
|
||||
**URL**: <https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/polygraphy/backend/trt/calibrator.html> + <https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/polygraphy/backend/trt/config.html>
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L1** — canonical NVIDIA TensorRT/Polygraphy SDK documentation
|
||||
|
||||
**Relevance**: Confirms `Calibrator(data_loader, cache=None, BaseClass=IInt8EntropyCalibrator2, algo=CalibrationAlgoType.MINMAX_CALIBRATION, batch_size=None, quantile=None, regression_cutoff=None)` full signature. Documents two algorithm choices: `IInt8EntropyCalibrator2` (entropy-based; project D-C7-2 default; Polygraphy default) vs `IInt8MinMaxCalibrator` (min-max scaling). Documents dynamic-shapes behavior: "if calibration is run and the model has dynamic shapes, the last optimization profile will be used as the calibration profile" — relevant for project's matchers if any of them export with dynamic input shapes (D-C3-2 LightGlue ONNX export pathway). Documents `--data-loader-script` / `--data-loader-func-name` CLI flags for supplying custom calibration data. Documents the "Int8 Calibration is using randomly generated input data" warning that fires when `--int8` is set but neither `--data-loader-script` nor an existing `--calibration-cache` is supplied — operationalizes the D-C7-1 closure (real UAV nadir flight footage corpus) as a pre-flight build prerequisite. CLI also supports `--load-tactics` / `--save-tactics` for replaying tactic-search results across multiple builds (faster than re-running tactic profiling each build) — useful for the reference-Jetson-prebuilt-engine fallback path per D-C7-7.
|
||||
|
||||
**Evidence quality**: ✅ High — canonical NVIDIA documentation, directly cited from polygraphy/tools/args/backend/trt/config source code.
|
||||
|
||||
---
|
||||
|
||||
## Source #119 — `trtexec` CLI for one-off engine builds — INT8/FP16 flags + calibration cache support (L1 official)
|
||||
|
||||
**URL**: <https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/quick-start-guide.html> + <https://docs.nvidia.com/deeplearning/tensorrt/latest/reference/command-line-programs.html>
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L1** — canonical NVIDIA TensorRT SDK documentation
|
||||
|
||||
**Relevance**: Confirms `trtexec` as the simpler-but-less-flexible TensorRT engine build CLI bundled with every TensorRT installation. Canonical invocation: `trtexec --onnx=model.onnx --saveEngine=model.engine --fp16 --int8 --calib=calibration.cache --shapes=input:1x3x224x224`. Supports `--int8 --fp16` mixed precision (matches project's D-C7-2). Supports `--calib=<cache_path>` for INT8 calibration cache reuse (cache file format identical to Polygraphy's; the two tools are interoperable on the calibration cache layer). **Critical limitation vs Polygraphy**: `trtexec --int8` without `--calib` causes TRT to use random data for calibration (per TRT docs warning) — this collapses INT8 accuracy by ~5-15%. **Strength**: single-binary; no Python imports; no calibration data loader script required; perfect for emergency rebuilds when an existing calibration cache is available; perfect for ad-hoc benchmarking via `--iterations=N --useCudaGraph --noDataTransfers`. **Recommended role for project**: fallback orchestration tool when Polygraphy is unavailable OR when calibration cache is already shipped from a reference build (e.g., the prebuilt-engine fallback per D-C7-7).
|
||||
|
||||
**Evidence quality**: ✅ High — canonical NVIDIA documentation; trtexec is bundled with TensorRT distributions and has been the canonical TensorRT CLI since TensorRT 5.x.
|
||||
|
||||
---
|
||||
|
||||
## Source #120 — TensorRT INT8 INT8 calibration corpus size guidance (~500-1000 images) — Jetson AGX Orin specific (L2 vendor)
|
||||
|
||||
**URL**: <https://nvnexus.com/tensorrt-jetson-agx-orin-optimization-guide/>
|
||||
|
||||
**Date accessed**: 2026-05-08
|
||||
|
||||
**Tier**: **L2** — vendor-aligned engineering guide (TensorRT-on-Jetson specialist content), cross-cited from official NVIDIA Developer Forum patterns
|
||||
|
||||
**Relevance**: Independent confirmation of the project's D-C7-1 closure: "INT8 optimization can double inference throughput on Jetson AGX Orin with minimal accuracy loss; calibration on representative input data (500-1000 images recommended)". Aligns with project's pinned 500-1500 sample range from C7 batch 1 Fact #94. Cross-cite to AGX Orin (server-class Jetson) — the project's deployment target is Orin Nano Super (smaller class), but the calibration-corpus-size guidance is governed by the model + INT8 entropy-statistics requirement, not by the Jetson SKU. **Conservative confirmation**: project's calibration corpus target of 500-1500 samples per D-C7-1 closure is sufficient by community-confirmed benchmarks.
|
||||
|
||||
**Evidence quality**: ⚠️ Medium-High — L2 vendor-aligned source; aligns with multiple independent confirmations including NVIDIA Developer Forum threads and the canonical TensorRT INT8 calibration documentation; project's D-C7-1 closure already pinned this range from L1 sources.
|
||||
|
||||
---
|
||||
|
||||
## Source #121 — Direct TensorRT `IBuilderConfig` + `IInt8EntropyCalibrator2` Python API (L1 official, cross-cite from C7 Source #105)
|
||||
|
||||
**URL**: <https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python/api/infer/Core/BuilderConfig.html> (cross-cite from C7 batch 1 Source #105 + Source #102)
|
||||
|
||||
**Date accessed**: 2026-05-08 (cross-cite)
|
||||
|
||||
**Tier**: **L1** — canonical NVIDIA TensorRT Python API documentation
|
||||
|
||||
**Relevance**: Already cited in C7 batch 1 Source #102 + Source #105 (mode pinning for D-C7-2). Re-cited here for the C10 D-C7-7 confirmation context: confirms direct `IBuilderConfig` + `IInt8EntropyCalibrator2` Python API as the most-flexible-but-most-engineering-cost orchestration option. Pattern: instantiate `trt.Builder(logger)` → `builder.create_network(...)` → parse ONNX via `trt.OnnxParser` → instantiate `builder.create_builder_config()` → `config.set_flag(trt.BuilderFlag.INT8)` + `config.set_flag(trt.BuilderFlag.FP16)` → assign custom `Int8EntropyCalibrator2` subclass instance to `config.int8_calibrator` → `config.max_workspace_size = 1 << 30` (1 GB per D-C7-8) → `serialized_engine = builder.build_serialized_network(network, config)` → `with open(path, 'wb') as f: f.write(serialized_engine)`. **Used in C10 only as the per-model fallback path for the reference-Jetson-prebuilt-engine generation** (D-C7-7 fallback) when Polygraphy's data-loader-script abstraction is too rigid for an unusual model (e.g., LightGlue with dynamic-shape inputs requiring a custom calibration profile).
|
||||
|
||||
**Evidence quality**: ✅ High — canonical NVIDIA Python API; cross-cite from existing C7 Source #105 reduces redundancy.
|
||||
|
||||
---
|
||||
@@ -0,0 +1,192 @@
|
||||
# Source Registry — C1 — Visual / Visual-Inertial Odometry candidates
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `../00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> This file replaces a section of the previous monolithic `01_source_registry.md`. See `00_summary.md` for the full category index. Investigation order is tracked in `../00_question_decomposition.md` and the cross-category Investigation Status table in `00_summary.md`.
|
||||
|
||||
---
|
||||
|
||||
### Source #43
|
||||
- **Title**: VINS-Mono — A Robust and Versatile Monocular Visual-Inertial State Estimator (HKUST-Aerial-Robotics)
|
||||
- **Link**: https://github.com/HKUST-Aerial-Robotics/VINS-Mono ; LICENCE: https://github.com/HKUST-Aerial-Robotics/VINS-Mono/blob/master/LICENCE
|
||||
- **Tier**: L1 (canonical reference implementation; published in IEEE T-RO 2018 by Qin, Li, Shen)
|
||||
- **Publication Date**: original 2018; repository last meaningful update 2024-02-25 (per GitHub commit log; 2024-05-23 simulation-data commit only)
|
||||
- **Timeliness Status**: ⚠️ **Borderline.** ~24 months since the last meaningful master-branch commit at access time (2026-05-07). Established baseline that does NOT trigger Step 0.5's 18-month timeliness rejection because (a) IEEE T-RO publication is the canonical authority for the algorithm, (b) downstream forks (vins-mono-android, embedded variants) keep the algorithm class actively deployed.
|
||||
- **Version Info**: No GitHub releases / tags (master-branch-only project). Stars 5,829.
|
||||
- **Target Audience**: Mono+IMU VIO implementers; UAV state estimation researchers
|
||||
- **Research Boundary Match**: **Full match for the candidate's pinned mode** — monocular camera + IMU producing 6-DoF metric pose. The VINS-Mono README explicitly names this configuration as primary.
|
||||
- **Summary**: Optimization-based sliding-window monocular VIO. Features: efficient IMU pre-integration (Forster et al. 2017), automatic initialization, online camera-IMU extrinsic calibration, online camera-IMU temporal calibration, failure detection + recovery, loop detection (DBoW2-based), global pose graph optimization. Output is metric-scale 6-DoF pose at IMU rate (typically 100–200 Hz) with covariance from the optimization Hessian. **License: GPL-3.0 (copyleft viral)** — every binary distribution requires source disclosure for the entire linked binary; relevant for dual-use deployment if the companion image is sold or transferred to a customer.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate
|
||||
|
||||
|
||||
### Source #44
|
||||
- **Title**: VINS-Fusion — Optimization-based multi-sensor state estimator (HKUST-Aerial-Robotics)
|
||||
- **Link**: https://github.com/HKUST-Aerial-Robotics/VINS-Fusion ; LICENCE: https://github.com/HKUST-Aerial-Robotics/VINS-Fusion/blob/master/LICENCE
|
||||
- **Tier**: L1 (canonical reference; superset of VINS-Mono)
|
||||
- **Publication Date**: original 2019 (Qin, Cao, Pan, Shen — ICRA workshop / IROS); repository last update 2024-05-23
|
||||
- **Timeliness Status**: ⚠️ **Borderline.** ~24 months since the last update at access time. Same Step-0.5 reasoning as VINS-Mono — established class.
|
||||
- **Version Info**: master-branch-only. Stars 4,476. Top-ranked open-source stereo-VIO on KITTI Odometry as of January 2019.
|
||||
- **Target Audience**: Multi-sensor VIO implementers (mono+IMU, stereo, stereo+IMU, +GPS fusion)
|
||||
- **Research Boundary Match**: **Full match** for monocular+IMU mode. VINS-Fusion README explicitly enumerates four sensor configurations (mono+IMU, stereo, stereo+IMU, +GPS toy example).
|
||||
- **Summary**: Superset of VINS-Mono adding stereo and GPS-fusion modes. Same algorithmic core (sliding-window optimization with IMU pre-integration). Online spatial + temporal camera-IMU calibration; visual loop closure; ROS Kinetic/Melodic build dependency. **License: GPL-3.0** — same dual-use distribution constraint as VINS-Mono. Independent KAIST benchmark (Source #46) found VINS-Fusion CPU mode + VINS-Fusion-imu **fail to run** on Jetson TX2 (insufficient memory and CPU); GPU-accelerated VINS-Fusion-gpu does run on TX2. Implication for project: VINS-Fusion-imu on Jetson Orin Nano Super is feasible but not certain; needs MVE.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate
|
||||
|
||||
|
||||
### Source #45
|
||||
- **Title**: OpenVINS — An open source platform for visual-inertial navigation research (Robot Perception and Navigation Group, U. of Delaware — rpng)
|
||||
- **Link**: https://github.com/rpng/open_vins ; docs: https://docs.openvins.com/ ; LICENSE: https://github.com/rpng/open_vins/blob/master/LICENSE
|
||||
- **Tier**: L1 (canonical research implementation; ICRA 2020 paper Geneva, Eckenhoff, Lee, Yang, Huang)
|
||||
- **Publication Date**: original 2020; latest tagged release v2.7 = 2023-06; ongoing master-branch commits through 2024–2025 (latest issue threads through Feb 2025)
|
||||
- **Timeliness Status**: ✅ Currently valid (master branch active; latest tagged release ~35 months but library is in stable/maintenance mode with continued issue triage).
|
||||
- **Version Info**: Stars 2,828; 30 contributors; 12 releases. v2.7 is the current tagged stable.
|
||||
- **Target Audience**: MSCKF/EKF VIO implementers; researchers needing a reference MSCKF
|
||||
- **Research Boundary Match**: **Full match** for monocular+IMU mode. OpenVINS supports mono, stereo, multi-camera (1–N cameras) + IMU; mono is a documented first-class mode.
|
||||
- **Summary**: Modular MSCKF (Multi-State Constraint Kalman Filter) implementation built around an Extended Kalman filter that fuses inertial state with sparse visual feature tracks via the sliding-window MSCKF formulation (Mourikis & Roumeliotis 2007). Supports SLAM features (in-state landmarks) plus pure MSCKF features (out-of-state). ROS1 + ROS2 (Humble) builds documented; Jetson Orin Nano Dev Kit + JetPack 6 + ROS 2 Humble compilation **confirmed working** by community contributors (rpng/open_vins issue #421, fdcl-gwu/openvins_jetson_realsense Nov 2025 setup guide). **License: GPL-3.0** — same dual-use distribution constraint. Reported latency ~270 ms on Xavier NX (4-core, ARM, 40% CPU usage) per issue #164; needs Jetson-Orin-Nano-Super MVE for production budget verification.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate
|
||||
|
||||
|
||||
### Source #46
|
||||
- **Title**: Run Your Visual-Inertial Odometry on NVIDIA Jetson — Benchmark Tests on a Micro Aerial Vehicle (Jeon, Jung, Lee, Choi, Myung — KAIST)
|
||||
- **Link**: https://arxiv.org/abs/2103.01655 ; KAIST VIO dataset: https://github.com/zinuok/kaistviodataset
|
||||
- **Tier**: L1 (peer-reviewed conference, IROS-track preprint with public dataset)
|
||||
- **Publication Date**: arXiv 2021-03-02
|
||||
- **Timeliness Status**: ⚠️ Older than the 18-month Critical-novelty window, but **uniquely authoritative** for the specific question "do these VIO algorithms run on a Jetson?"; the included algorithms (VINS-Mono, VINS-Fusion, ROVIO, ALVIO, Stereo-MSCKF, Kimera, ORB-SLAM2-stereo) are all classical baselines whose runtime characteristics on ARM CPUs have not changed materially. Jetson hardware comparison (TX2 / Xavier NX / AGX Xavier) does NOT include Orin Nano — must extrapolate.
|
||||
- **Version Info**: Conference paper.
|
||||
- **Target Audience**: UAV state-estimation engineers picking a VIO for a Jetson companion
|
||||
- **Research Boundary Match**: **Strong match for the question**, partial for the hardware (no Orin Nano). KAIST VIO dataset is indoor mocap, not UAV-aerial-nadir — the *latency / CPU / memory* numbers transfer; the *accuracy* numbers do not transfer to our domain.
|
||||
- **Summary**: Comprehensive benchmark of 9 algorithms on TX2, Xavier NX, AGX Xavier: VINS-Mono, VINS-Fusion (CPU), VINS-Fusion-gpu, VINS-Fusion-imu, ROVIO, Stereo-MSCKF, ALVIO, Kimera, ORB-SLAM2-stereo. **Hard findings**: (a) on TX2, **VINS-Fusion (CPU) and VINS-Fusion-imu fail to run** due to insufficient memory and CPU performance — VINS-Fusion-gpu does run; (b) all algorithms except ROVIO show >100% CPU usage (multi-core utilisation, OK for our 6-core Orin Nano A78AE); (c) Kimera has the highest memory usage among VIO methods (numerous computations per keyframe), failure-prone on Xavier NX-class memory; (d) Stereo-MSCKF has the lowest memory among stereo VIOs; (e) ROVIO has the lowest CPU usage owing to its patch-tracking formulation. **Implication for project**: Jetson Orin Nano Super (8 GB shared, 6-core A78AE, Ampere GPU, 67 TOPS sparse INT8) is between Xavier NX and AGX Xavier in CPU performance and memory; algorithms passing on Xavier NX should pass on Orin Nano Super, but VINS-Fusion-imu's TX2 failure is a yellow-flag for memory pressure under co-resident C2/C3/C5 modules.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 (VINS-Mono / VINS-Fusion / OpenVINS / Kimera / Stereo-MSCKF / ROVIO Jetson runtime evidence), SQ5 (resource-budget failure modes)
|
||||
|
||||
|
||||
### Source #47
|
||||
- **Title**: OKVIS2 — Realtime Scalable Visual-Inertial SLAM with Loop Closure (Leutenegger, ETH/Imperial/TUM Smart Robotics Lab)
|
||||
- **Link**: https://github.com/ethz-mrl/okvis2 ; arXiv: https://arxiv.org/abs/2202.09199 ; LICENSE: https://github.com/ethz-mrl/okvis2/blob/main/LICENSE
|
||||
- **Tier**: L1 (canonical implementation; arXiv 2022 by paper author)
|
||||
- **Publication Date**: original arXiv 2022; OKVIS2-X T-RO 2025 successor (Boche, Jung, Laina, Leutenegger — IEEE T-RO 2025, vol 41 pp 6064–6083, DOI 10.1109/TRO.2025.3619051; arXiv 2510.04612, Oct 2025). Repository last push 2026-03-17 (ethz-mrl/OKVIS2-X).
|
||||
- **Timeliness Status**: ✅ **Current.** Active development through 2026; OKVIS2-X is the most recent published VI-SLAM system in this class.
|
||||
- **Version Info**: ethz-mrl/okvis2 (core) and ethz-mrl/OKVIS2-X (multi-sensor extension with optional GNSS / LiDAR / dense depth).
|
||||
- **Target Audience**: Factor-graph VI-SLAM implementers; mid-large-scale loop-closure use cases
|
||||
- **Research Boundary Match**: **Full match** for monocular+IMU mode. OKVIS2 README + paper explicitly support mono and multi-camera VI configurations. OKVIS2-X adds GNSS fusion (relevant: VINS-Fusion-style GPS-when-available drop-in IS the project's eventual posture in non-spoofed regions).
|
||||
- **Summary**: Factor-graph VI-SLAM with bounded-size optimization. Innovation: pose-graph edges from marginalised observations can be "seamlessly turned back into observations" upon loop closure, reviving old landmarks and reprojection errors. Includes lightweight CNN segmentation for dynamic-region removal. OKVIS2-X (2025) generalises the core to fuse multi-camera + IMU + optional GNSS + LiDAR/depth — directly aligned with project's "VIO that may opportunistically fuse a non-spoofed GPS update" pattern and AC-NEW-2's spoof-promotion path. **License: 3-clause BSD (permissive)** — no copyleft / dual-use distribution friction. Note: GitHub UI shows "Other (NOASSERTION)" because of the standard BSD clause language pattern; the LICENSE file is canonical 3-clause BSD.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 lead candidate (factor-graph + permissive license + active maintenance)
|
||||
|
||||
|
||||
### Source #48
|
||||
- **Title**: OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS (Boche, Jung, Laina, Leutenegger — TUM / ETH Zurich Smart Robotics Lab)
|
||||
- **Link**: https://github.com/ethz-mrl/OKVIS2-X ; arXiv: https://arxiv.org/abs/2510.04612 ; IEEE T-RO 2025 vol 41 pp 6064–6083 DOI 10.1109/TRO.2025.3619051
|
||||
- **Tier**: L1 (peer-reviewed IEEE Transactions on Robotics, Special Issue Visual SLAM 2025)
|
||||
- **Publication Date**: arXiv 2025-10-04; T-RO 2025 vol 41
|
||||
- **Timeliness Status**: ✅ Current (within 6-month Critical-novelty window)
|
||||
- **Version Info**: 295 stars; 38 forks; 2 contributors; created 2025-09-23, last push 2026-03-17. License: NOASSERTION on GitHub UI; per-paper license follows ethz-mrl convention (BSD-3 derived).
|
||||
- **Target Audience**: Multi-sensor SLAM researchers; large-scale VI-SLAM with optional GNSS/LiDAR
|
||||
- **Research Boundary Match**: **Strong match** — extends OKVIS2 monocular+IMU mode with optional GNSS fusion (Visual-Inertial SLAM with Tightly-Coupled Dropout-Tolerant GPS Fusion lineage from IROS 2022). Project's `MAV_CMD_SET_EKF_SOURCE_SET` switch + companion-side spoof-detection conceptually mirrors OKVIS2-X's "GPS as drop-out-tolerant signal".
|
||||
- **Summary**: Non-trivial extension of OKVIS2; submap-based volumetric occupancy mapping. Demonstrates that the OKVIS2 factor-graph backbone can absorb spoofing-aware GPS without re-architecting. Useful as architectural template for project's C5 estimator + C8 adapter integration. License: same as OKVIS2 (BSD-3-derived). Two named contributors (bochsim, SebsBarbas) actively pushing through Mar 2026.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 (OKVIS2 lineage; VI-SLAM with optional GPS/LiDAR), SQ8 (GPS-fusion dropout-tolerant lineage)
|
||||
|
||||
|
||||
### Source #49
|
||||
- **Title**: Kimera-VIO — Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation (MIT-SPARK)
|
||||
- **Link**: https://github.com/MIT-SPARK/Kimera-VIO ; LICENSE.BSD: https://github.com/MIT-SPARK/Kimera-VIO/blob/master/LICENSE.BSD
|
||||
- **Tier**: L1 (canonical implementation by MIT SPARK Lab)
|
||||
- **Publication Date**: original 2020 (Rosinol, Abate, Chang, Carlone — ICRA 2020); ongoing development through 2024–2025 issue threads (Dec 2024 / Feb 2025 ROS2 / mono-inertial discussion).
|
||||
- **Timeliness Status**: ✅ Active maintenance (recent issues / PRs through 2025).
|
||||
- **Version Info**: master-branch-only; LICENSE.BSD = BSD 2-Clause "Simplified".
|
||||
- **Target Audience**: VI-SLAM + mesh-mapping researchers
|
||||
- **Research Boundary Match**: **Partial.** Stereo+IMU is the primary supported configuration; mono+IMU is **optional but documented**. Kimera also produces 3D mesh and high-level semantic labels (relevant to neither C1 nor the project's bandwidth budget — overhead).
|
||||
- **Summary**: Frontend (image processing + IMU pre-integration) + Backend (factor-graph optimization in iSAM2 or GTSAM) + Mesher + Pose-Graph-Optimizer. **License: BSD 2-Clause (permissive)** — no dual-use distribution friction. **Penalty for project**: Source #46 KAIST benchmark found Kimera has highest memory usage among the VIOs tested (numerous computations per keyframe), and Kimera failed to fit on Xavier-NX-class memory under multi-process load. Mesh + semantic features are unused by the project — Kimera's overhead is unjustified vs OKVIS2 / OpenVINS for the project's narrow C1 mandate. **Status**: viable secondary fallback if OKVIS2 / VINS-Mono runtime issues arise; not a lead candidate due to overhead misfit.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 secondary candidate (BSD-permissive but resource-heavy)
|
||||
|
||||
|
||||
### Source #50
|
||||
- **Title**: DROID-SLAM — Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras (princeton-vl, Teed & Deng)
|
||||
- **Link**: https://github.com/princeton-vl/droid-slam ; arXiv: https://arxiv.org/abs/2108.10869 ; NeurIPS 2021
|
||||
- **Tier**: L1 (canonical reference)
|
||||
- **Publication Date**: NeurIPS 2021; repository latest tagged baseline.
|
||||
- **Timeliness Status**: ✅ Foundational reference; DPV-SLAM (Source #51) is the lighter successor.
|
||||
- **Version Info**: master-branch-only.
|
||||
- **Target Audience**: Deep-learning-based VO/VSLAM researchers
|
||||
- **Research Boundary Match**: **Disqualified by hardware budget.** Inference requires ≥11 GB GPU VRAM per official README; project budget is 8 GB **shared CPU+GPU** on Jetson Orin Nano Super, leaving <8 GB for VO + VPR + matcher + estimator + cache co-resident. DROID-SLAM is also **monocular VO/SLAM, not VIO** — no native IMU fusion; metric scale recovery requires external scale alignment.
|
||||
- **Summary**: Recurrent dense bundle adjustment over a complete history of camera poses. State-of-the-art accuracy on TartanAir / EuRoC / TUM-RGBD at the cost of GPU memory. **Disqualified outright for C1 lead** by AC-4.2 (≤8 GB shared RAM) and the lack of IMU fusion that would require an additional ESKF/UKF wrapping. Kept as **reference baseline** to be cited as "what we cannot afford" in `solution_draft01`.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 disqualified candidate
|
||||
|
||||
|
||||
### Source #51
|
||||
- **Title**: DPVO — Deep Patch Visual Odometry (princeton-vl, Teed, Lipson, Deng) + DPV-SLAM (Lipson, Teed, Deng — ECCV 2024)
|
||||
- **Link**: https://github.com/princeton-vl/DPVO ; LICENSE: https://github.com/princeton-vl/DPVO/blob/main/LICENSE ; ECCV 2024 paper: https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00272.pdf
|
||||
- **Tier**: L1 (canonical implementation; NeurIPS 2023 + ECCV 2024)
|
||||
- **Publication Date**: NeurIPS 2023 (DPVO); ECCV 2024 (DPV-SLAM); repository last update 2024-10-12.
|
||||
- **Timeliness Status**: ⚠️ Borderline. ~19 months since last code update; ECCV-2024 publication of DPV-SLAM keeps the algorithm class within the 6-month claim window for the SLAM successor.
|
||||
- **Version Info**: 989 stars; primary languages C++ / Python / CUDA. **License: MIT (permissive)** — no dual-use distribution friction.
|
||||
- **Target Audience**: Deep-learning VO/SLAM with reduced memory footprint
|
||||
- **Research Boundary Match**: **Partial.** DPVO is **monocular VO only — no IMU fusion**. Output pose is in arbitrary scale (no metric scale recovery). To be a viable C1 candidate the project must wrap DPVO with an external IMU+scale-fusion stage (loosely-coupled ESKF / VIO-fusion module). This makes DPVO **not a drop-in C1** like VINS-Mono / OpenVINS / OKVIS2; it is a **VO module that needs a separate VIO wrapper**.
|
||||
- **Summary**: Sparse patch tracking + differentiable bundle adjustment back end. Outperforms DROID-SLAM on TartanAir / EuRoC ATE while using ~1/3 of DROID-SLAM's GPU memory (DROID-SLAM: 8.7 GB VO mode vs DPVO: ~3 GB). DPV-SLAM (Lipson, Teed, Deng — ECCV 2024) adds full SLAM capability with 4–5 GB GPU usage. **Jetson runtime evidence**: indirect via DPVO-QAT++ (Source #52) — peak reserved memory 1.02 GB on RTX 4060 (8 GB) after INT8 fake-quant + custom CUDA kernel fusion; not directly tested on Jetson Orin Nano. **Status for C1**: pure-VO candidate (must be paired with separate IMU integration to deliver metric scale + attitude); would not satisfy "monocular VIO" gate alone, but viable as the *VO half* of a hybrid C1+C5 design.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 conditional candidate (VO not VIO; needs external IMU wrapper)
|
||||
|
||||
|
||||
### Source #52
|
||||
- **Title**: DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry (Cheng Liao)
|
||||
- **Link**: https://arxiv.org/abs/2511.12653 ; project HTML: https://arxiv.org/html/2511.12653
|
||||
- **Tier**: L2 (single-author preprint, code partially released; no peer-review yet)
|
||||
- **Publication Date**: arXiv 2025-11-16 (within 6-month Critical-novelty window)
|
||||
- **Timeliness Status**: ✅ Current
|
||||
- **Version Info**: arXiv preprint; code & weights released for QAT-only and fused-CUDA variants.
|
||||
- **Target Audience**: Embedded-platform DPVO deployers
|
||||
- **Research Boundary Match**: **Partial.** Hardware tested = RTX 4060 (8 GB) + Intel Core Ultra 5-125H + 32 GB RAM — desktop GPU, NOT Jetson Orin Nano. Direct extrapolation requires Jetson MVE; Orin Nano Super's Ampere GPU is architecturally similar but smaller than RTX 4060.
|
||||
- **Summary**: Quantization-Aware Training framework for DPVO with fused CUDA kernels. Reduces peak GPU memory from 1.94 GB → 1.02 GB (-47%) on a representative TartanAir sequence; +34.6% median FPS on TartanAir, +26.7% on EuRoC; -22.8 ms / -19.7 ms median P99 tail latency on TartanAir / EuRoC respectively. Heterogeneous precision: front-end pseudo-quantization (FP16/FP32 with INT8 simulation) + FP32 back-end geometric solver. **Implication for project**: shows DPVO has a documented Jetson-suitable footprint **path** but not a Jetson-Orin-Nano measurement. ATE accuracy comparable to baseline DPVO across 32 TartanAir + 11 EuRoC validation sequences. Notable: requires a teacher-student distillation training pipeline before deployment — adds operational complexity vs classical VINS-* / OpenVINS / OKVIS2.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 supporting evidence for DPVO embedded feasibility
|
||||
|
||||
|
||||
### Source #53
|
||||
- **Title**: Pure VO baseline — KLT optical flow + 5-point essential matrix or homography RANSAC (OpenCV reference)
|
||||
- **Link**: https://docs.opencv.org/4.x/d4/dee/tutorial_optical_flow.html ; representative public implementation: https://github.com/alishobeiri/Monocular-Video-Odometery (MIT, 2018) ; tutorial reference: https://zxh.me/posts/2022-12-19-monocular-visual-odometry/
|
||||
- **Tier**: L1 (OpenCV official documentation) + L2 (representative public implementations)
|
||||
- **Publication Date**: OpenCV docs continuously updated; tutorial 2022-12; reference implementation 2018 (algorithmic class is foundational, no time window per Step 0.5)
|
||||
- **Timeliness Status**: ✅ Foundational baseline (no time window).
|
||||
- **Version Info**: OpenCV `cv::calcOpticalFlowPyrLK` (KLT) + `cv::findEssentialMat` (5-point Nister) or `cv::findHomography` with RANSAC.
|
||||
- **Target Audience**: Implementers needing a transparent low-complexity fallback
|
||||
- **Research Boundary Match**: **Full match for the simple-baseline candidate.** Suits planar nadir-down UAV at altitude (Ukrainian steppe is ~planar at 1 km AGL — homography is geometrically appropriate; for non-planar relief the essential matrix path is more appropriate but adds scale-recovery work).
|
||||
- **Summary**: Established classical pipeline: Shi-Tomasi or FAST corner detection → KLT pyramidal optical flow tracking → 5-point essential matrix or homography RANSAC → relative pose with arbitrary scale (must be metric-scale-aligned via IMU integration externally). Reference implementations widely available in OpenCV samples and pedagogical repos. **Status**: candidate as the project's `Simple baseline / known-runnable / known-failure-mode` C1 option per Component Option Breadth rule. Not a lead, but mandatory fallback presence per the research engine's "include at least one simple baseline" rule.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 simple-baseline candidate
|
||||
|
||||
|
||||
### Source #54
|
||||
- **Title**: OpenVINS — `context7` per-mode capability lookup (`/rpng/open_vins`, master)
|
||||
- **Link**: context7 query against `/rpng/open_vins`, accessed 2026-05-08; canonical doc references returned: `https://github.com/rpng/open_vins/blob/master/docs/gs-tutorial.dox`, `https://github.com/rpng/open_vins/blob/master/docs/gs-datasets.dox`, `https://github.com/rpng/open_vins/blob/master/docs/gs-calibration.dox`, `https://github.com/rpng/open_vins/blob/master/docs/propagation-analytical.dox`
|
||||
- **Tier**: L1 (project-official documentation reachable via the project's documentation generator)
|
||||
- **Publication Date**: live docs (master, accessed 2026-05-08)
|
||||
- **Timeliness Status**: ✅ Within Critical-novelty window (active master + community evidence through 2025–2026)
|
||||
- **Version Info**: master HEAD at access time (no tagged release for ROS 2 path; ROS 1 / ROS 2 build paths both documented)
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Research Boundary Match**: **Full match** for monocular + IMU mode. The `subscribe.launch.py` ROS 2 launch script (and its ROS 1 sibling) declare `use_stereo` and `max_cameras` as DeclareLaunchArguments — setting `use_stereo:=false max_cameras:=1` selects monocular operation; `config:=` selects an estimator-config directory (`euroc_mav`, `tum_vi`, `rpng_aruco`, …). KALIBR + RPNG IMU intrinsic calibration models are both documented in `propagation-analytical.dox` with the corresponding state-vector composition.
|
||||
- **Summary**: Confirms documentary evidence for OpenVINS' three sensor configurations exposed at the launch layer (mono / stereo / multi-camera), all with IMU mandatory; confirms the project's pinned mode (`use_stereo:=false max_cameras:=1`) is a first-class launch configuration that requires no source patch. Confirms that estimator config files in `ov_msckf/config/<dataset>/estimator_config.yaml` are the parameter-tuning surface and that supported IMU intrinsic models include both KALIBR and RPNG. **Open**: `context7` Disqualifier-Probe query did not surface explicit per-mode latency/memory limits or sub-20-Hz validation evidence; those constraints carry into the Jetson-Orin-Nano-Super hardware MVE (D-C1-2 deferred phase).
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 — OpenVINS per-mode API capability verification (Mandatory `context7` lookup per Per-Mode API Capability Verification rule)
|
||||
|
||||
|
||||
### Source #55
|
||||
- **Title**: VINS-Mono — official README + VINS-Fusion `context7` per-mode capability lookup (`/hkust-aerial-robotics/vins-fusion`, master) [cross-source documentary evidence for the mono+IMU mode shared with VINS-Mono]
|
||||
- **Link**: VINS-Mono README — https://raw.githubusercontent.com/HKUST-Aerial-Robotics/VINS-Mono/master/README.md (accessed 2026-05-08); VINS-Fusion docs — context7 query against `/hkust-aerial-robotics/vins-fusion`, accessed 2026-05-08, canonical reference returned: https://github.com/hkust-aerial-robotics/vins-fusion/blob/master/README.md
|
||||
- **Tier**: L1 (project-official READMEs of both repos)
|
||||
- **Publication Date**: VINS-Mono README — 2019-01-11 last major revision (master-branch only, no tagged releases); VINS-Fusion docs — live (master, accessed 2026-05-08)
|
||||
- **Timeliness Status**: ⚠️ borderline (per Step 0.5 timeliness — VINS-Mono master last meaningful commit 2024-02-25 / 2024-05-23; older than the 18-month preferred window for live API behaviour, but the algorithm class remains the canonical mono+IMU sliding-window VIO referenced by 2025 community work — see Fact #36)
|
||||
- **Version Info**: VINS-Mono master HEAD; depends on Ceres v1.14.0 (versions ≥2.0.0 have build issues per README). VINS-Fusion master HEAD has `euroc_mono_imu_config.yaml` as a first-class config.
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Research Boundary Match**: **Full match** for the project's pinned mode (mono + IMU). VINS-Mono is single-mode by construction — "real-time SLAM framework for **Monocular Visual-Inertial Systems**" — the project's pinned mode is the only mode the project will use the binary in. VINS-Fusion `euroc_mono_imu_config.yaml` is the documentary cross-source evidence that the algorithmic mono+IMU path remains a first-class configuration in the same authors' active fork.
|
||||
- **Summary**: Confirms VINS-Mono = monocular + IMU only (single mode); ROS Kinetic / Ubuntu 16.04 reference build; pinhole + MEI camera models supported; rolling-shutter support with calibrated reprojection error <0.5 px; online camera-IMU extrinsic + temporal calibration; loop closure via DBoW2; pose-graph reuse and map merge supported. **Critical recommended-input bound**: README §5.1 — *"The image should exceed 20Hz and IMU should exceed 100Hz."* — the project's nav cam target is 3 fps; this is a documentary signal that VIO performance below the recommended frame rate is not validated by the upstream authors. License: GPLv3 (confirmed in README §8). **Cross-source note**: VINS-Fusion `euroc_mono_imu_config.yaml` is named explicitly in `context7` results and uses the same algorithmic core; treat as evidence for VINS-Mono's mono+IMU mode while honouring the per-mode rule that VINS-Fusion's mono+IMU mode is a separately-cataloged candidate (Fact #29).
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 — VINS-Mono per-mode API capability verification (Mandatory `context7` lookup per Per-Mode API Capability Verification rule, with cross-source documentary evidence from VINS-Fusion since VINS-Mono itself is not indexed in `context7`)
|
||||
|
||||
|
||||
### Source #56
|
||||
- **Title**: OKVIS2 — official README (`smartroboticslab/okvis2`, main)
|
||||
- **Link**: https://raw.githubusercontent.com/smartroboticslab/okvis2/main/README.md (accessed 2026-05-08); papers cited in README: arXiv:2202.09199 (Leutenegger 2022), IJRR 2015, RSS 2013
|
||||
- **Tier**: L1 (project-official README; arXiv canonical paper)
|
||||
- **Publication Date**: README live; canonical paper 2022-02; OKVIS2 master last push within the Critical-novelty window (per Fact #36 timeliness audit, OKVIS2-X 2026-03-17 push confirms active)
|
||||
- **Timeliness Status**: ✅ Fully within Critical-novelty window
|
||||
- **Version Info**: OKVIS2 main HEAD; cmake build with optional ROS 2 wrapping (`BUILD_ROS2=ON`); optional sky-segmentation CNN via LibTorch (`USE_NN=OFF` to disable)
|
||||
- **Target Audience**: System architects + C1 implementer + Step-7.5 reviewer
|
||||
- **Research Boundary Match**: **Full match** for the project's pinned mode (mono + IMU). README confirms multi-camera support (camera frames `C_i` for the i-th camera) plus IMU mandatory; mono operation is a documented configuration via the example apps (`okvis_app_synchronous`, `okvis_app_realsense`). OKVIS2-X is the GNSS-fusion extension (T-RO 2025) that aligns architecturally with the project's spoof-promotion path.
|
||||
- **Summary**: Confirms OKVIS2 = keyframe-based VI-SLAM (factor-graph backbone with loop closure); BSD-3 license (no copyleft); coordinate-frame contract (`W` world, `C_i` cameras, `S` IMU, `B` body); state representation (`T_WS` pose + velocity + gyro/accel biases); two-callback API (`setOptimisedGraphCallback` for batch updates incl. loop closure + `setImuCallback` for high-rate prediction). Calibration prerequisites: camera intrinsics + camera-IMU extrinsics + IMU noise parameters + tight time sync (Kalibr toolchain explicitly recommended). Optional LibTorch sky-segmentation CNN can be disabled (`USE_NN=OFF`) to remove a major Jetson dependency. ROS 2 build path (`BUILD_ROS2=ON`) with `okvis_node_realsense.launch.xml`, `okvis_node_realsense_publisher.launch.xml`, `okvis_node_subscriber.launch.xml`, `okvis_node_synchronous.launch.xml`. **Health warning** in README: poor calibration → poor results; this is shared with all VI candidates but is more strongly emphasised in OKVIS2 docs. **Open**: README does not state explicit minimum frame rate (cf. VINS-Mono's documented 20 Hz minimum) — keyframe-based selection generally tolerates lower input frame rates than sliding-window optimisation; this needs explicit Jetson MVE validation at 3 fps.
|
||||
- **Related Sub-question**: SQ3+SQ4 / C1 — OKVIS2 per-mode API capability verification (Mandatory `context7` lookup per Per-Mode API Capability Verification rule, with WebFetch fallback to official README since `context7` returned no match)
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -0,0 +1,88 @@
|
||||
# Source Registry — C4 — Pose estimation (PnP + RANSAC + LM) candidates
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `../00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> This file replaces a section of the previous monolithic `01_source_registry.md`. See `00_summary.md` for the full category index. Investigation order is tracked in `../00_question_decomposition.md` and the cross-category Investigation Status table in `00_summary.md`.
|
||||
|
||||
---
|
||||
|
||||
### Source #82
|
||||
- **Title**: OpenCV canonical implementation — `opencv/opencv` (Open Source Computer Vision Library) GitHub repository metadata via GitHub API + LICENSE — **Apache-2.0** (`license.spdx_id: "Apache-2.0"`); 87385 stars + 56554 forks + 2606 subscribers + 2732 open issues; created 2012-07-19; **last pushed 2026-05-08T07:00:03Z = TODAY at access time** (daily-active maintenance); default branch `4.x`; size 555 GB; topics include `c-plus-plus, computer-vision, deep-learning, image-processing, opencv`
|
||||
- **Link**: GitHub API metadata https://api.github.com/repos/opencv/opencv (accessed 2026-05-08; `license.spdx_id: "Apache-2.0"` confirmed); canonical repo https://github.com/opencv/opencv ; canonical website https://opencv.org ; LICENSE file https://raw.githubusercontent.com/opencv/opencv/4.x/LICENSE (Apache License 2.0 standard text)
|
||||
- **Tier**: L1 (project-official codebase by the OpenCV organization; canonical reference computer-vision library used by every modern computer-vision deployment as the de-facto industry-standard classical-CV foundation; cited by every C-row component's deployment guide; canonical solvePnPRansac is the industry-standard reference RANSAC-PnP implementation that every modern alternative [OpenGV, GTSAM-PnP, Theia, Ceres-only] compares against in its own documentation)
|
||||
- **Publication Date**: original 2000 (Intel) → open-source release 2006 (Willow Garage) → OpenCV.org foundation 2020 → canonical 4.x branch active continuous development; access date 2026-05-08; daily commits to `4.x` branch
|
||||
- **Timeliness Status**: ✅ Within Established-baseline-reference window (2000+ — established competitive ground for classical computer-vision + RANSAC-PnP reference; Established-competitive-mandatory-baseline exemption applies — `cv::solvePnPRansac` is the **canonical RANSAC-PnP reference baseline** that defines the mandatory-simple-baseline role for the C4 row per the engine Component Option Breadth rule, structurally analogous to NetVLAD's role in C2 row + SuperGlue+SuperPoint's role in C3 row)
|
||||
- **Version Info**: 4.14.0-pre at access time (default branch `4.x` = next-major-release rolling-development branch; current stable release 4.10.0 from late 2025 at access date — 4.x is the project's pinned major version per Source #83 documentation footer "Generated on Fri May 8 2026 04:21:44 for OpenCV by 1.12.0"); JetPack 6 ships canonical `libopencv_calib3d.so` for ARM Cortex-A78AE = the project's pinned Jetson Orin Nano Super deployment runtime
|
||||
- **Target Audience**: System architects + C4 implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 — clean Apache-2.0) + C7 (Jetson runtime) implementer (canonical OpenCV is shipped with JetPack 6 distribution)
|
||||
- **Research Boundary Match**: **Full match** for the project's pinned C4 mandatory-simple-baseline mode (per-frame pose-from-correspondences via classical RANSAC-PnP with paired Levenberg-Marquardt refinement). The canonical `opencv/opencv` library ships everything needed for C4 deployment: `cv::solvePnPRansac` two function signatures (classical + USAC variant), nine `SolvePnPMethod` enum values, paired `cv::solvePnPRefineLM` LM refinement + alternate `cv::solvePnPRefineVVS` Gauss-Newton SO(3) refinement, paired `cv::solvePnPGeneric` for multi-solution + per-solution reprojection-error reporting, `cv::projectPoints` Jacobian for D-C4-2 post-hoc covariance recovery. **N/A for the project's domain caveat** — OpenCV solvePnPRansac is a classical algorithm with no training data; D-C2-1 retrain decision is irrelevant for OpenCV solvePnPRansac
|
||||
- **Summary**: OpenCV is the canonical industry-standard open-source computer vision library; the calib3d module ships `cv::solvePnPRansac` as the canonical RANSAC-PnP reference implementation. **CRITICAL LICENSE FINDING**: Apache-2.0 (`license.spdx_id: "Apache-2.0"`) — permissive, BSD/permissive license track on the C4 mandatory-simple-baseline; **deployment-ready under every D-C1-1 license-posture choice** with the cleanest license-compliance story tied with cvg/LightGlue + DISK + XFeat. **Daily-active maintenance**: last pushed 2026-05-08 (TODAY at access time) — among the most actively-maintained C-row references across all components evaluated. **Industry-standard reference status**: 87385 stars + 56554 forks + 2606 subscribers — the dominant industry-standard reference implementation that every modern C4 alternative (OpenGV, GTSAM-PnP, Theia, Ceres-only) compares against in its own documentation. **JetPack 6 canonical distribution**: canonical OpenCV is shipped with JetPack 6 distribution, providing zero-effort deployment for the project's pinned Jetson Orin Nano Super runtime
|
||||
- **Related Sub-question**: SQ3+SQ4 / C4 — OpenCV solvePnPRansac per-mode API capability verification (Mandatory `context7` lookup MCP-validation-error + WebFetch fallback PASS per Per-Mode rule item 2; cross-validated against canonical GitHub API license metadata WebFetch + canonical OpenCV calib3d module documentation [Source #83]); **D-C1-1 license-posture compliance**: clean Apache-2.0 throughout; **Mandatory-simple-baseline role per engine Component Option Breadth rule** confirmed; **JetPack 6 canonical distribution** documented
|
||||
|
||||
|
||||
### Source #83
|
||||
- **Title**: OpenCV 4.x calib3d module canonical documentation — group `cv::calib3d` (Camera Calibration and 3D Reconstruction) at `https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html` + Perspective-n-Point (PnP) pose computation tutorial at `https://docs.opencv.org/4.x/d5/d1f/calib3d_solvePnP.html`; `cv::solvePnPRansac` two function signatures (classical with `iterationsCount=100, reprojectionError=8.0, confidence=0.99, flags=SOLVEPNP_ITERATIVE` defaults + USAC variant with `UsacParams` and `cameraMatrix` as `InputOutputArray` for focal-length refinement); Python bindings; `cv::SolvePnPMethod` enum 9 values; `cv::solvePnPRefineLM` + alternate `cv::solvePnPRefineVVS`; `cv::solvePnPGeneric` for multi-solution + per-solution reprojection-error reporting; USAC RANSAC-method enum 7 modern variants
|
||||
- **Link**: calib3d module documentation https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html (accessed 2026-05-08); PnP tutorial page https://docs.opencv.org/4.x/d5/d1f/calib3d_solvePnP.html (accessed 2026-05-08); both pages footer-stamped "Generated on Fri May 8 2026 04:21:44 for OpenCV by 1.12.0" — fresh canonical documentation at the project's evaluation time
|
||||
- **Tier**: L1 (canonical project-official documentation by the OpenCV organization; the canonical reference for the `cv::solvePnPRansac` function signature, parameter defaults, paired refinement variants, minimal-solver enum values, and structural caveats; auto-generated by Doxygen 1.12.0 from canonical opencv/opencv source code at `4.x` branch)
|
||||
- **Publication Date**: rolling Doxygen documentation auto-regenerated on every push to `4.x` branch; access date 2026-05-08 04:21:44 page-generation timestamp
|
||||
- **Timeliness Status**: ✅ Within Established-baseline-reference window (rolling Doxygen documentation; the canonical reference for `cv::solvePnPRansac` API surface at the project's evaluation time)
|
||||
- **Version Info**: 4.14.0-pre at access time (default branch `4.x` = next-major-release rolling-development branch). **Mode-enumeration query (1/3) — context7 MCP-validation-error + WebFetch fallback PASS** — `context7 resolve-library-id` returned MCP validation errors (parameter schema mismatch on both `query` and `libraryName` argument names — context7 server expects different argument shape than provided); per Per-Mode API Capability Verification rule item 2, fall-back to official-docs WebFetch on the canonical OpenCV calib3d module documentation + PnP tutorial page was used (this Source #83). **Nine `SolvePnPMethod` enum values documented** at line 243 of the calib3d.html: `SOLVEPNP_ITERATIVE=0` (default; iterative LM-based on top of EPNP minimal-solver result), `SOLVEPNP_EPNP=1` (Efficient Perspective-n-Point [Lepetit et al. IJCV 2009]; canonical default for ≥4 non-planar correspondences), `SOLVEPNP_P3P=2` (Revisiting the P3P Problem [Ding et al. 2023]; minimal-solver for exactly-3 correspondences with up to 4 solutions), `SOLVEPNP_DLS=3` (**BROKEN per explicit docstring "Broken implementation. Using this flag will fallback to EPnP"** — Direct Least-Squares method [Hesch & Roumeliotis 2011] originally), `SOLVEPNP_UPNP=4` (**BROKEN per explicit docstring "Broken implementation. Using this flag will fallback to EPnP"** — Exhaustive Linearization for Robust Camera Pose and Focal Length Estimation [Penate-Sanchez et al. 2013] originally), `SOLVEPNP_AP3P=5` (Algebraic P3P [Ke & Roumeliotis CVPR 2017]), `SOLVEPNP_IPPE=6` (Infinitesimal Plane-Based Pose Estimation [Collins & Bartoli ECCV 2014]; **planar-only — object points must be coplanar — directly relevant to project's D-C4-1 = 4-DoF flat-earth lift recommendation**), `SOLVEPNP_IPPE_SQUARE=7` (special-case IPPE for marker pose with 4 fixed-pattern points), `SOLVEPNP_SQPNP=8` (SQPnP: A Consistently Fast and Globally Optimal Solution [Terzakis & Lourakis ECCV 2020]; **modern globally-optimal alternate without planarity restriction — second-recommended fallback if D-C4-1 chooses 6-DoF DSM lift**). **`cv::solvePnPRansac` classical signature** at line 3211 of calib3d.html: `bool solvePnPRansac(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess=false, int iterationsCount=100, float reprojectionError=8.0, double confidence=0.99, OutputArray inliers=noArray(), int flags=SOLVEPNP_ITERATIVE)` — Python `cv.solvePnPRansac(objectPoints, imagePoints, cameraMatrix, distCoeffs[, rvec[, tvec[, useExtrinsicGuess[, iterationsCount[, reprojectionError[, confidence[, inliers[, flags]]]]]]]]) -> retval, rvec, tvec, inliers`. **`cv::solvePnPRansac` USAC variant signature** at line 3261: `bool solvePnPRansac(InputArray objectPoints, InputArray imagePoints, InputOutputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, OutputArray inliers, const UsacParams& params=UsacParams())` — Python `cv.solvePnPRansac(objectPoints, imagePoints, cameraMatrix, distCoeffs[, rvec[, tvec[, inliers[, params]]]]) -> retval, cameraMatrix, rvec, tvec, inliers`; note `cameraMatrix` is `InputOutputArray` in the USAC variant, allowing focal-length refinement during the RANSAC loop. **`cv::solvePnPRefineLM`** at line 3268: canonical default `TermCriteria(EPS+COUNT, 20, FLT_EPSILON)`. **CRITICAL CAVEAT** documented at the PnP-tutorial page: "the current implementation computes the rotation update as a perturbation and not on SO(3)" — minor structural caveat; alternate `cv::solvePnPRefineVVS` at line 3289 uses Gauss-Newton with rotation update via exponential map on SO(3) (preferred for high-accuracy aerial pose-from-correspondences). **`cv::solvePnPGeneric`** at line 370: returns multiple candidate solutions sorted by reprojection error + an `OutputArray reprojectionError` per-solution. **Default minimal-sample-set method** at line 3256: "The default method used to estimate the camera pose for the Minimal Sample Sets step is `SOLVEPNP_EPNP`. Exceptions are: if you choose `SOLVEPNP_P3P` or `SOLVEPNP_AP3P`, these methods will be used; if the number of input points is equal to 4, `SOLVEPNP_P3P` is used." **USAC RANSAC-method enumeration** at the calib3d.html anonymous-enum block: canonical RANSAC, LMEDS, RHO, **USAC_DEFAULT, USAC_PARALLEL, USAC_FM_8PTS, USAC_FAST, USAC_ACCURATE, USAC_PROSAC, USAC_MAGSAC** — modern USAC variants (Barath et al. CVPR 2019 + ICCV 2019 MAGSAC++) provide higher inlier-recovery rate than vanilla RANSAC at the same iteration budget; **USAC_MAGSAC is the canonical sigma-consensus modern alternative to vanilla RANSAC** with no fixed inlier threshold
|
||||
- **Target Audience**: System architects + C4 implementer + Step-7.5 reviewer + Plan-phase architect (mandatory-simple-baseline role documentation for engine Component Option Breadth rule compliance + D-C4-1 2D-3D-lift architectural decision carry-forward + D-C4-2 NEW covariance-recovery-strategy gate)
|
||||
- **Research Boundary Match**: **Full match** for the C4 row's pinned mode (per-frame pose-from-correspondences contract on Jetson Orin Nano Super; inputs = up to 1024 3D-2D correspondences from C3's 2D-2D + D-C4-1's 2D→3D lift + camera intrinsic + distortion; outputs = 6-DoF camera pose + per-correspondence inlier mask + reprojection error + RANSAC iter count + 6×6 covariance via D-C4-2). The canonical OpenCV calib3d module documentation provides the complete API surface for the project's pinned mode: two function signatures, nine minimal-solver enum values, paired LM + Gauss-Newton SO(3) refinement, paired multi-solution reporting with reprojection error, USAC RANSAC-method enumeration with 7 modern variants. **CRITICAL contract finding**: the documented signature requires `objectPoints` Nx3 1-channel + `imagePoints` Nx2 1-channel — **3D-2D, not 2D-2D**; the project must perform a 2D→3D lift on C3's satellite-tile-side 2D pixels via D-C4-1's 4-DoF flat-earth lift recommendation (project default) before calling solvePnPRansac. **CRITICAL covariance finding**: the documented signature returns `retval, rvec, tvec, inliers` only — **NO direct 6×6 covariance output**; AC-NEW-4 covariance-honesty contract requires D-C4-2 NEW Plan-phase decision for covariance-recovery-strategy
|
||||
- **Summary**: The canonical OpenCV 4.x calib3d module documentation is the definitive reference for `cv::solvePnPRansac` API surface, parameter defaults, paired refinement variants, minimal-solver enum values, and structural caveats. Two function signatures (classical + USAC variant), nine `SolvePnPMethod` enum values (4 valid for general project use + 2 special-case + 1 ITERATIVE default + 2 BROKEN-fallback-to-EPNP), paired `cv::solvePnPRefineLM` (LM with rotation update as perturbation, NOT on SO(3)) + alternate `cv::solvePnPRefineVVS` (Gauss-Newton on SO(3) via exponential map) refinement, paired `cv::solvePnPGeneric` for multi-solution + per-solution reprojection-error reporting, USAC RANSAC-method enumeration with 7 modern variants (USAC_DEFAULT, USAC_PARALLEL, USAC_FM_8PTS, USAC_FAST, USAC_ACCURATE, USAC_PROSAC, USAC_MAGSAC). **CRITICAL findings for the C4 row**: (i) **3D-2D INPUT CONTRACT, NOT 2D-2D** — solvePnPRansac requires Nx3 objectPoints + Nx2 imagePoints; project must perform 2D→3D lift via D-C4-1's locked-in 4-DoF flat-earth lift recommendation before invocation; (ii) **NO DIRECT 6×6 COVARIANCE OUTPUT** — AC-NEW-4 covariance-honesty contract requires D-C4-2 NEW Plan-phase decision for covariance-recovery-strategy; (iii) **TWO MINIMAL-SOLVER ENUM VALUES BROKEN** — SOLVEPNP_DLS + SOLVEPNP_UPNP fall back to EPNP per explicit docstring; valid set is `EPNP / AP3P / IPPE / SQPNP` plus 2 special-case (`P3P` for exactly-3; `IPPE_SQUARE` for 4-fixed-pattern markers) plus `ITERATIVE` default; (iv) **`cv::solvePnPRefineLM` ROTATION UPDATE NOT ON SO(3)** — minor caveat; alternate `cv::solvePnPRefineVVS` is the SO(3)-correct refiner. Canonical default minimal-sample-set method is `SOLVEPNP_EPNP`; recommended pairing for D-C4-1 = 4-DoF flat-earth lift is `SOLVEPNP_IPPE` (planar-scene minimal-solver designed for coplanar object points) with `SOLVEPNP_SQPNP` as the modern globally-optimal fallback
|
||||
- **Related Sub-question**: SQ3+SQ4 / C4 — OpenCV solvePnPRansac per-mode API capability verification (cross-source verification of canonical API documentation + structural caveats + minimal-solver enum + paired refinement variants); **D-C4-2 NEW Plan-phase decision raised** for covariance-recovery-strategy; **D-C4-1 carry-forward REINFORCED** by the 3D-2D-input-contract finding (applies to all C4 candidates, not unique to OpenCV); cross-cite to Fact #20 + #21 closures from C2 row (canonical PnP+RANSAC+LM reference pipeline shape feeds AC-NEW-4 covariance-honesty contract)
|
||||
|
||||
|
||||
### Source #84
|
||||
- **Title**: OpenGV canonical implementation — `laurentkneip/opengv` (A library for solving calibrated central and non-central geometric vision problems) GitHub repository metadata via GitHub API + License.txt — **BSD-3-Clause-equivalent boilerplate** ("Author: Laurent Kneip, ANU. All rights reserved." with three numbered redistribution conditions including non-endorsement clause; **GitHub API license SPDX detector reports `license.spdx_id: "NOASSERTION"`** because the License.txt file does NOT use the canonical Open Source Initiative BSD-3-Clause boilerplate text — verified by direct WebFetch of `https://raw.githubusercontent.com/laurentkneip/opengv/master/License.txt`); 1109 stars + 358 forks + 66 subscribers + 58 open issues; created 2013-08-10; **last pushed 2023-06-07T18:14:14Z = ~2 years 11 months stale at access time 2026-05-08** (CRITICAL maintenance finding); default branch `master`; size 7790 KB; description "OpenGV is a collection of computer vision methods for solving geometric vision problems. It is hosted and maintained by the Mobile Perception Lab of ShanghaiTech."
|
||||
- **Link**: GitHub API metadata https://api.github.com/repos/laurentkneip/opengv (accessed 2026-05-08); canonical repo https://github.com/laurentkneip/opengv ; License.txt https://raw.githubusercontent.com/laurentkneip/opengv/master/License.txt (BSD-3-Clause-equivalent boilerplate verified via WebFetch); canonical Doxygen documentation portal https://laurentkneip.github.io/opengv/
|
||||
- **Tier**: L1 (project-official codebase by Laurent Kneip + ShanghaiTech Mobile Perception Lab; canonical reference for non-OpenCV PnP solvers including p3p_kneip [Kneip et al. CVPR 2011], p3p_gao [Gao et al. PAMI 2003], UPnP [Kneip et al. ECCV 2014], gpnp [Kneip 2014 generalized PnP], gp3p [generalized 3-point]; cited by every modern multi-camera + central-camera + relative-pose paper since 2014; field-standard for non-trivial PnP variants beyond OpenCV's `cv::solvePnPRansac` coverage)
|
||||
- **Publication Date**: original 2013-08-10 → continuous development 2013-2018 → maintenance gap 2018-2023 → last pushed 2023-06-07; access date 2026-05-08; **Doxygen documentation portal generation timestamp "Generated on Mon Jan 8 2018 21:43:04 for OpenGV by 1.8.11" — documentation page is 8.3 years old at access time**
|
||||
- **Timeliness Status**: ⚠️ Within Established-baseline-reference window (2013+ — established competitive ground for non-OpenCV PnP minimal solvers + generalized-camera support) but **with CRITICAL ~3-year maintenance staleness caveat** — Established-competitive-mandatory-baseline exemption applies (OpenGV is the canonical reference for non-trivial PnP variants beyond OpenCV) but Plan-phase decision-maker MUST account for: (i) no security patches since 2023; (ii) no Eigen 3.4+ compatibility patches; (iii) no JetPack 6 + ARM Cortex-A78AE compilation testing in upstream CI; (iv) ShanghaiTech Mobile Perception Lab's claim of active maintenance is contradicted by the GitHub commit history at access time
|
||||
- **Version Info**: master branch at git commit ea7c66f5e (last commit 2023-06-07T18:14:14Z); no version tags, no releases. **Mode-enumeration query (1/3) — context7 NOT INDEXED + WebFetch fallback PASS** — `context7 resolve-library-id` returned only OpenCV variants for the OpenGV query (top-5 results were `/websites/opencv_4_x` + `/websites/opencv_4_6_0` + `/opencv/opencv` + `/opencv/opencv-python` + `/websites/opencv_5_0_0-alpha` — all unrelated to OpenGV); per Per-Mode API Capability Verification rule item 2, fall-back to official-docs WebFetch on canonical Doxygen portal `laurentkneip.github.io/opengv/page_how_to_use.html` was used (this Source #85 below + License.txt verification on this Source #84). **Absolute pose minimal solvers documented** via Source #85 §"Central absolute pose": `absolute_pose::p2p` (with known rotation), `absolute_pose::p3p_kneip` [Kneip CVPR 2011], `absolute_pose::p3p_gao` [Gao PAMI 2003], `absolute_pose::upnp` [Kneip ECCV 2014]. **Absolute pose non-minimal solvers documented**: `absolute_pose::epnp` [Lepetit IJCV 2009 — same algorithm as OpenCV's SOLVEPNP_EPNP], `absolute_pose::upnp` (also valid for non-minimal). **Generalized/multi-camera absolute pose solvers documented** via Source #85 §"Non-central absolute pose": `absolute_pose::gp3p` (Kneip 3-point generalized), `absolute_pose::gpnp` [Kneip 2014]. **Non-linear LM optimizer documented**: `absolute_pose::optimize_nonlinear(adapter)` — handles both central + non-central cases; canonical refinement after RANSAC. **RANSAC documented**: `sac::Ransac` + `sac_problems::absolute_pose::AbsolutePoseSacProblem(adapter, algorithm)` with **algorithm parameter selectable from {KNEIP, GAO, EPNP, GP3P}** — richer minimal-solver selection than OpenCV's effectively-4-valid SolvePnPMethod enum (EPNP/AP3P/IPPE/SQPNP after 2 BROKEN entries removed). **CRITICAL input-contract finding**: OpenGV uses **bearing vectors (3D unit vectors)** as input, NOT 2D pixel coordinates — adapters (`AbsoluteAdapterBase`, `RelativeAdapterBase`, `PointCloudAdapterBase`) convert from user data format to OpenGV bearing-vector representation; project must implement adapter or use `CentralAbsoluteAdapter(bearingVectors, points)` constructor where bearingVectors are pre-computed unit vectors via inverse camera-intrinsic projection from C3's pixel correspondences. **CRITICAL threshold-structure finding**: RANSAC threshold is a **3D angle (radians)** between bearing vectors, NOT a 2D pixel reprojection error — Source #85 documents the conversion `ransac.threshold_ = 1.0 - cos(atan(sqrt(2.0)*0.5/800.0))` for a focal length of 800 px and 0.5*sqrt(2.0) pixel reprojection-error-equivalent
|
||||
- **Target Audience**: System architects + C4 implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 — BSD-3-Clause-equivalent contingent on Plan-phase license-clearance verification due to NOASSERTION SPDX-detector status) + C7 (Jetson runtime) implementer (canonical OpenGV requires custom build on JetPack 6 ARM Cortex-A78AE — no canonical Jetson distribution; Plan-phase MVE prerequisite)
|
||||
- **Research Boundary Match**: **Partial match** for the project's pinned C4 mode (per-frame pose-from-correspondences via classical RANSAC-PnP with paired LM refinement) — algorithm coverage is RICHER than OpenCV at the minimal-solver axis (UPnP for both minimal+non-minimal, GP3P for generalized cameras, 2 P3P variants [Kneip + Gao] vs OpenCV's 1 P3P variant [Ke & Roumeliotis 2017 AP3P]) BUT the input contract (bearing vectors, not pixels) + threshold contract (3D angle, not pixels) + maintenance status (~3 years stale) require Plan-phase mitigation work. **N/A for the project's domain caveat** — OpenGV is a classical algorithm library with no training data; D-C2-1 retrain decision is irrelevant for OpenGV
|
||||
- **Summary**: OpenGV is the canonical reference for non-OpenCV PnP minimal solvers + generalized-camera support. **CRITICAL LICENSE FINDING**: License.txt content matches BSD-3-Clause boilerplate (three numbered redistribution conditions including non-endorsement clause) — eligible on every D-C1-1 license-posture choice CONTINGENT on Plan-phase license-clearance verification gate (because GitHub API SPDX detector reports `NOASSERTION`, indicating the License.txt file uses non-standard boilerplate that didn't match the OSI BSD-3-Clause template detection — recommend Plan-phase counsel-review of the License.txt text to confirm BSD-3-Clause-equivalent dual-use compatibility). **CRITICAL MAINTENANCE FINDING**: ~3 years stale at access time (last pushed 2023-06-07; Doxygen documentation portal generated 2018-01-08); ShanghaiTech Mobile Perception Lab's claimed maintenance is contradicted by commit history. **POSITIVE structural findings**: (i) richer minimal-solver coverage than OpenCV (UPnP minimal+non-minimal, GP3P generalized, 2 P3P variants); (ii) canonical reference for non-trivial PnP variants every modern paper compares against; (iii) generalized-camera support (multi-camera rig, non-central absolute pose) — not directly applicable to project's pinned 1× ADTi 20MP nav frame but architecturally cleaner if the project later adds a side-looking camera. **NEGATIVE structural findings**: (iv) bearing-vector input contract requires adapter or pre-computed unit-vector conversion from pixel correspondences (additional engineering vs OpenCV's direct pixel input); (v) 3D-angle RANSAC threshold requires conversion from project's pixel-reprojection-error budget; (vi) NO direct 6×6 covariance output from `optimize_nonlinear` (same finding as OpenCV — D-C4-2 covariance-recovery-strategy applies identically to OpenGV)
|
||||
- **Related Sub-question**: SQ3+SQ4 / C4 — OpenGV per-mode API capability verification (Mandatory `context7` lookup NOT-INDEXED + WebFetch fallback PASS per Per-Mode rule item 2; cross-validated against canonical GitHub API metadata WebFetch + canonical License.txt WebFetch + canonical Doxygen documentation portal [Source #85]); **D-C1-1 license-posture compliance**: BSD-3-Clause-equivalent CONTINGENT on Plan-phase license-clearance verification gate (NOASSERTION SPDX-detector caveat); **D-C4-1 carry-forward REINFORCED** (bearing-vector input contract still requires 2D→3D lift on satellite-tile-side from pixel correspondences); **D-C4-2 NEW gate APPLIES IDENTICALLY** to OpenGV (`optimize_nonlinear` returns no covariance — same Plan-phase mitigation strategies as OpenCV); **D-C4-3 NEW gate raised by OpenGV closure** — license-clearance verification due to NOASSERTION SPDX status; **D-C4-4 NEW gate raised by OpenGV closure** — maintenance-staleness mitigation (Plan-phase decision: accept-as-is + freeze upstream / fork into project-controlled branch + apply Eigen-3.4+ + JetPack-6 patches in-house / migrate to Ceres-only as fallback if patches not feasible)
|
||||
|
||||
|
||||
### Source #85
|
||||
- **Title**: OpenGV canonical Doxygen documentation portal — `laurentkneip.github.io/opengv/page_how_to_use.html` (How to use OpenGV: vocabulary, library organization, adapter pattern interface, conventions, problem types and examples) + `namespaceopengv.html` (top-level namespace) + `namespaceopengv_1_1absolute__pose.html` (absolute-pose methods reference) + `namespaceopengv_1_1relative__pose.html` (relative-pose methods reference) + `namespaceopengv_1_1sac.html` + `namespaceopengv_1_1sac__problems_1_1absolute__pose.html`
|
||||
- **Link**: documentation portal entry https://laurentkneip.github.io/opengv/ (accessed 2026-05-08); how-to-use page https://laurentkneip.github.io/opengv/page_how_to_use.html (accessed 2026-05-08; **Doxygen-generated 2018-01-08 21:43:04 by Doxygen 1.8.11 = 8.3 years old at access time**)
|
||||
- **Tier**: L1 (canonical project-official Doxygen-generated documentation; the canonical reference for OpenGV's adapter pattern, function signatures, RANSAC integration, and threshold-structure conventions)
|
||||
- **Publication Date**: page-generation 2018-01-08; access date 2026-05-08
|
||||
- **Timeliness Status**: ⚠️ Established-baseline-reference window with **8.3-year-old documentation** — Plan-phase architect MUST cross-check actual `master` branch source (`opengv/include/opengv/absolute_pose/methods.hpp` + `opengv/include/opengv/sac/Ransac.hpp` + `opengv/include/opengv/sac_problems/absolute_pose/AbsolutePoseSacProblem.hpp`) for any signature drift between 2018 documentation and 2023-06-07 master branch HEAD. The documentation portal is structurally complete for the canonical 2013-2018 published API surface; subsequent commits (2018-2023) appear to be primarily fix commits + ShanghaiTech-era additions
|
||||
- **Version Info**: master branch at git commit ea7c66f5e (last commit 2023-06-07). **Pinned-mode runnable example query (2/3) — WebFetch PASS**: Source #85 §"Central absolute pose" provides the canonical OpenGV runnable example: `absolute_pose::CentralAbsoluteAdapter adapter(bearingVectors, points); std::shared_ptr<sac_problems::absolute_pose::AbsolutePoseSacProblem> absposeproblem_ptr(new sac_problems::absolute_pose::AbsolutePoseSacProblem(adapter, sac_problems::absolute_pose::AbsolutePoseSacProblem::KNEIP)); sac::Ransac<sac_problems::absolute_pose::AbsolutePoseSacProblem> ransac; ransac.sac_model_ = absposeproblem_ptr; ransac.threshold_ = 1.0 - cos(atan(sqrt(2.0)*0.5/800.0)); ransac.max_iterations_ = maxIterations; ransac.computeModel(); ransac.model_coefficients_;` followed by optional `absolute_pose::optimize_nonlinear(adapter)` LM refinement on the inlier set with `adapter.sett(initial_translation); adapter.setR(initial_rotation);`. **Disqualifier-probe query (3/3) — FOUR FINDINGS (1 negative-but-mitigable structural + 3 caveats)**: (i) **CRITICAL contract finding — OpenGV uses bearing vectors (3D unit vectors) as input, NOT 2D pixel coordinates** (Source #85 explicit "OpenGV assumes to be in the calibrated case, and landmark measurements are always given in form of bearing vectors in a camera frame"); the project must implement a `CentralAbsoluteAdapter` constructor or pre-compute unit-vector conversion from C3's pixel correspondences via inverse camera-intrinsic projection — additional engineering vs OpenCV's direct pixel input contract; this is an API-level structural difference, not a fundamental algorithmic limitation; (ii) **CRITICAL covariance finding — `optimize_nonlinear` does NOT directly emit a 6×6 pose covariance** (Source #85 documentation does not document a covariance output API; D-C4-2 covariance-recovery-strategy applies identically to OpenGV — Plan-phase mitigation strategies (a) post-hoc Jacobian-based via custom Jacobian propagation through `optimize_nonlinear` residuals OR (b) wrap OpenGV result in GTSAM `Marginals` posterior OR (c) heuristic scaling = AC-NEW-4 REJECT family); (iii) **CRITICAL threshold-structure finding — RANSAC threshold is a 3D angle (radians) between bearing vectors, NOT a 2D pixel reprojection error** (Source #85 §"Ransac threshold" canonical conversion `ransac.threshold_ = 1.0 - cos(atan(sqrt(2.0)*0.5/800.0))` for focal length 800 px and reprojection-error-equivalent 0.5*sqrt(2.0) pixels); project must convert from pixel-reprojection-error budget at runtime; (iv) **CRITICAL maintenance staleness — Doxygen portal generated 2018-01-08 + last commit 2023-06-07 = ~8.3 years documentation staleness + ~3 years code staleness** at access time 2026-05-08; D-C4-4 NEW Plan-phase mitigation strategy required; (v) **License-clearance contingency** — License.txt is BSD-3-Clause-equivalent but GitHub SPDX detector reports NOASSERTION; D-C4-3 NEW Plan-phase license-clearance verification gate required for dual-use deployment compliance
|
||||
- **Target Audience**: System architects + C4 implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 + D-C4-3 NEW) + Plan-phase architect (richer-minimal-solver-coverage role documentation for engine Component Option Breadth rule compliance + bearing-vector adapter engineering work + 3D-angle threshold conversion engineering work + D-C4-4 NEW maintenance-staleness mitigation gate)
|
||||
- **Research Boundary Match**: Documents the OpenGV library's complete absolute-pose API surface (4 minimal solvers + 2 non-minimal solvers + 1 LM optimizer + 1 RANSAC integration + 4 algorithm-selectable RANSAC enum values) at the structural detail required for Plan-phase decision-making; runnable examples for both central + non-central + relative + multi-camera cases. **N/A for the project's domain caveat** — same as Source #84
|
||||
- **Summary**: Canonical Doxygen documentation portal for OpenGV's adapter-pattern interface and method signatures. Documents richer minimal-solver coverage than OpenCV (UPnP for both minimal+non-minimal, GP3P for generalized cameras, 2 P3P variants [Kneip + Gao] vs OpenCV's 1 [AP3P Ke & Roumeliotis 2017]). **CRITICAL contract differences vs OpenCV**: (i) bearing-vector input (3D unit vectors) instead of 2D pixels — adapter required; (ii) 3D-angle RANSAC threshold instead of pixel reprojection — conversion required; (iii) `optimize_nonlinear` LM refinement does not emit covariance — D-C4-2 still applies. **Documentation staleness**: page generated 2018-01-08 by Doxygen 1.8.11 (8.3 years old). **Maintenance staleness**: master branch last pushed 2023-06-07 (~3 years stale). **Recommended pinned mode**: `CentralAbsoluteAdapter` + `AbsolutePoseSacProblem::KNEIP` (Kneip's P3P inside RANSAC) + `optimize_nonlinear` LM refinement — Kneip's P3P is the canonical OpenGV-distinctive minimal solver and is the closest structural analog to OpenCV's `flags=SOLVEPNP_AP3P` (both are P3P variants but Kneip's is the original 2011 method while AP3P is Ke & Roumeliotis 2017 algebraic alternate); for project's planar-scene D-C4-1 = 4-DoF flat-earth lift case, OpenGV does NOT have a dedicated planar-scene minimal solver equivalent to OpenCV's `flags=SOLVEPNP_IPPE` — project would need to use Kneip's P3P or EPNP without the planar-scene specialization advantage. For project's 6-DoF DSM-lift case, OpenGV's UPnP is the modern globally-optimal alternate (analogous structural role to OpenCV's `flags=SOLVEPNP_SQPNP`)
|
||||
- **Related Sub-question**: SQ3+SQ4 / C4 — OpenGV per-mode API capability verification (cross-source verification with Source #84 GitHub API + License.txt; runnable example documented; structural caveats documented including bearing-vector contract + 3D-angle threshold + LM-no-covariance findings); **D-C4-2 NEW gate APPLIES IDENTICALLY**; **D-C4-3 NEW gate raised** (license-clearance contingency); **D-C4-4 NEW gate raised** (maintenance-staleness mitigation)
|
||||
|
||||
|
||||
### Source #86
|
||||
- **Title**: GTSAM canonical implementation — `borglab/gtsam` (Georgia Tech Smoothing and Mapping library; C++ classes for smoothing and mapping in robotics and vision using factor graphs and Bayes networks) GitHub repository metadata via GitHub API + LICENSE + LICENSE.BSD — **BSD-3-Clause** (LICENSE.BSD file contains 3 numbered redistribution conditions including non-endorsement clause; **GitHub API license SPDX detector reports `license.spdx_id: "NOASSERTION"`** because the wrapper LICENSE file at the repo root references `LICENSE.BSD` indirectly + bundles third-party license declarations rather than directly containing OSI canonical BSD-3-Clause boilerplate text; verified BSD-3-Clause via direct WebFetch of `https://raw.githubusercontent.com/borglab/gtsam/develop/LICENSE.BSD`); 3424 stars + 927 forks + 60 subscribers + 140 open issues; created 2017-03-27; **last pushed 2026-05-08T13:00:22Z = TODAY at access time** (daily-active maintenance — fresher than OpenCV); default branch `develop`; size 109374 KB; topics include `estimation, perception, robotics, sensorfusion`; canonical website https://gtsam.org and Doxygen portal https://borglab.github.io/gtsam/. **Bundled third-party libraries** (per LICENSE wrapper file): CCOLAMD 2.9.6 (BSD-3, gtsam/3rdparty/CCOLAMD), Ceres auto-diff/jet code only (BSD-3, modified, gtsam/3rdparty), Eigen 3.3.7 (MPL2 file-level copyleft, gtsam/3rdparty/Eigen), METIS 5.1.0 (Apache-2.0, gtsam/3rdparty/metis), Spectra v0.9.0 (MPL2, externally referenced) — **all clean for project's dual-use deployment** (MPL2 is file-level copyleft only, doesn't propagate to project product code; Apache-2.0 + BSD-3 are permissive)
|
||||
- **Link**: GitHub API metadata https://api.github.com/repos/borglab/gtsam (accessed 2026-05-08); canonical repo https://github.com/borglab/gtsam ; LICENSE wrapper https://raw.githubusercontent.com/borglab/gtsam/develop/LICENSE (top-level documents bundled-library licensing); LICENSE.BSD https://raw.githubusercontent.com/borglab/gtsam/develop/LICENSE.BSD (BSD-3-Clause canonical boilerplate "Copyright (c) 2010, Georgia Tech Research Corporation, Atlanta, Georgia 30332-0415, All Rights Reserved" with three numbered redistribution conditions); canonical website https://gtsam.org ; Doxygen portal https://borglab.github.io/gtsam/
|
||||
- **Tier**: L1 (project-official codebase by Georgia Tech Research Corporation Borg Lab; canonical reference factor-graph SLAM library used by every modern multi-frame state-estimation deployment as the de-facto industry-standard factor-graph foundation; cited by every C-row component's deployment guide; canonical `LevenbergMarquardtOptimizer` + `Marginals` posterior is the **industry-standard reference for covariance-honest pose estimation**)
|
||||
- **Publication Date**: original GTSAM C++ library 2010 (Frank Dellaert + Borg Lab Georgia Tech) → open-source release 2010-12 → migration to GitHub 2017-03-27 → version 4.3a1 indexed in context7 at access time (next-major-release rolling-development branch `develop`); access date 2026-05-08; daily commits to `develop` branch
|
||||
- **Timeliness Status**: ✅ Within Established-baseline-reference window (2010+ — established competitive ground for factor-graph SLAM + covariance-honest pose estimation; Established-competitive-mandatory-baseline exemption applies — `LevenbergMarquardtOptimizer` + `Marginals` is the **canonical covariance-honest factor-graph reference** for the C4 row's modern-competitive-lead role and **directly addresses AC-NEW-4 covariance-honesty contract** without D-C4-2 mitigation work)
|
||||
- **Version Info**: 4.3a1 at access time (default branch `develop` = next-major-release rolling-development branch; current stable release 4.2 from 2024). **`LevenbergMarquardtOptimizer` + `Marginals` posterior covariance recovery API surface** — see Source #87 below for full documentation and runnable examples
|
||||
- **Target Audience**: System architects + C4 implementer + Step-7.5 reviewer + license-posture decision-maker (D-C1-1 — BSD-3-Clause; bundled deps clean) + C5 (state estimator) implementer (GTSAM iSAM2 + factor-graph fusion is the canonical incremental-multi-frame-fusion pathway that scales naturally from C4 single-frame PnP to C5 multi-frame state estimation) + Plan-phase architect (D-C4-2 option (b) Plan-phase pathway candidate)
|
||||
- **Research Boundary Match**: **Full match** for the project's pinned C4 mode (per-frame pose-from-correspondences contract on Jetson Orin Nano Super) AT THE COVARIANCE-HONESTY AXIS — GTSAM is the **only C4 candidate evaluated to date that emits 6×6 pose covariance NATIVELY via `Marginals(graph, result).marginalCovariance(pose_key)`** without custom Jacobian engineering. **Architectural extension match**: GTSAM's factor-graph paradigm extends naturally from C4 single-frame PnP to C5 multi-frame state estimation via iSAM2 + `BetweenFactor<Pose3>` + `PriorFactorPose3` — would simplify C5 implementation if both C4 and C5 are GTSAM-based. **N/A for the project's domain caveat** — GTSAM is a classical factor-graph library with no training data; D-C2-1 retrain decision is irrelevant for GTSAM
|
||||
- **Summary**: GTSAM is the canonical industry-standard factor-graph SLAM library by Georgia Tech Borg Lab (Frank Dellaert et al.); the `gtsam::slam` module ships `GenericProjectionFactor<Pose3, Point3, CALIBRATION>` as the canonical per-correspondence projection factor for PnP-class problems. **CRITICAL POSITIVE LICENSE FINDING**: BSD-3-Clause via LICENSE.BSD (`Copyright (c) 2010, Georgia Tech Research Corporation`) — permissive, BSD/permissive license track on the C4 modern-competitive-lead axis; **deployment-ready under every D-C1-1 license-posture choice** with the cleanest license-compliance story tied with cvg/LightGlue + DISK + XFeat + OpenCV; bundled dependencies are clean (BSD-3/Apache-2.0/MPL2 file-level — all dual-use compatible). **Daily-active maintenance**: last pushed 2026-05-08 (TODAY at access time) — among the most actively-maintained C-row references; **fresher than OpenCV's last-pushed 2026-05-08T07:00:03Z by 6 hours at access time**. **CRITICAL POSITIVE COVARIANCE FINDING**: `Marginals(graph, result).marginalCovariance(pose_key)` emits a **direct 6×6 pose covariance** with no custom engineering — **the only C4 candidate evaluated to date that satisfies AC-NEW-4 covariance-honesty contract NATIVELY without D-C4-2 mitigation work**; this is the canonical Plan-phase pathway for D-C4-2 = (b) wrap-OpenCV-result-in-GTSAM-Marginals OR full-GTSAM-as-primary
|
||||
- **Related Sub-question**: SQ3+SQ4 / C4 — GTSAM per-mode API capability verification (Mandatory `context7` lookup INDEXED at `/borglab/gtsam` with **1121 code snippets at version 4.3a1** — best context7 indexing of any C4 candidate evaluated; full per-mode API documentation accessible via `query-docs` tool); **D-C1-1 license-posture compliance**: BSD-3-Clause with clean bundled deps; **D-C4-2 NATIVELY SATISFIED** via `Marginals` posterior covariance recovery — GTSAM is the canonical Plan-phase pathway for D-C4-2 = (b) wrap-OpenCV-result-in-GTSAM-Marginals OR full-GTSAM-as-primary; **NO new D-C4-N gates raised** by GTSAM closure (D-C4-1 carry-forward applies identically, D-C4-2 natively satisfied)
|
||||
|
||||
|
||||
### Source #87
|
||||
- **Title**: GTSAM canonical Python documentation via context7-indexed library `/borglab/gtsam` at version 4.3a1 (1121 code snippets) — `python/gtsam/examples/CameraResectioning.ipynb` (canonical PnP example with `LevenbergMarquardtOptimizer`) + `gtsam/slam/doc/ProjectionFactor.ipynb` (`GenericProjectionFactorCal3_S2` API documentation) + `python/gtsam/examples/Pose2SLAMExample.ipynb` + `python/gtsam/examples/PlanarSLAMExample.ipynb` (`Marginals.marginalCovariance` posterior covariance recovery) + `gtsam/inference/doc/FactorGraph.ipynb` (`NonlinearFactorGraph` API documentation)
|
||||
- **Link**: context7 library ID `/borglab/gtsam` at version 4.3a1; canonical docs portal https://borglab.github.io/gtsam/ ; canonical Python examples directory https://github.com/borglab/gtsam/tree/develop/python/gtsam/examples (accessed 2026-05-08 via context7 query-docs MCP integration); CameraResectioning canonical example https://github.com/borglab/gtsam/blob/develop/python/gtsam/examples/CameraResectioning.ipynb ; ProjectionFactor canonical documentation https://github.com/borglab/gtsam/blob/develop/gtsam/slam/doc/ProjectionFactor.ipynb
|
||||
- **Tier**: L1 (canonical project-official documentation via context7-indexed library; the canonical reference for GTSAM's `GenericProjectionFactorCal3_S2`, `LevenbergMarquardtOptimizer`, `Marginals.marginalCovariance`, `NonlinearFactorGraph`, `Cal3_S2` calibration, `Pose3` 6-DoF pose, and `noiseModel.Diagonal.Sigmas` API surface)
|
||||
- **Publication Date**: rolling Jupyter notebook documentation auto-updated on every push to `develop` branch; access date 2026-05-08; canonical PnP example `CameraResectioning.ipynb` has been part of the GTSAM Python distribution since version 4.0 (~2019); access via context7 query at version 4.3a1
|
||||
- **Timeliness Status**: ✅ Within Established-baseline-reference window (rolling Jupyter notebook documentation; the canonical reference for GTSAM's PnP + covariance API surface at the project's evaluation time)
|
||||
- **Version Info**: 4.3a1 at access time (default branch `develop`). **Mode-enumeration query (1/3) — context7 INDEXED PASS**: `context7 resolve-library-id` returned `/borglab/gtsam` at version 4.3a1 with 1121 code snippets + High source reputation. **Pinned-mode runnable example query (2/3) — context7 query-docs PASS**: canonical PnP runnable Python example from `CameraResectioning.ipynb`: `calibration = Cal3_S2(1, 1, 0, 50, 50)` → `graph = NonlinearFactorGraph()` → per-correspondence factor add via `graph.add(resectioning_factor(measurement_noise, X(1), calibration, Point2(image_pixel), Point3(world_landmark)))` for each 2D-3D correspondence → `initial = Values(); initial.insert(X(1), Pose3(Rot3(...), Point3(...)))` → `result = LevenbergMarquardtOptimizer(graph, initial).optimize()`. **`GenericProjectionFactorCal3_S2` canonical API**: `GenericProjectionFactorCal3_S2(measured_pt2: Point2, pixel_noise: gtsam.noiseModel, pose_key: Symbol, landmark_key: Symbol, calibration: Cal3_S2, body_P_sensor: Pose3=identity)` — per-correspondence projection factor with optional sensor-body offset for IMU-camera extrinsic. **CRITICAL POSITIVE 6×6 covariance recovery API**: `marginals = gtsam.Marginals(graph, result); pose_covariance = marginals.marginalCovariance(pose_key)` — direct 6×6 posterior covariance with NO custom Jacobian engineering required; this is the **DIRECT AC-NEW-4 covariance-honesty contract satisfaction pathway** that no other C4 candidate evaluated to date provides natively. **Disqualifier-probe query (3/3) — TWO FINDINGS (1 negative-but-mitigable structural + 1 caveat)**: (i) **CRITICAL contract finding — GTSAM has NO native RANSAC algorithm** — canonical pattern is to run RANSAC externally (e.g., via OpenCV `cv::solvePnPRansac` for the inlier mask) THEN build the factor graph from inliers only with `GenericProjectionFactorCal3_S2`; alternative is in-graph robust outlier rejection via `gtsam.noiseModel.Robust.Create(gtsam.noiseModel.mEstimator.Huber.Create(1.0), gaussian_noise)` (Huber/Tukey/Cauchy M-estimator robust kernels) OR `GncOptimizer` (Graduated Non-Convexity, Yang et al. RAL 2020) for globally-convergent RANSAC alternative; this couples C4 = GTSAM-as-primary with C5 = OpenCV-RANSAC-as-inlier-detector OR full-GTSAM-with-robust-noise-model OR full-GTSAM-with-GncOptimizer; (ii) **Memory + binary-size CAVEAT — GTSAM library footprint is ~50-200 MB at runtime depending on factor-graph size and bundled-dependency build configuration** (vs OpenCV's ~10-50 MB calib3d module); on Jetson Orin Nano Super 8 GB shared memory budget, GTSAM is the **heaviest C4 candidate evaluated to date** but still well within AC-4.2 budget when co-resident with C1/C2/C3/C5/C6
|
||||
- **Target Audience**: System architects + C4 implementer + Step-7.5 reviewer + Plan-phase architect (modern-competitive-lead role documentation for engine Component Option Breadth rule compliance + D-C4-2 NATIVELY SATISFIED + D-C5-N forward-looking carry-forward for state estimator factor-graph extension)
|
||||
- **Research Boundary Match**: **Full match** for the C4 row's pinned mode AT THE COVARIANCE-HONESTY AXIS (GTSAM `Marginals.marginalCovariance` is the only C4 candidate evaluated to date that emits 6×6 pose covariance natively; canonical PnP runnable example provided via `CameraResectioning.ipynb`; complete API surface for `LevenbergMarquardtOptimizer` + `GenericProjectionFactorCal3_S2` + `Cal3_S2` + `Pose3` + `noiseModel.Diagonal.Sigmas` documented in canonical Python notebooks); **Architectural-extension match**: GTSAM's factor-graph paradigm extends naturally from C4 single-frame PnP to C5 multi-frame state estimation via iSAM2 + `BetweenFactor<Pose3>` — would simplify C5 implementation if both C4 and C5 are GTSAM-based
|
||||
- **Summary**: The canonical GTSAM Python documentation (via context7 at version 4.3a1 with 1121 code snippets) is the definitive reference for `GenericProjectionFactorCal3_S2`, `LevenbergMarquardtOptimizer`, `Marginals.marginalCovariance`, and `NonlinearFactorGraph` API surface. **CRITICAL POSITIVE FINDING for the C4 row**: `Marginals(graph, result).marginalCovariance(pose_key)` emits a **direct 6×6 pose covariance NATIVELY** with no custom Jacobian engineering — **the only C4 candidate evaluated to date that satisfies AC-NEW-4 covariance-honesty contract without D-C4-2 mitigation work**. **NO native RANSAC** — canonical pattern is external RANSAC (via OpenCV solvePnPRansac for inliers) then GTSAM factor-graph from inliers, OR in-graph robust noise model (`gtsam.noiseModel.Robust.Create` + Huber/Tukey/Cauchy), OR `GncOptimizer` (Yang et al. RAL 2020 Graduated Non-Convexity). **Heavier library footprint** than OpenCV (~50-200 MB at runtime) but still well within AC-4.2 8 GB shared memory budget. **Architectural extension to C5**: factor-graph paradigm scales naturally to multi-frame state estimation via iSAM2 + `BetweenFactor<Pose3>` + `PriorFactorPose3` — would simplify C5 implementation
|
||||
- **Related Sub-question**: SQ3+SQ4 / C4 — GTSAM per-mode API capability verification (cross-source verification of canonical Python examples + ProjectionFactor API + Marginals posterior + LevenbergMarquardtOptimizer + NonlinearFactorGraph); **D-C4-2 NATIVELY SATISFIED** via `Marginals.marginalCovariance` — GTSAM is the canonical Plan-phase pathway for D-C4-2 = (b); cross-cite to Fact #20 + #21 closures from C2 row (canonical PnP+RANSAC+LM reference pipeline shape feeds AC-NEW-4 covariance-honesty contract); forward-cite to C5 row (factor-graph paradigm extension to multi-frame state estimation via iSAM2)
|
||||
@@ -0,0 +1,95 @@
|
||||
# Source Registry — C5: State estimator / sensor fusion
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation). Sources for C5 (state estimator / sensor fusion) candidates.
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling categories: [SQ6](SQ6_external_positioning.md), [SQ1](SQ1_existing_systems.md), [SQ2](SQ2_canonical_pipeline.md), [C1](C1_vio.md), [C2](C2_vpr.md), [C3](C3_matchers.md), [C4](C4_pose_estimation.md). Backing fact cards: [`../02_fact_cards/C5_state_estimator.md`](../02_fact_cards/C5_state_estimator.md). Component fit matrix row: [`../06_component_fit_matrix/C5_state_estimator.md`](../06_component_fit_matrix/C5_state_estimator.md).
|
||||
|
||||
---
|
||||
|
||||
## Source #88 — Solà 2017 "Quaternion kinematics for the error-state Kalman filter" (canonical aerial/quaternion ESKF tutorial)
|
||||
|
||||
**Title**: "Quaternion kinematics for the error-state Kalman filter"
|
||||
**Author**: Joan Solà
|
||||
**Venue**: arXiv preprint cs.RO 1711.02508 (HAL hal-01122406; Semantic Scholar 12412090e46d1b21eecc59d1326edb8e47e9640e)
|
||||
**Submitted**: 2017-11-03 (revision v5 hosted on HAL); originally drafted earlier and continually revised since 2014
|
||||
**URL**: <https://arxiv.org/abs/1711.02508> (canonical) + <https://hal.science/hal-01122406v5> (HAL mirror)
|
||||
**Tier**: L1 (canonical authoritative tutorial; 592 citations per Semantic Scholar; the de-facto industry reference for ESKF + quaternion algebra in robotics + aerospace + UAV applications since 2017; open-access public-domain academic preprint)
|
||||
**Length**: 73 sections including 9 main parts (§1 quaternion definition + §2 rotations + §3 conventions + §4 perturbations/derivatives/integrals + §5 error-state kinematics for IMU-driven systems + §6 fusing IMU with complementary sensory data + §7 ESKF using global angular errors + §8 high-order integration variants + §9 references + §10 appendix)
|
||||
**Date Accessed**: 2026-05-08
|
||||
|
||||
**Why it matters for C5**:
|
||||
- §5.1 lists the THREE structural advantages of ESKF over standard EKF that drive its dominance for UAV applications: (i) minimal orientation error-state (no over-parametrization, no covariance singularity), (ii) error-state always near origin (linearization always valid), (iii) error-state always small (Jacobians fast and often constant).
|
||||
- §5.4 provides discrete-time error-state Jacobians directly usable for project's IMU integration.
|
||||
- §6 (sub-divided into §6.1 measurement update + §6.2 injection + §6.3 covariance reset) is the canonical recipe for fusing IMU with complementary sensors (project's case = C1 VIO + C4 satellite anchors + FC IMU).
|
||||
- §6 explicitly states (line 2013 of the paper text): "At the arrival of other kind of information than IMU, such as GPS or vision, we proceed to correct the ESKF. ... These vision + IMU setups are very interesting for use in **GPS-denied environments**, and can be implemented on mobile devices ... but also on **UAVs and other small, agile platforms**." — a direct project-relevant endorsement from the canonical tutorial.
|
||||
- §1675-1677 of the paper text frames the project's exact problem statement: "Integrating IMU readings leads to dead-reckoning positioning systems, which drift with time. Avoiding drift is a matter of fusing this information with absolute position readings such as GPS or vision."
|
||||
- §6.3 explicitly notes that the canonical reset Jacobian G can be approximated as `G = I_18` in most implementations, "but the expression here provided should produce more precise results, which might be of interest for reducing long-term error drift in odometry systems" — relevant for project's 8-hour fixed-wing flights where long-term drift is a binding concern.
|
||||
- §7 provides an alternate formulation using global angular errors (vs §5's local angular errors); both are valid; project must pick one and stick with it.
|
||||
|
||||
---
|
||||
|
||||
## Source #89 — Reference open-source ESKF implementations (canonical-paper-derived)
|
||||
|
||||
**Repositories examined**:
|
||||
|
||||
| # | Repo | Language | License | Sensors fused | Project relevance |
|
||||
|---|---|---|---|---|---|
|
||||
| 89.a | `ludvigls/ESKF` | Python | (LICENSE not declared in front-page README — Plan-phase verification gate **D-C5-1 NEW** required if adoption) | IMU + GNSS for fixed-wing UAVs | **DIRECTLY MATCHES project hardware family (fixed-wing UAV + IMU + GNSS-replacement)** — closest documentary template; tested on simulated + real datasets per author description |
|
||||
| 89.b | `cggos/imu_x_fusion` | C++/ROS | (Plan-phase verification gate **D-C5-1 NEW** required if adoption) | IMU + GNSS + 6DoF-Odom (loosely-coupled) — also IEKF, UKF (UKF/SPKF, JUKF, SVD-UKF), MAP variants | **MATCHES project pattern** — multi-source loosely-coupled fusion (IMU + GNSS-as-satellite_anchor + Odom-as-VIO) |
|
||||
| 89.c | `EliaTarasov/ESKF` | C++/ROS | (Plan-phase verification gate **D-C5-1 NEW** required if adoption) | GPS + Magnetometer + Vision Pose + Optical Flow + Range Finder fused with IMU (ROS Error-State Kalman Filter based on PX4/ecl) | **CLOSE MATCH but PX4-derived** — license-clear if PX4/ecl BSD-3-Clause, but verify that the derived code is BSD-3-Clause (PX4 is dual BSD/Apache, ecl is BSD-3-Clause) |
|
||||
| 89.d | `koledickarlo/ESKF-ESP32` | C++ | (LICENSE not declared in front-page README — Plan-phase verification gate **D-C5-1 NEW** required if adoption) | Accelerometer + Gyroscope + Optical Flow + Time-of-Flight (microcontroller-class, ESP32) | NOT MATCH — microcontroller-class targets (ESP32) not Jetson; useful only as small-state ESKF reference (Solà 2017 paper explicit citation) |
|
||||
| 89.e | `joansola/slamtb` | MATLAB | (LICENSE not declared in front-page README) | EKF-SLAM (full visual-inertial SLAM toolbox) | Author Joan Solà's own SLAM Toolbox in MATLAB — the most authoritative reference for the canonical paper but MATLAB-only, NOT deployable on JetPack 6 |
|
||||
|
||||
**Interpretation**: For Fact #88, project does NOT directly reuse any of the above repositories at the source-code level (license verification gates D-C5-1 NEW + cross-domain adaptation costs). Instead, the project implements ESKF following Solà 2017 §5+§6 equations directly in Python (NumPy/SciPy) or C++17 (Eigen3), using ludvigls/ESKF (89.a) as the closest documentary reference template for fixed-wing UAV ESKF structure. The reference implementations serve as evidence that Solà 2017 ESKF is implementable + deployable on UAV-class platforms with multi-sensor fusion patterns identical to the project's pinned configuration.
|
||||
|
||||
**URLs accessed (full canonical README pages)**:
|
||||
- <https://github.com/ludvigls/ESKF>
|
||||
- <https://github.com/cggos/imu_x_fusion>
|
||||
- <https://github.com/EliaTarasov/ESKF>
|
||||
- <https://github.com/koledickarlo/ESKF-ESP32>
|
||||
- <https://github.com/joansola/slamtb>
|
||||
|
||||
**Tier**: L1 (canonical project repositories; multiple independent reproductions of Solà 2017 paper across Python, C++/ROS, MATLAB, and microcontroller-class) + L2 (reference template only; project does NOT directly reuse).
|
||||
**Date Accessed**: 2026-05-08
|
||||
|
||||
---
|
||||
|
||||
## Source #90 — GTSAM `ImuFactor` / `CombinedImuFactor` / `PreintegratedImuMeasurements` / `PreintegratedCombinedMeasurements` (context7 query-docs at `/borglab/gtsam` — IMU pre-integration sub-API)
|
||||
|
||||
**Title**: GTSAM canonical `ImuFactor` and `CombinedImuFactor` API reference + canonical Python runnable examples
|
||||
**Source**: context7 query-docs at `/borglab/gtsam` version 4.3a1 with 1121 code snippets (cross-cite to Source #87 from C4 Fact #54 — same library, different sub-API surface; queried 2026-05-08 for IMU + state-estimation extension to C5)
|
||||
**Returned canonical Python notebooks**:
|
||||
- `gtsam/navigation/doc/ImuFactor.ipynb` — basic `ImuFactor(X(0), V(0), X(1), V(1), B(0), pim)` 5-key factor + canonical `PreintegrationParams.MakeSharedU(9.81)` setup + `PreintegratedImuMeasurements(params, bias_hat)` + `pim.integrateMeasurement(acc_meas, gyro_meas, dt)` + `pim.predict(initial_state, current_best_bias)` + `imu_factor.evaluateError(pose_i, vel_i, pose_j, vel_j, bias_i)`
|
||||
- `gtsam/navigation/doc/CombinedImuFactor.ipynb` — modern `CombinedImuFactor(X(0), V(0), X(1), V(1), B(0), B(1), pim)` 6-key factor with bias evolution per random walk via `PreintegrationCombinedParams.MakeSharedU(9.81)` + `params.setBiasAccCovariance(np.eye(3) * bias_acc_rw_sigma**2)` + `params.setBiasOmegaCovariance(np.eye(3) * bias_gyro_rw_sigma**2)` + `params.setBiasAccOmegaInit(initial_bias_cov)` + `PreintegratedCombinedMeasurements(params, bias_hat)`
|
||||
- `gtsam/navigation/doc/PreintegratedImuMeasurements.ipynb` — full PIM workflow: `pim.integrateMeasurement(acc, gyro, dt)` × N → `pim.deltaTij()` / `pim.deltaRij().matrix()` / `pim.deltaPij()` / `pim.deltaVij()` / `pim.biasHat()` / `pim.preintMeasCov()` 9×9 covariance + `pim.predict(initial_state, current_best_bias)` for IMU-only state extrapolation
|
||||
- `gtsam/navigation/doc/GPSFactor.ipynb` — `GPSFactor(pose_key, gps_measurement_enu, gps_noise_model)` for 3-DoF GPS prior + `GPSFactorArmCalib(pose_key, lever_arm_key, gps_measurement_enu, gps_noise_model)` for GPS with unknown lever-arm calibration
|
||||
|
||||
**Tier**: L1 (canonical context7-indexed library documentation at version 4.3a1; cross-validated against canonical Doxygen portal `borglab.github.io/gtsam/`).
|
||||
**URL**: context7 indexing of <https://github.com/borglab/gtsam/tree/develop/gtsam/navigation/doc/> (canonical Borg Lab navigation documentation; access via context7 server at queried-date 2026-05-08)
|
||||
**Cross-cite**: Source #86 (canonical `borglab/gtsam` GitHub repo + LICENSE.BSD direct WebFetch — BSD-3-Clause throughout per C4 Fact #54), Source #87 (canonical GTSAM Python examples via context7 query-docs at version 4.3a1 — `CameraResectioning.ipynb` + `Pose2SLAMExample.ipynb` + `PlanarSLAMExample.ipynb` per C4 Fact #54)
|
||||
|
||||
**Date Accessed**: 2026-05-08 (~13:00 UTC, immediately after C4 Fact #54 closure — same daily-active GTSAM master branch state)
|
||||
|
||||
---
|
||||
|
||||
## Source #91 — GTSAM `ISAM2` / `IncrementalFixedLagSmoother` / `Marginals` with iSAM2 results (context7 query-docs at `/borglab/gtsam` — incremental smoothing sub-API)
|
||||
|
||||
**Title**: GTSAM canonical `ISAM2` and `IncrementalFixedLagSmoother` incremental smoothing API + `Marginals` posterior recovery for iSAM2 results
|
||||
**Source**: context7 query-docs at `/borglab/gtsam` version 4.3a1 with 1121 code snippets (queried 2026-05-08 for incremental smoothing sub-API)
|
||||
**Returned canonical Python notebooks**:
|
||||
- `gtsam/inference/doc/ISAM.ipynb` — `GaussianISAM(initial_bayes_tree)` constructor + `isam.update(new_factors)` incremental graph modification + `isam.print()` introspection (legacy linear `GaussianISAM`; modern nonlinear `ISAM2` follows the same API pattern with additional `ISAM2Params(relinearizeThreshold, relinearizeSkip, factorization, evaluateNonlinearError, cacheLinearizedFactors, ...)` configuration)
|
||||
- `python/gtsam/examples/PlanarSLAMExample.ipynb` — `Marginals(graph, result).marginalCovariance(key)` 6×6 posterior covariance recovery (works with both batch `LevenbergMarquardtOptimizer` results and `ISAM2.calculateEstimate()` results)
|
||||
- `python/gtsam/examples/Pose2SLAMExample.ipynb` — same canonical `PriorFactorPose2(1, Pose2(0, 0, 0), PRIOR_NOISE)` initial-pose anchor pattern; reusable for Pose3 (`PriorFactorPose3(X(0), Pose3(...), prior_noise)`) for project's 3D state estimation
|
||||
- `gtsam/slam/doc/lago.ipynb` — `lago.initialize(graph)` linear-and-iterative-pose-graph initialization (good for cold-start pose initialization from FC GPS-extrapolated pose at boot per AC-NEW-1)
|
||||
- `gtsam/slam/doc/InitializePose3.ipynb` — `InitializePose3.initialize(graph)` chordal-relaxation 3D initialization (modern alternative for Pose3 cold-start)
|
||||
- `gtsam/inference/doc/FactorGraph.ipynb` — `NonlinearFactorGraph()` + `BetweenFactorPose2(X(0), X(1), Pose2(1, 0, 0), odometry_noise)` + `PriorFactorPose2(X(0), Pose2(0, 0, 0), prior_noise)` core factor-graph patterns (project applies Pose3 variants: `BetweenFactorPose3` + `PriorFactorPose3` + `GenericProjectionFactorCal3DS2`)
|
||||
|
||||
**Note on `IncrementalFixedLagSmoother`**: context7 query-docs at /borglab/gtsam returned ISAM (legacy GaussianISAM) examples but did NOT return a top-3 `IncrementalFixedLagSmoother` snippet on the queried search. The IncrementalFixedLagSmoother class is documented in the canonical GTSAM source tree at `gtsam_unstable/nonlinear/IncrementalFixedLagSmoother.h` (not in the `develop` branch's stable area; in the `gtsam_unstable` namespace, requiring user to opt-in to unstable APIs). Project must verify at Plan-phase Jetson MVE that IncrementalFixedLagSmoother is the correct sliding-window primitive vs writing custom marginalization on top of `ISAM2.marginalizeLeaves(keys_to_marginalize)`.
|
||||
|
||||
**Tier**: L1 (canonical context7-indexed library documentation at version 4.3a1) + L2 (IncrementalFixedLagSmoother — gtsam_unstable namespace, verification at Plan phase required).
|
||||
**URL**: context7 indexing of <https://github.com/borglab/gtsam/tree/develop/gtsam/inference/doc/> + <https://github.com/borglab/gtsam/tree/develop/python/gtsam/examples/> (canonical Borg Lab inference + examples documentation; access via context7 server at queried-date 2026-05-08)
|
||||
**Cross-cite**: Source #86 + Source #87 + Source #90 (all GTSAM library; same daily-active master branch state)
|
||||
|
||||
**Date Accessed**: 2026-05-08
|
||||
|
||||
---
|
||||
@@ -0,0 +1,142 @@
|
||||
# Source Registry — C6: Tile cache + spatial index
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation). Sources backing the C6 component candidates ([`../06_component_fit_matrix/C6_tile_cache_spatial_index.md`](../06_component_fit_matrix/C6_tile_cache_spatial_index.md)) and C6 fact cards ([`../02_fact_cards/C6_tile_cache_spatial_index.md`](../02_fact_cards/C6_tile_cache_spatial_index.md)).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling component sources: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md). Sub-question sources: [SQ6 external positioning](SQ6_external_positioning.md), [SQ1 existing systems](SQ1_existing_systems.md), [SQ2 canonical pipeline](SQ2_canonical_pipeline.md).
|
||||
|
||||
---
|
||||
|
||||
## Scope summary
|
||||
|
||||
C6 candidates evaluated documentary level: **Cand 1 (mandatory simple-baseline)** mirrors the parent-suite `satellite-provider` pattern (PostgreSQL + pure btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` + filesystem tile storage at `./tiles/{zoom}/{x}/{y}.jpg`); **Cand 2 (modern-competitive-lead-spatial-extension)** = PostGIS GiST on `geography(POINT,4326)` for geographic side + pgvector HNSW for descriptor ANN side + same filesystem tile storage. Both candidates share the same Postgres-as-runtime-DB substrate per user-pinned scope (Postgres on Jetson at runtime, c6_postgres_locus = A). The user explicitly stated the satellite-provider pattern is NOT carved in stone — Cand 2 may cascade changes back to the satellite-provider IF research reveals a MATERIAL improvement (small improvements stay with Cand 1).
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
### Source #92 — Parent-suite `satellite-provider` existing pattern (verified directly via filesystem read at /Users/obezdienie001/dev/azaion/suite/satellite-provider/)
|
||||
|
||||
**Title**: `azaion/suite/satellite-provider` .NET 8.0 microservice (PostgreSQL + Dapper + filesystem tile storage)
|
||||
**Tier**: L1 — primary code in the same multi-repo project workspace
|
||||
**URL**: file:///Users/obezdienie001/dev/azaion/suite/satellite-provider/
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**:
|
||||
- README at `satellite-provider/README.md` — confirms PostgreSQL backend, .NET 8.0 microservice, Dapper-based DataAccess layer, filesystem tile storage at `./tiles/{zoomLevel}/{x}/{y}.jpg`, NO PostGIS extension declared.
|
||||
- Migration `001_CreateTilesTable.sql` — `tiles` table with `(id UUID PK, zoom_level INT, latitude DOUBLE PRECISION, longitude DOUBLE PRECISION, tile_size_meters DOUBLE PRECISION, tile_size_pixels INT, image_type VARCHAR(10), maps_version VARCHAR(50), file_path VARCHAR(500), created_at, updated_at)`.
|
||||
- Migration `003_CreateIndexes.sql` — `CREATE INDEX idx_tiles_composite ON tiles(latitude, longitude, tile_size_meters)` + `CREATE INDEX idx_tiles_zoom ON tiles(zoom_level)` + `CREATE INDEX idx_regions_status ON regions(status)`. **Pure btree composite indexes; NO GiST, NO PostGIS, NO spatial extension.**
|
||||
- Migration `011_AddTileCoordinates.sql` — RENAME `zoom_level` → `tile_zoom`; ADD `tile_x INT NOT NULL` + `tile_y INT NOT NULL` derived via slippy-map Web Mercator math (`tile_x = FLOOR((longitude + 180.0) / 360.0 * POWER(2, tile_zoom))::INT` + `tile_y = FLOOR((1.0 - LN(TAN(RADIANS(latitude)) + 1.0 / COS(RADIANS(latitude))) / PI()) / 2.0 * POWER(2, tile_zoom))::INT`); CREATE UNIQUE INDEX `idx_tiles_unique_location ON tiles(latitude, longitude, tile_zoom, tile_size_meters, version)` + `CREATE INDEX idx_tiles_coordinates ON tiles(tile_zoom, tile_x, tile_y, version)`. **Confirms: existing pattern uses btree on slippy-map (zoom, x, y) integer-coordinate columns for spatial-grid range queries.**
|
||||
|
||||
**Key facts extracted**:
|
||||
- DB engine: PostgreSQL (vanilla, no extensions).
|
||||
- Spatial index strategy: pure btree composite on slippy-map integer coordinates `(tile_zoom, tile_x, tile_y, version)` for spatial-grid range queries; secondary btree on lat/lon for inverse-geocode lookups.
|
||||
- Tile bytes: filesystem at canonical slippy-map path `./tiles/{zoom}/{x}/{y}.jpg`.
|
||||
- DB ↔ filesystem coupling: `file_path VARCHAR(500)` pointer in DB.
|
||||
- Migration mechanism: numbered SQL files as `EmbeddedResource`, run automatically on startup via `DatabaseMigrator.cs`.
|
||||
- App layer: .NET 8.0 + Dapper + raw SQL repos.
|
||||
|
||||
**Implication**: For the on-Jetson C6 (which is Python/C++, not .NET), the equivalent stack is `psycopg[binary]` or `asyncpg` Python driver + raw SQL queries against the same schema pattern.
|
||||
|
||||
---
|
||||
|
||||
### Source #93 — PostgreSQL official documentation: btree multi-column index ordering and range query optimization
|
||||
|
||||
**Title**: PostgreSQL 16 documentation — "Multicolumn Indexes" + "Indexes and ORDER BY" + "EXPLAIN" + "btree access method"
|
||||
**Tier**: L1 — official authoritative docs
|
||||
**URL**: <https://www.postgresql.org/docs/current/indexes-multicolumn.html> + <https://www.postgresql.org/docs/current/btree.html>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: pending WebFetch
|
||||
**Key facts to extract**:
|
||||
- Btree multicolumn index supports range queries on the leading prefix (i.e., `WHERE tile_zoom = ? AND tile_x BETWEEN ? AND ?` uses the index optimally).
|
||||
- Btree composite index access time: O(log N) where N = total rows.
|
||||
- Storage overhead: typically ~50-100 bytes per index entry depending on column types.
|
||||
|
||||
**Use**: backs Fact #92 sub-matrix entries on AC-4.1 (latency) and AC-4.2 (memory) for Cand 1.
|
||||
|
||||
---
|
||||
|
||||
### Source #94 — PostGIS official documentation: GiST spatial index on geography type + KNN distance ordering
|
||||
|
||||
**Title**: PostGIS 3.4 documentation — "GiST Indexes" + "geography Type" + "PostGIS Special Functions Index" + "ST_DWithin" + "<-> KNN operator"
|
||||
**Tier**: L1 — official authoritative docs (OGC SFS-compliant canonical extension)
|
||||
**URL**: <https://postgis.net/docs/using_postgis_dbmanagement.html#idx-spgist> + <https://postgis.net/docs/geography.html> + <https://postgis.net/workshops/postgis-intro/knn.html>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: pending WebFetch
|
||||
**Key facts to extract**:
|
||||
- GiST index access time on `geography(POINT,4326)`: O(log N) for bounding-box pre-filter; full geographic distance check is exact (not approximate).
|
||||
- KNN ordering via `ORDER BY position <-> ST_MakePoint(?, ?)::geography LIMIT K` is index-optimized in PostGIS 2.0+.
|
||||
- `ST_DWithin(position::geography, ST_MakePoint(?, ?)::geography, radius_m)` supports radius queries with native great-circle distance.
|
||||
- PostGIS extension installed footprint: typically ~30-50 MB shared libraries + ~10-20 MB SRID/projection metadata catalog.
|
||||
|
||||
**Use**: backs Fact #93 sub-matrix entries on AC-4.1 (latency) and AC-4.2 (memory) for Cand 2 + comparative-improvement-vs-Cand-1 analysis.
|
||||
|
||||
---
|
||||
|
||||
### Source #95 — pgvector official documentation: HNSW index for vector similarity search
|
||||
|
||||
**Title**: pgvector — "Open-source vector similarity search for Postgres" (`pgvector/pgvector`)
|
||||
**Tier**: L1 — canonical implementation by Andrew Kane
|
||||
**URL**: <https://github.com/pgvector/pgvector> + context7 indexed via `/pgvector/pgvector`
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: pending context7 + WebFetch
|
||||
**Key facts to extract**:
|
||||
- HNSW index API: `CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)` + `CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)` + `CREATE INDEX ON items USING hnsw (embedding vector_ip_ops)`.
|
||||
- Default tunable parameters: `m=16` (max connections per layer) + `ef_construction=64` (build-time candidate list size); query-time `ef_search` (default 40).
|
||||
- Vector dimension limits: pgvector 0.7+ supports up to 16,000 dimensions for HNSW; 2,000 dimensions for IVFFlat.
|
||||
- Memory footprint: extension itself ~5-10 MB shared library; per-vector storage = 4 bytes × dimensions (so 2048-D = 8 KB/vec, 1024-D = 4 KB/vec, 512-D = 2 KB/vec, 256-D = 1 KB/vec).
|
||||
|
||||
**Use**: backs Fact #93 sub-matrix on descriptor ANN side for Cand 2 + comparative cache footprint analysis.
|
||||
|
||||
---
|
||||
|
||||
### Source #96 — FAISS official documentation: in-memory ANN library + Python bindings
|
||||
|
||||
**Title**: FAISS — "A library for efficient similarity search and clustering of dense vectors" (`facebookresearch/faiss`)
|
||||
**Tier**: L1 — canonical implementation by Meta AI Research
|
||||
**URL**: <https://github.com/facebookresearch/faiss> + <https://faiss.ai/>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: pending WebFetch + context7
|
||||
**Key facts to extract**:
|
||||
- Index types relevant to C6 descriptor ANN: `IndexFlatL2` (brute-force, exact), `IndexHNSWFlat` (HNSW graph, approximate), `IndexIVFFlat` (Inverted File, approximate w/ training).
|
||||
- Memory: in-memory only at query time; loaded from disk via `faiss.read_index(path)` at startup.
|
||||
- License: MIT.
|
||||
- Python API: `faiss.IndexFlatL2(d)` / `faiss.IndexHNSWFlat(d, m)` / `index.add(xb)` / `D, I = index.search(xq, k)`.
|
||||
|
||||
**Use**: backs Fact #92 sub-matrix on descriptor ANN side for Cand 1 (app-side FAISS in-memory loaded at takeoff from Postgres bytea blobs).
|
||||
|
||||
---
|
||||
|
||||
### Source #97 — Postgres on NVIDIA Jetson Orin Nano memory footprint and JetPack 6 deployment
|
||||
|
||||
**Title**: PostgreSQL on ARM64 / Ubuntu 22.04 (JetPack 6 base) — official packaging + Docker images
|
||||
**Tier**: L1 — official Postgres ARM64 packages + Docker `postgres:16-alpine` image documentation
|
||||
**URL**: <https://hub.docker.com/_/postgres> + <https://www.postgresql.org/download/linux/ubuntu/>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: pending WebFetch
|
||||
**Key facts to extract**:
|
||||
- ARM64 packages available for Postgres 16 on Ubuntu 22.04 (JetPack 6 base).
|
||||
- Default `shared_buffers=128MB` + `work_mem=4MB` resident footprint ~80-150 MB on idle; ~200-400 MB under modest load.
|
||||
- Docker `postgres:16-alpine` image size: ~250 MB compressed.
|
||||
- PostGIS Docker image `postgis/postgis:16-3.4-alpine` adds ~50-80 MB to base postgres image.
|
||||
|
||||
**Use**: backs both Fact #92 + Fact #93 sub-matrix entries on AC-4.2 (8 GB shared memory budget) for the Postgres-on-Jetson deployment.
|
||||
|
||||
---
|
||||
|
||||
### Source #98 — Slippy Map Tilenames specification (OpenStreetMap canonical reference)
|
||||
|
||||
**Title**: Slippy Map Tilenames — XYZ tile coordinate system + Web Mercator projection
|
||||
**Tier**: L1 — canonical convention documented by OpenStreetMap Foundation
|
||||
**URL**: <https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: pending WebFetch
|
||||
**Key facts to extract**:
|
||||
- Tile X/Y math: `xtile = floor((lon + 180) / 360 * 2^zoom)` + `ytile = floor((1 - asinh(tan(lat * π/180)) / π) / 2 * 2^zoom)` — matches satellite-provider migration 011 exactly.
|
||||
- Tile coverage: at zoom Z, world divided into 2^Z × 2^Z tiles; each tile covers `360/2^Z` longitude × variable-latitude.
|
||||
- Project zoom: ZoomLevel 18 (per satellite-provider README default) covers ~38m × 38m at equator (cited as "tileSizeMeters: 38.2" in README sample response).
|
||||
- Cache budget per AC-8.3 (10 GB): at typical JPEG ~30 KB/tile, fits ~330,000 tiles = roughly an area of 50 km × 50 km × 9 zoom levels OR a single mission corridor at zoom 18 of ~1000 km × 12 m.
|
||||
|
||||
**Use**: backs both Fact #92 + Fact #93 sub-matrix entries on AC-8.3 (10 GB cache budget) + AC-3.x (mission corridor coverage).
|
||||
|
||||
---
|
||||
|
||||
(Subsequent sources #99+ added during fact extraction below as candidate-specific evidence is gathered.)
|
||||
@@ -0,0 +1,190 @@
|
||||
# Source Registry — C7: On-Jetson inference runtime
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation). Sources backing the C7 cross-cutting integration row ([`../06_component_fit_matrix/C7_inference_runtime.md`](../06_component_fit_matrix/C7_inference_runtime.md)) and C7 fact cards ([`../02_fact_cards/C7_inference_runtime.md`](../02_fact_cards/C7_inference_runtime.md)).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling component sources: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md), [C6 Tile cache](C6_tile_cache_spatial_index.md). Sub-question sources: [SQ6 external positioning](SQ6_external_positioning.md), [SQ1 existing systems](SQ1_existing_systems.md), [SQ2 canonical pipeline](SQ2_canonical_pipeline.md).
|
||||
|
||||
---
|
||||
|
||||
## Scope summary
|
||||
|
||||
C7 is a **cross-cutting integration row** rather than a per-component candidate row: it pins how the C1 VIO learned-frontend (if any), C2 VPR backbone, and C3 matcher actually run on the Jetson Orin Nano Super under JetPack 6 — TensorRT vs ONNX Runtime+TRT EP vs pure PyTorch FP16. Per the user-pinned scope (locked via `/autodev` AskQuestion 2026-05-08 — see `_docs/_autodev_state.md` `c7_breadth=B`, `c7_quantization=A`, `c7_overkill_options=A`), three documentary candidate rows are evaluated: **TensorRT native primary** + **ONNX Runtime + TensorRT EP interop alternate** + **pure PyTorch FP16 mandatory simple-baseline**. INT8 primary + FP16 fallback per candidate; INT8-only candidates Experimental until calibration data exists. Triton / DeepStream / CUDA-Python custom kernels noted-and-rejected in one sentence (server/video-pipeline class or out-of-budget for embedded 8 h mission). Cand-row candidates inherit and propagate Plan-phase gates already opened by C2 (D-C2-5 DINOv2 ViT-export to TensorRT FP16/INT8) and C3 (D-C3-2 LightGlue inference runtime path).
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
### Source #99 — NVIDIA TensorRT 10.x official documentation portal (context7-indexed)
|
||||
|
||||
**Title**: NVIDIA TensorRT — SDK for optimizing and accelerating deep learning inference on NVIDIA GPUs (mixed precision, dynamic shapes, transformer optimizations)
|
||||
**Tier**: L1 — official authoritative SDK documentation (NVIDIA primary)
|
||||
**URL**: <https://docs.nvidia.com/deeplearning/tensorrt/latest/> + context7 indexed at `/websites/nvidia_deeplearning_tensorrt`
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ context7 query "INT8 calibration EntropyCalibrator2 ICudaEngine deserialize Jetson Orin Nano FP16 mixed precision deployment workflow Python builder" returned 9371 code snippets at Source Reputation High + Benchmark Score 75.25.
|
||||
|
||||
**Key APIs verified**:
|
||||
- **INT8 calibrator hierarchy**: `nvinfer1::IInt8Calibrator` (abstract base) + `nvinfer1::IInt8EntropyCalibrator` (deprecated) + `nvinfer1::IInt8EntropyCalibrator2` (current canonical) + `nvinfer1::IInt8MinMaxCalibrator`. Each defines `getBatchSize()` + `getBatch(void* bindings[], const char* names[], int32_t nbBindings)` + `readCalibrationCache(size_t& length)` + `writeCalibrationCache(const void* ptr, size_t length)` + `getAlgorithm()` returning `kENTROPY_CALIBRATION_2` for the canonical path.
|
||||
- **Python builder INT8 enable pattern** (canonical TensorRT 10.x):
|
||||
```python
|
||||
config.set_flag(trt.BuilderFlag.INT8)
|
||||
config.int8_calibrator = Int8_calibrator
|
||||
Int8_calibrator = EntropyCalibrator(["input_node_name"], batchstream)
|
||||
```
|
||||
- **Mixed-precision flag pattern**: `config.set_flag(trt.BuilderFlag.FP16)` + `config.set_flag(trt.BuilderFlag.INT8)` for combined FP16+INT8 mixed precision (TensorRT auto-selects per-layer precision based on calibration data).
|
||||
|
||||
**Use**: backs Fact #94 (TensorRT native primary candidate) per-mode API verification block + Plan-phase D-C7-1 calibration-dataset-strategy + D-C7-2 mixed-precision flag matrix.
|
||||
|
||||
---
|
||||
|
||||
### Source #100 — Microsoft ONNX Runtime official documentation (context7-indexed) + Jetson AI Lab community wheel index
|
||||
|
||||
**Title**: Microsoft ONNX Runtime — cross-platform ML inference and training accelerator with TensorRT execution provider; Jetson-specific install path via Jetson AI Lab community PyPI index
|
||||
**Tier**: L1 — official authoritative SDK documentation (Microsoft primary) + L2 community-maintained Jetson wheel index
|
||||
**URL**: <https://onnxruntime.ai/> + context7 indexed at `/microsoft/onnxruntime` (v1.25.0) + <https://pypi.jetson-ai-lab.io/jp6/cu126/> + <https://github.com/dusty-nv/jetson-containers/issues/1283> + <https://github.com/microsoft/onnxruntime/issues/20503> + <https://github.com/microsoft/onnxruntime/issues/27562>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ context7 query "TensorRT execution provider TrtFp16Enable TrtInt8Enable TrtCachePath onnxruntime-gpu Jetson ARM64 inference session options" returned 1462 code snippets at Source Reputation High + Benchmark Score 82.23 (highest of the 3 C7 candidate context7 lookups).
|
||||
|
||||
**Key APIs verified**:
|
||||
- **Provider enumeration + config pattern** (canonical Python API):
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
print(ort.get_available_providers())
|
||||
tensorrt_options = {'device_id': 0, 'trt_max_workspace_size': 2147483648, 'trt_fp16_enable': True}
|
||||
cuda_options = {'device_id': 0, 'arena_extend_strategy': 'kNextPowerOfTwo', 'gpu_mem_limit': 2 * 1024 * 1024 * 1024}
|
||||
session_trt = ort.InferenceSession(
|
||||
"model.onnx",
|
||||
providers=[('TensorrtExecutionProvider', tensorrt_options), ('CUDAExecutionProvider', cuda_options), 'CPUExecutionProvider']
|
||||
)
|
||||
```
|
||||
- **Provider-cascade behavior**: ORT TRT EP attempts to optimize each subgraph via TensorRT; falls back to CUDA EP for unsupported ops; falls back to CPU EP if neither GPU EP applies. Subgraph fallback is automatic and per-op transparent.
|
||||
|
||||
**Jetson install constraints (CRITICAL)**:
|
||||
- **Standard `pip install onnxruntime-gpu` does NOT work on Jetson Tegra** — Microsoft does not publish prebuilt aarch64 wheels with CUDA/TensorRT EPs (per Issue #20503: "NVIDIA does not have CI infrastructure to publish them").
|
||||
- **Canonical install path (JetPack 6 + CUDA 12.6 + Ubuntu 22.04)**: `pip3 install onnxruntime-gpu --index-url https://pypi.jetson-ai-lab.io/jp6/cu126`.
|
||||
- **Alternate index (CUDA 12.9 + Ubuntu 24.04)**: `pip3 install onnxruntime-gpu --index-url https://pypi.jetson-ai-lab.io/jp6/cu129`.
|
||||
- **Known incompatibility**: onnxruntime-gpu v1.23.0 wheels for JetPack 6 were built against `numpy<2.0.0`; importing under `numpy>=2.0.0` raises a compatibility error per Issue #27562. Pin numpy<2 in project requirements until upstream rebuild is published.
|
||||
- **Standard pip install `onnxruntime` (CPU-only) succeeds but exposes only `CPUExecutionProvider` and `AzureExecutionProvider`** — does NOT include CUDA EP or TensorRT EP.
|
||||
|
||||
**Use**: backs Fact #95 (ONNX Runtime + TensorRT EP interop alternate candidate) per-mode API verification block + Plan-phase D-C7-3 ORT-Jetson-wheel-pin + D-C7-4 numpy-version-pin.
|
||||
|
||||
---
|
||||
|
||||
### Source #101 — PyTorch official documentation (context7-indexed) + Jetson AI Lab PyTorch wheel availability for JetPack 6
|
||||
|
||||
**Title**: PyTorch — open-source machine learning framework (tensor computation with strong GPU acceleration; tape-based autograd); Jetson-specific wheels available via Jetson AI Lab + NVIDIA forums
|
||||
**Tier**: L1 — official authoritative SDK documentation (PyTorch Foundation primary) + L1 NVIDIA Developer Forums (canonical Jetson PyTorch distribution channel)
|
||||
**URL**: <https://pytorch.org/docs/stable/amp.html> + context7 indexed at `/pytorch/pytorch` (v2.5.1, v2.8.0, v2.9.1, v2.11.0) + <https://forums.developer.nvidia.com/t/installing-pytorch-for-jetpack-6-2/349519> + <https://forums.developer.nvidia.com/t/jetpack-6-2-and-pytorch-2-6-0-on-jetson-nano-orin/331972>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ context7 query "torch.cuda.amp.autocast half precision FP16 inference mode no_grad CUDA Jetson Orin ARM64 model.half() torch.compile inference deployment" returned 4866 code snippets at Source Reputation High + Benchmark Score 76.69.
|
||||
|
||||
**Key APIs verified**:
|
||||
- **`torch.amp.autocast(device_type, dtype, enabled, cache_enabled)`** — canonical AMP context manager (since PyTorch 1.10). Replaces deprecated `torch.cuda.amp.autocast`. Inference pattern:
|
||||
```python
|
||||
with torch.no_grad():
|
||||
with torch.autocast(device_type='cuda', dtype=torch.float16, enabled=True):
|
||||
output = model(input)
|
||||
```
|
||||
- **`torch.compile(model, backend='inductor')`** — graph-mode optimization for further speedup; tradeoff is cold-start compile cost (~10-60 sec depending on model complexity).
|
||||
- **`model.half()`** — eager-mode FP16 weight conversion (full-precision FP16 throughout, vs autocast's per-op precision selection).
|
||||
|
||||
**Jetson install constraints**:
|
||||
- **Standard `pip install torch` does NOT include CUDA support on Jetson** — must use NVIDIA-published or Jetson AI Lab community wheels.
|
||||
- **JetPack 6.2 + CUDA 12.6 + Ubuntu 22.04 + Python 3.10 canonical wheel**: `torch-2.9.0-cp310-cp310-linux_aarch64.whl` from Jetson AI Lab (per NVIDIA forum recommendation). Earlier stable combination: PyTorch 2.5 + torchvision 0.20.
|
||||
- **Known dependency issues**: missing `libcudss.so.0` and `libnvdla_runtime.so` on PyTorch 2.9 cu129 wheel under JetPack 6.2 (CUDA 12.6) — version mismatch between wheel build target and installed JetPack CUDA. Mitigation: prefer the cu126 variant for JetPack 6.2.
|
||||
- **CUDA capability**: Jetson Orin Nano Super GPU = compute capability **SM 87** (Ampere class).
|
||||
|
||||
**Use**: backs Fact #96 (pure PyTorch FP16 mandatory simple-baseline candidate) per-mode API verification block + D-C7-5 PyTorch-Jetson-wheel-pin.
|
||||
|
||||
---
|
||||
|
||||
### Source #102 — Ultralytics YOLO26 benchmark suite on Jetson Orin Nano Super (April 2026)
|
||||
|
||||
**Title**: Update NVIDIA Jetson Orin Nano Super benchmarks with YOLO26 (Ultralytics 8.4.33; commit 8d4e6e8 April 2026)
|
||||
**Tier**: L1 — official authoritative benchmark suite (Ultralytics is the canonical YOLO maintainer)
|
||||
**URL**: <https://github.com/ultralytics/ultralytics/pull/24097> + <https://github.com/ultralytics/ultralytics/commit/8d4e6e841c89f6598b322695cb2bc816eeba8b93>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ Web search results explicitly cite the per-export-format inference times measured on Jetson Orin Nano Super.
|
||||
|
||||
**Key data extracted (YOLO26n on Jetson Orin Nano Super, April 2026 measurement)**:
|
||||
|
||||
| Export format | Inference time (ms) | mAP50-95 | Speedup vs FP32 | Accuracy delta vs FP16 |
|
||||
|---|---|---|---|---|
|
||||
| TensorRT FP32 | 7.53 | 0.4770 | 1.00× | — |
|
||||
| TensorRT FP16 | 4.57 | 0.4800 | 1.65× | baseline (slightly higher than FP32 due to noise) |
|
||||
| TensorRT INT8 | 3.80 | 0.4490 | 1.98× | **-6.5% mAP50-95** |
|
||||
|
||||
**Key data extracted (YOLOv8s on Jetson Orin Nano, NVIDIA forum)**:
|
||||
- **INT8**: ~157 QPS (~6.4 ms/inference)
|
||||
- **FP16**: ~103 QPS (~9.7 ms/inference)
|
||||
- **INT8 vs FP16 speedup**: ~1.5× (vs ~1.20× on YOLO26n — model architecture and memory bandwidth dependent)
|
||||
|
||||
**Use**: backs Fact #94 (TensorRT) latency claims for object-detection-class CNN backbones on Jetson Orin Nano Super; provides empirical anchor for the engine's "INT8 primary + FP16 fallback" precision strategy. Caveat: YOLO is a detection network; feature-matching networks (LightGlue / DISK / XFeat) are known to be more quantization-sensitive (see Source #103).
|
||||
|
||||
---
|
||||
|
||||
### Source #103 — LightGlue ONNX Runtime + TensorRT acceleration (canonical reference) + FP8 ModelOpt quantization findings (Fabio Sim's Journal)
|
||||
|
||||
**Title**: Accelerating LightGlue Inference with ONNX Runtime and TensorRT (Fabio Sim's Journal, canonical author of `fabio-sim/LightGlue-ONNX`) + FP8 Quantized LightGlue in TensorRT with NVIDIA Model Optimizer (subsequent post)
|
||||
**Tier**: L1 — canonical author of the canonical LightGlue ONNX/TensorRT export pathway (already cited as Source #73 in C3 row)
|
||||
**URL**: <https://fabio-sim.github.io/blog/accelerating-lightglue-inference-onnx-runtime-tensorrt/> + <https://fabio-sim.github.io/blog/fp8-quantized-lightglue-tensorrt-nvidia-model-optimizer/> + <https://github.com/qdLMF/LightGlue-with-FlashAttentionV2-TensorRT> (community Jetson Orin NX TensorRT 8.5.2 + FlashAttentionV2 plugin reference implementation)
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ Web search results explicitly cite the 2-4× ONNX Runtime + TensorRT speedup over compiled PyTorch and the FP8 5.97× / 0.32× engine-size results.
|
||||
|
||||
**Key data extracted**:
|
||||
- **LightGlue (transformer-based feature matcher) — ONNX Runtime + TensorRT inference**: 2-4× speedup over compiled PyTorch across various batch sizes and sequence lengths.
|
||||
- **FP8 quantized LightGlue (NVIDIA ModelOpt) on Hopper/Ada/Blackwell**:
|
||||
- Engine size ~0.32× of FP32 (~68% smaller).
|
||||
- Up to 5.97× speedup vs FP32.
|
||||
- **Material accuracy degradation**: "match counts dropped. Sometimes they dropped hard." This is qualitatively different from YOLO-class detection networks where INT8 is well-tolerated.
|
||||
- **FP8 hardware support**: requires Hopper / Ada / Blackwell architecture. **Jetson Orin Nano Super is Ampere (SM 87) — NOT FP8-native**. FP8 ModelOpt path applies only via INT8 emulation fallback on Ampere.
|
||||
- **Two FP8 formats**: E4M3 (4 exponent bits + 3 mantissa bits, better precision for activations) + E5M2 (5 exponent bits + 2 mantissa bits, better dynamic range for gradients).
|
||||
- **Community Jetson reference implementation**: `qdLMF/LightGlue-with-FlashAttentionV2-TensorRT` deploys on Jetson Orin NX 8 GB with TensorRT 8.5.2 + custom FlashAttentionV2 plugin.
|
||||
|
||||
**Use**: backs Fact #94 (TensorRT) feature-matching-network INT8 caveat; backs the "INT8-only candidates Experimental until calibration data exists" engine ruling per user-pinned `c7_quantization=A` scope; raises Plan-phase gate D-C7-6 INT8-vs-FP16-per-model-family-precision-policy.
|
||||
|
||||
---
|
||||
|
||||
### Source #104 — JetPack SDK release notes (NVIDIA official) — JetPack 6.0 / 6.1 / 6.2 version matrix
|
||||
|
||||
**Title**: NVIDIA JetPack 6.x SDK Release Notes — TensorRT/CUDA/cuDNN versions per release; Super Mode introduction in JetPack 6.2 (January 2025)
|
||||
**Tier**: L1 — official authoritative release notes (NVIDIA Developer)
|
||||
**URL**: <https://developer.nvidia.com/embedded/jetpack-sdk-60> + <https://developer.nvidia.com/embedded/jetpack-sdk-61> + <https://developer.nvidia.com/embedded/jetpack-sdk-62> + <https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/>
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ Web search results explicitly enumerate TensorRT / CUDA / cuDNN per JetPack release.
|
||||
|
||||
**Key data extracted**:
|
||||
|
||||
| JetPack | CUDA | TensorRT | cuDNN | Super Mode | Released |
|
||||
|---|---|---|---|---|---|
|
||||
| 6.0 | 12.2 | 8.6 | 8.9 | No | early 2024 |
|
||||
| 6.1 | 12.6 | 10.3 | 9.3 | MAXN mode (dev kit only) | mid-2024 |
|
||||
| **6.2** | **12.6** | **10.3** | **9.3** | **YES — Orin Nano Super + Orin NX production modules** | **2025-01-16** |
|
||||
|
||||
- **Super Mode performance gains** (vs base Orin Nano): up to 2× higher generative AI inference performance, 70% AI TOPS increase, 50% memory bandwidth boost.
|
||||
- **TensorRT 10.3** is the canonical inference runtime version for JetPack 6.1 / 6.2 deployments. Major API upgrade from TensorRT 8.x → 10.x — `IInt8EntropyCalibrator2` API surface is preserved; `INetworkDefinition` and `IBuilderConfig` semantics unchanged.
|
||||
|
||||
**Use**: pins the project's target software stack to **JetPack 6.2 + CUDA 12.6 + TensorRT 10.3 + cuDNN 9.3 + Super Mode enabled** for the Jetson Orin Nano Super target hardware. Backs Facts #94, #95, #96 deployability claims.
|
||||
|
||||
---
|
||||
|
||||
### Source #105 — TensorRT-on-Jetson canonical install constraints (Ultralytics issue reports + NVIDIA forum)
|
||||
|
||||
**Title**: TensorRT 10.x on Jetson Orin Nano — install path, hardware-specificity, memory-pressure-during-build constraints
|
||||
**Tier**: L2 — community-reported issues with NVIDIA-acknowledged root causes (high signal-to-noise on canonical constraints)
|
||||
**URL**: <https://github.com/ultralytics/ultralytics/issues/18882> ("TensorRT does not currently build wheels for Tegra systems") + <https://forums.developer.nvidia.com/t/tensorrt-10-7-0-on-orin-nano/364236> (SM 87 compute-capability mismatch) + <https://github.com/ultralytics/ultralytics/issues/18730> (laptop-GPU-built engine cannot load on Jetson) + <https://github.com/ultralytics/ultralytics/issues/21281> (TensorRT export memory pressure on Orin AGX)
|
||||
**Access date**: 2026-05-08
|
||||
**Direct verification**: ✅ Web search returned direct issue links with NVIDIA-confirmed root causes.
|
||||
|
||||
**Key constraints extracted** (CRITICAL for C7 deployment design):
|
||||
|
||||
1. **TensorRT Python wheels are NOT installed via pip on Jetson Tegra**. Standard `pip install tensorrt` raises: `RuntimeError: TensorRT does not currently build wheels for Tegra systems`. The canonical install path is the JetPack-bundled TensorRT (already present after `apt install nvidia-jetpack`), accessed via the system Python at `/usr/lib/python3.10/dist-packages/tensorrt`.
|
||||
2. **TensorRT engines are hardware-specific** — engines built against a laptop / dev-machine GPU CANNOT be loaded on the Jetson at runtime. **Engines must be built directly on the Jetson target**.
|
||||
3. **GPU compute capability mismatch is silent at build-time, fatal at load-time**: laptop GPUs (e.g., RTX 4090 = SM 89) and Jetson Orin Nano Super (SM 87) produce incompatible engines; the build emits no error, the load logs `Target GPU SM 87 is not supported by this TensorRT release` — version-and-SM-compatibility matrix must be respected.
|
||||
4. **TensorRT engine builds on Jetson under memory pressure can segfault during tactic profiling** (8 GB shared CPU+GPU is tight; a rich layer-fusion search consumes peak RAM during `tactic.profile` phase). Mitigation: limit `config.max_workspace_size` to a fraction of the budget (e.g., 1-2 GB) and avoid concurrent inference / Postgres / FAISS during builds.
|
||||
5. **JetPack 6.x ships the canonical TensorRT version** (TensorRT 10.3 for JP 6.1/6.2 per Source #104); upgrading TensorRT independently of JetPack is not officially supported.
|
||||
|
||||
**Use**: drives D-C7-7 build-on-Jetson-vs-prebuilt-engine-shipping-strategy + D-C7-8 max-workspace-size-cap-for-build-stability + D-C7-9 SM-compatibility-version-pin.
|
||||
|
||||
---
|
||||
|
||||
(Subsequent sources #106+ added during fact extraction below as candidate-specific evidence is gathered. Closure target: 3 candidate rows + 1 cross-cutting integration matrix.)
|
||||
@@ -0,0 +1,97 @@
|
||||
# Source Registry — C8: MAVLink / MSP2 FC adapter
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation). C8 batch 1 sources for the FC adapter (per-FC adapter pattern verified at SQ6 closure: ArduPilot Plane via MAVLink `GPS_INPUT`, iNav via `MSP2_SENSOR_GPS` primary OR UBX-impersonation alternate). Confidence labels per `references/source-tiering.md`. Cross-references back to SQ6 fact card sources (#4, #9, #10, #12, #13, #15) where the iNav inbound-handler reality and MSP2/UBX transport options were originally established.
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling component categories: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md), [C6 Tile cache](C6_tile_cache_spatial_index.md), [C7 Inference runtime](C7_inference_runtime.md). Cross-cuts: [SQ6 external positioning](SQ6_external_positioning.md).
|
||||
|
||||
## Sources
|
||||
|
||||
### Source #106 — ArduPilot Pymavlink (context7-indexed `/ardupilot/pymavlink`)
|
||||
- **Tier**: L1 (canonical Python MAVLink implementation maintained by ArduPilot)
|
||||
- **Found via**: context7 `resolve-library-id` → `/ardupilot/pymavlink` → `query-docs` for GPS_INPUT send patterns
|
||||
- **Library posture**: 32 code snippets indexed in context7 (Source Reputation: High); coverage emphasizes the JavaScript MAVLink generator output, with thinner Python-side examples in context7 — supplementary primary sources (canonical pymavlink GitHub README + ArduPilot GPS_INPUT dev docs Source #107) carry the canonical Python `master.mav.gps_input_send(...)` send pattern.
|
||||
- **License**: LGPL v3 (pymavlink itself); MAVLink generated dialects are MIT — the project's runtime dependency is on the LGPL pymavlink Python package. **Compatible with project's Apache-2.0 dual-use track**: LGPL allows linking from a non-LGPL application without "infecting" application license; the only obligation is to publish/redistribute any modifications to pymavlink itself (project does not modify pymavlink), and to allow users to relink against an updated pymavlink (trivially satisfied for an open-source / company-internal deployment with published `requirements.txt`).
|
||||
- **Critical-novelty-sensitivity**: Established baseline; no time window — pymavlink has been the canonical Python MAVLink stack since 2010+, and `GPS_INPUT` (msg 232) has been in `common.xml` since 2017 ArduPilot dev iteration.
|
||||
- **Per-mode capability verification (context7 + SQ6 Source #4 AP_GPS_MAV.cpp cross-cite)**: ✅ `GPS_INPUT` decoder confirmed in AP_GPS_MAV.cpp master per SQ6 Fact #1; Python sender uses `master = mavutil.mavlink_connection(...)` + `master.mav.gps_input_send(time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type, lat, lon, alt, hdop, vdop, vn, ve, vd, speed_accuracy, horiz_accuracy, vert_accuracy, satellites_visible, yaw)` per pymavlink generated dialect.
|
||||
- **Used to support**: Fact #97 (ArduPilot Plane FC adapter primary candidate).
|
||||
|
||||
### Source #107 — ArduPilot Plane Non-GPS Position Estimation + MAVProxy GPS Input module documentation
|
||||
- **Tier**: L1 (official ArduPilot dev docs portal; documented configuration + canonical injection example)
|
||||
- **Found via**: web search for `pymavlink GPS_INPUT msg 232 example ArduPilot Plane non-GPS external positioning companion computer 2025`
|
||||
- **Date accessed**: 2026-05-08
|
||||
- **URLs**:
|
||||
- https://ardupilot.org/dev/docs/mavlink-nongps-position-estimation.html
|
||||
- https://ardupilot.org/plane/docs/common-non-gps-navigation-landing-page.html
|
||||
- https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
|
||||
- https://ardupilot.org/plane/docs/common-companion-computers.html
|
||||
- **Critical configuration captured**: `GPS1_TYPE = 14` (MAVLink) is required on the FC for `GPS_INPUT` ingestion. Without this parameter set, AP_GPS will not accept the message. `EK3_SRC1_POSXY = 3` (GPS) selects the GPS_INPUT-fed virtual GPS as the primary horizontal-position source. Per ArduPilot dev docs, the **preferred method** for non-GPS navigation is `ODOMETRY` or `VISION_POSITION_ESTIMATE` at ≥4 Hz — but `GPS_INPUT` remains supported and is the right choice when the project's outcome contract is "WGS84 coordinates as a real-GPS replacement" (AC-4.3 wording aligns with GPS_INPUT semantics, not ODOMETRY semantics).
|
||||
- **Cross-cite**: SQ6 Fact #1 (AP_GPS_MAV.cpp ingestion path) + SQ6 Fact #4 (`ODOMETRY`-velocity-only NOT supported) — together these pin `GPS_INPUT` as the right transport for the project's `{satellite_anchored, visual_propagated, dead_reckoned}` source-label scheme.
|
||||
- **Per-mode capability verification**: ✅ All required ACs (AC-4.3 / AC-NEW-2 / AC-NEW-4 / AC-NEW-8) map directly into GPS_INPUT field semantics per SQ6 working summary table.
|
||||
|
||||
### Source #108 — pyubx2 (context7-indexed `/semuconsulting/pyubx2` + canonical GitHub README)
|
||||
- **Tier**: L1 (canonical Python UBX/NMEA/RTCM3 parser; benchmark score 86.8 in context7; 139 code snippets)
|
||||
- **Found via**: context7 `resolve-library-id` → `/semuconsulting/pyubx2` → `query-docs` for UBX-NAV-PVT message construction with full attribute control + serialize-to-bytes pattern for UART transmission
|
||||
- **Library posture**: BSD-3-Clause license (clean, dual-use compatible); semuconsulting publishes both the canonical GitHub repo + comprehensive readthedocs.io documentation also indexed in context7 as `/websites/semuconsulting_pyubx2` (239 additional code snippets, benchmark 85.2). The library supports `UBXMessage(ubxClass, ubxID, mode, **kwargs)` constructor with three modes: `GET (0x00)` for output from the receiver, `SET (0x01)` for command input, `POLL (0x02)` for query input. NAV-PVT belongs to the GET output set.
|
||||
- **Critical-novelty-sensitivity**: Library/SDK API behaviour — must reflect currently shipped version; semuconsulting/pyubx2 is daily-active (last released 2025).
|
||||
- **Per-mode capability verification (context7-confirmed)**: ✅ NAV-PVT message construction with all UBX-NAV-PVT fields supported as keyword arguments per `UBXMessage('NAV', 'NAV-PVT', GET, iTOW=..., year=..., lon=..., lat=..., height=..., hMSL=..., hAcc=..., vAcc=..., velN=..., velE=..., velD=..., gSpeed=..., headMot=..., sAcc=..., headAcc=..., pDOP=..., fixType=..., flags=..., numSV=..., valid=...)`. ✅ `serialize()` method returns the full UBX wire-format bytestring (sync-bytes 0xB5 0x62 + class + ID + length + payload + 8-bit Fletcher checksum). ✅ `parsebitfield=1` mode allows individual bit attributes for `flags` (e.g., `gnssFixOK`, `diffSoln`, `psmState`) and `valid` (e.g., `validDate`, `validTime`, `fullyResolved`, `validMag`) — required for the impersonation path to set the `gnssFixOK` bit that iNav's `gpsMapFixType()` validates.
|
||||
- **Used to support**: Fact #98 (iNav UBX impersonation alternate candidate).
|
||||
|
||||
### Source #109 — u-blox NEO-M9N Integration Manual (UBX-19014286) + u-blox 8/M8 Receiver Description (UBX-13003221) — UBX-NAV-PVT canonical specification
|
||||
- **Tier**: L1 (vendor-authoritative protocol specification PDFs)
|
||||
- **Found via**: web search for `UBX-NAV-PVT frame structure spec u-blox protocol M8 M9 fix type fabricate inject iNav 2025`
|
||||
- **Date accessed**: 2026-05-08
|
||||
- **URLs**:
|
||||
- https://content.u-blox.com/sites/default/files/NEO-M9N_Integrationmanual_UBX-19014286.pdf
|
||||
- https://content.u-blox.com/sites/default/files/products/documents/u-blox8-M8_ReceiverDescrProtSpec_UBX-13003221.pdf
|
||||
- **Frame structure captured**: NAV-PVT (class=0x01, ID=0x07) carries 92-byte payload — `iTOW (u32 ms)` + `year (u16)` + `month/day/hour/min/sec (u8 each)` + `valid (u8 bitmask)` + `tAcc (u32 ns)` + `nano (i32 ns)` + `fixType (u8 enum: 0=NoFix, 1=DeadReck, 2=2D, 3=3D, 4=GNSS+DR, 5=TimeOnly)` + `flags (u8 bitmask incl. gnssFixOK bit 0)` + `flags2 (u8)` + `numSV (u8)` + `lon (i32 deg×1e-7)` + `lat (i32 deg×1e-7)` + `height (i32 mm above ellipsoid)` + `hMSL (i32 mm above mean sea level)` + `hAcc (u32 mm)` + `vAcc (u32 mm)` + `velN/velE/velD (i32 each mm/s)` + `gSpeed (i32 mm/s)` + `headMot (i32 deg×1e-5)` + `sAcc (u32 mm/s)` + `headAcc (u32 deg×1e-5)` + `pDOP (u16 ×0.01)` + reserved bytes + `headVeh (i32)` + `magDec (i16)` + `magAcc (u16)`. M9N supersedes M8 with refined NAV-PVT semantics; both are accepted by iNav 9.0 (per Source #11 in SQ6 — UBX ≥ 15.00 protocol version).
|
||||
- **Critical-novelty-sensitivity**: Established baseline + library/SDK API behaviour — u-blox NAV-PVT is a stable protocol surface since u-blox 8 (2014); minor field semantics evolve across vendor protocol versions, so exact wire format must be checked against the iNav-target version (iNav 9.0 expects ≥ 15.00).
|
||||
- **Per-mode capability verification**: ✅ NAV-PVT contains all fields needed for iNav's `gpsMapFixType()` validation (Source #110 cross-cite): `flags` byte bit 0 `gnssFixOK` + `fixType` enum + `numSV` + `hAcc/vAcc` for AC-NEW-4 covariance honesty.
|
||||
- **Used to support**: Fact #98 (iNav UBX impersonation alternate candidate) NAV-PVT frame fabrication spec.
|
||||
|
||||
### Source #110 — iNav `gps_ublox.c` source (master, GitHub) — UBX validation gates that the impersonation must pass
|
||||
- **Tier**: L1 (canonical iNav firmware source, master branch, accessed via cached web fetch)
|
||||
- **Found via**: web search for `iNav GPS UBX validation fixType numSat hDOP threshold reject GNSS spoofing companion computer 2025`
|
||||
- **URL**: https://github.com/iNavFlight/inav/blob/master/src/main/io/gps_ublox.c
|
||||
- **Date accessed**: 2026-05-08
|
||||
- **Critical-novelty-sensitivity**: Library/SDK API behaviour — must reflect current shipped iNav version. iNav 9.0 master (post-2025-12-11 wiki update per SQ6 Source #10) confirmed via direct file read.
|
||||
- **Validation logic captured (line-numbered evidence)**:
|
||||
- **Line 215-220**: `gpsMapFixType(fixValid, ubloxFixType)` returns `GPS_FIX_2D` if `fixValid && ubloxFixType == FIX_2D`, returns `GPS_FIX_3D` if `fixValid && ubloxFixType == FIX_3D`, otherwise `GPS_NO_FIX`. **THIS IS THE GATE** the impersonation must pass.
|
||||
- **Line 654**: NAV-PVT path computes `next_fix_type = gpsMapFixType(_buffer.pvt.fix_status & NAV_STATUS_FIX_VALID, _buffer.pvt.fix_type)`. The `fix_status & NAV_STATUS_FIX_VALID` masks the lowest bit of NAV-PVT's `flags` byte (bit 0 = `gnssFixOK`).
|
||||
- **Lines 656-683**: NAV-PVT-driven full state population including `lon (1e-7 deg)`, `lat (1e-7 deg)`, `altitude_msl (mm)`, NED velocity (mm/s converted to cm/s), `speed_2d (mm/s)`, `heading_2d (deg×1e-5 → deg×10)`, `satellites`, `horizontal_accuracy (mm)`, `vertical_accuracy (mm)`, `position_DOP`, valid date/time bits.
|
||||
- **Lines 1024-1060**: Configuration logic — for u-blox version ≥ 15.0 (iNav 9.0+), iNav configures NAV-PVT-only via `configureMSG(MSG_CLASS_UBX, MSG_PVT, 1)`. For older receivers, configures the legacy NAV-POSLLH + NAV-SOL + NAV-VELNED + NAV-TIMEUTC quad. **Implication**: companion impersonator should advertise version ≥ 15.0 via NAV-VER (CLASS=0x0A, ID=0x04) to drive iNav into the simpler NAV-PVT-only protocol.
|
||||
- **Per-mode capability verification**: ✅ Validation gate fully decoded; impersonation viability confirmed at the firmware-source level (no opaque downstream filter discovered).
|
||||
- **Used to support**: Fact #98 — provides the iNav-firmware-side validation contract that the UBX impersonation must satisfy.
|
||||
|
||||
### Source #111 — iNav `docs/development/msp/README.md` (master, GitHub) — MSP2_SENSOR_GPS canonical payload specification
|
||||
- **Tier**: L1 (canonical iNav protocol-reference documentation, master branch, accessed via cached web fetch)
|
||||
- **Found via**: web search for `MSP2_SENSOR_GPS Python library iNav msp2 protocol companion computer external GPS injection 2025 2026`
|
||||
- **URL**: https://github.com/iNavFlight/inav/blob/master/docs/development/msp/README.md
|
||||
- **Date accessed**: 2026-05-08
|
||||
- **Payload structure captured (line 2999-3031 of the master README)**: `MSP2_SENSOR_GPS (7939 / 0x1F03)` — request payload 36 bytes containing `instance (u8)` + `gpsWeek (u16)` + `msTOW (u32 ms)` + `fixType (u8 = gpsFixType_e)` + `satellitesInView (u8)` + `hPosAccuracy (u16 mm)` + `vPosAccuracy (u16 mm)` + `hVelAccuracy (u16 cm/s)` + `hdop (u16 ×0.01)` + `longitude (i32 deg×1e7)` + `latitude (i32 deg×1e7)` + `mslAltitude (i32 cm)` + `nedVelNorth (i32 cm/s)` + `nedVelEast (i32 cm/s)` + `nedVelDown (i32 cm/s)` + `groundCourse (u16 deg×100)` + `trueYaw (u16 deg×100, 65535 = unavailable)` + `year (u16)` + `month/day/hour/min/sec (u8 each)`. **Reply payload: None.** **Notes: Requires `USE_GPS_PROTO_MSP`. Calls `mspGPSReceiveNewData()`.**
|
||||
- **Critical-novelty-sensitivity**: Library/SDK API behaviour — verified against iNav master (post-9.0).
|
||||
- **Per-mode capability verification**: ✅ Full payload spec covers all AC-NEW-4 covariance honesty fields (`hPosAccuracy`, `vPosAccuracy`, `hVelAccuracy`); ✅ AC-NEW-8 graceful-degrade signal carried via `fixType` enum (`gpsFixType_e`) — companion can emit `GPS_NO_FIX` (0) or `GPS_FIX_2D` (1) for the "covariance >100 m" / "covariance >500 m" thresholds; ✅ AC-1.4 95% covariance proxy carried in `hPosAccuracy`.
|
||||
- **Used to support**: Fact #99 (iNav MSP2_SENSOR_GPS primary candidate).
|
||||
|
||||
### Source #112 — Python MSP2 implementations: YAMSPy + INAV-Toolkit `inav_msp.py`
|
||||
- **Tier**: L2 (community implementations; NOT vendor-canonical but actively maintained)
|
||||
- **Found via**: web search for Python MSP2_SENSOR_GPS libraries; iNav Issue #4465 confirms YAMSPy as community-recommended; agoliveira/INAV-Toolkit confirmed via direct GitHub source read
|
||||
- **URLs**:
|
||||
- YAMSPy mention: https://github.com/iNavFlight/inav/issues/4465
|
||||
- INAV-Toolkit `inav_msp.py`: https://github.com/agoliveira/INAV-Toolkit/blob/5c4ef789068399b4dc7461b71c6f71c25aef5e4e/inav_msp.py
|
||||
- **Date accessed**: 2026-05-08
|
||||
- **Library posture**:
|
||||
- **YAMSPy** (`thecognifly/YAMSPy`): MIT-licensed Python library with explicit MSP V2 support; community-blessed for iNav external-device communication per the iNav issue thread.
|
||||
- **INAV-Toolkit `inav_msp.py`**: 951-line MIT-licensed module implementing `msp_v2_encode(cmd, payload)` + `msp_v2_decode(buffer)` with CRC-8 DVB-S2 checksumming + serial transport. Direct primary-source implementation reference for MSP V2 frame construction.
|
||||
- **Critical-novelty-sensitivity**: Library/SDK API behaviour — both libraries are recent (post-2024 commits). **Risk**: community libraries may lag the iNav protocol surface (e.g., MSP V2 sensor message range 0x1F00-0x1FFF was added later than the original MSP V2 baseline). The project may need to either (a) extend the chosen community library with MSP2_SENSOR_GPS-specific encoding helpers, or (b) implement a thin custom encoder using the canonical msp_v2_encode primitive — both paths verified feasible from primary sources.
|
||||
- **License notes**: MIT throughout — clean dual-use compatible.
|
||||
- **Per-mode capability verification**: ⚠️ MSP V2 frame envelope (0x24 + 'X' + 0x3C + flag + cmd_lo + cmd_hi + len_lo + len_hi + payload + CRC8-DVB-S2) confirmed via INAV-Toolkit primary source; ✅ MSP2_SENSOR_GPS payload structure confirmed via Source #111. Combining the two yields a complete companion-side encoder for the iNav primary path.
|
||||
- **Used to support**: Fact #99 (iNav MSP2_SENSOR_GPS primary candidate, Python implementation path).
|
||||
|
||||
### Source #113 — iNav `src/main/msp/msp_protocol_v2_sensor.h` (master, GitHub) — MSP2 sensor command-ID range
|
||||
- **Tier**: L1 (canonical iNav firmware source, master branch)
|
||||
- **Found via**: web search co-result with Source #112; opens via the `msp_protocol_v2_sensor.h` direct link
|
||||
- **URL**: https://github.com/iNavFlight/inav/blob/master/src/main/msp/msp_protocol_v2_sensor.h
|
||||
- **Date accessed**: 2026-05-08
|
||||
- **Critical fact captured**: `MSP2_SENSOR_GPS = 0x1F03` (= 7939 decimal); MSP V2 sensor-message range `0x1F00-0x1FFF` is reserved for sensor injection plugins. iNav 9.0 master expectation: MSP2 frame must use the MSP V2 envelope (sync = 0x24 0x58 0x3C; flag = 0x00; cmd = LE 16-bit; len = LE 16-bit; CRC = CRC-8 DVB-S2 over flag through end of payload).
|
||||
- **Per-mode capability verification**: ✅ MSP2_SENSOR_GPS = 0x1F03 confirmed at source; ✅ MSP V2 envelope spec confirmed.
|
||||
- **Used to support**: Fact #99 — provides the canonical MSP V2 sensor-message-range definition.
|
||||
@@ -0,0 +1,179 @@
|
||||
# Source Registry — SQ1 — Existing / competitor GPS-denied UAV navigation systems
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `../00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> This file replaces a section of the previous monolithic `01_source_registry.md`. See `00_summary.md` for the full category index. Investigation order is tracked in `../00_question_decomposition.md` and the cross-category Investigation Status table in `00_summary.md`.
|
||||
|
||||
---
|
||||
|
||||
### Source #25
|
||||
- **Title**: Twist Robotics develops OSCAR — a GPS-independent visual navigation system for drones resistant to electronic warfare equipment
|
||||
- **Link**: https://www.pravda.com.ua/eng/news/2026/01/28/8018266/
|
||||
- **Tier**: L2 (national newspaper of record reporting on a Technology Forces of Ukraine release; primary press is the Technology Forces of Ukraine FB post)
|
||||
- **Publication Date**: 2026-01-28 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (within 6-month critical-novelty window)
|
||||
- **Target Audience**: Ukraine-deployment practitioners; UAV companion-system designers
|
||||
- **Research Boundary Match**: **Full match** — Ukrainian fixed-wing-class UAV, GPS-denied, vision-based, deployed in active conflict
|
||||
- **Summary**: Twist Robotics (UA) deployed OSCAR ("Optical System of Coordinates with Automatic Relocalisation") — camera + landmark-matching + map → autopilot ingests as a "reliable GPS signal". Vendor claims: 20 m accuracy without cumulative error, day/night/fog operation, 500,000 km logged across 25,000 combat missions over 24 months development, AI-augmented + Obrii proprietary simulator for training. Note: hardware photo shows active cooling on the module — implies non-trivial compute (probably Jetson-class). **No public independent benchmark.** Closest deployed peer system to this project.
|
||||
- **Related Sub-question**: SQ1 (closest peer); also informs SQ8 (anti-spoofing claims), SQ9 (synthesis)
|
||||
|
||||
|
||||
### Source #26
|
||||
- **Title**: Ukraine Gives Drones Vision-Based Navigation to Push Past Heavy Jamming — The Defense Post
|
||||
- **Link**: https://thedefensepost.com/2026/01/29/ukraine-drones-vision-navigation/
|
||||
- **Tier**: L2 (defense-trade publication; corroborates Source #25 with a second-party byline)
|
||||
- **Publication Date**: 2026-01-29 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Defense-policy / procurement readership
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms OSCAR is operational, terrain-imagery-against-mapped-landmarks pattern, autopilot-ingestion. Adds "live imagery" framing. No new technical detail beyond Source #25.
|
||||
- **Related Sub-question**: SQ1
|
||||
|
||||
|
||||
### Source #27
|
||||
- **Title**: Ukraine's Ruta Missile Drone Will Get an EW-Immune Navigation System — Defense Express
|
||||
- **Link**: https://en.defence-ua.com/weapon_and_tech/ukraines_ruta_missile_drone_will_get_an_ew_immune_navigation_system-14541.html
|
||||
- **Tier**: L2 (defense-trade publication, Ukraine-domestic)
|
||||
- **Publication Date**: 2025-05-17 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (within 18-month authority window)
|
||||
- **Target Audience**: Defense-procurement / industry analysts
|
||||
- **Research Boundary Match**: Partial — operational profile (cruise-missile-class, terminal guidance) differs from our 8-h fixed-wing surveillance/strike profile; technique class is closely related (DSMAC pattern)
|
||||
- **Summary**: Destinus Ruta (Ukrainian-Swiss origin; ~300 km strike range, miniature cruise missile) will integrate a navigation system from UAV Navigation (Spanish, Grupo Oesía). Defense Express infers DSMAC-style operating principle: "takes images of surface mid-flight, identifies location through comparison with reference". Vendor announcement notes validation in Ukrainian combat conditions including GNSS-denied / jamming / spoofing. Establishes that the cruise-missile-tier vision-nav pattern is now being miniaturised for ~300 km strike drones.
|
||||
- **Related Sub-question**: SQ1 (commercial/military landscape)
|
||||
|
||||
|
||||
### Source #28
|
||||
- **Title**: Kilometer-Scale GNSS-Denied UAV Navigation via Heightmap Gradients: A Winning System from the SPRIN-D Challenge
|
||||
- **Link**: https://arxiv.org/abs/2510.01348
|
||||
- **Tier**: L1 (peer-style preprint, full system description, real flight data, competition results)
|
||||
- **Publication Date**: October 2025 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: arXiv v1 (2510.01348v1)
|
||||
- **Target Audience**: GNSS-denied UAV system designers (academic + practitioner)
|
||||
- **Research Boundary Match**: **Partial — different regime.** Multirotor (≤25 kg), <25 m AGL, LiDAR-equipped, no satellite-tile basemap; 9 km waypoint mission. Our project is fixed-wing, ~1 km AGL, no LiDAR, monocular + sat-tile basemap. **Architectural pattern transfers; specific algorithm does NOT** (heightmap gradients require LiDAR).
|
||||
- **Summary**: CTU Prague team won SPRIN-D Funke Fully Autonomous Flight Challenge with: VIO (OpenVINS) + LiDAR-derived local heightmap + gradient template matching against open-data DEM + clustered K-means particle filter, all on Intel NUC i7 16 GB CPU-only (no GPU). Achieved RMSE <11 m over kilometer-scale flights vs ≤53 m for raw odometry. Critical observations explicitly stated:
|
||||
- **RTAB-Map and ORB-SLAM3 both fail** beyond 1 km / above 2 m/s flight (compute/memory) and ORB-SLAM3 loses tracking in textureless areas — directly applicable to our 17 m/s cruise over agricultural steppe.
|
||||
- **"Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes."** This is a low-altitude (<25 m AGL) finding; our 1 km AGL operates in the high-altitude regime where the same paper notes RGB sat-matching "works reasonably well" (refs [5][6]).
|
||||
- Lesson: "ability to recover from periods of high uncertainty and re-localize is more critical than maintaining consistently low instantaneous RMSE." Direct architectural input for AC-NEW-2 / AC-NEW-8.
|
||||
- Lesson: IMU-from-airframe vibration isolation is mission-critical for VIO usability.
|
||||
- Lesson: magnetometer is unreliable near steel-reinforced structures; sensor-fusion is essential for heading robustness.
|
||||
- **Related Sub-question**: SQ1 + SQ5 (failure modes for VIO/SLAM at speed) + SQ2 (canonical pipeline)
|
||||
|
||||
|
||||
### Source #29
|
||||
- **Title**: Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints
|
||||
- **Link**: https://arxiv.org/abs/2506.09748 (PDF: https://arxiv.org/pdf/2506.09748)
|
||||
- **Tier**: L1 (peer-submitted preprint, IEEE-bound, with public CS-UAV dataset)
|
||||
- **Publication Date**: June 2025 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (within 6-month critical-novelty window for SOTA claims)
|
||||
- **Version Info**: arXiv v1 (2506.09748v1)
|
||||
- **Target Audience**: Academic SOTA researchers + UAV-localization implementers
|
||||
- **Research Boundary Match**: **Full match** — exact same problem (UAV absolute visual localization in GNSS-denied conditions, downward-facing camera, satellite reference)
|
||||
- **Summary**: 2025 SOTA pipeline: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation), (2) Semantic-Aware and Structure-Constrained Matching Module using **DINOv2** features + 4D correlation tensor + SoftMNN + 4D conv, (3) lightweight fine-grained module for pixel-level. Constructs UAV absolute visual-loc pipeline **without VIO/relative-loc dependence** (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV. **Direct relevance**: this is a candidate template for our C2 (VPR) + C3 (cross-domain registration) components, but DINOv2 is a heavyweight foundation model — must be benchmarked under our 25 W / 8 GB Jetson Orin Nano envelope before selection (handed off to SQ3/SQ4 + SQ5 for that component).
|
||||
- **Related Sub-question**: SQ1 (academic SOTA), SQ3+SQ4 (C2/C3 candidates), SQ5 (Jetson-on-Foundation-Model failure mode)
|
||||
|
||||
|
||||
### Source #30
|
||||
- **Title**: Raptor — GPS-Denied UAV Navigation & Coordinate Extraction (Vantor product page; Guide / Sync / Ace suite)
|
||||
- **Link**: https://www.vantor.com/product/mission-solutions/raptor/
|
||||
- **Tier**: L2 (vendor product spec; primary for the product itself, not for independent benchmark numbers)
|
||||
- **Publication Date**: live (accessed 2026-05-07; references Mar 2026 + Dec 2025 + Sep 2025 partner blog posts indicating active product line)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Defense / commercial / industrial UAV integrators
|
||||
- **Research Boundary Match**: **Full match** — vision-based aerial position software using existing camera + 3D terrain data, deployable on commodity hardware
|
||||
- **Summary**: Vantor Raptor product family: **Guide** (on-drone vision-based positioning, demonstrated <7 m absolute accuracy in all dimensions, day/night/low-altitude, runs on commodity HW); **Sync** (georegisters live drone video against 3D terrain in real time, <3 m coordinate extraction); **Ace** (laptop-side coordinate extraction at <3 m). Backbone: Vantor's "100 million-plus sq km of highly accurate 3D terrain data, regularly updated" (Vivid Terrain, 3 m accuracy). Inertial Labs partnership (VINS-integrated Raptor Guide). Use cases include joint multi-domain ops, large-scale autonomous delivery, search-and-rescue. **This is the closest production-grade commercial peer to the project's architecture (sat-basemap-as-service + on-drone vision).**
|
||||
- **Related Sub-question**: SQ1 (commercial), SQ3+SQ4 (commercial alternatives to building C2/C3 ourselves), SQ8 (basemap as a service vs offline cache)
|
||||
|
||||
|
||||
### Source #31
|
||||
- **Title**: Auterion successfully completes Artemis program to deliver long-range deep strike drone (press release)
|
||||
- **Link**: https://auterion.com/auterion-successfully-completes-artemis-program-to-deliver-long-range-deep-strike-drone/
|
||||
- **Tier**: L1 (official vendor press release)
|
||||
- **Publication Date**: 2025-10-15 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Defense-procurement; UAV-integration architects
|
||||
- **Research Boundary Match**: **Full match** — fixed-wing-class one-way attack drone with Ukraine-validated GPS-denied navigation; the system architecture is directly comparable
|
||||
- **Summary**: Auterion Artemis (DIU project, completed Oct 2025) = Shahed-style design developed in Ukraine; up to 1,000-mile range; up to 40 kg warhead; runs on Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. **Establishes that the integration pattern (companion-class autopilot + visual navigation + terminal guidance) is shipping at production scale to a US defense customer.** Open architecture, manufacturing in US/UA/DE.
|
||||
- **Related Sub-question**: SQ1
|
||||
|
||||
|
||||
### Source #32
|
||||
- **Title**: Bring AI and computer vision to small autonomous systems — Auterion Skynode S product page
|
||||
- **Link**: https://auterion.com/product/skynode-s
|
||||
- **Tier**: L2 (vendor product spec)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Small-UAS integrators
|
||||
- **Research Boundary Match**: Full match (companion-class autopilot with NPU)
|
||||
- **Summary**: Auterion Skynode S = compact mission computer with **dedicated Neural Processing Unit** for AI / computer-vision applications on small UAS systems. Architecturally the same niche our Jetson Orin Nano Super sits in (companion compute + autopilot integration), but with Auterion's PX4 fork pre-integrated. Hardware/runtime envelope is comparable; the product establishes that this is a product category, not a one-off integration.
|
||||
- **Related Sub-question**: SQ1, SQ7 (alternate companion HW for adjacent context)
|
||||
|
||||
|
||||
### Source #33
|
||||
- **Title**: snktshrma/ngps_flight — Next-Generation Positioning System for ArduPilot (GSoC 2024)
|
||||
- **Link**: https://github.com/snktshrma/ngps_flight (sibling: https://github.com/snktshrma/ap_nongps)
|
||||
- **Tier**: L1 (open-source code repository, published GSoC project under ArduPilot organisation)
|
||||
- **Publication Date**: GSoC 2024 timeframe (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: GSoC 2024 prototype (research-grade, not production firmware)
|
||||
- **Target Audience**: ArduPilot integrators building visual-positioning companion stacks
|
||||
- **Research Boundary Match**: **Full match — closest open-source peer to our exact pipeline.** ArduPilot, downward-facing camera, satellite-image reference, deep-learning matching, fused with VIO, fed back to autopilot.
|
||||
- **Summary**: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: **`ap_ngps_ros2`** (visual geo-localization at 1–2 Hz by matching live camera frames to georeferenced satellite imagery using **LightGlue + SuperPoint**); **`ap_ukf`** (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates); **`ap_vips`** (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF via `VISION_POSITION_ESTIMATE` (per the related issue #23471 framing). **This is the architectural template** the project should explicitly compare against — same component split as our C1+C2+C3+C5+C8 stack.
|
||||
- Caveats: (a) GSoC prototype, not production-hardened; (b) uses `VISION_POSITION_ESTIMATE` which on AP requires EKF source set 2/3 with EK3_SRC*_POSXY=Vision; our SQ6 conclusion picked `GPS_INPUT` as primary AP path because it carries `horiz_accuracy` directly and supports source-set switching via `MAV_CMD_SET_EKF_SOURCE_SET` — must compare the trade-off in design phase; (c) no documented spoofing-defence integration; (d) no documented covariance-honesty contract.
|
||||
- **Related Sub-question**: SQ1 (closest open-source peer), SQ2 (canonical-pipeline confirmation), SQ3+SQ4 (architectural template for component selection), SQ6 (alternate AP transport: `VISION_POSITION_ESTIMATE` vs `GPS_INPUT`)
|
||||
|
||||
|
||||
### Source #34
|
||||
- **Title**: AerialExtreMatch — A Benchmark for Extreme-View Image Matching and Localization (project page + GitHub + Hugging Face dataset)
|
||||
- **Link**: https://xecades.github.io/AerialExtreMatch/ ; https://github.com/Xecades/AerialExtreMatch ; https://huggingface.co/datasets/Xecades/AerialExtreMatch-Localization
|
||||
- **Tier**: L1 (peer-reviewed benchmark with public dataset, code, model checkpoints; OpenReview submission)
|
||||
- **Publication Date**: 2025 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Academic + practitioner image-matching evaluators
|
||||
- **Research Boundary Match**: **Full match** for cross-source UAV-satellite image matching evaluation
|
||||
- **Summary**: 2025 benchmark with: 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); ~30,000 evaluation pairs in 32 difficulty levels stratified by overlap (4 bins: <20/20-40/40-60/>60%), pitch difference (4 bins: 50–55, 55–60, 60–65, 65–70°), and scale (2 bins: 1-2×, >2×); a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. Evaluates 16 representative detector-based + detector-free image matching methods. **This is the academic benchmark our C2+C3 candidate selection must publish numbers against.**
|
||||
- **Related Sub-question**: SQ1 (academic landscape), SQ7 (datasets)
|
||||
|
||||
|
||||
### Source #35
|
||||
- **Title**: DARPA Fast Lightweight Autonomy (FLA) program page + Test-and-Evaluation review (arXiv 2504.08122)
|
||||
- **Link**: https://www.darpa.mil/research/programs/fast-lightweight-autonomy ; https://arxiv.org/abs/2504.08122
|
||||
- **Tier**: L1 (DARPA program page + 2025 academic review of program results)
|
||||
- **Publication Date**: program 2015–2018 (concluded); review 2025-04 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Foundational reference; review is current (within 18-month authority window)
|
||||
- **Target Audience**: Defense-program historians + indoor-low-altitude GPS-denied autonomy researchers
|
||||
- **Research Boundary Match**: **Partial — different regime.** FLA = small quadcopters at ≤20 m/s in cluttered indoor/outdoor with onboard sensing only, no satellite-tile basemap. Our project is fixed-wing, ~17 m/s, 1 km AGL, with sat-tile basemap.
|
||||
- **Summary**: Foundational US-defense lineage for GPS-denied autonomy (2015–2018, complete). Set the template for "small UAV + onboard sensors + onboard compute → autonomous obstacle-avoidance + navigation without datalink/GPS". Phase 1 in Florida 2017; Phase 2 in Georgia 2018. The 2025 retrospective (arXiv 2504.08122) reviews FLA's testing methodology and Phase 1 results. Companion 2025 USAF SBIR Phase II solicitation (Sweetspot ID `7946c818-409f-5b31-8f06-554466071d83`) is requesting visual-position-and-navigation capability for sUAS in GPS-denied environments — the regulatory tailwind is now active.
|
||||
- **Related Sub-question**: SQ1 (defense-program lineage)
|
||||
|
||||
|
||||
### Source #36
|
||||
- **Title**: DSMAC / TERCOM lineage — DTIC ADA315439 (Scene Matching Missile Guidance Technologies) + Wikipedia / SPIE references
|
||||
- **Link**: https://apps.dtic.mil/sti/tr/pdf/ADA315439.pdf ; https://en.wikipedia.org/wiki/DSMAC ; https://www.spiedigitallibrary.org/conference-proceedings-of-spie/0238/1/Terrain-Contour-Matching-TERCOM-A-Cruise-Missile-Guidance-Aid/10.1117/12.959127.short
|
||||
- **Tier**: L1 (DTIC unclassified technical report) + L2 (encyclopedia/SPIE proceedings)
|
||||
- **Publication Date**: DTIC: 1996; SPIE: 1980; Wikipedia: live
|
||||
- **Timeliness Status**: Foundational baseline (no time window per Step 0.5 — established classical algorithms)
|
||||
- **Target Audience**: Cruise-missile-class designers; analogues for downward-vision navigation
|
||||
- **Research Boundary Match**: **Partial — different regime** (cruise missile, terminal guidance). Architectural pattern (pre-cached scene reference + downward camera + correlation matching) is the direct ancestor of our C3 pipeline.
|
||||
- **Summary**: DSMAC = electro-optical camera correlated against pre-stored reference scenes (often from satellite reconnaissance), achieving 3–10 m terminal accuracy. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight; DSMAC for terminal. CEP without DSMAC: ~30 m; with DSMAC: "only meters". Gulf War 1991: >80% of 280 launched Tomahawks hit target. **Establishes that downward-vision-against-pre-stored-imagery is a 40+ year-old well-characterised technique class with documented accuracy bounds; our project's claim of <500 m / 99.9% reliability is achievable in the same technique class.**
|
||||
- **Related Sub-question**: SQ1 (lineage), SQ8 (baseline accuracy expectations)
|
||||
|
||||
|
||||
### Source #37
|
||||
- **Title**: Electronic Warfare in Ukraine: The Invisible Battle — Ukraine War Analytics
|
||||
- **Link**: https://ukraine-war-analytics.com/analysis/electronic-warfare-ukraine.html
|
||||
- **Tier**: L3 (analytical aggregator; primary-source numbers cite vendor / OSINT reports)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (operational-context reference)
|
||||
- **Target Audience**: Ukraine-deployment practitioners
|
||||
- **Research Boundary Match**: Full match (operational geography, threat environment)
|
||||
- **Summary**: Operational-context anchor: Russian EW systems including Pole-21 GPS jammers (25+ km range) plus spoofing capabilities have driven ~70% of small-tactical-UAV losses to EW across the conflict. Twist Robotics' OSCAR cites the same approximate number (~75% of small tactical UAV losses to EW at the front per Source #25). **Confirms the demand-side number is consistent across two independent reporting chains.**
|
||||
- **Related Sub-question**: SQ1 (Ukraine practitioner perspective)
|
||||
|
||||
---
|
||||
|
||||
## SQ2 — Canonical pipeline decomposition
|
||||
@@ -0,0 +1,74 @@
|
||||
# Source Registry — SQ2 — Canonical pipeline decomposition
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `../00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> This file replaces a section of the previous monolithic `01_source_registry.md`. See `00_summary.md` for the full category index. Investigation order is tracked in `../00_question_decomposition.md` and the cross-category Investigation Status table in `00_summary.md`.
|
||||
|
||||
---
|
||||
|
||||
### Source #38
|
||||
- **Title**: Visual Place Recognition for Aerial Imagery: A Survey (Moskalenko, Kornilova, Ferrer — Skoltech)
|
||||
- **Link**: https://arxiv.org/abs/2406.00885 (v2)
|
||||
- **Tier**: L1 (peer-reviewed survey, accepted in Robotics and Autonomous Systems; companion benchmark code: https://github.com/prime-slam/aero-vloc)
|
||||
- **Publication Date**: arXiv 2024-06; v2 update through 2024
|
||||
- **Timeliness Status**: Currently valid (within 18-month authority window for established surveys; specific candidate latency numbers will need cross-validation against newer Jetson-class hardware reports)
|
||||
- **Target Audience**: Aerial-VPR practitioners + UAV navigation system architects
|
||||
- **Research Boundary Match**: **Full match** for the offline-cache visual geo-localization decomposition (aerial-nadir UAV vs. satellite tile basemap)
|
||||
- **Summary**: Authoritative two-stage pipeline definition (verbatim): "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: **global localization (or Visual Place Recognition, VPR) and local alignment**. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Re-ranking is treated as an integral sub-stage of VPR for aerial data because of agricultural/urban grid repetition. Local alignment = SuperPoint/keypoint detector → LightGlue/SuperGlue/SelaVPR matcher → cv2.findHomography → cv2.perspectiveTransform → Web-Mercator coordinate conversion. **Practitioner-critical runtime numbers (RTX 3090, NOT Jetson)**: AnyLoc descriptor calculation = 0.37–0.84 s/frame (huge ViT-G DINOv2); MixVPR / SALAD = 0.05–0.20 s; SelaVPR = 0.04 s; SuperGlue re-rank = 15–25 s on top-100 candidates; LightGlue re-rank = ~1 s; SelaVPR re-rank = <0.1 s. Memory: AnyLoc descriptors = 2.3–13.9 GB for 4–7k tiles; SelaVPR = <0.2 GB. Final commentary: "While our methodology alone may not provide comprehensive robustness, it can be effectively augmented with additional sensors, such as inertial measurement units (IMUs). This integration enhances its utility for Visual Inertial Odometry (VIO) and Simultaneous Localization and Mapping (SLAM) systems, particularly for periodic location refinement and loop closure tasks. Additionally, our methodology could serve as a dependable emergency localization fallback in the event of an unexpected GNSS signal loss." → **Validates the project's IMU/VIO + sat-anchor architecture as the canonical extension of the survey's two-stage core.**
|
||||
- **Related Sub-question**: SQ2 (canonical decomposition), SQ3+SQ4 (C2/C3 candidate latency budgets), SQ5 (foundation-model-on-Jetson failure mode)
|
||||
|
||||
|
||||
### Source #39
|
||||
- **Title**: Cross-View Geo-Localization: A Survey (Durgam, Paheding, Dhiman, Devabhaktuni — U. Maine / Fairfield / ISU)
|
||||
- **Link**: https://arxiv.org/abs/2406.09722 (v1)
|
||||
- **Tier**: L1 (peer-style preprint, journal-bound — Expert Systems with Applications)
|
||||
- **Publication Date**: arXiv 2024-06
|
||||
- **Timeliness Status**: Currently valid (≤18 months for survey-of-deep-learning architectures)
|
||||
- **Target Audience**: Cross-view (ground↔aerial) geo-localization researchers; partial overlap with our aerial↔satellite pipeline
|
||||
- **Research Boundary Match**: **Partial — different cross-view setup** (the survey focuses on ground panorama → aerial overhead; ours is aerial nadir → satellite ortho). The pipeline-shape lessons transfer; the polar-transform / Siamese-network / GAN-based view-synthesis lessons do NOT directly apply because our two views are both top-down.
|
||||
- **Summary**: Confirms the canonical pipeline decomposition (feature extraction → cross-view matching → similarity-driven retrieval) is the dominant pattern across 2015–2024 SOTA. Establishes the historical lineage: pixel-wise (Sheikh 2003) → feature-based (Lin 2013) → CNN/triplet-loss (Tian 2017) → Siamese+GAN (Hu 2018) → polar-transform (Shi 2019) → CosPlace/EigenPlaces (2022–2023) → DINOv2-class (AnyLoc 2023) → Transformer-only (TransGeo 2022, MGTL 2022) → multi-method fusion (2023+). Backbone comparison table establishes that ViT/DINOv2 is the current SOTA backbone; ResNet-class is the established production baseline; SIFT/SURF/PHOW remain the handcrafted baseline. **Confirms our component-area split (C2 VPR + C3 cross-domain matching) is canonical and matches the survey's two-axis organization (backbone × matching strategy).**
|
||||
- **Related Sub-question**: SQ2 (decomposition lineage), SQ3+SQ4 (C2 candidate landscape)
|
||||
|
||||
|
||||
### Source #40
|
||||
- **Title**: OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata (Dhaouadi, Marin, Meier, Kaiser, Cremers — DeepScenario / TU Munich / MCML)
|
||||
- **Link**: https://arxiv.org/abs/2509.18350 ; project page https://deepscenario.github.io/OrthoLoC
|
||||
- **Tier**: L1 (peer-style preprint with public dataset, code, model checkpoints; 16,425 UAV images Germany+US, full 6-DoF ground truth)
|
||||
- **Publication Date**: arXiv 2025-09 (within 6-month critical-novelty window)
|
||||
- **Timeliness Status**: Currently valid (within 6-month critical-novelty window for SOTA aerial-localization claims)
|
||||
- **Target Audience**: UAV-localization implementers + system architects building on Digital Orthophotos (DOP) + Digital Surface Models (DSM)
|
||||
- **Research Boundary Match**: **Full match — direct paradigm match** to our project: "lightweight orthographic representations" instead of 3D meshes; "increasingly accessible through free releases by governmental authorities"; "no internet connection or GNSS/GPS support" — exactly the project's constraint envelope.
|
||||
- **Summary**: **Most directly applicable SQ2 source.** Defines the 6-DoF localization pipeline using 2.5D geodata: (1) match query UAV image against DOP (orthophoto raster) using state-of-the-art matchers; (2) lift each 2D match in the DOP to 3D using the corresponding DSM elevation; (3) PnP+RANSAC (RANSAC-EPnP, 5-pixel inlier threshold) → initial pose; (4) Levenberg-Marquardt joint refinement of intrinsics + extrinsics; (5) **AdHoP refinement**: estimate homography from initial 2D-2D correspondences via DLT+RANSAC, warp the DOP to better match the query's perspective, re-match, map back via H⁻¹, lift to 3D, refine pose; accept refinement only if reprojection error decreases. **Quantitative results** on 16.4k images, 47 locations: best matcher = GIM+DKM achieves 75.4% recall at 1m-1° threshold (sparse SP+SG = 64.4%, sparse SP+LG = 64.2%, MASt3R = 63.5%, RoMa+AdHoP = 54.6%, XFeat*+AdHoP = 59.8%; LoFTR / eLoFTR / XoFTR all <23% recall). AdHoP yields ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods (XFeat* → 95% matching improvement; DKM → 63% translation reduction; RoMa → 1m-1° recall +23%). **Performance factors** explicitly characterized: (a) **cross-domain DOPs (visual gap only) cause ~3× translation error increase** even on best method; (b) **cross-domain DOPs+DSMs (visual + structural gap) cause ~7× translation error increase** (0.16 m → 1.12 m for GIM+DKM+AdHoP) — **this is exactly the war-zone scene-change scenario AC-3.x covers**; (c) **20% covisibility floor** between query and reference; below it localization fails; (d) **Calibration is fundamentally ambiguous** between focal length and translation → camera intrinsics MUST be calibrated upstream, not jointly optimized in flight. (e) Resolution: scaling images to 30% of original (~300 px) still works; geodata at 13 m/pixel is the floor, with degradation below.
|
||||
- **Related Sub-question**: SQ2 (canonical pipeline + AdHoP refinement loop), SQ3+SQ4 (C3 matcher candidate ranks), SQ5 (war-zone scene-change failure mode), SQ8 (covisibility safety gate)
|
||||
|
||||
|
||||
### Source #41
|
||||
- **Title**: Exploring the best way for UAV visual localization under Low-altitude Multi-view Observation Condition: a Benchmark — AnyVisLoc (Ye, Teng, Chen, Li, Liu, Yu, Tan — NUDT / Macao Polytechnic)
|
||||
- **Link**: https://arxiv.org/abs/2503.10692 ; benchmark code https://github.com/UAV-AVL/Benchmark
|
||||
- **Tier**: L1 (peer-style preprint with public 18,000-image dataset across 15 Chinese cities, multi-pitch / multi-altitude / multi-scene, with both aerial-photogrammetry AND satellite reference maps)
|
||||
- **Publication Date**: arXiv 2025-03 (within 6-month critical-novelty window)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: Aerial AVL practitioners; UAV-system designers facing pitch/altitude/yaw uncertainty
|
||||
- **Research Boundary Match**: **Partial — different altitude regime** (the benchmark covers 30–300 m AGL, ours is ~1 km AGL); pitch range is 20–90° (ours is mostly nadir, ~80–90°). Lessons on the **pipeline structure, retrieval-vs-matching trade-offs, sensor-prior noise tolerance, and aerial-vs-satellite reference-map gap** transfer directly.
|
||||
- **Summary**: Independently confirms the SAME pipeline as Source #40: image retrieval (rough position) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC. Best baseline = CAMP (retrieval) + RoMa (dense matcher) + Top-N re-rank → 74.1% A@5m on aerial photogrammetry map, 18.5% A@5m on satellite map (ALOS 30m DSM). **Critical AC-quantitative findings**: (a) **Aerial map vs satellite map**: 4× accuracy gap at A@5m (74.1% vs 18.5%) — driven by satellite-DSM coarseness (ALOS 30m vs aerial 0.94m) and modality difference. **Direct relevance**: project's offline cache is satellite tiles ≥0.5 m/px without DSM; this places us between the two data points (better than ALOS 30m, worse than aerial photogrammetry) — exact accuracy must be re-established once tile resolution is pinned. (b) **Yaw prior noise**: σ ≤ 5° → no impact; σ = 10° → 1.9% A@5m drop; σ = 30° → 4.1% drop; σ = 50° → 13.7% drop; σ = 60° → 25.7% drop. **Implication for project's C1+C5+IMU**: companion-side yaw estimate must hold σ < 10°. (c) **Pitch prior noise**: σ < 5° → no impact; σ ≥ 7° causes ~1–5% drops. (d) **Pitch angle**: smaller pitch (more oblique) → lower accuracy; nadir is best. Project's nadir-fixed camera at 1 km AGL is consistent with the benchmark's most-favourable regime. (e) **Sparse vs dense matchers**: SP+LightGlue+GIM+k2s = 75.4% A@10m at 105 ms/frame; RoMa = 81.3% A@10m at 659 ms/frame. **Implication for project's C7 Jetson runtime**: dense matchers ~6× more accurate but ~6× slower → SP+LightGlue-class is the production sweet spot under our 400 ms budget. (f) **Re-ranking strategy**: Top-N re-rank by inlier count = best accuracy/cost trade-off (62.2% A@5m at 0.8 s/frame on RTX 3090). Match-without-retrieval = catastrophic (34.3% A@5m, search-space too large).
|
||||
- **Related Sub-question**: SQ2 (pipeline + sensor-prior tolerance), SQ3+SQ4 (C2 retrieval-vs-matcher trade-offs, C5 IMU prior contract), SQ5 (war-zone reference-map staleness failure mode), SQ7 (aerial-vs-satellite reference benchmarks)
|
||||
|
||||
|
||||
### Source #42
|
||||
- **Title**: Survey on absolute visual localization techniques for low-altitude unmanned aerial vehicles (Ye, Chen, Teng, Li, Yang, Song, Yu — NUDT, College of Aerospace Science)
|
||||
- **Link**: https://www.sciopen.com/article/10.11887/j.issn.1001-2486.25120033 ; DOI 10.11887/j.issn.1001-2486.25120033
|
||||
- **Tier**: L1 (peer-reviewed Chinese journal — Journal of National University of Defense Technology, vol 48 issue 2, 2026; same lab as Source #41 with overlapping authorship — confirmed cross-validation, not duplicative)
|
||||
- **Publication Date**: 2026-04-01 (within 6-month critical-novelty window)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: UAV-system architects + Chinese-defense-research community
|
||||
- **Research Boundary Match**: **Full match** (low-altitude UAV AVL is the survey's exact subject)
|
||||
- **Summary**: Survey-level confirmation of the canonical "**retrieval-matching-pose estimation**" hierarchical framework. Verbatim claim: "the hierarchical framework balances search efficiency, positioning accuracy, and scene generalization, becoming a robust technical path for low-altitude long-endurance absolute localization." Compares the framework against alternatives that are explicitly rejected: (a) relative visual localization (cumulative errors — VIO/SLAM only); (b) end-to-end direct localization (poor generalization); (c) map-free localization (scene-dependent). Sub-component evolution per stage: (a) retrieval = template-matching (SAD/SSD/NCC) → BoW/VLAD → deep-learning (annular/dense feature segmentation, contrastive InfoNCE, self-supervised); (b) matching = SIFT/SURF/ORB → SuperPoint+LightGlue/RoMa (sparse / semi-dense / dense); (c) pose estimation = PnP variants + RANSAC + IMU prior fusion. **Identifies four open challenges** that align with project risks: (i) cross-domain generalization (war-zone scene change); (ii) real-time inference on edge platforms (Jetson); (iii) robustness to complex environments (cropland, snow, low texture); (iv) high-quality datasets (the same gap our project's AC-NEW-7 / cache provisioning works around). **Lightweight-model-design-for-edge-deployment is named as a primary future-research direction** — directly validates project's Jetson Orin Nano constraint as a recognized field-level challenge, not a project-specific oddity.
|
||||
- **Related Sub-question**: SQ2 (framework canonicalness), SQ3+SQ4 (per-component evolution), SQ5 (named open challenges align with project risks)
|
||||
|
||||
---
|
||||
|
||||
## SQ3+SQ4 / C1 (Visual / Visual-Inertial Odometry) — Candidate enumeration
|
||||
@@ -0,0 +1,320 @@
|
||||
# Source Registry — SQ6 — ArduPilot Plane vs iNav external positioning
|
||||
|
||||
> Mode A Phase 2 — engine Step 2 (Source Tiering & Exhaustive Web Investigation).
|
||||
> Critical-novelty sensitivity per Step 0.5 in `../00_question_decomposition.md`. Time windows applied:
|
||||
> - **Lead-candidate / SOTA claims**: prefer sources within last 6 months; up to 18 months if older is the official authority.
|
||||
> - **Library/SDK API behaviour**: must reflect the currently shipped version at search time (`context7` mandatory per lead candidate).
|
||||
> - **Established baselines** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window.
|
||||
>
|
||||
> This file replaces a section of the previous monolithic `01_source_registry.md`. See `00_summary.md` for the full category index. Investigation order is tracked in `../00_question_decomposition.md` and the cross-category Investigation Status table in `00_summary.md`.
|
||||
|
||||
---
|
||||
|
||||
### Source #1
|
||||
- **Title**: Non-GPS Navigation — Plane documentation
|
||||
- **Link**: https://ardupilot.org/plane/docs/common-non-gps-navigation-landing-page.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live docs (current ArduPilot stable, accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: ArduPilot 4.7+ (persistent origin storage); applies to current Plane stable
|
||||
- **Target Audience**: ArduPilot Plane operators / developers
|
||||
- **Research Boundary Match**: Full match (fixed-wing, ArduPilot Plane is in scope)
|
||||
- **Summary**: Lists supported non-GPS navigation systems for Plane. Notes that boards <1MB flash still support `GPS_INPUT` even when they cannot run other non-GPS messages. Notes that Plane (non-VTOL) is generally not applicable for low-altitude non-GPS — but `GPS_INPUT` as an external GPS replacement is not constrained by that note.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #2
|
||||
- **Title**: GPS / Non-GPS Transitions — Plane documentation
|
||||
- **Link**: https://ardupilot.org/plane/docs/common-non-gps-to-gps.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live docs (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: EKF3 (default since AP 4.0+)
|
||||
- **Target Audience**: ArduPilot operators using mixed GPS / non-GPS sources
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Documents the EKF3 source-set mechanism (`EK3_SRC1..3_POSXY/VELXY/POSZ/VELZ/YAW`), three source sets, RC aux switch (option 90 "EKF Pos Source"), `MAV_CMD_SET_EKF_SOURCE_SET`, Lua-script driven switching. Explicitly named messages for non-GPS path: ExternalNav (option 6). GPS_INPUT is treated as a GPS source (set 1).
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #3
|
||||
- **Title**: EKF Source Selection and Switching — Plane documentation
|
||||
- **Link**: https://ardupilot.org/plane/docs/common-ekf-sources.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live docs (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: EKF3 stable
|
||||
- **Target Audience**: ArduPilot operators / developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative parameter reference for `EK3_SRCx_*` (POSXY/VELXY/POSZ/VELZ/YAW). Important caveat: "Ground stations or companion computers may set the source by sending a `MAV_CMD_SET_EKF_SOURCE_SET` mavlink command **but no GCSs are currently known to implement this**." Source-set switching from companion is supported by AP, not by stock GCS UI. Mentions ExternalNAV/OpticalFlow transition options via `EK3_SRC_OPTIONS` bit 1.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #4
|
||||
- **Title**: ArduPilot AP_GPS_MAV.cpp (master)
|
||||
- **Link**: https://raw.githubusercontent.com/ArduPilot/ardupilot/master/libraries/AP_GPS/AP_GPS_MAV.cpp
|
||||
- **Tier**: L1 (source code)
|
||||
- **Publication Date**: master HEAD (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master branch
|
||||
- **Target Audience**: ArduPilot developers, integrators of external GPS via MAVLink
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative implementation of `MAVLINK_MSG_ID_GPS_INPUT` ingestion into AP_GPS state. Decodes lat/lon/alt, hdop/vdop, velocity (vn/ve/vd), speed/horizontal/vertical accuracy, yaw. Honors `gps_id` (multi-GPS instance), `ignore_flags` bitmask (ALT, HDOP, VDOP, VEL_HORIZ, VEL_VERT, SPEED_ACCURACY, HORIZONTAL_ACCURACY, VERTICAL_ACCURACY). Requires `fix_type ≥ 3` and `time_week > 0` for jitter-corrected timestamping. Yaw uses `0` as "not provided" sentinel. Only `GPS_INPUT` is handled by this driver — `VISION_POSITION_ESTIMATE` / `ODOMETRY` go via the external-nav driver, not AP_GPS_MAV.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #5
|
||||
- **Title**: ArduPilot PR #28750 — AP_NavEKF3: added two more EK3_OPTION bits (GPS-denied testing)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/pull/28750
|
||||
- **Tier**: L2 (development PR, ArduPilot core team)
|
||||
- **Publication Date**: 2024 (accessed via search 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master / pending stable branch propagation
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Adds new `EK3_OPTION` bits to allow easier GPS-denied testing of EKF3, including an aux-switch / MAVLink command path to disable GPS use. Confirms ongoing 2024-2025 work on GPS-denied robustness.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #6
|
||||
- **Title**: ArduPilot Issue #15859 — EKF3: improve source switching (GPS<->NonGPS)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/15859
|
||||
- **Tier**: L4 (issue tracker — open enhancement list)
|
||||
- **Publication Date**: ongoing (long-running issue, accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid (still open per dev docs reference)
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative list of planned improvements for source-switching. Linked from the L1 GPS-Non-GPS Transitions page. Indicates current source switching has known rough edges acknowledged by the core team.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #7
|
||||
- **Title**: ArduPilot Issue #27193 — EK3 Source Switching wrong frame for GUIDED commands SOLVED
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/27193
|
||||
- **Tier**: L4 (issue tracker, resolved)
|
||||
- **Publication Date**: 2024 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Reference only (resolved as user-config)
|
||||
- **Target Audience**: ArduPilot operators using GPS↔Vision source switching
|
||||
- **Research Boundary Match**: Partial overlap (Copter context but the bug was in shared SET_POSITION_TARGET_GLOBAL_INT path)
|
||||
- **Summary**: Documented frame-interpretation issue when companion switches source set 1 (GPS) → set 3 (VISION_POSITION_ESTIMATES) and back. Resolved as configuration not code, but illustrates the kind of edge case to validate in SITL for AC-NEW-2 promotion.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #8
|
||||
- **Title**: ArduPilot Issue #23485 — AP_NavEKF3: support fusing only External Nav Velocities (without position)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/23485
|
||||
- **Tier**: L4 (open enhancement)
|
||||
- **Publication Date**: ongoing (open as of accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms current limitation: ODOMETRY without position causes position-estimate timeout / failsafe. Implies the project's `visual_propagated` path (VO without satellite anchor) cannot be expressed as ODOMETRY-velocity-only on current AP — must be sent as full GPS_INPUT with widened covariance.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #9
|
||||
- **Title**: iNavFlight/inav — telemetry/mavlink.c (master, processMAVLinkIncomingTelemetry)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/src/main/telemetry/mavlink.c
|
||||
- **Tier**: L1 (source code, authoritative)
|
||||
- **Publication Date**: master HEAD (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav master (post-9.0)
|
||||
- **Target Audience**: iNav developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative inbound MAVLink switch (lines ~1334–1390). Handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only `MAV_CMD_DO_REPOSITION`), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. **No `GPS_INPUT`, no `VISION_POSITION_ESTIMATE`, no `ODOMETRY`, no `GLOBAL_POSITION_INT`, no `GPS_RAW_INT`** are accepted as inputs. Wiki page (Source #10) confirms.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #10
|
||||
- **Title**: iNav Wiki — MAVLink (frogmane edited 2025-12-11)
|
||||
- **Link**: https://github.com/iNavFlight/inav/wiki/Mavlink
|
||||
- **Tier**: L1 (project wiki)
|
||||
- **Publication Date**: 2025-12-11
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 8.0 / 9.0 era
|
||||
- **Target Audience**: iNav users / integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative inbound/outbound MAVLink message lists. "Limited command support: Commands that are not implemented are ignored." Explicitly enumerates the supported incoming list (matches Source #9). Confirms iNav MAVLink is "intended primarily for simple telemetry and operation" and "not 100% compatible".
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #11
|
||||
- **Title**: iNav Wiki — GPS and Compass setup
|
||||
- **Link**: https://github.com/iNavFlight/inav/wiki/GPS-and-Compass-setup
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live wiki (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 7.0+ (UBX-only); 9.0 requires UBX protocol ≥15.00
|
||||
- **Target Audience**: iNav operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: From iNav 7.0 NMEA was removed; only UBX is supported. Recommends u-blox M8/M9/M10 with protocol ≥15.00. Sets up the constraint for any UBX-emulation path the companion would take.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #12
|
||||
- **Title**: iNavFlight/inav docs/development/msp/README.md (MSP message reference)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/development/msp/README.md
|
||||
- **Tier**: L1 (project docs)
|
||||
- **Publication Date**: live (master, accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav master
|
||||
- **Target Audience**: iNav developers / integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative spec for `MSP_SET_RAW_GPS (201)` and `MSP2_SENSOR_GPS (7939)`. `MSP_SET_RAW_GPS` is 14-byte, lossy (no covariance, no per-axis velocity, altitude in meters with cm internal mismatch — bug fixed in 5.0.0 per issue #8336). `MSP2_SENSOR_GPS` is the newer plugin-style message with `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` (mm and cm/s), `hdop`, NED velocity components, `trueYaw`, GPS week + time-of-week, fix type, satellite count. Requires `USE_GPS_PROTO_MSP` build flag and routes through `mspGPSReceiveNewData()` (the GPS_PROVIDER_MSP driver path).
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #13
|
||||
- **Title**: iNavFlight/inav src/main/io/gps.c + src/main/target/common.h (master)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/src/main/target/common.h
|
||||
- **Tier**: L1 (source code)
|
||||
- **Publication Date**: master (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master
|
||||
- **Target Audience**: iNav developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: `USE_GPS_PROTO_MSP` is enabled by default in the common target configuration; on default builds the MSP GPS provider (`GPS_PROVIDER_MSP`) is registered with `gpsRestartMSP` / `gpsHandleMSP`. Confirms the MSP2_SENSOR_GPS path is reachable on stock iNav firmware without custom builds.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #14
|
||||
- **Title**: iNav Issue #10141 — dual GPS support
|
||||
- **Link**: https://github.com/iNavFlight/inav/issues/10141
|
||||
- **Tier**: L4 (open feature request)
|
||||
- **Publication Date**: ongoing (open as of accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: iNav users
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms iNav does **not** support dual-GPS / primary-secondary failover. Open enhancement; no implementation in 8.0 / 9.0. Architectural implication: companion must be the sole GPS source for iNav (not a backup to a real GPS connected directly to FC).
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #15
|
||||
- **Title**: iNav docs/GPS_fix_estimation.md (master)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/GPS_fix_estimation.md
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 8.0+
|
||||
- **Target Audience**: iNav fixed-wing operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: iNav's internal dead-reckoning ("GPS fix estimation") for fixed-wing. Uses gyro/accel/baro/(mag/pitot). RTH-only intent. **Explicitly states: "Not a solution for GPS spoofing (GPS output is not validated in INAV)"** — iNav has no internal anti-spoofing, so anti-spoofing is fully the companion's responsibility. Two settings: `inav_allow_gps_fix_estimation` (RTH-with-no-GPS) and `inav_allow_dead_reckoning` (short-outage tolerance) — both default OFF. `failsafe_gps_fix_estimation_delay` controls mission-vs-RTH tradeoff (default 7 s).
|
||||
- **Related Sub-question**: SQ6 (dead-reckoning fallback) + SQ8 (anti-spoofing implication)
|
||||
|
||||
|
||||
### Source #16
|
||||
- **Title**: iNav docs/Settings.md (master)
|
||||
- **Link**: https://github.com/iNavFlight/inav/blob/master/docs/Settings.md
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: master (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav master
|
||||
- **Target Audience**: iNav operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Authoritative parameter list. Confirms `inav_allow_dead_reckoning` (line 2081, default OFF) ≠ `inav_allow_gps_fix_estimation` (line 2091, default OFF). The two settings address different scenarios. `failsafe_gps_fix_estimation_delay` (line 1041, default 7 s) governs mission-abort timing.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #17
|
||||
- **Title**: iNav Issue #10588 — Weird behaviour in DeadReckoning mode while GPS outage is not constant
|
||||
- **Link**: https://github.com/iNavFlight/inav/issues/10588
|
||||
- **Tier**: L4 (open issue, 2025)
|
||||
- **Publication Date**: 2025
|
||||
- **Timeliness Status**: Currently valid (open)
|
||||
- **Target Audience**: iNav operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Documented stability bug: intermittent GPS outages cause porpoising and motor bursts in dead-reckoning. Cited recommendation: "GPS should be rejected if providing erroneous coordinates rather than no fix." Risk for AC-NEW-8 (visual blackout + spoofed GPS) on iNav: do NOT rely on iNav's dead-reckoning for the spoof-active failsafe path; companion must actively suppress its own MSP feed and accept that iNav may misbehave during the gap. Better: continue feeding companion-IMU-propagated position with growing covariance via MSP2_SENSOR_GPS so iNav never enters its dead-reckoning state.
|
||||
- **Related Sub-question**: SQ6 + AC-NEW-8 design implication
|
||||
|
||||
|
||||
### Source #18
|
||||
- **Title**: iNav Release 8.0.0 (highlights, Dec 2024)
|
||||
- **Link**: https://github.com/iNavFlight/inav/releases/tag/8.0.0
|
||||
- **Tier**: L1 (project release notes)
|
||||
- **Publication Date**: late 2024 / early 2025
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 8.0
|
||||
- **Target Audience**: iNav users
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Introduces fixed-wing GPS fix estimation (dead reckoning RTH-only) — the milestone for #8347. No new external-positioning inbound MAVLink in 8.0. Confirms iNav's 2024–2025 trajectory has not added a `GPS_INPUT`-equivalent inbound interface.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #19
|
||||
- **Title**: iNav Release 9.0.0 / 9.0.1 + 9.0.0 Release Notes wiki
|
||||
- **Link**: https://github.com/iNavFlight/inav/wiki/9.0.0-Release-Notes
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2025-2026
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: iNav 9.0.x
|
||||
- **Target Audience**: iNav users
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: New in 9.0: pitot APA/TPA, position estimator improvements, MSP_REBOOT DFU, GCS NAV via `COMMAND_INT` `MAV_CMD_DO_REPOSITION`. **No** new external-positioning inbound MAVLink. UBX <15.00 dropped. Confirms iNav 9.x continues the same external-positioning architecture as 8.x.
|
||||
- **Related Sub-question**: SQ6
|
||||
|
||||
|
||||
### Source #20
|
||||
- **Title**: MAVLink common message set — GPS_RAW_INT (24)
|
||||
- **Link**: https://mavlink.io/en/messages/common.html
|
||||
- **Tier**: L1 (MAVLink spec, live)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: MAVLink common, current
|
||||
- **Target Audience**: MAVLink integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Current published `GPS_RAW_INT` extension fields: `alt_ellipsoid`, `h_acc` (mm), `v_acc` (mm), `vel_acc` (mm/s), `hdg_acc` (degE5), `yaw` (cdeg). **No spoofing/jamming/integrity bitfield is present in `GPS_RAW_INT` at the time of access**, despite PR #2110 having been merged for spoofing/integrity reporting. Spoofing/integrity may live in a separate message (`GPS_INTEGRITY` or similar — to be verified in SQ8). For now, spoof-detection signals available to companion from FC are limited at the message-shape level; FC-side textual signals (`STATUSTEXT`) and `NAMED_VALUE_INT` are the documented practical path.
|
||||
- **Related Sub-question**: SQ6 + SQ8
|
||||
|
||||
|
||||
### Source #21
|
||||
- **Title**: MAVLink PR #2110 — gps: add status and integrity information
|
||||
- **Link**: https://github.com/mavlink/mavlink/pull/2110
|
||||
- **Tier**: L2 (protocol PR with cross-project sign-off)
|
||||
- **Publication Date**: merged (accessed via search 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: MAVLink common
|
||||
- **Target Audience**: MAVLink integrators across PX4 / ArduPilot / QGC / Mission Planner
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Adds GNSS status / integrity reporting (jamming/spoofing/error) at the protocol level. Cross-project sign-off across PX4, ArduPilot, QGC, Mission Planner. Field-level breakdown to be cross-checked in SQ8 against the dialect XML — current `common.html` does not show those fields inside `GPS_RAW_INT` itself, suggesting they live in a sibling message (likely `GPS_INTEGRITY` or `GPS_STATUS_EXT`).
|
||||
- **Related Sub-question**: SQ6 → defer to SQ8 for the precise message name and field set ArduPilot uses to expose spoofing.
|
||||
|
||||
|
||||
### Source #22
|
||||
- **Title**: AirDroper — GNSS Spoofing Filter (companion device, MAVLink2 NAMED_VALUE_INT pattern)
|
||||
- **Link**: https://gps.airdroper.org/
|
||||
- **Tier**: L3 (vendor product page; design pattern reference, not protocol authority)
|
||||
- **Publication Date**: live (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Target Audience**: ArduPilot integrators considering anti-spoofing
|
||||
- **Research Boundary Match**: Reference only (vendor's specific algorithm not relevant; the integration pattern is)
|
||||
- **Summary**: Establishes a precedent that "companion-runs-spoofing-detection → publishes confidence to GCS as MAVLink2 `NAMED_VALUE_INT`, logged to dataflash" is a real-world integration pattern with ArduPilot, not novel to this project. Useful for SQ8.
|
||||
- **Related Sub-question**: SQ8 (referenced from SQ6)
|
||||
|
||||
|
||||
### Source #23
|
||||
- **Title**: ArduPilot PR #24135 — Add option to make EKF3 more robust to bad IMU and lagged GPS data
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/pull/24135
|
||||
- **Tier**: L2 (development PR)
|
||||
- **Publication Date**: 2023-2024 (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master / propagated to stable
|
||||
- **Target Audience**: ArduPilot developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Introduces `EK3_GLITCH_RADIUS` parameter — soft outlier rejection: instead of dropping a GPS measurement that fails innovation gating, the EKF inflates innovation variance to the minimum that just passes, effectively de-weighting the measurement. Implication for AC-NEW-4 (false-position safety): the project's covariance honesty contract on `GPS_INPUT.horiz_accuracy` is the ONLY way for AP's EKF to detect and de-weight a bad estimate; under-reporting collapses this safety net.
|
||||
- **Related Sub-question**: SQ6 + AC-NEW-4 design implication
|
||||
|
||||
|
||||
### Source #24
|
||||
- **Title**: ArduPilot AP_NavEKF3 — VehicleStatus.cpp + AP_NavEKF3.cpp (master)
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_NavEKF3/AP_NavEKF3_VehicleStatus.cpp ; https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_NavEKF3/AP_NavEKF3.cpp
|
||||
- **Tier**: L1 (source code)
|
||||
- **Publication Date**: master HEAD (accessed 2026-05-07)
|
||||
- **Timeliness Status**: Currently valid
|
||||
- **Version Info**: master
|
||||
- **Target Audience**: ArduPilot EKF3 developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: EKF3 quality control: (a) ground-stationary GPS drift check ≤ 3 m (gated by `_gpsCheckScaler`); (b) innovation gating per `POS_I_GATE` / `VEL_I_GATE`; (c) soft de-weighting via `EK3_GLITCH_RADIUS` (Source #23). Confirms AP's covariance-driven quality path actually exists; companion-supplied `horiz_accuracy` flows into this chain.
|
||||
- **Related Sub-question**: SQ6 (full file analysis deferred to design phase)
|
||||
|
||||
---
|
||||
|
||||
## SQ1 — Existing / competitor GPS-denied UAV navigation systems
|
||||
@@ -1,543 +0,0 @@
|
||||
# Fact Cards
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `01_source_registry.md`. Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential).
|
||||
>
|
||||
> Bound to sub-questions in `00_question_decomposition.md`. Many SQ6 facts also bind directly to the Project Constraint Matrix (`acceptance_criteria.md` / `restrictions.md`); per the engine's "Per-Mode API Capability Verification" rule, MAVLink/MSP messages are treated as candidate **modes** and are bound `Pass/Fail/Verify/N/A` against numbered ACs and restrictions.
|
||||
|
||||
---
|
||||
|
||||
## SQ6 — ArduPilot Plane vs iNav external positioning
|
||||
|
||||
### Fact #1 — ArduPilot Plane EKF3 ingests `GPS_INPUT` (MAVLink ID 232) as a first-class GPS source
|
||||
- **Statement**: ArduPilot's `AP_GPS_MAV` driver (master) decodes `MAVLINK_MSG_ID_GPS_INPUT` and stores the resulting state into the GPS slot identified by `gps_id`. Decoded fields: lat/lon (degE7), alt (mm → cm internally), hdop/vdop, velocity (vn/ve/vd cm/s), speed/horizontal/vertical accuracy (m / m/s), yaw (cdeg, `0` sentinel = "not provided"). Honors `ignore_flags` for ALT/HDOP/VDOP/VEL_HORIZ/VEL_VERT/SPEED_ACCURACY/HORIZONTAL_ACCURACY/VERTICAL_ACCURACY. Requires `fix_type ≥ 3` and `time_week > 0` for jitter-corrected timestamping.
|
||||
- **Source**: Source #4 (AP_GPS_MAV.cpp master), Source #1 (Plane Non-GPS Navigation docs)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: ArduPilot Plane operators / developers
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8 (FC adapter), C5 (estimator covariance contract)
|
||||
- **Fit Impact**: **supports selection** — ArduPilot side of AC-4.3 is satisfied by `GPS_INPUT` as the primary external-positioning message; covariance fields (`horiz_accuracy`, `vert_accuracy`, `speed_accuracy`) are wired through.
|
||||
|
||||
### Fact #2 — ArduPilot's covariance honesty (AC-NEW-4) is enforced via the `horiz_accuracy` field of `GPS_INPUT`
|
||||
- **Statement**: When `GPS_INPUT_IGNORE_FLAG_HORIZONTAL_ACCURACY` is unset, AP_GPS stores `packet.horiz_accuracy` into `state.horizontal_accuracy` and sets `state.have_horizontal_accuracy = true`. EKF3's quality chain consumes this via (a) ground-stationary 3 m drift check (`_gpsCheckScaler`-modulated), (b) innovation gating (`POS_I_GATE`/`VEL_I_GATE`), (c) soft de-weighting via `EK3_GLITCH_RADIUS` (PR #24135). Under-reporting `horiz_accuracy` defeats these gates — exactly the AC-NEW-4 risk the project flagged.
|
||||
- **Source**: Source #4, Source #23 (PR #24135), Source #24 (AP_NavEKF3 master)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers writing the C5 estimator → C8 adapter
|
||||
- **Confidence**: ✅ (source code + L1 docs); ⚠️ for the precise innovation-gate mechanics (deferred to design-phase SITL tuning)
|
||||
- **Related Dimension**: C5 covariance, AC-NEW-4
|
||||
- **Fit Impact**: **architectural constraint** — the C5 estimator MUST publish honest `horiz_accuracy` (not optimistic) for AP's EKF3 quality chain to function. Aligns directly with AC-1.4 / AC-NEW-4.
|
||||
|
||||
### Fact #3 — ArduPilot supports runtime EKF source-set switching from companion via `MAV_CMD_SET_EKF_SOURCE_SET`
|
||||
- **Statement**: EKF3 supports up to three source sets (`EK3_SRC1..3_*`). A companion can request a switch by sending `MAV_CMD_SET_EKF_SOURCE_SET`. Alternative paths: RC aux-switch option 90 ("EKF Pos Source"), Lua scripts (e.g., `ahrs-source.lua`). **Caveat from L1 docs**: "no GCSs are currently known to implement this" — companion-driven switching works at the firmware level but is not exposed in stock GCS UIs.
|
||||
- **Source**: Source #2, Source #3
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers handling AC-NEW-2 spoof-promotion path on ArduPilot
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8 + AC-NEW-2
|
||||
- **Fit Impact**: **supports selection** — AP allows the project to model two source sets (set 1 = real GPS, set 2 = onboard `GPS_INPUT`) and switch automatically. Keeps companion lightweight; switching does not require the companion to suppress real-GPS itself.
|
||||
|
||||
### Fact #4 — ArduPilot ODOMETRY-velocity-only fusion is currently NOT supported (open enhancement)
|
||||
- **Statement**: Issue #23485 confirms current limitation: feeding `ODOMETRY` without position causes EKF position-estimate timeout / failsafe. Implication: the project's `visual_propagated` mode (VO drift between satellite anchors, no global position) **cannot be expressed as ODOMETRY-velocity-only on current AP** — must be sent as a full `GPS_INPUT` with covariance widened to reflect drift uncertainty.
|
||||
- **Source**: Source #8
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers
|
||||
- **Confidence**: ✅ (open enhancement, open as of accessed date)
|
||||
- **Related Dimension**: C5 + C8 + AC-1.3 (`visual_propagated` label) + AC-1.4 (covariance ellipse)
|
||||
- **Fit Impact**: **architectural constraint** — `visual_propagated` and `dead_reckoned` labels both ride `GPS_INPUT` with growing `horiz_accuracy`, NOT a separate `ODOMETRY` channel. Single-message contract = simpler. AC-NEW-8 thresholds (`horiz_accuracy = 999.0` for "no fix") map directly.
|
||||
|
||||
### Fact #5 — iNav firmware (master, post-9.0) has NO inbound MAVLink handler for any external-positioning message
|
||||
- **Statement**: Authoritative inbound switch in `src/main/telemetry/mavlink.c::processMAVLinkIncomingTelemetry` (master) handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub reply), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only `MAV_CMD_DO_REPOSITION`), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. **No `GPS_INPUT`, `VISION_POSITION_ESTIMATE`, `ODOMETRY`, `GLOBAL_POSITION_INT`, or `GPS_RAW_INT` are accepted as inputs.** Wiki page (Source #10) confirms: "Limited command support: Commands that are not implemented are ignored."
|
||||
- **Source**: Source #9 (master code), Source #10 (wiki, edited 2025-12-11)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers + AC-4.3 author
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8, AC-4.3
|
||||
- **Fit Impact**: **DISQUALIFIES the literal AC-4.3 wording** ("the standard external-positioning message type(s) accepted by ArduPilot AND iNav"). No single MAVLink external-positioning message is accepted by both FCs. Project must adopt a per-FC adapter design and AC-4.3 must be revised to acknowledge two transports.
|
||||
|
||||
### Fact #6 — iNav accepts external GPS injection via two MSP paths; `MSP2_SENSOR_GPS` is the covariance-rich path
|
||||
- **Statement**: `MSP_SET_RAW_GPS (201)` (legacy MSP1, 14 bytes): fixType, numSat, lat, lon, alt (m, internal cm), speed (cm/s). **No covariance, no per-axis velocity, no yaw.** `MSP2_SENSOR_GPS (7939, MSPv2 sensor plugin)`: instance, gpsWeek, msTOW, fixType, satellitesInView, hPosAccuracy (mm), vPosAccuracy (mm), hVelAccuracy (cm/s), hdop, lat, lon, mslAltitude (cm), nedVelNorth/East/Down (cm/s), groundCourse (cdeg×100), trueYaw (cdeg×100), date+time. Routes through `mspGPSReceiveNewData()` via `GPS_PROVIDER_MSP`. Requires build flag `USE_GPS_PROTO_MSP` — **enabled by default in iNav's `target/common.h`**, so stock firmware reaches this path.
|
||||
- **Source**: Source #12 (MSP message reference, master), Source #13 (target/common.h master + gps.c provider table)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers (C8 adapter, MSP transport)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8, C5 covariance contract
|
||||
- **Fit Impact**: **supports selection** of `MSP2_SENSOR_GPS` for the iNav adapter. Covariance fields (`hPosAccuracy`, `vPosAccuracy`, `hVelAccuracy`) align semantically with `GPS_INPUT.horiz_accuracy` / `vert_accuracy` / `speed_accuracy`, but unit conversions differ (mm vs m). The C8 adapter must therefore be FC-aware, not protocol-monomorphic.
|
||||
|
||||
### Fact #7 — iNav does NOT support dual-GPS arbitration; companion must be the SOLE GPS source
|
||||
- **Statement**: Issue #10141 is an open feature request for dual-GPS support. Current iNav (master incl. 9.0.x) has single-GPS architecture with one UART selected as the GPS port. There is no primary/secondary failover and no per-instance arbitration in the nav stack.
|
||||
- **Source**: Source #14
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers (architecture)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8, C5, AC-NEW-2 (spoof promotion)
|
||||
- **Fit Impact**: **architectural constraint** — on iNav, real GPS receivers must NOT be wired directly to the FC. Real GPS goes to the companion; the companion fuses (or rejects) it and emits the single iNav-facing feed via MSP2_SENSOR_GPS (or via a UBX-emulation UART). AC-NEW-2 latency on iNav = companion's internal reaction time only; iNav does not participate in source switching at all.
|
||||
|
||||
### Fact #8 — iNav explicitly does NOT validate GPS for spoofing; anti-spoofing is fully the companion's responsibility
|
||||
- **Statement**: iNav's `docs/GPS_fix_estimation.md` states verbatim: "Not a solution for GPS spoofing (GPS output is not validated in INAV)." Combined with Fact #7, the architectural conclusion on iNav: companion = anti-spoofing oracle + nav-camera estimator + IMU-propagation source, all collapsed into the single MSP2_SENSOR_GPS feed.
|
||||
- **Source**: Source #15
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers; AC-NEW-2 / AC-3.5 / AC-NEW-8 owners
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: AC-NEW-2, AC-3.5, AC-NEW-8
|
||||
- **Fit Impact**: **supports selection** of "companion as iNav's only GPS"; **disqualifies** any architecture that relies on iNav-side spoof detection for AC-NEW-2 reaction.
|
||||
|
||||
### Fact #9 — iNav dead-reckoning has documented stability bugs under intermittent feeds; AC-NEW-8 must avoid letting iNav enter dead-reckoning
|
||||
- **Statement**: Issue #10588 documents porpoising and motor-burst behaviour during intermittent GPS outages on iNav fixed-wing dead-reckoning. The community recommendation captured in the issue: "GPS should be rejected if providing erroneous coordinates rather than no fix." `inav_allow_dead_reckoning` (default OFF) and `inav_allow_gps_fix_estimation` (default OFF) are both fixed-state booleans — entering dead-reckoning mid-flight is a discrete transition, not a smooth degrade.
|
||||
- **Source**: Source #15, Source #16 (Settings.md), Source #17 (#10588)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers; AC-NEW-8 owner
|
||||
- **Confidence**: ✅ for setting names; ⚠️ for severity of stability bug (single open issue)
|
||||
- **Related Dimension**: AC-NEW-8, AC-3.5, C8
|
||||
- **Fit Impact**: **architectural constraint** — on iNav, the AC-NEW-8 path must keep emitting `MSP2_SENSOR_GPS` with growing `hPosAccuracy` rather than letting the feed drop and iNav switch to dead-reckoning. The "no fix" semantics on iNav must be expressed via `fixType` field of MSP2_SENSOR_GPS (not by silence). The horiz/vert accuracy fields are the only signal available; iNav has no equivalent of the AP `horiz_accuracy = 999.0` "no fix" sentinel — must verify which `fixType` enum values iNav treats as no-fix.
|
||||
|
||||
### Fact #10 — iNav supports UBX-only over UART (NMEA dropped in 7.0); UBX emulation is a viable third transport
|
||||
- **Statement**: iNav 7.0 removed NMEA. Currently supports u-blox UBX protocol with version ≥ 15.00 in 9.0+. Recommended physical receivers: u-blox M8/M9/M10. Companion can implement a UBX-emulation writer on the iNav GPS UART (NAV-PVT mandatory; NAV-DOP optional). UBX carries `hAcc`/`vAcc`/`headAcc`/velocity components — covariance honesty preserved.
|
||||
- **Source**: Source #11 (iNav GPS-and-Compass-setup wiki)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers (transport-choice)
|
||||
- **Confidence**: ✅ for UBX-only; ⚠️ for "minimum NAV-* set" — the canonical U-blox protocol spec (Source filed in agent-tools as `fd8513f8-...txt`) plus iNav's `gps_ublox.c` drive the precise message set; **this is a follow-up search before final selection**.
|
||||
- **Related Dimension**: C8 transport choice
|
||||
- **Fit Impact**: **alternate candidate, NOT YET SELECTED** — UBX path bypasses MSP queueing/arbitration concerns and treats the companion as a normal GPS to iNav. Trade-off: implementation cost (UBX writer + correct ACK behaviour) vs. MSP path (already-designed wire format, but iNav-specific).
|
||||
|
||||
---
|
||||
|
||||
## SQ6 — Conclusions (working summary, will be re-checked at Step 7.5)
|
||||
|
||||
### Per-FC adapter design is unavoidable (single-message AC-4.3 wording is unsatisfiable)
|
||||
|
||||
| FC | Inbound external-positioning transport | Message | Covariance fields | Per-axis velocity | Yaw | Source-switching from companion |
|
||||
|---|---|---|---|---|---|---|
|
||||
| **ArduPilot Plane** | MAVLink (TELEM/USB/UDP serial) | `GPS_INPUT` (id 232) — primary | `horiz_accuracy`, `vert_accuracy`, `speed_accuracy` (m/m·s⁻¹) | `vn`, `ve`, `vd` (cm/s) | `yaw` cdeg, 0 = not provided | `MAV_CMD_SET_EKF_SOURCE_SET` (FW supports; stock GCS UIs do not — companion-driven OK) |
|
||||
| **iNav** | MSP2 (UART/USB) | `MSP2_SENSOR_GPS` (id 7939) — primary candidate | `hPosAccuracy` mm, `vPosAccuracy` mm, `hVelAccuracy` cm/s | `nedVelNorth/East/Down` cm/s | `trueYaw` cdeg×100 | **N/A** — iNav has single-GPS arch; companion = sole GPS source |
|
||||
| iNav alt 1 | MSP1 | `MSP_SET_RAW_GPS` (id 201) — **rejected for production** | none | none | none | N/A |
|
||||
| iNav alt 2 | UART | UBX emulation (NAV-PVT etc.) — **alternate candidate, requires NAV-* subset verification** | UBX `hAcc`/`vAcc`/`headAcc` mm/cm/scale | NED in NAV-PVT | yes | N/A |
|
||||
|
||||
**Selection (preliminary, pending Step 7.5 component-fit gate):**
|
||||
- **AP path**: `GPS_INPUT` — Selected (lead).
|
||||
- **iNav path**: `MSP2_SENSOR_GPS` — Selected (lead). UBX-emulation kept as fallback if MSP2_SENSOR_GPS proves rate-limited or quality-flag-lossy.
|
||||
|
||||
### AC / Restriction binding (per-mode, Per-Mode API Capability Verification rule)
|
||||
|
||||
| Numbered AC / Restriction | AP `GPS_INPUT` | iNav `MSP2_SENSOR_GPS` | iNav `MSP_SET_RAW_GPS` |
|
||||
|---|---|---|---|
|
||||
| AC-1.4 (95% cov + source label `{satellite_anchored, visual_propagated, dead_reckoned}`) | **Pass** (`horiz_accuracy` carries 95% covariance proxy; source label is companion-side metadata, not in MAVLink — emit via STATUSTEXT/NAMED_VALUE_FLOAT) | **Pass** (`hPosAccuracy` = covariance proxy; same off-band source-label channel) | **Fail** (no covariance field → cannot publish 95% ellipse) |
|
||||
| AC-NEW-4 (false-position safety budget; covariance honesty) | **Pass** (de-weighted via `EK3_GLITCH_RADIUS` if covariance is honest) | **Verify** (need to confirm iNav nav-stack actually uses `hPosAccuracy` for outlier handling — pre-Step-7.5 follow-up) | **Fail** |
|
||||
| AC-NEW-2 (<3 s p95 spoof promotion) | **Verify** via SITL (`MAV_CMD_SET_EKF_SOURCE_SET` round-trip latency under load) | **Pass** by architecture (companion is sole GPS, no FC-side switch needed) | Pass-by-arch but Fails AC-1.4 |
|
||||
| AC-NEW-8 (visual-blackout + spoofed GPS failsafe; covariance growth + degraded fix levels) | **Pass** (`fix_type` 0/1/2 + `horiz_accuracy=999.0` documented sentinel maps to AC-NEW-8 thresholds) | **Verify** (iNav's `fixType` enum mapping for "no fix" — pre-Step-7.5 follow-up) | **Fail** (no graceful degrade signal) |
|
||||
| AC-3.5 (label switch within ≤1 frame OR ≤400 ms; reject spoofed GPS as input) | **Pass** by architecture (EKF source switch + STATUSTEXT) | **Pass** by architecture (companion suppresses spoofed-GPS contribution upstream) | Pass-by-arch but Fails AC-1.4 |
|
||||
| AC-4.3 (FC accepts the chosen messages) | **Pass** | **Pass** (default build, `USE_GPS_PROTO_MSP` on) | **Pass** but Fails AC-1.4 — discard |
|
||||
| Restriction "Supported FCs: ArduPilot, iNav (both via standard MAVLink)" | **Pass** | **Fail** of "via standard MAVLink" — restriction's literal wording is incorrect because iNav has no inbound MAVLink external-positioning. The restriction must be revised to "ArduPilot via MAVLink GPS_INPUT; iNav via MSP2_SENSOR_GPS". | n/a |
|
||||
|
||||
### Required AC / Restrictions edits flagged for user review
|
||||
|
||||
1. **AC-4.3** — current text says "the standard external-positioning message type(s) accepted by ArduPilot and iNav". Reality: no single message type is accepted by both. **Proposed revision** (outcome-shaped, IEEE-830-style): "WGS84 coordinates are delivered to each supported FC via that FC's documented external-positioning interface — MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav. Honest covariance is carried in the field each FC uses for outlier rejection (under-reported covariance is a defect — see AC-NEW-4). Source-label semantics per AC-1.4 are emitted out-of-band (FC-appropriate STATUSTEXT / NAMED_VALUE_FLOAT / equivalent)."
|
||||
2. **Restriction "Communication protocol (pinned): MAVLink for both FC and GCS"** — incorrect for iNav. **Proposed revision**: "Communication protocol: MAVLink for ArduPilot Plane and for QGroundControl GCS; MSP2 for iNav (UART or USB transport). MAVLink remains the GCS-facing protocol for both FCs." (iNav still emits MAVLink telemetry outbound to QGC; this is preserved.)
|
||||
3. **AC-NEW-2** — keep numerical budget (<3 s p95) but split per-FC validation: ArduPilot validation = SITL round-trip of `MAV_CMD_SET_EKF_SOURCE_SET` from companion under spoof injection; iNav validation = companion-internal reaction time (companion-only metric — iNav doesn't participate).
|
||||
4. **AC-NEW-8** — language "fix-quality 2D fix or worse when covariance > 100 m" maps to `GPS_INPUT.fix_type` for AP. iNav's `fixType` enum mapping (per `gpsFixType_e` in iNav's enums-reference) must be confirmed at design time before this AC is testable on iNav.
|
||||
|
||||
### Open follow-up probes (deferred to SQ8 + design phase, NOT blocking SQ6 closure)
|
||||
|
||||
- **(SQ8)** Confirm the precise MAVLink message + field set ArduPilot exposes for spoofing/jamming integrity reports (PR #2110 merged, but `GPS_RAW_INT` in current published common.xml shows no spoofing bits — likely lives in a sibling message such as `GPS_INTEGRITY`). This is the FC→companion direction needed for AC-NEW-2's input side and AC-3.5's spoofing detection.
|
||||
- **(SQ8)** UBX-emulation minimum NAV-* subset for iNav 9.0 (UBX ≥ 15.00). Authoritative inputs: U-blox protocol spec (cached) + iNav `gps_ublox.c` (cached). Output a "minimum companion-side UBX writer" definition.
|
||||
- **(design)** SITL parameter sets for both FCs for AC-NEW-2 / AC-NEW-8 validation. Out of research scope.
|
||||
- **(design)** Verify iNav nav-stack consumption of `MSP2_SENSOR_GPS.hPosAccuracy` for outlier handling (read `src/main/io/gps_msp.c` / `mspGPSReceiveNewData` in design phase, not research phase).
|
||||
|
||||
### Boundary check: this SQ6 is saturated for the architectural decision
|
||||
|
||||
Saturation signals observed: ArduPilot side covered by L1 docs + L1 source code; iNav side covered by L1 source code (master) + L1 wiki (edited 2025-12-11) + L1 release notes (8.0/9.0). Three independent rounds of search yielded the same architectural conclusion (no inbound external-positioning MAVLink on iNav). Last queries returned no novel facts. Per `references/source-tiering.md` "Search saturation rule" → SQ6 is closed pending the SQ8 follow-up probes above; user decision required on the AC/restriction edits before further architectural work.
|
||||
|
||||
---
|
||||
|
||||
## SQ1 — Existing / competitor GPS-denied UAV navigation systems
|
||||
|
||||
### Fact #11 — Twist Robotics OSCAR is a deployed Ukrainian peer system in the same architectural class as this project
|
||||
- **Statement**: Twist Robotics (Ukraine) has a fielded camera + map-matching navigation module called OSCAR (Optical System of Coordinates with Automatic Relocalisation). The vendor states the system "captures the terrain, identifies landmarks, compares them with a map, determines coordinates, and transmits them to the autopilot as a reliable GPS signal" — the same five-stage architecture this project is building. Vendor-stated specs: ≤20 m accuracy without cumulative error, day/night/fog operation, and operational deployment of "more than 500,000 km across 25,000 combat missions over 24 months". Hardware includes active cooling, indicating a non-trivial onboard compute (likely Jetson-class). **No public independent benchmark of the 20 m number.**
|
||||
- **Source**: Source #25, Source #26
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + AC owners (existence-of-peer evidence, not implementation guide)
|
||||
- **Confidence**: ✅ for "deployed at scale on Ukrainian combat platforms"; ⚠️ for "20 m accuracy" (vendor self-report); ❓ for "fully resistant to spoofing and jamming" (claim not independently verified)
|
||||
- **Related Dimension**: SQ1, SQ8 (anti-spoofing claim audit), SQ9 (synthesis — ours must beat or at least match this in the operational regime)
|
||||
- **Fit Impact**: **establishes feasibility floor** — a Ukrainian peer is operating a similar architecture against the same threat environment our system targets. Project framing must explicitly differentiate (e.g., 1 km AGL vs unspecified OSCAR altitude; 8 h endurance vs unspecified OSCAR endurance; AC-NEW-4 honest covariance contract vs OSCAR's unspecified covariance reporting).
|
||||
|
||||
### Fact #12 — Auterion Artemis is a production-shipping fixed-wing one-way attack drone with Ukraine-validated GPS-denied navigation, defining the production benchmark for this class
|
||||
- **Statement**: Auterion completed the US Defense Innovation Unit Artemis program in October 2025, delivering a Shahed-class deep-strike drone with up to 1,000-mile range and up to 40 kg warhead, running on **Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance**. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. Manufacturing is being established in US, UA, and DE; Auterion is offering the system to the US Department of War and allied nations.
|
||||
- **Source**: Source #31; Source #32 confirms Skynode S sibling architecture (NPU-equipped companion).
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects (production-pattern reference)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1 (closest commercial production peer), SQ9 (architecture template)
|
||||
- **Fit Impact**: **establishes production reference architecture** — companion-class autopilot + visual navigation + terminal guidance is shipping at production scale to a US defense customer. Implication: building a per-FC adapter (project decision in SQ6) is consistent with what production stacks already do; integrating against the Artemis architecture is realistic; competing on price + Ukraine-specific operational tuning + AC-NEW-4 honest-covariance contract is a viable differentiation.
|
||||
|
||||
### Fact #13 — Vantor Raptor is a production COTS visual-GPS-replacement software suite, demonstrating that "branded sat-tile basemap + on-drone vision software" is a viable commercial pattern
|
||||
- **Statement**: Vantor Raptor product family (Guide / Sync / Ace) provides vision-based GPS replacement using the drone's existing camera plus Vantor's "100 million-plus sq km of highly accurate 3D terrain data" (Vivid Terrain, vendor-stated 3 m accuracy). Vendor-demonstrated absolute accuracy: **<7 m in all dimensions** for aerial position (Guide), **<3 m** for ground coordinate extraction (Sync, Ace). Works at night and at low altitudes. Platform-agnostic, deployable on commodity hardware, integrates with existing onboard cameras. Inertial Labs has published a VINS-integrated Raptor Guide white paper. Recent partnerships: Niantic Spatial (Dec 2025) for unified air-to-ground positioning in GPS-denied areas; Maxar partnership with AIDC (Sep 2025) for Taiwan UAV resilience against GPS interference.
|
||||
- **Source**: Source #30
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Architecture / business decision-makers (build-vs-buy framing)
|
||||
- **Confidence**: ✅ for product existence + claimed accuracy bounds (vendor primary); ⚠️ for whether Vantor's commercial accuracy figures hold under the project's specific Ukrainian-steppe + active-conflict-tile-staleness conditions
|
||||
- **Related Dimension**: SQ1 (commercial), C2/C3 (commercial alternatives to building ourselves), SQ8 (basemap as a service vs offline cache)
|
||||
- **Fit Impact**: **build-vs-buy lens** — Raptor Guide's <7 m claim is *better* than the project's AC-1.1 budget (≤80 m / 95% under AC-1.1.1), so it's not a disqualifier on accuracy. Reasons we still build vs buy: (a) Vantor is a US vendor; export / dual-use licensing into the Ukrainian battlefield is uncertain; (b) restrictions specify offline cache from the project's own Azaion Suite Satellite Service (AC-2.x), not Vantor's Vivid Terrain — replacing the basemap is non-negotiable; (c) covariance honesty contract (AC-NEW-4) and source-label contract (AC-1.4) are project-specific and may not be exposed by Vantor's API. **Outcome**: keep Raptor as a competitive comparator in `solution_draft01`, NOT as a candidate component to integrate.
|
||||
|
||||
### Fact #14 — snktshrma/ngps_flight (NGPS — ArduPilot GSoC 2024) is the closest open-source pipeline match to this project's exact C1+C2+C3+C5+C8 stack
|
||||
- **Statement**: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: **`ap_ngps_ros2`** (visual geo-localization at 1–2 Hz by matching live camera frames to georeferenced satellite imagery using **LightGlue + SuperPoint**, deep-learning-based feature matching), **`ap_ukf`** (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates), **`ap_vips`** (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF (per related ArduPilot issue #23471, this is via `VISION_POSITION_ESTIMATE` requiring EKF source-set 2/3 with `EK3_SRC*_POSXY=Vision`). Project is published under ArduPilot's GSoC 2024 program. Sibling `ap_nongps` is an earlier OpenCV-based prototype.
|
||||
- **Source**: Source #33
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer
|
||||
- **Confidence**: ✅ for project existence, component breakdown, and matcher choice (LightGlue+SuperPoint); ⚠️ for runtime behaviour under our exact constraints (Jetson Orin Nano, 1 km AGL, 17 m/s, 3 fps); ❓ for production hardening / covariance honesty / spoof-defence (none documented)
|
||||
- **Related Dimension**: SQ1 (closest open-source peer), SQ2 (canonical pipeline confirmation), SQ3+SQ4 (architectural template for component candidate matrix), SQ6 (alternate AP transport debate)
|
||||
- **Fit Impact**: **architectural template** — confirms the project's split (C1 VIO ↔ C2/C3 visual absolute ↔ C5 fusion ↔ C8 FC adapter) is canonical, not novel. Two concrete deltas:
|
||||
1. **Transport choice on AP**: NGPS uses `VISION_POSITION_ESTIMATE`. SQ6 picked `GPS_INPUT` because it carries `horiz_accuracy` directly, supports source-set switching via `MAV_CMD_SET_EKF_SOURCE_SET`, and avoids EKF-source-set reconfiguration. The trade-off (NGPS's path vs SQ6's pick) must be re-examined at design time before final AP-transport selection.
|
||||
2. **Estimator choice**: NGPS uses UKF; SQ3/SQ4 will compare UKF vs ESKF vs MSCKF vs factor-graph (GTSAM) on the same matrix.
|
||||
|
||||
### Fact #15 — RGB satellite-image matching as a *low-altitude* (<25 m AGL) localization technique is unreliable per the SPRIN-D Challenge; our 1 km AGL operates in the regime where the same authors note it "works reasonably well"
|
||||
- **Statement**: The CTU Prague team's SPRIN-D winning paper directly states: *"Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes."* (referring to <25 m AGL). The paper's related-work review separately notes that *"high-altitude matching... works reasonably well, but at low altitudes (25 m) the viewpoint differs drastically, making roofs, facades, and vegetation inconsistent with satellite imagery."* The project operates at ≤1 km AGL — which is the *high-altitude* regime in the paper's terminology — making RGB sat-matching the appropriate technique class. The paper's CPU-only winning method (LiDAR heightmap-gradients + clustered particle filter) is **not** transferable to our hardware: our project has no LiDAR.
|
||||
- **Source**: Source #28
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer + Domain expert
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1, SQ5 (failure modes), SQ2 (canonical pipeline)
|
||||
- **Fit Impact**: **disambiguates a potentially-disqualifying lesson** — the CTU paper's "RGB sat-matching is unreliable" finding does NOT disqualify our approach because the failure was caused by low-altitude viewpoint mismatch, which our 1 km AGL regime does not have. This must be cited explicitly in `solution_draft01` to pre-empt the natural objection from anyone who reads the paper. Separately, the CTU paper's specific lessons are still binding: VIO degrades catastrophically without IMU vibration isolation; magnetometer is unreliable near steel/concrete; "ability to recover from periods of high uncertainty and re-localize" matters more than instantaneous RMSE — this last lesson is a direct architectural input for AC-NEW-2 / AC-NEW-8.
|
||||
|
||||
### Fact #16 — RTAB-Map and ORB-SLAM3 both fail beyond 1 km / above 2 m/s flight in the SPRIN-D environment; our cruise profile (≤17 m/s, kilometers between satellite anchors) explicitly excludes both as primary candidates
|
||||
- **Statement**: The SPRIN-D paper states: *"We tested state-of-the-art visual SLAM systems such as RTAB-Map and ORB-SLAM3 in a high-fidelity simulator, and found that both performance degraded significantly in a long-range scenario (beyond 1 km), as their memory and compute demands grow with the size of the environment. Moreover, RTAB-Map was unable to maintain quality odometry in faster flight speeds (beyond 2 m/s), while ORB-SLAM3 suffered from tracking loss in textureless areas."*
|
||||
- **Source**: Source #28
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer (component selection for C1)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1, SQ3+SQ4 component C1 (VO/VIO), SQ5 (failure modes)
|
||||
- **Fit Impact**: **prunes the C1 candidate landscape** — RTAB-Map and ORB-SLAM3 should not be pursued as C1 leads. Plausible C1 leads remain: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline (KLT + RANSAC homography). NGPS (Fact #14) uses `ap_vips` = OpenVINS-class VIO — confirming an aligned community choice. Final C1 selection happens in SQ3+SQ4.
|
||||
|
||||
### Fact #17 — DSMAC + TERCOM lineage: pre-cached scene matching for downward-looking navigation is a 40+ year deployed technique class with documented sub-10 m terminal accuracy
|
||||
- **Statement**: DSMAC (Digital Scene Matching Area Correlator) is an autonomous missile-guidance system based on area correlation of sensed downward-camera ground scenes against pre-stored reference imagery (often satellite reconnaissance). It achieves 3–10 m terminal accuracy by correlating buildings, road intersections, and distinctive terrain landmarks. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight + DSMAC for terminal guidance reduces CEP from ~30 m to "only meters". Documented combat record: 1991 Gulf War, >80% of 280 launched Tomahawks hit target. Recent miniaturisation: Destinus Ruta (300 km strike-class) is integrating UAV Navigation's (Spanish, Grupo Oesía) DSMAC-class system, validated in Ukrainian combat conditions including GNSS-denied / jamming / spoofing.
|
||||
- **Source**: Source #36, Source #27
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Domain expert + Decision-maker
|
||||
- **Confidence**: ✅ for the lineage and Tomahawk performance numbers (DTIC + open-source); ⚠️ for the Ruta-specific "DSMAC operating principle" inference (Defense Express analyst inference, not vendor disclosure)
|
||||
- **Related Dimension**: SQ1 (lineage), SQ8 (baseline accuracy expectations for AC-1.1.1 80 m / AC-NEW-4 false-position budget)
|
||||
- **Fit Impact**: **establishes baseline accuracy expectations** — the technique class has documented sub-10 m accuracy in the cruise-missile-terminal regime. Our budget (AC-1.1.1: <80 m at 1 km AGL with ≥0.5 m/px tiles) is loose by comparison, indicating that the AC budget is *not* aggressive against the technique-class baseline — it is aggressive against the Jetson Orin Nano + 8-h-continuous + 25 W envelope. **Implication for AC-NEW-4**: claiming P(error >500 m) <0.1% per flight is consistent with the DSMAC-lineage class; an honestly-reported failure rate at this level is realistic, not unprecedented.
|
||||
|
||||
### Fact #18 — Hierarchical Image Matching (arXiv 2506.09748, June 2025) is a current academic SOTA pipeline for our exact problem, but uses DINOv2 — a heavyweight foundation model that must be benchmarked under our 25 W / 8 GB Jetson envelope before any selection
|
||||
- **Statement**: 2025 academic SOTA pipeline structure: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation); (2) Semantic-Aware and Structure-Constrained Matching Module (SASCM) using **DINOv2** features + 4D correlation tensor + SoftMNN + 4D conv; (3) lightweight fine-grained matching module for pixel-level. Constructs UAV absolute visual localization without VIO/relative-localization dependence (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV dataset claims superior accuracy under cross-source and cross-temporal variation.
|
||||
- **Source**: Source #29
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer + Domain expert
|
||||
- **Confidence**: ✅ for pipeline structure and method; ⚠️ for "superior" claim (single-paper benchmark; AerialExtreMatch evaluates 16 methods with broader rigor — Source #34 is the better cross-method ranker); ❓ for Jetson-Orin-Nano runtime (no published number)
|
||||
- **Related Dimension**: SQ1 (academic SOTA), C2 (VPR), C3 (cross-domain registration), SQ5 (foundation-model-on-Jetson failure mode)
|
||||
- **Fit Impact**: **academic-SOTA snapshot, candidate template** — the retrieval → semantic-aware coarse → fine-grained pipeline is a candidate template for our C2+C3, but DINOv2 introduces a Jetson-deployment risk that must be quantified before commitment. Candidate-level decision: include DINOv2-based pipelines (AnyLoc, BoQ, this paper's SASCM) in the C2/C3 candidate matrix with mandatory MVE on Jetson Orin Nano under our exact frame size and 3 fps cadence. Reject DINOv2 if total inference latency cannot be brought under (400 ms - other-stages budget) at INT8 / fp16. Per Source #28 lesson, classical matchers (LightGlue+SuperPoint as in NGPS) should also be in the matrix as the "simple baseline / known-Jetson-runnable" option.
|
||||
|
||||
### Fact #19 — AerialExtreMatch (2025) is the academic benchmark our C2+C3 candidate matrix must publish numbers against, with 32 difficulty-stratified cells exposing exactly the cross-source / cross-pitch / cross-scale failure modes our project will face
|
||||
- **Statement**: AerialExtreMatch publishes (a) 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); (b) ~30,000 evaluation pairs in **32 difficulty levels** stratified by overlap (4 bins: <20%, 20–40%, 40–60%, >60%), pitch difference (4 bins: 50–55°, 55–60°, 60–65°, 65–70°), and scale variation (2 bins: 1–2×, >2×); (c) a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. The benchmark evaluates 16 representative detector-based and detector-free image matching methods.
|
||||
- **Source**: Source #34
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Domain expert + Implementer
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1 (academic landscape), SQ7 (datasets), C2 (VPR), C3 (cross-domain registration)
|
||||
- **Fit Impact**: **defines the C2/C3 evaluation matrix** — every C2/C3 candidate going into `solution_draft01` must report numbers on AerialExtreMatch's 32 difficulty cells, with at least the high-pitch (65–70°) and high-scale (>2×) cells representing our worst-case (UAV vs satellite tile geometry mismatch + ortho-rectification residual). The dataset's real-world UAV-localization split with both UAV-orthomosaic AND satellite-map references mirrors our project's offline-cache-tile semantics directly.
|
||||
|
||||
### Fact #20 — DARPA FLA + USAF SBIR establish the US-defense-program tailwind, but do not directly validate the project's specific regime (fixed-wing, ~1 km AGL, sat-tile basemap, 8-h endurance)
|
||||
- **Statement**: DARPA Fast Lightweight Autonomy (FLA) program ran 2015–2018 (Phase 1 Florida 2017; Phase 2 Georgia 2018; complete). Focused on small quadcopter autonomy at ≤20 m/s through cluttered indoor/outdoor environments using onboard cameras + LIDAR + sonar + IMU, no GPS / datalink / pilot. A 2025 retrospective (arXiv 2504.08122) reviews FLA testing methodology and Phase 1 results. A 2025 USAF SBIR Phase II solicitation (Sweetspot ID `7946c818-409f-5b31-8f06-554466071d83`) is requesting visual position and navigation capability for sUAS in GPS-denied environments — confirming the regulatory + funding environment is currently active for this category in 2025.
|
||||
- **Source**: Source #35
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Decision-maker + Domain expert
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1 (defense-program lineage)
|
||||
- **Fit Impact**: **context only, no direct candidate gain** — FLA pre-dates the project's specific regime by 8 years, focused on a different platform (multirotor) and altitude (low-altitude obstacle avoidance, not 1 km AGL nadir-camera satellite-anchor). Useful only to establish lineage and context. The USAF SBIR datapoint is more directly relevant: confirms that an active US-defense-funded need exists for sUAS visual position + navigation in GPS-denied environments — i.e., the project's market exists outside Ukraine.
|
||||
|
||||
---
|
||||
|
||||
## SQ1 — Conclusions (working summary, will be re-checked at Step 7.5)
|
||||
|
||||
### Existing-systems landscape (5 named-and-evidenced peer / adjacent systems)
|
||||
|
||||
| System | Class | Operational regime | Closest match dimension | Closest mismatch dimension | Status as evidence |
|
||||
|---|---|---|---|---|---|
|
||||
| **Twist Robotics OSCAR** (UA) | Deployed Ukrainian peer | Combat-deployed, fixed-wing-class, GPS-denied vision-nav | **Same architecture, same threat environment** | Altitude / endurance / FC / accuracy contract not publicly specified | Closest peer for "feasibility floor" |
|
||||
| **Auterion Artemis** | Production COTS one-way attack drone | Shahed-class, 1000-mile range, 40 kg warhead, Ukraine-validated GPS-denied nav | Same architectural pattern (Skynode + Visual Navigation + terminal guidance) | One-way attack vs reusable; no covariance/source-label contract published | Closest production reference architecture |
|
||||
| **Vantor Raptor (Guide / Sync / Ace)** | Production COTS software suite | Vision-based GPS replacement on existing drone camera + Vivid Terrain 3D basemap | Visual-position software pattern | Vendor-managed sat-tile basemap is not the project's Azaion Suite Satellite Service; no AC-NEW-4 / AC-1.4 contract | Closest commercial peer for "build-vs-buy" framing |
|
||||
| **snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024)** | Open-source research prototype | LightGlue+SuperPoint+UKF+`VISION_POSITION_ESTIMATE` to AP | **Same component split, same FC family** | GSoC prototype, not production; no spoof defence; no covariance honesty | **Closest open-source pipeline match — explicit architectural template** |
|
||||
| **CTU Prague SPRIN-D winner** | Academic / competition | Multirotor, ≤25 m AGL, LiDAR + heightmap gradient + particle filter on CPU | "Recover-from-uncertainty > low-instantaneous-RMSE" lesson; VIO discipline | LiDAR-required, low-altitude regime, no sat-tile basemap | Architectural-pattern reference + cautionary tale |
|
||||
| **Destinus Ruta + UAV Navigation** | Production miniaturised cruise missile | 300 km strike, DSMAC-class, Ukraine-combat-validated | Pre-cached basemap + visual matching + autopilot ingestion | One-way attack, terminal guidance, no covariance contract | Shows DSMAC-class miniaturised into UAV tier |
|
||||
|
||||
### Per-perspective coverage
|
||||
|
||||
| Perspective | Facts supporting | Saturation status |
|
||||
|---|---|---|
|
||||
| **Implementer / Engineer** | Fact #14 (NGPS), Fact #16 (SLAM failure modes), Fact #18 (DINOv2 risk) | Saturated for SQ1 — deeper component-level deep-dives go to SQ3/SQ4 |
|
||||
| **Practitioner / Field (Ukraine)** | Fact #11 (OSCAR), Source #37 (~70% UAV losses to EW), Source #27 (Ruta + UAV Navigation Ukraine combat validation) | Saturated for SQ1 |
|
||||
| **Domain expert / Academic** | Fact #18 (Hierarchical Matching SOTA), Fact #19 (AerialExtreMatch benchmark), Fact #15 (SPRIN-D regime distinction) | Saturated for SQ1 — academic SOTA benchmarking handed off to SQ3/SQ4 + SQ7 |
|
||||
| **Contrarian / Devil's advocate** | Fact #15 (low-altitude RGB matching unreliable lesson), Fact #16 (RTAB-Map / ORB-SLAM3 disqualified), Fact #18 (DINOv2-on-Jetson risk) | Saturated for SQ1 |
|
||||
| **Decision-maker / Business** | Fact #12 (production-ready Auterion), Fact #13 (commercial Vantor build-vs-buy framing), Fact #20 (USAF SBIR market context) | Saturated for SQ1 |
|
||||
|
||||
### Architectural conclusions for `solution_draft01`
|
||||
|
||||
1. **Build-vs-buy stance**: build. Vantor Raptor and Auterion Visual Navigation are commercially superior on hardening + integration but neither exposes the covariance honesty contract (AC-NEW-4) nor uses the project-specified Azaion Suite Satellite Service tile cache (AC-2.x); both are dual-use export risks for the Ukrainian battlefield. NGPS (Fact #14) is the open-source architectural template to learn from but is a GSoC research prototype lacking production hardening, spoof defence, and the covariance-honesty contract. Architectural conclusion: build with NGPS as the template, with project-specific contracts (AC-NEW-4, AC-1.4, AC-NEW-7) and per-FC adapter (SQ6 conclusion) layered on top.
|
||||
2. **Differentiation from OSCAR (Twist Robotics)** must be made explicit in `solution_draft01`: (a) honest covariance contract per AC-NEW-4; (b) explicit `{satellite_anchored, visual_propagated, dead_reckoned}` source-label contract per AC-1.4; (c) AC-NEW-7 cache-poisoning safety budget on tile write-back; (d) ArduPilot Plane + iNav both supported per project's revised AC-4.3.
|
||||
3. **Pipeline canonicalness**: the C1+C2+C3+C4+C5+C8 split is canonical (NGPS + the 2025 hierarchical-matching paper + SPRIN-D winner all use the same shape; only the specific algorithm choices differ). SQ2 will sanity-check this against one more pipeline-survey paper, but this is essentially a low-risk question now.
|
||||
4. **Component-pruning** carried into SQ3/SQ4:
|
||||
- C1: **prune RTAB-Map and ORB-SLAM3** as primary candidates per Fact #16. Carry: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline.
|
||||
- C2/C3: **mandatorily benchmark** any DINOv2-based candidate (AnyLoc, BoQ, SASCM-style) against AerialExtreMatch at our pitch / scale / overlap regime AND against Jetson Orin Nano latency budget (per Fact #18). Maintain LightGlue+SuperPoint as the "simple-baseline / known-Jetson-runnable" option per NGPS precedent.
|
||||
- C8 transport: NGPS uses `VISION_POSITION_ESTIMATE`. SQ6 picked `GPS_INPUT`. Re-examine the trade-off in design phase, but SQ6's selection stands for the research draft.
|
||||
5. **Lessons from SPRIN-D winner that must propagate to `solution_draft01`**:
|
||||
- "Ability to recover from periods of high uncertainty and re-localize" > "low instantaneous RMSE" — directly informs AC-NEW-2 / AC-NEW-8.
|
||||
- VIO requires mechanically-decoupled IMU; this is a hardware-integration constraint, not a software issue.
|
||||
- Magnetometer is unreliable near steel/concrete; sensor fusion of heading sources is essential.
|
||||
- "No single sensor can be fully relied upon" — directly supports our IMU+camera+sat-tile multi-source posture.
|
||||
|
||||
### Open follow-ups (deferred to later sub-questions)
|
||||
|
||||
- **(SQ8)** Independent verification of OSCAR's "fully resistant to spoofing/jamming" claim — if available. Otherwise, Twist Robotics's claim remains a vendor-only signal.
|
||||
- **(SQ8)** Vantor Raptor and Auterion Visual Navigation's covariance reporting behaviour — for benchmarking AC-NEW-4 compliance.
|
||||
- **(SQ3+SQ4 / C2)** AnyLoc / BoQ / DINOv2-VLAD / MixVPR / EigenPlaces / NetVLAD on AerialExtreMatch for cross-source aerial — already in C2 search plan; SQ1 just confirmed they're the right candidate set.
|
||||
- **(SQ3+SQ4 / C3)** LightGlue / LoFTR / RoMa / DKM / MASt3R + classical SIFT+RANSAC + XFeat on AerialExtreMatch — already in C3 search plan; SQ1 confirms shape.
|
||||
- **(SQ7)** AerialExtreMatch + AerialVL + CS-UAV + RealUAV/SAVL + UAV-VisLoc as the dataset shortlist for our cross-validation — confirmed by SQ1 hits.
|
||||
|
||||
### Boundary check: SQ1 is saturated
|
||||
|
||||
Saturation signals observed: 4 perspectives saturated, ≥3 high-confidence facts per perspective, last 3 search rounds (Anduril Iris detail probe, ArduPilot prior-art probe, DSMAC lineage probe) yielded only one new substantive datapoint (NGPS) and confirmed already-known patterns. No unresolved contradictions. Per `references/source-tiering.md` "Search saturation rule" → SQ1 is closed.
|
||||
|
||||
---
|
||||
|
||||
## SQ2 — Canonical pipeline decomposition (sanity-check)
|
||||
|
||||
### Fact #21 — The canonical pipeline for offline-cache visual geo-localization is two-stage: global VPR retrieval, then local alignment (image matching → pose)
|
||||
- **Statement**: Source #38 (Skoltech aerial-VPR survey) defines the field's canonical pipeline verbatim: "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: global localization (or Visual Place Recognition, VPR) and local alignment. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Source #42 (NUDT 2026 absolute-VL survey) names the same shape "**retrieval → matching → pose-estimation hierarchical framework**" and explicitly contrasts it against three rejected alternatives: (a) relative-only VIO/SLAM (cumulative error), (b) end-to-end direct localization (poor generalization), (c) map-free localization (scene-dependent). Source #39 (U.Maine cross-view survey) traces the same lineage from 2003 pixel-wise template-matching → 2013 hand-engineered features → 2017 CNN/triplet-loss → 2018+ Siamese/GAN → 2022+ Transformer → 2023 DINOv2-class. Source #41 (AnyVisLoc benchmark) implements this hierarchy as: image retrieval (rough) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC, with **Top-N re-rank by inlier count** as a critical fourth stage between matching and pose.
|
||||
- **Source**: Source #38, Source #39, Source #41, Source #42
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Architects of `solution_draft01`
|
||||
- **Confidence**: ✅ (four independent surveys/benchmarks converge)
|
||||
- **Related Dimension**: SQ2, C2 (VPR), C3 (cross-domain matching), C4 (pose estimation)
|
||||
- **Fit Impact**: **confirms** the project's C1–C10 decomposition is canonical for the **C2 → C3 → C4** chain. The component split is not novel; the project's contribution is the **integration discipline** (covariance honesty AC-NEW-4, source-label contract AC-1.4, offline-cache safety AC-NEW-7) layered on top. **Augment** the existing decomposition with an explicit "Top-N re-rank by inlier count" stage between C3 and C4 (currently implicit).
|
||||
|
||||
### Fact #22 — AdHoP (Adaptive Homography Preconditioning) is a method-agnostic post-matching refinement loop that improves translation accuracy by ~30% average and up to 63% for previously-underperforming methods, at the cost of a second matching pass
|
||||
- **Statement**: Source #40 (OrthoLoC benchmark, Sep 2025): from initial 2D-2D query↔orthophoto correspondences, estimate a homography H via DLT+RANSAC, warp the orthophoto with H to better match the query's perspective (reducing residual perspective gap), re-match in this warped frame, then map the new correspondences back to the original orthophoto via H⁻¹, lift to 3D using DSM, and run PnP+RANSAC + Levenberg-Marquardt refinement. Accept the AdHoP-refined pose only if reprojection error decreases vs. the non-refined pose. **Quantitative effects** (16,425 images, 47 locations, 1m-1° threshold): GIM+DKM 75.4% recall (best); AdHoP-refined methods see ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods AdHoP yields up to 95% matching improvement (XFeat*) or 63% translation reduction (DKM); for RoMa, AdHoP lifts 1m-1° recall by +23 points (54.6% → 77.6%-class). **Cross-domain regime** (war-zone-equivalent: scene change between query and reference): translation error increases ~3× when only the visual modality differs, ~7× when both visual and structural (DSM) gaps exist (0.16 m → 1.12 m for GIM+DKM+AdHoP). **Method-agnostic** — works on top of any 2D-2D matcher.
|
||||
- **Source**: Source #40
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C3/C4 implementers
|
||||
- **Confidence**: ✅ for headline numbers (single-paper, but published dataset + open code + reproducible per repo)
|
||||
- **Related Dimension**: SQ2 (new sub-stage), C3 (matcher), C4 (pose), SQ5 (cross-domain failure mode)
|
||||
- **Fit Impact**: **adds a new sub-stage** between C3 and C4. Decision for `solution_draft01`: include AdHoP-class refinement as an **optional** stage gated on Jetson Orin Nano latency budget — if (single-pass match latency × 2) + homography estimation + reprojection check fits under (400 ms - other-stages), include it; otherwise reserve as offline-replay-time refinement. Cross-domain 3× translation-error penalty is a **direct AC-NEW-4 calibration input** — companion-side covariance must inflate proportionally when scene-change detection (deferred to SQ8) flags a stale tile.
|
||||
|
||||
### Fact #23 — 6-DoF aerial-to-satellite localization requires DSM (Digital Surface Model) elevation data; without DSM, the system collapses to 3-DoF (position + 1 rotation) or must compute attitude purely from IMU/VIO
|
||||
- **Statement**: Source #40 OrthoLoC explicitly: "Our pipeline matches the query image with the DOP, lifts the matched 2D points in DOP to 3D using the DSM, and then estimates the camera pose using PnP and RANSAC." Without the DSM lift, the matcher produces 2D↔2D correspondences that constrain a homography (which encodes 3-DoF for a planar scene + planar camera) but **not** the full 6-DoF camera pose. Source #41 AnyVisLoc independently confirms by measuring: aerial-photogrammetry map (with paired DSM at 0.94 m/px) achieves 74.1% A@5m; satellite map (with ALOS 30 m DSM) achieves only 18.5% A@5m — a 4× accuracy collapse driven by DSM coarseness. The project's offline cache from the Azaion Suite Satellite Service is currently specified as **2D ortho tiles only** (no DSM commitment in restrictions.md or AC). **Three architectural responses** are available: (a) **3-DoF acceptance** — fix attitude from IMU/VIO, treat the matcher output as a homography-only constraint, ignore DSM; sacrifices the up-to-2× higher accuracy reported when DSM is present, but stays within current cache contract; (b) **Request DSM tiles from the Suite Sat Service** — adds C2 cache schema work + a Suite Sat Service contract change; preserves 6-DoF accuracy; (c) **IMU/VIO-only attitude + 2D-2D matching translation** — same as (a) but explicitly contracts the IMU/VIO module to provide attitude with σ ≤ 5° (per Fact #24); operationally identical to (a), differs only in how the contract is written.
|
||||
- **Source**: Source #40, Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + Suite Sat Service stakeholder + AC owner
|
||||
- **Confidence**: ✅ for the architectural claim; ✅ for the 4× accuracy collapse number
|
||||
- **Related Dimension**: SQ2 (decomposition), C2 (cache schema), C3 (matcher output contract), C4 (pose), C5 (estimator), C6 (IMU/VIO contract), AC-1.1 / AC-1.1.1 (accuracy budget)
|
||||
- **Fit Impact**: **architectural decision required, surfaced for user.** The current restrictions.md (no DSM commitment) implicitly forces option (a) or (c). The accuracy budget AC-1.1.1 (≤80 m at 1 km AGL) is loose enough that 3-DoF + IMU-attitude almost certainly satisfies it on a per-frame basis (per Fact #21 and DSMAC-class lineage in Fact #17), but **requires explicit acknowledgement** in the architecture before commitment. **Proposed default** for `solution_draft01`: option (c) — fix attitude from IMU/VIO with documented σ ≤ 5° contract on yaw, σ ≤ 5° on pitch (per Fact #24), translation from 2D-2D matching + camera pose. Flag option (b) as a "Suite Sat Service follow-up" if 6-DoF accuracy ever becomes a hard requirement.
|
||||
|
||||
### Fact #24 — IMU-derived yaw and pitch priors with σ ≤ 5° are required for the matching+PnP stack to hit benchmark accuracy; σ ≥ 10° causes 2–4% A@5m drops, σ ≥ 30° causes ≥4% drops, σ ≥ 60° causes 25.7% drops
|
||||
- **Statement**: Source #41 AnyVisLoc systematically perturbs yaw and pitch priors and measures localization accuracy collapse. Yaw: σ = 5° → no impact; σ = 10° → −1.9% A@5m; σ = 30° → −4.1%; σ = 50° → −13.7%; σ = 60° → −25.7%. Pitch: σ < 5° → no impact; σ ≥ 7° → 1–5% drops. The benchmark is conducted at low altitude (30–300 m AGL) with 20–90° pitch range; lessons transfer to our 1 km AGL nadir-camera regime in the **direction** but the magnitudes may be lower at 1 km AGL because nadir geometry is less yaw-sensitive than oblique. Conservatively adopting the benchmark numbers gives a hard contract: **IMU/VIO must deliver yaw with σ ≤ 5° and pitch with σ ≤ 5° to the matcher** (1σ, not 95%, since the benchmark is single-σ). Pitch is naturally tighter on a nadir-fixed camera (mechanically constrained); yaw is the binding constraint and is the typical IMU/magnetometer failure mode (per SPRIN-D lesson Fact #15).
|
||||
- **Source**: Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 (VIO) implementer + C5 (estimator) implementer
|
||||
- **Confidence**: ✅ for the AnyVisLoc numbers; ⚠️ for direct transfer to 1 km AGL nadir regime (magnitudes likely smaller at our altitude/pitch — direction is conservative)
|
||||
- **Related Dimension**: SQ2 (sensor-prior contract), C1 (VIO output contract), C5 (estimator), C6 (IMU)
|
||||
- **Fit Impact**: **architectural contract** for `solution_draft01`: the C1 module's published contract to the C2/C3 stack is yaw σ ≤ 5° AND pitch σ ≤ 5°. Magnetometer-only yaw is **insufficient** by the SPRIN-D lesson (Fact #15) — VIO must contribute. **Adds a constraint** that flows back to the C6 IMU integration: IMU mechanical isolation per SPRIN-D Fact #15 is required; magnetometer + GPS-yaw startup alignment at the airbase (before take-off, while real GPS is healthy) is part of the boot sequence.
|
||||
|
||||
### Fact #25 — Top-N re-ranking by inlier count is the dominant accuracy/cost trade-off; pure-matching-without-retrieval is catastrophic (A@5m collapses from 62.2% to 34.3% with the same matcher)
|
||||
- **Statement**: Source #41 AnyVisLoc and Source #38 Skoltech survey both quantify the value of retrieval as a search-space reducer for matching. Source #41 explicitly: "Top-N re-rank by inlier count is the best accuracy/cost trade-off" → 62.2% A@5m at 0.8 s/frame on RTX 3090. **Without retrieval** (pure exhaustive matching against the cache): 34.3% A@5m — i.e., almost **half** the accuracy at infeasible compute. Source #38 measures sparse-VPR re-ranking specifically: AnyLoc descriptor + SuperGlue re-rank on top-100 candidates = 15–25 s/frame on RTX 3090 (catastrophic for our 400 ms budget); LightGlue re-rank ≈ 1 s/frame (still over budget); SelaVPR re-rank < 0.1 s/frame (in-budget on RTX 3090, must be re-tested on Jetson Orin Nano). **Re-ranking budget** = (frame budget) − (descriptor extraction) − (initial top-N retrieval) − (matcher pose estimation) − (AdHoP if included).
|
||||
- **Source**: Source #38, Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C2 implementer
|
||||
- **Confidence**: ✅ (two-source convergence on the qualitative claim; quantitative numbers are RTX-3090-specific and must be Jetson-MVE'd)
|
||||
- **Related Dimension**: SQ2 (pipeline structure), C2 (VPR), C3 (matcher), SQ3+SQ4 (Jetson MVE)
|
||||
- **Fit Impact**: **mandates** Top-N re-rank by inlier count as a stage in `solution_draft01`. Trade-off Top-N value (typical N=5–20 in literature) goes to SQ3+SQ4 candidate matrix, not SQ2.
|
||||
|
||||
### Fact #26 — High-accuracy SOTA models (AnyLoc + SuperGlue + RoMa-class) are NOT viable on Jetson Orin Nano under the 400 ms p95 budget; lightweight VPR (MixVPR / SALAD / SelaVPR-class) + lightweight matchers (LightGlue / XFeat-class) are the only candidates that survive a basic latency pre-screen
|
||||
- **Statement**: Two independent runtime measurements on RTX 3090 (≥10× faster than Jetson Orin Nano in dense matrix ops): Source #38 — AnyLoc descriptor calculation 0.37–0.84 s/frame (huge ViT-G DINOv2); SuperGlue re-rank 15–25 s/frame on top-100; LightGlue re-rank ~1 s/frame; SelaVPR re-rank < 0.1 s/frame. Source #41 — RoMa dense matcher 659 ms/frame; SP+LightGlue+GIM sparse 105 ms/frame; ratio = 6.3×. **Memory**: AnyLoc descriptors = 2.3–13.9 GB for 4–7k tiles (out of 8 GB Jetson Orin Nano envelope before model weights); SelaVPR descriptors < 0.2 GB. Pre-screen conclusion: AnyLoc / SuperGlue / RoMa-class are **disqualified** on the Jetson Orin Nano at 3 fps unless heavy quantization (INT8) reduces them ≥10×, which is not yet established for our latency target on this hardware. Surviving candidates from the literature: **VPR**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD-class; **matchers**: LightGlue, XFeat, XFeat*, SP+LightGlue. **Disqualification is preliminary** — final go/no-go happens at SQ3+SQ4 with on-Jetson MVE per `references/mode-A-mve-rules.md`.
|
||||
- **Source**: Source #38, Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: C2 + C3 implementer; SQ3+SQ4 candidate-matrix author
|
||||
- **Confidence**: ✅ for RTX-3090 numbers; ⚠️ for direct Jetson translation (Jetson Orin Nano AI score is well-published; ratio is conservative)
|
||||
- **Related Dimension**: SQ2 (Jetson budget feasibility), SQ3+SQ4 (candidate pre-screen), SQ5 (foundation-model-on-edge failure mode), C2, C3, C7 (Jetson runtime)
|
||||
- **Fit Impact**: **prunes the SQ3+SQ4 candidate matrix BEFORE expensive Jetson MVE.** Candidates entering SQ3+SQ4 with mandatory Jetson MVE: (C2 VPR) MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD; (C3 matcher) LightGlue, XFeat, XFeat*, SP+LightGlue. Candidates that need Jetson INT8 quant before they earn an MVE slot: AnyLoc, BoQ, DINOv2-VLAD (must demonstrate INT8 build path with vendor-validated accuracy preservation). Candidates pruned outright: RoMa dense, SuperGlue, MASt3R (latency).
|
||||
|
||||
### Fact #27 — A 20% covisibility floor between query frame and reference tile is required for localization to succeed; below it, ALL methods fail regardless of matcher quality
|
||||
- **Statement**: Source #40 OrthoLoC: "When the covisibility between the UAV image and the orthographic geodata is too small (less than ~20%), the localization fails for all methods regardless of matcher quality." This is a geometric floor, not a method-specific limit. The implication for the project: any tile-cache design that allows a query to fall outside 20% covisibility with the **best available** cached tile must also include a **runtime covisibility-check + graceful degrade** to `visual_propagated` mode (per AC-1.4 source label). This is a runtime condition, not a one-time setup parameter.
|
||||
- **Source**: Source #40
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: C2 (cache scheduler) + C5 (estimator) + AC-1.4 owner
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ2 (boundary condition), C2 (tile cache), C5 (estimator state machine), AC-1.4
|
||||
- **Fit Impact**: **adds a runtime invariant** to `solution_draft01`: tile selection must guarantee ≥20% covisibility OR explicitly emit the `visual_propagated` source label per AC-1.4 with covariance widened per AC-NEW-4. This becomes a hard constraint on the C2 cache schema (must support tile-extent metadata) and a runtime check before invoking C3 matcher.
|
||||
|
||||
---
|
||||
|
||||
## SQ2 — Conclusions (working summary, will be re-checked at Step 7.5)
|
||||
|
||||
### Pipeline-component coverage table (existing C1–C10 vs. survey-listed components)
|
||||
|
||||
| Survey/benchmark canonical stage | Project component (current) | Coverage status | Required action |
|
||||
|---|---|---|---|
|
||||
| Image retrieval (global VPR) | **C2 — Visual Place Recognition** | ✅ covered | No change |
|
||||
| Re-ranking (top-N inlier-based) | (currently implicit, inside C2 or C3) | ⚠️ implicit | **Promote to explicit sub-stage** (`C2.5` or `C3.0`) in `solution_draft01` |
|
||||
| Local image matching (2D-2D, sparse or dense) | **C3 — Cross-domain registration** | ✅ covered | Add Top-N re-rank-by-inlier-count requirement |
|
||||
| AdHoP-style perspective preconditioning | (not represented) | ❌ missing | **Add as optional sub-stage** between C3 and C4, gated on Jetson latency budget |
|
||||
| 2D-3D lift via DSM | (not represented; current cache is 2D ortho only) | ❌ architectural decision required | **Decision required from user** — see below |
|
||||
| Pose estimation (PnP + RANSAC + LM) | **C4 — Pose estimation** | ✅ covered | No change |
|
||||
| State estimator / fusion (UKF / ESKF / MSCKF / factor graph) | **C5 — Estimator / fusion** | ✅ covered | Augmented with covariance-honesty contract from AC-NEW-4 |
|
||||
| IMU + VIO contract | **C1 — VO/VIO** + **C6 — IMU integration** | ✅ covered | Add yaw σ ≤ 5°, pitch σ ≤ 5° hard contract from Fact #24 |
|
||||
| Tile cache + scheduler | **C2 — VPR tile cache** + **C9 — Cache hygiene** | ✅ covered | Add 20% covisibility runtime invariant (Fact #27) |
|
||||
| Anti-spoof / source-switch | **C7 — Spoof detection** + **C8 — FC adapter** | ✅ covered | Already addressed in SQ6 |
|
||||
| Health monitoring / safety | **C10 — Safety / health monitoring** | ✅ covered | Already addressed |
|
||||
|
||||
### Architectural decisions surfaced (require user resolution before SQ3+SQ4 starts)
|
||||
|
||||
1. **DSM dependency on the Suite Sat Service tile cache** (per Fact #23). Three options:
|
||||
- **(a) 3-DoF acceptance** — accept that without DSM, only position is recovered from matching; attitude is fixed by IMU/VIO with no satellite-tile cross-check. Lowest project scope. Requires AC budget verification (likely passes AC-1.1.1).
|
||||
- **(b) Request DSM tiles** — Suite Sat Service contract change. Highest accuracy. Adds ~1 cycle to delivery. Recommended if 6-DoF accuracy ever becomes a hard AC.
|
||||
- **(c) IMU/VIO-attitude + 2D-2D matching translation** — operationally identical to (a) but contracts the IMU/VIO module explicitly with σ ≤ 5° yaw / pitch (Fact #24).
|
||||
- **Recommended default**: **(c)** — explicit IMU/VIO contract; fall back to (b) if AC tightens.
|
||||
|
||||
2. **AdHoP refinement loop** (per Fact #22). Three options:
|
||||
- **(a) Always-on** — included in every frame; Jetson budget must accommodate 2× matching latency.
|
||||
- **(b) Conditional** — only when initial reprojection error exceeds a threshold; gated on per-frame budget.
|
||||
- **(c) Off (initial release)** — relegate to offline-replay refinement.
|
||||
- **Recommended default**: **(b) Conditional** — fits within latency variance budget while capturing the cross-domain accuracy gain.
|
||||
|
||||
3. **Top-N re-rank promotion to explicit pipeline sub-stage** (per Fact #25). Recommendation: promote to a named sub-stage in `solution_draft01` with N as an SQ3+SQ4 hyperparameter sweep target.
|
||||
|
||||
### Component-pruning carried into SQ3+SQ4
|
||||
|
||||
- **C2 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD.
|
||||
- **C2 candidates entering SQ3+SQ4 conditional on INT8 quantization path**: AnyLoc, BoQ, DINOv2-VLAD.
|
||||
- **C2 candidates pruned**: SuperGlue-as-reranker (latency).
|
||||
- **C3 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: LightGlue, XFeat, XFeat*, SP+LightGlue (NGPS template).
|
||||
- **C3 candidates pruned**: RoMa, MASt3R, DKM (dense matcher latency on Jetson).
|
||||
- **C3 candidates as "AerialExtreMatch reference points" only, NOT for production**: GIM+DKM, GIM+LightGlue (per Source #40, used as accuracy benchmark only).
|
||||
|
||||
### Boundary check: SQ2 is saturated
|
||||
|
||||
Saturation signals observed: (a) four independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey) converge on the **same** "retrieval → matching → pose-estimation hierarchical framework" as canonical; (b) two independent runtime sources (Skoltech survey on RTX 3090; AnyVisLoc on RTX 3090 with explicit dense-vs-sparse breakdown) agree on the relative cost ordering of model classes; (c) cross-source agreement on AdHoP value (Source #40 only, but with reproducible code and dataset — single-source-but-strong evidence); (d) cross-source agreement on covisibility / sensor-prior thresholds. Two outstanding decisions are flagged for user — neither blocks SQ2's saturation status, both block SQ3+SQ4 start. Per `references/source-tiering.md` "Search saturation rule" → SQ2 is closed pending user decisions on DSM dependency + AdHoP gating.
|
||||
|
||||
---
|
||||
|
||||
## SQ3+SQ4 / C1 — Visual / Visual-Inertial Odometry candidate enumeration
|
||||
|
||||
> **Project's pinned mode for every C1 candidate (binding)**: monocular ADTi 20MP nav camera @ 3 fps + IMU from FC over MAVLink @ ≥100 Hz, on Jetson Orin Nano Super (JetPack/CUDA/TensorRT, 8 GB shared LPDDR5, 25 W TDP), producing relative 6-DoF metric pose between consecutive frames + per-axis covariance, with attitude (yaw + pitch) hard-contract σ ≤ 5° at 1 σ (Fact #24), output cadence ≥3 Hz, no in-flight network, license compatible with onboard-binary distribution to a dual-use customer.
|
||||
>
|
||||
> Per the engine's "Per-Mode API Capability Verification" rule, any candidate marked `Selected` requires a `context7` lookup (mode enum + project's exact mode runnable example + disqualifier probe) AND a per-numbered-Restriction × per-numbered-AC sub-matrix. **This session covers candidate enumeration + preliminary applicability assessment only**; `context7` verification and the structured sub-matrix are deferred to the next session per the autodev context budget heuristic.
|
||||
|
||||
### Fact #28 — VINS-Mono is a canonical monocular-only sliding-window VIO with a working Jetson-Nano deployment record but no GitHub release and ~24-month-old master branch
|
||||
- **Statement**: VINS-Mono is the canonical mono+IMU sliding-window VIO from HKUST-Aerial-Robotics (Qin, Li, Shen — IEEE T-RO 2018). Features: efficient IMU pre-integration, automatic initialization, online camera-IMU spatial + temporal calibration, failure detection + recovery, DBoW2 loop detection, global pose-graph optimization. Output: metric-scale 6-DoF pose at IMU rate. **Repository state**: master-branch only (no tagged releases), 5,829 stars; last meaningful master-branch commit 2024-02-25 with a 2024-05-23 simulation-data commit. **Jetson record**: a 2021 IEICE paper (zinuok / KAIST) demonstrated VINS-Mono real-time on the original Jetson Nano (much weaker than Orin Nano Super) for MAV state estimation; a 2024 arXiv paper (2406.13345) showed an enhanced VINS-Mono variant achieving 50 FPS on a Raspberry Pi CM4 with on-sensor accelerated optical flow. **License**: GPL-3.0 (copyleft viral) — distribution of the onboard binary requires source disclosure for the entire linked binary and triggers GPL-3 anti-tivoization clauses for embedded firmware.
|
||||
- **Source**: Source #43 (canonical), Source #46 (KAIST Jetson benchmark), Source #43-linked LICENCE for license confirmation
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅ for algorithm class, mode support, and Jetson Nano feasibility; ⚠️ for Jetson Orin Nano Super specific latency (no direct measurement — but Orin Nano Super >> Jetson Nano, so feasibility is virtually certain); ⚠️ for the maintenance-status risk implied by ~24-month-old master branch.
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Established-production candidate
|
||||
- **Fit Impact**: **carry as lead candidate, conditional on user license decision.** Algorithmic fit is excellent (canonical mono+IMU VIO with metric scale and covariance); maintenance status is borderline; **GPL-3.0 license is a project-level decision required from the user** before this candidate can be marked Selected — see "C1 Open Decisions" section below.
|
||||
|
||||
### Fact #29 — VINS-Fusion is a multi-sensor superset of VINS-Mono but its monocular+IMU mode failed to run on Jetson TX2 in a 2021 KAIST benchmark; Orin Nano Super feasibility unverified
|
||||
- **Statement**: VINS-Fusion (Qin, Cao, Pan, Shen — extension of VINS-Mono) supports four documented sensor configurations: stereo+IMU, mono+IMU, stereo only, +GPS-fusion (toy example). KITTI Odometry top-ranked open-source stereo algorithm as of January 2019. **Repository state**: 4,476 stars; last update 2024-05-23; same master-branch-only convention. **Jetson record**: KAIST 2021 benchmark (Source #46) — on Jetson TX2, both **VINS-Fusion (CPU) and VINS-Fusion-imu fail to run** due to insufficient memory and CPU; VINS-Fusion-gpu (GPU-accelerated front-end) runs on TX2. Orin Nano Super has more memory than TX2 (8 GB LPDDR5 shared vs TX2's 8 GB LPDDR4 shared) and stronger CPU/GPU, but the project's onboard stack is *co-resident* with C2 VPR + C3 matcher + C5 estimator + C6 cache → memory-pressure on the VINS-Fusion-imu path is plausible. **License**: GPL-3.0, same dual-use distribution constraint as VINS-Mono.
|
||||
- **Source**: Source #44 (canonical), Source #46 (KAIST Jetson benchmark)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅ for the multi-sensor mode support and KITTI ranking; ✅ for the 2021 TX2 failure-to-run finding; ⚠️ for Orin Nano Super viability (between TX2 and Xavier NX in CPU/memory; not yet measured).
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Open-source candidate
|
||||
- **Fit Impact**: **carry as alternate candidate, with mandatory Jetson Orin Nano Super MVE before promotion.** VINS-Mono's narrower scope (mono+IMU only, no stereo overhead) makes VINS-Mono the preferred lead within the HKUST-Aerial-Robotics family; VINS-Fusion's multi-sensor coverage is a distractor for our pinned mode. **GPL-3.0 license decision is the same as VINS-Mono** — see "C1 Open Decisions".
|
||||
|
||||
### Fact #30 — OpenVINS is the most actively maintained MSCKF-class VIO and runs on Jetson Orin Nano Dev Kit + JetPack 6 + ROS 2 Humble with documented build adjustments; latency 270 ms on Xavier NX needs Orin-Nano-Super MVE
|
||||
- **Statement**: OpenVINS (rpng, U. Delaware — Geneva, Eckenhoff, Lee, Yang, Huang — ICRA 2020) is a modular MSCKF (Multi-State Constraint Kalman Filter) implementation that fuses IMU state with sparse visual feature tracks via the Mourikis-Roumeliotis 2007 sliding-window MSCKF. **Mode support**: monocular, stereo, multi-camera (1–N) + IMU; mono+IMU is a documented first-class configuration. Supports SLAM features (in-state landmarks) plus pure MSCKF features. **Jetson Orin Nano evidence**: rpng/open_vins issue #421 (Genozen, Feb 2024, closed) confirms OpenVINS ROS 2 builds on Jetson Orin Nano Dev Kit + JetPack 6 + Ubuntu 22.04 + ROS 2 Humble after one build patch (`#include <opencv2/aruco.hpp>` with newer OpenCV); fdcl-gwu/openvins_jetson_realsense (Nov 2025) provides a complete setup guide for Jetson Orin Nano + Intel RealSense + librealsense compiled-from-source + `--parallel-workers 1` build to avoid memory issues. **Latency record**: rpng/open_vins issue #164 — ~270 ms latency on Jetson Xavier NX (4 cores, 40% CPU utilisation). Recommended optimisations: subscriber queue size 1, Release builds with ARM-specific optimization flags (e.g., `armv8.2-a`), reduced camera resolution, prefer `odometry` topic over `pose_imu`. **License**: GPL-3.0, same dual-use distribution constraint as VINS-Mono / VINS-Fusion. Stars 2,828; 30 contributors; 12 releases; latest tag v2.7 (June 2023) but master branch active through 2024–2025 issue threads.
|
||||
- **Source**: Source #45 (canonical + LICENSE + docs.openvins.com), Source #46 (KAIST Jetson benchmark for class-level CPU/memory profile), agent-tools record `29ebf728...txt` (Jetson Orin Nano build evidence)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅ for mode support, MSCKF formulation, and Jetson Orin Nano build feasibility; ⚠️ for steady-state latency on Orin Nano Super under our 5472×3648 nav frames — KAIST benchmark used 640×480; 16× pixel count is a yellow-flag.
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Established-production candidate
|
||||
- **Fit Impact**: **carry as lead candidate, conditional on user license decision.** OpenVINS has the most documented Jetson-Orin-Nano build path of the three GPL-3.0 candidates; MSCKF formulation is more memory-efficient than VINS-Mono's full sliding-window optimisation, which is a meaningful advantage under co-resident-process memory pressure. **GPL-3.0 license decision is the same as VINS-Mono / VINS-Fusion**.
|
||||
|
||||
### Fact #31 — OKVIS2 is the most actively maintained VI-SLAM in the BSD-permissive license bucket; OKVIS2-X (T-RO 2025) extends it with optional GNSS fusion that is architecturally aligned with the project's spoof-promotion path
|
||||
- **Statement**: OKVIS2 (Leutenegger — arXiv 2022, ETH/Imperial/TUM Smart Robotics Lab) is a factor-graph VI-SLAM with bounded-size optimization. Algorithmic novelty: pose-graph edges from marginalised observations are "seamlessly turned back into observations" upon loop closure, reviving old landmarks and reprojection errors. Includes lightweight CNN segmentation for dynamic-region removal. **Mode support**: monocular and multi-camera + IMU; mono+IMU is a documented first-class configuration. **Successor OKVIS2-X (Boche, Jung, Laina, Leutenegger — IEEE T-RO 2025 vol 41 pp 6064–6083, DOI 10.1109/TRO.2025.3619051; arXiv 2510.04612, Oct 2025)** generalises the core to fuse multi-camera + IMU + optional GNSS receiver + LiDAR or depth. The OKVIS2-X GNSS-fusion mode (lineage: Visual-Inertial SLAM with Tightly-Coupled Dropout-Tolerant GPS Fusion, IROS 2022) directly mirrors the project's "VIO that may opportunistically fuse a non-spoofed GPS update when promotion completes" pattern (AC-NEW-2). **Repository state**: ethz-mrl/OKVIS2-X created 2025-09-23, last push 2026-03-17, 295 stars, 2 active contributors (bochsim, SebsBarbas). **License**: 3-clause BSD on the LICENSE file (GitHub UI shows "Other (NOASSERTION)" but the file is canonical 3-clause BSD per ASL-ETH Zurich convention) — permissive, no dual-use distribution friction.
|
||||
- **Source**: Source #47 (OKVIS2 canonical), Source #48 (OKVIS2-X T-RO 2025)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 / C5 implementer
|
||||
- **Confidence**: ✅ for algorithm, mode support, license, T-RO 2025 publication, repository activity; ⚠️ for Jetson Orin Nano runtime — no direct Jetson Orin Nano benchmark located; OKVIS2's factor-graph backend is plausibly heavier than OpenVINS' MSCKF on memory but lighter than Kimera (Kimera also produces a 3D mesh + semantic mesher, OKVIS2 does not).
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Open-source-permissive lead candidate; potential C1+C5+C8 unified factor-graph design
|
||||
- **Fit Impact**: **strong lead candidate by license + maintenance + GNSS-fusion alignment.** If license permissiveness is a priority, OKVIS2 + OKVIS2-X is the natural choice. The OKVIS2-X factor-graph also opens a design path where C5 (state estimator) collapses INTO C1 (the same factor graph absorbs sat-anchor measurements as constraints) — would simplify the pipeline at the cost of departing from the C1/C5 split, which is a Step-7.5 / `solution_draft01` design decision, not a SQ3+SQ4 question. **Pending Jetson Orin Nano Super MVE.**
|
||||
|
||||
### Fact #32 — Kimera-VIO is BSD-permissive but resource-heavy; KAIST benchmark found Kimera had the highest memory usage among VIOs tested and failed Xavier-NX-class memory under multi-process load
|
||||
- **Statement**: Kimera-VIO (MIT-SPARK — Rosinol, Abate, Chang, Carlone — ICRA 2020) is a VI-SLAM pipeline with frontend + backend (factor-graph optimization in iSAM2 or GTSAM) + 3D mesher + pose-graph optimizer. Mode support: stereo+IMU primary, mono+IMU optional but documented. **License**: BSD 2-Clause "Simplified" (LICENSE.BSD on the repo) — permissive. **Maintenance**: active issue/PR threads through Dec 2024 / Feb 2025 covering ROS 2 integration, mono-inertial discussion, dependency management. **Resource profile** (Source #46 KAIST 2021 benchmark): Kimera had the highest memory usage among the 9 algorithms tested (numerous computations per keyframe); Kimera failed to fit on Xavier NX-class memory under sustained multi-process load. The 3D mesh + semantic-label outputs are unused by the project's narrow C1 mandate (relative 6-DoF + covariance only) — Kimera's overhead is unjustified vs OKVIS2 / OpenVINS for our use case.
|
||||
- **Source**: Source #49 (Kimera canonical + LICENSE.BSD), Source #46 (KAIST Jetson benchmark)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects (build-vs-buy, mesh-feature decision)
|
||||
- **Confidence**: ✅ for algorithm, license, maintenance status; ✅ for the Source #46 finding (KAIST 2021); ⚠️ for whether Orin Nano Super's larger memory + Ampere GPU lifts Kimera into feasibility — the Source-46 failure was on Xavier NX 8 GB shared, same memory budget as Orin Nano Super, but Orin Nano Super has higher per-core throughput.
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Open-source-permissive secondary candidate
|
||||
- **Fit Impact**: **carry as fallback only, not lead.** Kimera's permissive license is attractive but its resource overhead (especially the unused 3D mesh + semantic mesher) is a poor fit under co-resident process pressure. Use as a conservative secondary fallback if OKVIS2 unexpectedly fails Jetson MVE. **Status**: not lead.
|
||||
|
||||
### Fact #33 — DROID-SLAM is disqualified by AC-4.2: ≥11 GB GPU VRAM inference budget exceeds the project's 8 GB shared LPDDR5; further, DROID-SLAM is monocular VO/SLAM without IMU fusion and would require an external metric-scale wrapper
|
||||
- **Statement**: DROID-SLAM (princeton-vl, Teed & Deng — NeurIPS 2021; arXiv 2108.10869) requires ≥11 GB GPU memory to run inference per the official README; training requires ≥24 GB on 4× RTX 3090. Issue #121 confirms that even with 128 GB system RAM and 16 GB VRAM (RTX 4080), users hit very large RAM consumption quickly. Algorithmically, DROID-SLAM is **monocular VO/SLAM** with recurrent dense bundle adjustment over a complete history of camera poses — no native IMU fusion; output pose is in arbitrary scale (no metric scale recovery without external alignment). DPV-SLAM (ECCV 2024, princeton-vl) is the lighter successor at ~4–5 GB GPU memory; DPVO (NeurIPS 2023, princeton-vl) is even lighter at ~3 GB, but neither natively integrates IMU.
|
||||
- **Source**: Source #50 (DROID-SLAM canonical), Source #51 (DPVO / DPV-SLAM successor), Source #52 (DPVO-QAT++ memory measurement)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 disqualified candidate
|
||||
- **Fit Impact**: **DISQUALIFIED outright.** AC-4.2 sets the 8 GB shared CPU+GPU memory budget; DROID-SLAM's ≥11 GB GPU-only requirement violates it before adding co-resident C2/C3/C5/C6 processes. Cite as "what the project cannot afford" in `solution_draft01` to pre-empt obvious questions.
|
||||
|
||||
### Fact #34 — DPVO is monocular VO only (no IMU fusion); it can fit a Jetson-suitable memory footprint with QAT but cannot satisfy the C1 VIO mandate alone — would need an external IMU + metric-scale wrapper
|
||||
- **Statement**: DPVO (Teed, Lipson, Deng — NeurIPS 2023; ECCV 2024 DPV-SLAM successor) is a deep-learning monocular VO with sparse patch tracking + differentiable bundle adjustment. **Mode**: monocular VO only — no IMU fusion in the published paper or repository; output pose is in arbitrary scale. Memory footprint: DPVO ~3 GB GPU, DPV-SLAM ~4–5 GB GPU on standard hardware; DPVO-QAT++ (arXiv 2511.12653, Cheng Liao, Nov 2025) reduces peak reserved memory to 1.02 GB on RTX 4060 (8 GB) via fused-CUDA INT8 fake-quantization while preserving ATE on TartanAir/EuRoC. **License**: MIT (permissive). Repository: 989 stars; last update 2024-10-12. **Crucial gap**: DPVO does NOT meet the C1 mandate of a "VIO that produces metric-scale 6-DoF + attitude with σ ≤ 5°" — for the project to use DPVO as the *VO half* of C1, an additional IMU+scale-fusion module (loosely-coupled ESKF with VO velocity / displacement priors) must be designed; alternatively, DPVO's pose can feed C5 directly as a relative-displacement constraint, with attitude served separately by FC IMU integration. **Jetson Orin Nano runtime evidence**: indirect — DPVO-QAT++ benchmarks on RTX 4060 desktop, NOT Jetson Orin Nano. The Ampere GPU architecture is shared between RTX 4060 and Orin Nano Super (both Ampere); the Orin Nano Super's GPU is smaller, so direct extrapolation is not safe — Jetson MVE required.
|
||||
- **Source**: Source #51 (DPVO / DPV-SLAM canonical), Source #52 (DPVO-QAT++ Nov 2025)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 / C5 implementer
|
||||
- **Confidence**: ✅ for "VO only, no IMU fusion" and the memory footprints; ⚠️ for Jetson Orin Nano direct runtime (no measurement); ⚠️ for the operational complexity of the QAT pipeline (teacher-student distillation training is a significant prerequisite vs the classical VINS-* / OpenVINS / OKVIS2 candidates).
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 conditional candidate (VO not VIO; needs external IMU wrapper)
|
||||
- **Fit Impact**: **NOT a drop-in C1 candidate; conditional fit only.** DPVO is **not** a substitute for VINS-Mono / OpenVINS / OKVIS2 — it is a candidate for the *VO half* of a hybrid design where C5 (estimator) absorbs IMU and DPVO provides relative-pose priors. This adds design complexity and is **not preferred** unless one of the established VIO candidates fails Jetson MVE for memory reasons. **Status**: secondary, conditional.
|
||||
|
||||
### Fact #35 — Pure VO baseline (KLT optical flow + 5-point essential matrix or homography RANSAC) is the project's mandatory simple-baseline candidate and is the de-facto fallback when learning-based methods fail on Jetson-budget constraints
|
||||
- **Statement**: The classical pipeline — Shi-Tomasi or FAST corner detection → KLT pyramidal optical flow tracking (`cv::calcOpticalFlowPyrLK`) → 5-point essential matrix (Nister, `cv::findEssentialMat`) or homography RANSAC (`cv::findHomography`) → relative pose with arbitrary scale → metric-scale alignment via IMU integration externally — is the foundational visual-odometry pipeline implemented in OpenCV samples and pedagogical repositories. For the project's nadir-down UAV at 1 km AGL over Ukrainian steppe (predominantly planar terrain, low relief), the **homography path is geometrically appropriate** (a plane induces a homography between two views); for non-planar relief, the **essential-matrix path is appropriate** at a small overhead. License: public domain / OpenCV-Apache-2.0 / MIT (whatever reference implementation is chosen) — permissive. Reference: representative public Monocular-Video-Odometery (MIT, alishobeiri 2018), Monocular-Visual-Odometry (Yacynte) at translation error 0.94% / rotation error 0.015°/m on KITTI dataset.
|
||||
- **Source**: Source #53 (OpenCV docs + reference implementations)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer + risk reviewer
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Simple-baseline candidate (mandatory per Component Option Breadth rule)
|
||||
- **Fit Impact**: **carry as the project's `Simple baseline / known-runnable / known-failure-mode` C1 fallback.** Not a lead, but mandatory presence. Failure modes: (a) low-texture cropland / snow → KLT track loss; (b) sharp turns → low-overlap homography degeneracy; (c) no native IMU fusion → must wrap with external metric-scale alignment (same wrapper as DPVO). **Status**: simple-baseline reference; cited in `solution_draft01` to anchor the failure analysis.
|
||||
|
||||
### Fact #36 — Step-0.5-time-window assessment: VINS-Mono / VINS-Fusion master branches are at the Critical-novelty 18-month boundary; OpenVINS and OKVIS2 are within window; DPVO is borderline; the established baselines (KLT + RANSAC) are exempt
|
||||
- **Statement**: Per Step 0.5 timeliness assessment in `00_question_decomposition.md`, Critical-novelty topics require sources within 6 months for SOTA claims and 18 months for established libraries' API behaviour. Audit at access time 2026-05-07: VINS-Mono master last meaningful commit 2024-02-25 → ~27 months → **just over the 18-month window**; VINS-Fusion 2024-05-23 → ~24 months → just over; OpenVINS master active (issue threads through Feb 2025) and v2.7 release June 2023 → ~35 months for the tagged release but master in stable maintenance → within de-facto window for an established library; OKVIS2-X push 2026-03-17 → ~2 months → **fully within window**; DPVO last code update 2024-10-12 → ~19 months → just over but DPV-SLAM ECCV 2024 keeps the algorithm class within 6-month claim window; KLT / 5-point / RANSAC / homography → established baselines per Step 0.5 → **no time window applies**. **Implication**: VINS-Mono / VINS-Fusion fall into the "older than 18 months but classical authoritative reference" bucket — Step 0.5 allows up to 18 months strictly, but downstream forks (vins-mono-android, embedded variants) and the IEEE T-RO 2018 publication keep the algorithm class in active community use. Recommended treatment: **keep as candidates but require live MVE on Jetson Orin Nano Super before promotion to Selected**, to revalidate against the current OpenCV / Ceres / ROS 2 stack.
|
||||
- **Source**: Source #43, Source #44, Source #45, Source #47, Source #48, Source #51 (timeliness audit per source)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Step-7.5 reviewer + System architects
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 candidate-pool integrity
|
||||
- **Fit Impact**: **applies a conservative timeliness gate: every C1 candidate from VINS-Mono / VINS-Fusion / DPVO requires an Orin-Nano-Super MVE before being marked Selected**, since their master-branch staleness pushes them out of the Critical-novelty 18-month window. OpenVINS / OKVIS2 / OKVIS2-X / Kimera are within window via active issue threads or recent releases.
|
||||
|
||||
### C1 Component Applicability Gate — preliminary table (this session; structured Restrictions×AC sub-matrix per candidate is next session's work)
|
||||
|
||||
| Candidate | Mode (project) | License | Active maintenance? | Jetson Orin Nano Super runnable? | Native IMU fusion? | Native metric scale? | License blocks dual-use? | Preliminary status |
|
||||
|---|---|---|---|---|---|---|---|---|
|
||||
| **VINS-Mono** | mono+IMU | GPL-3.0 (copyleft) | ⚠️ borderline (24 mo) | ✅ proven on Jetson Nano (2021) → Orin Nano Super virtually certain | ✅ | ✅ | **⚠️ Verify with user** | Lead candidate **conditional on user license decision** + Orin-Nano-Super MVE |
|
||||
| **VINS-Fusion** | mono+IMU (mode) | GPL-3.0 | ⚠️ borderline (24 mo) | ⚠️ failed on TX2 (KAIST 2021); Orin Nano Super untested | ✅ | ✅ | **⚠️ Verify with user** | Alternate, secondary to VINS-Mono within HKUST family |
|
||||
| **OpenVINS** | mono+IMU | GPL-3.0 | ✅ active master | ✅ build confirmed on Orin Nano Dev Kit + JetPack 6 (2024 + 2025 community evidence); ~270 ms latency on Xavier NX | ✅ MSCKF | ✅ | **⚠️ Verify with user** | **Lead candidate** **conditional on user license decision** (best Jetson-Orin-Nano evidence + most maintained of the GPL-3 trio) |
|
||||
| **OKVIS2 / OKVIS2-X** | mono+IMU (+ optional GNSS) | BSD-3 | ✅ very active (2026 pushes) | ⚠️ no direct Jetson Orin Nano measurement; factor-graph backbone plausibly heavier than MSCKF | ✅ | ✅ | ✅ no | **Lead candidate by license + maintenance + spoof-promotion architectural alignment**, pending Jetson MVE |
|
||||
| **Kimera-VIO** | mono+IMU (optional) | BSD-2 | ✅ active | ⚠️ failed on Xavier NX 8 GB shared under multi-process (KAIST 2021) | ✅ | ✅ | ✅ no | Fallback secondary; resource overhead poor fit for project |
|
||||
| **DROID-SLAM** | mono VO/SLAM only | (project repo) | reference baseline | ❌ ≥11 GB GPU VRAM > 8 GB AC-4.2 budget | ❌ | ❌ (arbitrary scale) | n/a | **DISQUALIFIED** by AC-4.2 |
|
||||
| **DPVO / DPV-SLAM** | mono VO only | MIT | ⚠️ borderline (19 mo on code, ECCV 2024 paper) | ⚠️ DPVO-QAT++ (Nov 2025) shows 1.02 GB peak on RTX 4060 desktop; Jetson Orin Nano untested | ❌ (needs external IMU wrapper) | ❌ (needs external scale alignment) | ✅ no | Conditional secondary — VO half of a hybrid C1+C5 design only; not a drop-in VIO replacement |
|
||||
| **Pure VO baseline (KLT + 5pt RANSAC / homography)** | mono VO only | OpenCV-Apache-2.0 / MIT | ✅ foundational (no time window) | ✅ runs on any Jetson | ❌ (needs external IMU wrapper) | ❌ (needs external scale alignment) | ✅ no | **Mandatory simple-baseline reference** per Component Option Breadth rule |
|
||||
|
||||
**Surviving lead candidates (preliminary)**, in priority order based on this session's evidence:
|
||||
1. **OpenVINS** (GPL-3.0, MSCKF, best Jetson Orin Nano evidence) — pending user license decision + Orin-Nano-Super MVE
|
||||
2. **OKVIS2 / OKVIS2-X** (BSD-3, factor-graph + GNSS-fusion alignment, most active maintenance) — pending Jetson MVE
|
||||
3. **VINS-Mono** (GPL-3.0, sliding-window optimization, proven on Jetson Nano) — pending user license decision + Orin-Nano-Super MVE
|
||||
4. **Pure VO baseline** (mandatory simple-baseline; runtime guaranteed; carries the project as a graceful fallback)
|
||||
|
||||
**Disqualified outright**: DROID-SLAM (AC-4.2 memory budget), RTAB-Map and ORB-SLAM3 (already pruned by Fact #16).
|
||||
|
||||
**Conditional / not-direct-fit**: DPVO / DPV-SLAM (VO not VIO, needs external IMU wrapper), Kimera-VIO (resource overhead unjustified for narrow C1 mandate).
|
||||
|
||||
### C1 Open Decisions (to be resolved before SQ3+SQ4 closure)
|
||||
|
||||
**Decision D-C1-1 — GPL-3.0 license posture for the onboard binary** (BLOCKING for the GPL-3.0 trio: VINS-Mono / VINS-Fusion / OpenVINS).
|
||||
- The three most established VIO candidates (VINS-Mono / VINS-Fusion / OpenVINS) are GPL-3.0 (viral copyleft).
|
||||
- For dual-use UAV deployment, GPL-3 binary distribution to a customer triggers obligations: source-code disclosure for the entire linked binary, anti-tivoization clauses for embedded firmware updates, viral effect on any proprietary code linked into the same binary.
|
||||
- BSD/MIT alternatives exist (OKVIS2 BSD-3, Kimera BSD-2, DPVO MIT, pure-VO baseline OpenCV-Apache-2.0), but each comes with secondary trade-offs (Jetson MVE risk, missing IMU fusion, resource overhead).
|
||||
- Three options for the user:
|
||||
- **(a)** Accept GPL-3.0 — distribution model = release source on customer request; or operate the system as a service rather than transferring binaries. Lowest-risk algorithmic path (most-tested candidates).
|
||||
- **(b)** Restrict to permissive licenses only (BSD/MIT) — lead candidate becomes OKVIS2; carries Jetson MVE risk.
|
||||
- **(c)** Keep both options open through the design phase — make the final license decision after the Jetson Orin Nano MVE results are in.
|
||||
- **Recommended default**: **(c)** — defer the binary commitment until empirical evidence on Jetson Orin Nano. This is recorded as a flagged decision; SQ3+SQ4 candidate matrix will carry both license families to Step 7.5.
|
||||
|
||||
**Decision D-C1-2 — Acceptance of Jetson Orin Nano MVE as a Step-7.5 prerequisite** (procedural).
|
||||
- Per the Per-Mode API Capability Verification rule, every lead candidate library/SDK requires `context7` (or equivalent docs) lookup + a Minimum Viable Example for the project's pinned mode + per-numbered-Restriction × per-numbered-AC sub-matrix.
|
||||
- The Component Applicability Gate above is **preliminary** — it documents enumeration evidence but does NOT yet contain `context7` per-mode capability verification or the structured sub-matrix.
|
||||
- **Next session's mandatory work**: `context7` lookup (3 mandatory queries) for OpenVINS / OKVIS2 / VINS-Mono; per-Restriction × per-AC sub-matrix per candidate; the same for the simple-baseline path; record into `02_fact_cards.md` per the engine template + `06_component_fit_matrix.md` per Step 7.5.
|
||||
|
||||
### C1 Boundary check: candidate enumeration is saturated for this session
|
||||
|
||||
Saturation signals observed: (a) all 7 named candidates from `00_question_decomposition.md` C1 row enumerated with at least one canonical L1 source per candidate; (b) Jetson Orin Nano runtime evidence located for OpenVINS (direct) and VINS-Mono (Jetson Nano + RPi CM4); other candidates carry "MVE required" gates explicitly; (c) license diversity covered (GPL-3.0 trio + BSD-permissive duo + MIT + permissive-baseline); (d) explicit disqualifications recorded with cited evidence (DROID-SLAM, RTAB-Map, ORB-SLAM3). **Open**: per-mode `context7` verification (BLOCKING per rule) + Restrictions×AC sub-matrices (BLOCKING per Step 7.5) — explicitly deferred to next session.
|
||||
@@ -0,0 +1,50 @@
|
||||
# Fact Cards — Index & Summary
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `../01_source_registry/` (see `../01_source_registry/00_summary.md` for index). Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential).
|
||||
>
|
||||
> Bound to sub-questions in `../00_question_decomposition.md`. Many SQ6 facts also bind directly to the Project Constraint Matrix (`../../00_problem/acceptance_criteria.md` / `../../00_problem/restrictions.md`); per the engine's "Per-Mode API Capability Verification" rule, MAVLink/MSP messages are treated as candidate **modes** and are bound `Pass/Fail/Verify/N/A` against numbered ACs and restrictions.
|
||||
|
||||
This folder replaces the previous monolithic `02_fact_cards.md` (1480 lines, too large to navigate). Each category lives in its own file. Open the file matching the area you need — every fact and conclusion is preserved verbatim.
|
||||
|
||||
---
|
||||
|
||||
## Category index
|
||||
|
||||
| File | Sub-question / Component | Facts (count) | Scope summary |
|
||||
| --- | --- | --- | --- |
|
||||
| [`SQ6_fc_external_positioning.md`](SQ6_fc_external_positioning.md) | **SQ6** — ArduPilot Plane vs iNav external positioning | #1–#10 (10 facts) | MAVLink `GPS_INPUT` (232) ingestion in EKF3, iNav MSP `MSP2_SENSOR_GPS` ingestion via INAV BlackBox, covariance honesty, lane-fusion / lane-switch on (NSats, HDOP, fix_type), spoof-promotion via UBX emulation, dead-reckoning behaviour, `EK3_GPS_CHECK` bit-mask gates. Working conclusions: ArduPilot is the cooperative path, iNav requires UBX impersonation. |
|
||||
| [`SQ1_existing_systems.md`](SQ1_existing_systems.md) | **SQ1** — Existing / competitor GPS-denied UAV navigation systems | #11–#20 (10 facts) | Twist Robotics OSCAR (Ukrainian peer), Auterion Artemis OS, Vantor Raptor, NGPS class systems, SPRIN-D winner, RTAB-Map / ORB-SLAM3 pruning rationale, DSMAC/TERCOM lineage, hierarchical retrieval-matching SOTA, AerialExtreMatch benchmark, DARPA FLA + USAF SBIR programs. Working conclusions: VPR-anchored hybrid pipeline is canonical. |
|
||||
| [`SQ2_canonical_pipeline.md`](SQ2_canonical_pipeline.md) | **SQ2** — Canonical GPS-denied pipeline & SOTA components | #21–#27 (7 facts) | Two-stage canonical pipeline (global VPR → local alignment → PnP-RANSAC → EKF), end-to-end visual-localization rejection (poor generalization, no covariance), cross-domain sat ↔ UAV registration, hardware MVE doctrine, Top-N inlier re-rank gate. Working conclusions: VIO + VPR + Matcher + PnP + EKF is the design floor. |
|
||||
| [`C1_vio.md`](C1_vio.md) | **C1** — Visual / Visual-Inertial Odometry | Candidate enumeration + decisions | VINS-Mono (BSD/permissive baseline), VINS-Fusion (GPL-3.0 alternate), OpenVINS (GPL-3.0), OKVIS2 (BSD), Kimera-VIO (BSD), DROID-SLAM (BSD non-VIO), DPVO (Apache-2.0 non-VIO), KLT+RANSAC (homemade fallback). Decisions: D-C1-1 license posture, D-C1-2 IMU rate. |
|
||||
| [`C2_vpr.md`](C2_vpr.md) | **C2** — Visual Place Recognition | Candidate enumeration + decisions | MixVPR, SALAD (GPL-3.0 disqualifier), SelaVPR, NetVLAD, EigenPlaces, AnyLoc, BoQ, DINOv2-VLAD. Decisions: D-C2-1 aerial-domain training, D-C2-2 cache budget, D-C2-3 input resolution shape, D-C2-N TensorRT export gate. |
|
||||
| [`C3_matchers.md`](C3_matchers.md) | **C3** — Cross-domain registration (Matchers) | Candidate enumeration + decisions | SP+LightGlue (Magic Leap noncommercial disqualifier on canonical SP), DISK+LightGlue (RECOMMENDED-PRIMARY-MITIGATION), ALIKED+LightGlue, XFeat (alternate-modern lead), SuperGlue+SuperPoint (deprecated by LightGlue authors), MASt3R (CC-BY-NC), RoMa, DKM, LoFTR. Decisions: D-C3-1 modern-competitive lead, D-C3-2 ONNX/TensorRT export path, D-C3-6 re-rank strategy. |
|
||||
| [`C4_pose_estimation.md`](C4_pose_estimation.md) | **C4** — Pose estimation (PnP + RANSAC + LM) | #52–#54 (3 facts, in progress) | OpenCV `cv::solvePnPRansac` mandatory simple-baseline (Apache-2.0 throughout, 9 SolvePnPMethod enum values with 2 BROKEN, paired `solvePnPRefineLM`/`solvePnPRefineVVS`/`solvePnPGeneric`, 7 USAC RANSAC variants); OpenGV modern-competitive-lead-richer-minimal-solver (BSD-3-Clause-equivalent NOASSERTION-SPDX-detector contingent + ~3-year stale + 4 algorithm-selectable RANSAC enums [KNEIP/GAO/EPNP/GP3P] + 2 P3P variants + UPnP global-optimal + GP3P generalized-camera; NO planar-scene dedicated solver vs OpenCV's IPPE); GTSAM modern-competitive-lead-covariance-honest (BSD-3-Clause clean throughout, daily-active maintenance, **NATIVE 6×6 pose covariance via `Marginals.marginalCovariance` — only C4 candidate to satisfy AC-NEW-4 NATIVELY**, no native RANSAC, ~50-200 MB footprint, tight AC-4.1 latency margin). Decisions: D-C4-1 (carry-forward) 2D-3D-lift; D-C4-2 (NEW + UPDATED) covariance-recovery-strategy; D-C4-3 (NEW) OpenGV license-clearance-verification; D-C4-4 (NEW) OpenGV maintenance-staleness-mitigation. Subsequent candidates pending: Theia / Ceres-only (likely deferrable — D-C4 row may already have sufficient coverage). |
|
||||
| [`C5_state_estimator.md`](C5_state_estimator.md) | **C5** — State estimator / sensor fusion | #88–#89 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | Manual ESKF reference (Solà 2017 canonical aerial/quaternion arXiv preprint — public-domain canonical equations + project-side custom implementation under project's Apache-2.0; mandatory simple-baseline; trivial dependency footprint at ~kilobytes of NumPy/SciPy code; native 6×6 covariance via analytic Jacobian propagation per Solà §6 canonical recipe; quaternion-correct attitude integration on SO(3) via small-angle approximation in error-state; **fastest C5 candidate by an order of magnitude** at ~5-15 ms per update on Jetson CPU); GTSAM `iSAM2` + `CombinedImuFactor` (Forster et al. RSS 2015) + `PreintegratedCombinedMeasurements` + `BetweenFactorPose3` + `GenericProjectionFactorCal3DS2` + `PriorFactorPose3` + smart projection factors + `Marginals.marginalCovariance` + `gtsam_unstable.IncrementalFixedLagSmoother` modern-competitive-lead-factor-graph (clean BSD-3-Clause throughout, daily-active maintenance with last-pushed 2026-05-08T13:00:22Z = TODAY at access time, **architecturally couples with C4 Fact #54 via shared GTSAM substrate**, native 6×6 posterior covariance via `Marginals` — same NATIVE AC-NEW-4 satisfaction pathway as C4 Fact #54, IMU pre-integration via Forster et al. RSS 2015 `CombinedImuFactor` 6-key per-keyframe-pair factor with bias evolution for asynchronous IMU+camera fusion at ~100-200 Hz IMU + 3 Hz camera, ~50-200 MB footprint, incremental smoothing via iSAM2 amortizes per-frame cost, **NATIVE AC-4.5 look-back refinement** unique among C5 candidates). Decisions: D-C5-1 (NEW) reference-implementation-license-verification; D-C5-2 (NEW) long-cruise-observability-strategy; D-C5-3 (NEW) sliding-window-primitive-choice; D-C5-4 (NEW) IMU-gap-handling-strategy; D-C5-5 (NEW) factor-density-choice (recommended D-C5-5 = (c) couples C4 Fact #54 D-C4-2 = (b) with C5 Fact #89 architectural integration via shared GTSAM substrate). |
|
||||
| [`C6_tile_cache_spatial_index.md`](C6_tile_cache_spatial_index.md) | **C6** — Tile cache + spatial index | #92–#93 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY**: Manual mirror of existing parent-suite `satellite-provider` pattern (verified directly via Source #92 filesystem read at /Users/obezdienie001/dev/azaion/suite/satellite-provider/) — PostgreSQL btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` for geographic spatial-grid range queries + `bytea` descriptor blobs + app-side FAISS `IndexHNSWFlat(d, M=32)` loaded at takeoff via `faiss.read_index` for descriptor ANN + filesystem tile storage at `./tiles/{zoom}/{x}/{y}.{image_type}` slippy-map convention; clean PostgreSQL License + MIT + LGPL/MIT-Apache; trivial dependency footprint (no Postgres extensions); empirically-confirmed Postgres-on-Jetson viability per Source #97 March 2026 article (CPU cores limiting, NOT memory); ~6-54 ms per cache hit comfortably within AC-4.1 400 ms p95 budget; ~700 MB-1.5 GB total memory footprint within AC-4.2 8 GB budget. **Cand 2 DEFERRED secondary**: PostgreSQL + PostGIS 3.4 GiST on `geography(POINT,4326)` with KNN distance ordering (`<->`) + pgvector 0.7+ HNSW for descriptor ANN + same filesystem tile storage; native KNN + radius + combined-SQL capabilities are real improvements BUT 5-10× slower geographic lookup than Cand 1 + heavier dependency (~50-100 MB additional memory + ~50-200 MB additional disk install) + PostGIS GPL-2.0-or-later license-complexity (CONTINGENT REJECT under D-C1-1 = (b) BSD/permissive-only-track) + DIVERGENT from suite pattern + improvements marginal-to-negative in project's pinned 3 Hz spatial-grid query operating context. **Comparative-improvement-vs-Cand-1 verdict**: per user's session-start "significant-improvement-only" bar, no material justification to deviate from existing satellite-provider pattern. Decisions: D-C6-1 (NEW) descriptor-storage-format choice (halfvec recommended); D-C6-2 (NEW Cand-1-only) FAISS index variant choice (IndexHNSWFlat M=32 recommended); D-C6-3 (NEW Cand-1-only CROSS-COMPONENT with C10) descriptor-cache-rebuild-trigger strategy (periodic-during-C10-pre-flight recommended); D-C6-4 (NEW Cand-1-only) geographic-spatial-grid radius (dynamic recommended); D-C6-5 (NEW Cand-2-only contingent) Jetson PostGIS+pgvector co-installation Plan-phase verification (verify-on-Jetson-MVE recommended); D-C6-6 (NEW Cand-2-only contingent) pgvector descriptor-storage-type choice (halfvec recommended); D-C6-7 (NEW CROSS-COMPONENT affects parent-suite satellite-provider) cascade-changes-back-to-suite strategy (leave-unchanged recommended given Cand 1 closure verdict). |
|
||||
| [`C7_inference_runtime.md`](C7_inference_runtime.md) | **C7** — On-Jetson inference runtime | #94–#96 (3 facts, **batch 1 closed at 3/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY**: TensorRT native — JetPack 6.2 bundled TensorRT 10.3 + `IInt8EntropyCalibrator2` + `BuilderFlag.FP16+INT8` mixed-precision + engines built directly on Jetson Orin Nano Super SM 87 (Apache-2.0 in TensorRT 10.x; ships with JetPack so zero-effort install; lowest-latency primary path; 2-3× speedup at INT8 vs FP16 per Source #102 YOLO26 benchmark; engines tied to SM 87 hardware-specific per Source #105 — must be built on deployed Jetson via D-C7-7); **Cand 2 modern-competitive-lead-cross-architecture-portability**: ONNX Runtime + TensorRT EP — `onnxruntime-gpu` via Jetson AI Lab JP6/CU126 wheel index + `TensorrtExecutionProvider` config + automatic CUDA EP / CPU EP subgraph fallback (MIT throughout; cross-architecture portability for replay/SITL on x86 dev hosts; `pip install onnxruntime-gpu` does NOT work on Jetson — needs Jetson AI Lab community wheel via D-C7-3 + numpy<2.0.0 pin via D-C7-4); **Cand 3 mandatory simple-baseline**: pure PyTorch FP16 — `torch.amp.autocast` + `model.half()` + Jetson AI Lab PyTorch 2.5 ARM64 wheel (BSD-3-Clause throughout; zero-conversion regression baseline; reference-correctness oracle for accuracy validation of TRT-built engines; standard `pip install torch` lacks CUDA on Jetson — needs Jetson AI Lab wheel via D-C7-5). **Cross-cutting precision policy** (D-C7-6 NEW CROSS-COMPONENT, affects C2+C3+C1+C7): VPR backbones (CNN-class MixVPR/EigenPlaces/NetVLAD) → INT8+FP16 mixed; ViT-class VPR (SelaVPR DINOv2-L; conditional AnyLoc/BoQ/DINOv2-VLAD) → FP16-only initially, INT8 deferred to Jetson MVE per D-C2-5; matchers (LightGlue with SP/DISK/ALIKED, XFeat, XFeat+LighterGlue) → **FP16-only — NO INT8** per Source #103 quantization-sensitivity finding (LightGlue FP8 ModelOpt collapsed match counts); learned VIO frontends → FP16-only initially. **Triton/DeepStream/CUDA-Python custom kernels considered-and-rejected** (server/video-pipeline class + out-of-budget for embedded 8 h mission) per c7_overkill_options scope choice. Decisions: D-C7-1 (NEW Cand-1-only CROSS-COMPONENT with C9) calibration-dataset-strategy (AerialVL S03 + AerialExtreMatch recommended); D-C7-2 (NEW Cand-1-only) TensorRT mixed-precision flag matrix (per-family policy per D-C7-6 recommended); D-C7-3 (NEW Cand-2-only) ORT-Jetson-wheel-index-pin (mirror to project artifact registry + cu126 recommended); D-C7-4 (NEW Cand-2-only) numpy-version-pin (`numpy<2.0.0` recommended); D-C7-5 (NEW Cand-3-only) PyTorch-Jetson-wheel-pin (PyTorch 2.5 + torchvision 0.20 recommended); D-C7-6 (NEW CROSS-COMPONENT C2+C3+C1+C7) INT8-vs-FP16-per-model-family-precision-policy (per-family policy recommended); D-C7-7 (NEW Cand-1-only CROSS-COMPONENT with C10) engine-build-on-Jetson-vs-prebuilt strategy (primary build-on-target + reference-Jetson fallback recommended); D-C7-8 (NEW Cand-1-only) `config.max_workspace_size` cap (1 GB safe default recommended); D-C7-9 (NEW Cand-1-only) TensorRT version pin within JetPack lifecycle (JetPack 6.2 + TensorRT 10.3 recommended). |
|
||||
| [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | **C10** — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; only D-C6-3 + D-C7-7 confirmation pipelines researched here, operator tooling design deferred to Plan-phase) | #100–#101 (2 facts, **batch 1 closed at 2/N 2026-05-08**) | **D-C6-3 confirmation (Fact #100)**: descriptor-cache rebuild trigger + atomic-write strategy via direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` (write-temp + `fsync` + atomic rename) + content-hash (SHA-256) verification gate at takeoff load + `IO_FLAG_MMAP_IFC` mmap load with `madvise(MADV_WILLNEED)` pre-fault + manifest-hash-driven rebuild trigger; FAISS MIT + atomicwrites MIT throughout; FAISS warns "no internal integrity check, expects validated input" — MITIGATED by content-hash gate at takeoff (binds AC-NEW-7 cache-poisoning safety); rebuild-while-not-flying constraint per restrictions.md. **D-C7-7 confirmation (Fact #101)**: hybrid TensorRT engine-build orchestration — Polygraphy CLI primary for INT8-calibrating builds (`polygraphy convert --int8 --calib-cache=<path> ...` Apache-2.0 + Calibrator API replaces hand-written `IInt8EntropyCalibrator2`) + `trtexec` for fast cache-reuse rebuilds (`--fp16 --int8 --calib=<existing_cache>`) + direct `IBuilderConfig` Python API as escape hatch for unusual models (LightGlue dynamic-shape profiles); calibration cache binary-blob reuse keyed by `SHA-256(calib_corpus)` per D-C10-6; engines tied to SM 87 hardware-specific per Source #105 → must be built on deployed Jetson per D-C7-7 closure (D-C10-8 reference-Jetson-at-HQ + deployed-Jetson-copy-to-archive prebuilt-fallback venue); self-describing filename schema `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` per D-C10-7; binds AC-4.1/4.2 latency+memory budgets via D-C7-2 mixed-precision flag matrix + D-C7-1 calibration corpus closure. |
|
||||
| [`C8_fc_adapter.md`](C8_fc_adapter.md) | **C8** — MAVLink / MSP2 FC adapter | #97–#99 (3 facts, **batch 1 closed at 3/N 2026-05-08**) | **Cand 1 RECOMMENDED PRIMARY for ArduPilot**: pymavlink → MAVLink `GPS_INPUT` (msg 232) cooperative-path; `master.mav.gps_input_send(time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type, lat, lon, alt, hdop, vdop, vn, ve, vd, speed_accuracy, horiz_accuracy, vert_accuracy, satellites_visible, yaw)` periodic injection at 5 Hz over MAVLink (UART/USB/UDP per D-C8-1); FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set drives EKF3 ingestion via `AP_GPS_MAV` (verified Source #4 SQ6 + Source #106 + Source #107); pymavlink LGPL-3.0 linkable from Apache-2.0 app per LGPL §6 (D-C8-3 mitigation). **Cand 2 RECOMMENDED PRIMARY for iNav**: `MSP2_SENSOR_GPS` (id 7939 / 0x1F03) via Python MSP V2 (YAMSPy or INAV-Toolkit `msp_v2_encode`); `mspGPSReceiveNewData()` direct passthrough (no validation gate beyond data parse); covariance fields `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` align directly with AP `GPS_INPUT.horiz_accuracy`/`vert_accuracy`/`speed_accuracy`; YAMSPy + INAV-Toolkit MIT throughout; `USE_GPS_PROTO_MSP` enabled by default in iNav target/common.h (verified Source #111 + #112 + #113); locked SQ6 + AC-4.3 + restrictions.md transport. **Cand 3 DEFERRED secondary for iNav**: UBX impersonation via pyubx2 NAV-PVT — forging u-blox NAV-PVT frames through standard GPS pipeline; iNav-side `gpsMapFixType()` validation gate requires `flags & 0x01 = 1` (gnssFixOK) AND `fixType ∈ {2,3}` per Source #110 `gps_ublox.c` lines 215-220 + 654; pyubx2 BSD-3-Clause clean dual-use; **does NOT clear user's "significant-improvement-only" bar over Cand 2** — richer protocol surface (NAV-PVT periodic + NAV-VER startup + CFG-MSG/CFG-RATE ACK behaviour) + AC-NEW-7 forgery posture + stricter validation gate + AP-path field-name divergence outweigh pyubx2 library-maturity advantage. **Mid-batch correction**: I caught a contradiction between my own initial AskQuestion phrasing ("UBX impersonation as ONLY iNav path") and locked SQ6 + AC-4.3 + restrictions.md verdicts; user re-locked scope via `c8_inav_recovery=B` to evaluate both as parallel candidates. Decisions: D-C8-1 (NEW Cand-1-only) pymavlink connection-string transport choice (env-driven default-UART recommended); D-C8-2 (NEW Cand-1-only CROSS-COMPONENT with AC-NEW-2) `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch ownership pattern (companion publishes to source-set 2 + auto-switches FC recommended); D-C8-3 (NEW Cand-1-only) pymavlink LGPL-3.0 license-posture verification (bundle-unmodified-with-version-pin recommended); D-C8-4 (NEW Cand-2-only) Python MSP V2 implementation choice (YAMSPy primary + thin custom encoder fallback recommended); D-C8-5 (NEW Cand-2-only) MSP2_SENSOR_GPS injection rate (5 Hz periodic recommended); D-C8-6 (NEW Cand-3-only contingent) UBX-version-advertisement strategy (advertise version ≥ 15.0 recommended); D-C8-7 (NEW Cand-3-only contingent CROSS-COMPONENT with AC-NEW-7) AC-NEW-7 audit-trail posture for UBX impersonation (explicit FDR audit entry recommended); D-C8-8 (NEW CROSS-COMPONENT C5+C8) covariance-honesty cross-FC enforcement strategy (per-FC unit conversion recommended via 95% confidence ellipse semi-major axis from C5 GTSAM `Marginals.marginalCovariance`). |
|
||||
|
||||
**Cross-cutting consumers** (do not duplicate facts here, just point in):
|
||||
- The Component Fit Matrix (`../06_component_fit_matrix/`) cites every fact here by `Fact #N` or by candidate row.
|
||||
|
||||
---
|
||||
|
||||
## Confidence-label legend
|
||||
|
||||
| Label | Meaning | Source class |
|
||||
| --- | --- | --- |
|
||||
| ✅ High | Source code / official spec / canonical repo verified | L1 (primary code, official docs, published benchmarks) |
|
||||
| ⚠️ Medium | Authoritative but with stated caveat (out-of-date version, partial coverage, single-source confirmation) | L1 / L2 |
|
||||
| ❓ Low | Inferential or extrapolated (vendor blog, secondary commentary, candidate not yet runtime-verified on target hardware) | L3 / L4 |
|
||||
|
||||
Whenever a candidate is marked **Selected** in `../06_component_fit_matrix/`, its row depends on at least one ✅ High fact in the corresponding C-file plus a `context7` per-mode API capability verification.
|
||||
|
||||
---
|
||||
|
||||
## Editing rules
|
||||
|
||||
1. Add new facts only inside their owning category file. Cross-reference siblings; do not duplicate text.
|
||||
2. Each fact keeps the existing schema — `### Fact #N — title`, `**Statement**`, `**Source**`, `**Phase**`, `**Confidence**`, `**Sub-Question Binding**`, `**Implication**`.
|
||||
3. When extending C-rows, also touch the corresponding component file in `../06_component_fit_matrix/` so the matrix stays in sync.
|
||||
4. Working conclusions and decisions (`D-Cx-y`) live at the bottom of their owning file, not here.
|
||||
@@ -0,0 +1,261 @@
|
||||
# Fact Cards — C10: Pre-flight cache provisioning (cross-coupling minimal scope)
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Bound to sub-questions in `../00_question_decomposition.md` line 78 (C10 = "Pre-flight cache provisioning + sector classification + freshness pipeline" with 2026-05-08 user-locked CROSS-COUPLING MINIMAL scope per `c10_scope=C` — see "C10 Scope Restructure" section). Sources for C10 cluster live in [`../01_source_registry/C10_preflight_provisioning.md`](../01_source_registry/C10_preflight_provisioning.md).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md), [C6 Tile cache + spatial index](C6_tile_cache_spatial_index.md), [C7 On-Jetson inference runtime](C7_inference_runtime.md), [C8 MAVLink/MSP2 FC adapter](C8_fc_adapter.md). Cross-component gates: [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## Scope summary
|
||||
|
||||
C10 batch 1 closed at 2/N on 2026-05-08. **Fact #100** = D-C6-3 confirmation pipeline (descriptor-cache rebuild trigger orchestration for the FAISS HNSW index built during C10 pre-flight provisioning + serialized via `faiss.write_index` + atomic-write + content-hash + manifest-driven rebuild trigger + load-at-takeoff via `faiss.read_index` or memory-mapped via `IO_FLAG_MMAP_IFC`). **Fact #101** = D-C7-7 confirmation pipeline (TensorRT engine-build orchestration via Polygraphy CLI primary + `trtexec` simpler fallback + direct `IBuilderConfig` Python API for reference-Jetson-prebuilt-engine generation; calibration corpus shipping mechanism per D-C7-1 closure). User-pinned scope: cross-coupling-minimal — operator CLI/desktop tooling, sector classification heuristics, and freshness pipeline workflow are **deferred to Plan-phase**.
|
||||
|
||||
---
|
||||
|
||||
### Fact #100 — D-C6-3 confirmation: descriptor-cache rebuild trigger pipeline orchestrated via direct `faiss.write_index` / `faiss.read_index` Python API + atomic-write + content-hash + manifest-driven rebuild trigger + optional `IO_FLAG_MMAP_IFC` load
|
||||
|
||||
**Statement**: For C10 (pre-flight cache provisioning, cross-coupling minimal scope), the D-C6-3 descriptor-cache rebuild trigger pipeline (Recommendation = `periodic rebuild during C10 pre-flight provisioning`) is operationalized as the direct FAISS Python API wrapped in a thin project-side orchestration module:
|
||||
|
||||
- **Build pipeline (per pre-flight, manifest-hash-driven)**:
|
||||
1. C10 pre-flight CLI computes `manifest_hash = sha256(descriptor_blobs.sha256, descriptor_dim, faiss_M, ef_construction, vpr_model_sha256)` over the inputs that would change the index content.
|
||||
2. Compare to `manifest_hash_prev` recorded in `/var/lib/onboard/cache/faiss/manifest.json` from the last successful build.
|
||||
3. If `manifest_hash != manifest_hash_prev` (or if `manifest.json` is missing): rebuild the FAISS index. Otherwise: skip.
|
||||
4. Rebuild = `index = faiss.IndexHNSWFlat(d=descriptor_dim, M=faiss_M)` (per D-C6-2 = `IndexHNSWFlat M=32` recommendation) → `index.hnsw.efConstruction = 40` (per Source #96 / Source #114 / C6 Fact #92 canonical pattern) → `index.add(descriptor_blobs)` → write to disk via the atomic-write wrapper (next bullet).
|
||||
5. Write atomic-write wrapper:
|
||||
```python
|
||||
# pseudocode; implementation may use python-atomicwrites package or be hand-rolled per Source #116
|
||||
temp_path = target_path + ".tmp"
|
||||
faiss.write_index(index, temp_path) # FAISS writes serialized binary
|
||||
fd = os.open(temp_path, os.O_RDONLY)
|
||||
os.fsync(fd) # flush content + metadata to disk
|
||||
os.close(fd)
|
||||
os.rename(temp_path, target_path) # POSIX atomic rename (same filesystem)
|
||||
parent_fd = os.open(os.path.dirname(target_path), os.O_RDONLY | os.O_DIRECTORY)
|
||||
os.fsync(parent_fd) # flush directory entry change
|
||||
os.close(parent_fd)
|
||||
content_hash = sha256(open(target_path, 'rb').read())
|
||||
manifest = {"manifest_hash": manifest_hash,
|
||||
"content_hash": content_hash,
|
||||
"descriptor_dim": descriptor_dim,
|
||||
"faiss_M": faiss_M,
|
||||
"ef_construction": ef_construction,
|
||||
"n_tiles": index.ntotal,
|
||||
"build_iso8601": now(),
|
||||
"vpr_model_sha256": vpr_model_sha256,
|
||||
"build_duration_sec": build_duration_sec}
|
||||
write_atomic(manifest_path, json.dumps(manifest))
|
||||
```
|
||||
6. C10 also records the build event into the AC-NEW-3 FDR record: `(model="faiss_hnsw", manifest_hash, content_hash, build_duration_sec, n_tiles, descriptor_dim)`.
|
||||
|
||||
- **Load pipeline (per takeoff)**:
|
||||
1. Read `/var/lib/onboard/cache/faiss/manifest.json` → recover `expected_content_hash`.
|
||||
2. Compute `actual_content_hash = sha256(open(target_path, 'rb').read())` (single-pass file read; ~0.5-2 s on JetPack 6 ARM64 NVMe per ~430 MB halfvec file at 2048-D × 100K tiles per Source #115 size formula).
|
||||
3. Compare: if `actual != expected` → REJECT the cache; emit `STARTUP_FAULT_FAISS_CACHE_HASH_MISMATCH` MAVLink STATUSTEXT to QGC; refuse takeoff (per AC-NEW-7 cache-poisoning safety budget — never silently load a tampered cache file).
|
||||
4. Otherwise: `index = faiss.read_index(target_path, faiss.IO_FLAG_MMAP_IFC)` (memory-mapped load — zero-copy; <1 s wall-time for the syscall to set up mmap regardless of file size; per Source #114 supports HNSW + IndexFlatCodes-derived classes via the `IO_FLAG_MMAP_IFC` flag).
|
||||
5. Optional: warmup query at takeoff (issue ~10 dummy `index.search(rand_query, k=10)` calls) to prime the kernel page cache — smooths post-load p99 latency per Source #115 Issue #622 observation.
|
||||
|
||||
- **Pinned input/output contract**:
|
||||
- inputs: `descriptor_blobs[*]` per tile (numpy.ndarray of shape `(n_tiles, descriptor_dim)` and dtype float32 or halfvec per D-C6-1) computed by C10 pre-flight via running C2 VPR backbone over each cached tile image; `vpr_model_sha256` (the C2 VPR model artifact hash) — feeds into `manifest_hash` so a model-swap forces an index rebuild.
|
||||
- outputs: `<faiss_cache_dir>/v_<descriptor_dim>_M<HNSW_M>.index` (FAISS binary serialization per Source #114) + `<faiss_cache_dir>/manifest.json` (project-defined JSON manifest with content-hash + build provenance).
|
||||
- runtime: pre-flight build runs on the operator workstation OR on the deployed Jetson (per D-C7-7 = primary build-on-target-Jetson recommendation; the same workflow runs on the deployed Jetson to avoid the C7-style SM 87 hardware-tying constraint that doesn't apply to FAISS — FAISS HNSW serialization is hardware-agnostic and can be built once on any x86/ARM machine and shipped). Load runs on the deployed Jetson at takeoff via `faiss.read_index` Python call.
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: `descriptor_blobs: numpy.ndarray of shape (n_tiles, descriptor_dim) and dtype float32 or halfvec`; `descriptor_dim: int ∈ {256, 512, 1024, 2048, 4096}` per D-C2-9/10/6 final lock; `faiss_M: int = 32` per D-C6-2 lock; `ef_construction: int = 40` per Source #96 + C6 Fact #92 canonical pattern; `vpr_model_sha256: str` for manifest-hash binding
|
||||
- outputs: serialized FAISS index file at canonical path `<faiss_cache_dir>/v_<descriptor_dim>_M<HNSW_M>.index` + manifest.json with content-hash + build provenance + per-takeoff load latency <5 s (mmap path: <1 s; full-load path at 100K × 2048-D halfvec = ~430 MB / SATA SSD ~500 MB/s = ~0.9 s + page-cache warmup ~1-2 s)
|
||||
- runtime: FAISS-CPU 1.7+ ARM64 wheel via `pip install faiss-cpu` on JetPack 6 + Python 3.10 + NumPy<2.0.0 (per D-C7-4 cross-coupled numpy-version-pin from C7 batch 1 — same pinning applies here since FAISS-CPU shares the numpy ABI dependency)
|
||||
|
||||
**Source**:
|
||||
- Primary FAISS API: Source #114 (`faiss.write_index` / `faiss.read_index` + `IO_FLAG_MMAP_IFC` flag + explicit security warning — canonical FAISS GitHub Wiki + context7 indexed at `/facebookresearch/faiss`)
|
||||
- File-size + load-latency formula: Source #115 (FAISS GitHub Discussions #3953 + canonical `IndexHNSWFlat` C++ API docs cross-cite — per-vector cost formula `(vector_dim × 4) + (M × 4 × 2) + overhead`)
|
||||
- Atomic-write pattern: Source #116 (gocept blog reliable Python file updates + python-atomicwrites docs + Python tracker Issue 8604 — write-temp + fsync + atomic rename + parent-dir fsync canonical pattern; aligns with POSIX `rename(2)` atomicity guarantee)
|
||||
- Cross-cite: C6 Fact #92 (D-C6-3 originating recommendation = periodic rebuild during C10 pre-flight + `faiss.write_index`), C7 Fact #94 (D-C7-1 calibration-dataset-strategy closure that drives the `vpr_model_sha256` provenance binding)
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High — all evidence is L1/L2 with direct API verification; security-warning-driven content-hash gate is the project-side mitigation for the documented FAISS warning; atomic-write pattern is canonical POSIX semantics; FAISS load latency at the project's pinned descriptor dimensions comfortably fits the <5 s takeoff budget via either full-load or mmap path.
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C10 row in `../06_component_fit_matrix/C10_preflight_provisioning.md` (this fact populates the D-C6-3 confirmation candidate row)
|
||||
- D-C6-3 cross-coupling: closes the C6 ↔ C10 cross-component gate inherited from C6 Fact #92 (`Plan-phase architect + C10 owner` joint ownership)
|
||||
- AC-NEW-7 (cache-poisoning safety budget): the content-hash verification gate at takeoff is the project-side mitigation for FAISS's documented "no internal integrity check" warning; binds to AC-NEW-7's per-flight forgery-detection contract
|
||||
- AC-3.3 (re-localization stability): atomic-write + content-hash gate guarantees same-cache-content → same-cache-load → same-result determinism across reboots and pre-flight rebuilds
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | All FAISS read/write operations are local; `faiss.read_index` opens a local file; no network calls. |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ✅ PASS | FAISS-CPU ARM64 wheels are available via `pip install faiss-cpu` (cross-cite C6 Fact #92 + Source #97); no Jetson-specific issues with `faiss.write_index` / `faiss.read_index` / `IO_FLAG_MMAP_IFC` (canonical FAISS Python API works identically on ARM64). |
|
||||
| **AC-1.x position accuracy** | N/A | Cache file write/read is downstream of accuracy; this fact concerns the descriptor-cache provenance layer. |
|
||||
| **AC-3.3 re-localization stability** | ✅ PASS | Atomic-write + content-hash gate guarantees deterministic cache load across reboots; rebuild only when manifest hash changes; no silent cache mutation at runtime. |
|
||||
| **AC-3.4 operator re-loc hint** | ✅ PASS | Operator re-loc hint uses the same loaded FAISS index (no rebuild required at runtime); content-hash gate at takeoff suffices. |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | N/A | This is pre-flight + takeoff-load, NOT runtime per-frame. Runtime per-frame latency is governed by C6 Fact #92 (~6-54 ms per cache hit). |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | FAISS index in-memory footprint at the project's pinned descriptor dimensions: ~430 MB at 2048-D halfvec × 100K tiles per Source #115 formula (well within C6 Fact #92's 700 MB-1.5 GB Postgres+FAISS+cache subtotal). With `IO_FLAG_MMAP_IFC` the index is mmap'd from disk on demand — peak RSS reduces further at the cost of a page-fault per first-time access. |
|
||||
| **AC-4.5 look-back refinement** | N/A | Pre-flight cache + takeoff load are forward-only events. |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | ✅ PASS | FAISS index file size at the project's pinned descriptor dimensions: ~430 MB at 2048-D halfvec × 100K tiles + ~80-160 MB at 256-D/512-D halfvec for smaller VPR backbones — fits comfortably within the 10 GB cache budget (well under 5% even at the largest 2048-D variant). |
|
||||
| **AC-NEW-1 cold-start TTFF (<30 s p95)** | ✅ PASS | Takeoff-load via mmap path: <1 s; full-load path at 430 MB file: ~0.9-2 s; well within the AC-NEW-1 30-second cold-start TTFF budget. Content-hash gate adds ~0.5-2 s for the 430 MB SHA-256 pass; together <5 s — comfortably within budget. |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | Per-rebuild manifest entry (manifest_hash, content_hash, build_duration_sec, n_tiles, descriptor_dim, vpr_model_sha256) is recordable as an FDR field; per-takeoff load-latency + hash-verification result are recordable as FDR fields. |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Pre-flight pipeline is upstream of the C5 estimator; covariance honesty is C5's contract. |
|
||||
| **AC-NEW-7 cache-poisoning safety budget** | ✅ PASS at the FAISS-cache layer | Content-hash gate at takeoff load REJECTS cache files that don't match the manifest (per Source #114 explicit security warning); atomic-write pattern (Source #116) prevents partial-write corruption from masquerading as a valid cache; manifest-hash-driven rebuild triggers ensure that a model swap forces a rebuild with new content hash. **Cross-flight cache poisoning** (per AC-NEW-7's "P(geo-misalign >30 m) <1%" budget) is upstream of C10 — it's the C6 Fact #92 + AC-8.4 mid-flight tile generation responsibility plus the Suite Service voting layer per AC-NEW-7 external-dependency note. |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | Pre-flight pipeline doesn't run during flight; if the FAISS cache is corrupt at takeoff, the cache-hash-mismatch gate refuses takeoff (which is safer than launching with a bad cache). C5 demotion to `dead_reckoned` is the runtime failsafe path, not the pre-flight one. |
|
||||
|
||||
**Strengths** (positive structural advantages):
|
||||
1. **Direct FAISS API — minimal abstraction surface**. No additional library dependency beyond FAISS-CPU (already required by C6 Fact #92); no orchestration framework to maintain. The atomic-write wrapper is ~30 lines of Python; trivially auditable; works identically across operator workstation + deployed Jetson environments.
|
||||
2. **Manifest-hash-driven rebuild trigger** — idempotent (skip rebuild if no change); minimum-rebuild semantics (rebuild only when descriptor_blobs OR vpr_model_sha256 OR descriptor_dim changes); aligns naturally with C10 pre-flight workflow (descriptor blobs change when tiles are pulled/refreshed; VPR model changes only on dev-side model swap).
|
||||
3. **Content-hash verification gate at takeoff** — operationalizes the FAISS security warning as project-side AC-NEW-7 coverage; never silently loads a tampered cache file.
|
||||
4. **Atomic-write pattern guarantees crash safety** — power loss or process kill mid-build leaves the previous valid cache file intact (per POSIX `rename(2)` atomicity); next pre-flight rebuild detects the manifest mismatch and rebuilds cleanly.
|
||||
5. **Optional mmap load path (`IO_FLAG_MMAP_IFC`)** — zero-copy load syscall completes in <1 s regardless of file size; reduces takeoff RSS pressure; canonical FAISS HNSW + IndexFlatCodes-derived support per Source #114.
|
||||
6. **Hardware-agnostic FAISS serialization** — index can be built on the operator workstation (x86) and shipped to the Jetson (ARM64) without rebuild (vs C7's SM 87 hardware-tying constraint for TensorRT engines). Useful for the prebuilt-fallback path.
|
||||
7. **License clean throughout** — FAISS (MIT); python-atomicwrites if used (MIT); no GPL contagion path on this orchestration layer.
|
||||
|
||||
**Negative-but-mitigable structural findings**:
|
||||
8. **No FAISS-internal integrity check on `read_index`** (per Source #114 explicit warning) — must be mitigated project-side via the content-hash gate above. Without that gate, AC-NEW-7 fails. **Mitigation**: project-side ~5 lines of Python (open file → SHA-256 → compare to manifest) before the `read_index` call; cost ~0.5-2 s at takeoff for a 430 MB cache file.
|
||||
9. **Atomic-write pattern is project-side, not FAISS-internal** — must be hand-rolled or via `python-atomicwrites`. **Mitigation**: ~30 lines of Python; well-documented canonical POSIX pattern per Source #116; trivially auditable.
|
||||
10. **Manifest-hash binding requires VPR model SHA-256** — implies the C2 VPR model artifact has a stable SHA-256 (i.e., a versioned ONNX-or-engine file is checked into the cache directory or referenced from a versioned URI). **Mitigation**: standard ML artifact versioning; aligns with the C7 Fact #94 + C7 Fact #95 + C7 Fact #96 ONNX export pathway (each ONNX export is a binary file with a deterministic hash).
|
||||
11. **Mmap path RAM behavior depends on OS page cache pressure** — if other workloads consume RAM, mmap'd FAISS index pages may be evicted and re-faulted at runtime, adding ~1-5 ms per evicted page-fault to per-frame query latency. **Mitigation**: `mlock` / `madvise(MADV_WILLNEED)` syscalls available in Python via `mmap.MADV_WILLNEED` to pre-fault the pages; cost: one-time at takeoff (~1-2 s for the 430 MB file). At 8 GB shared budget (with C6 Fact #92's 700 MB-1.5 GB total subtotal) there's ample headroom for keeping the mmap'd index resident.
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C10-N gates):
|
||||
|
||||
- **D-C10-1 NEW** — descriptor-cache rebuild trigger choice (manifest-hash-driven [recommended] / always-rebuild-every-pre-flight / operator-manual flag): trade-off between idempotency vs simplicity vs operator control. **Recommendation**: D-C10-1 = (a) manifest-hash-driven (idempotent + minimum-rebuild + operator-manual override flag `--force-rebuild` available).
|
||||
- **D-C10-2 NEW** — descriptor-cache atomic-write strategy (write-temp+fsync+rename hand-rolled / `python-atomicwrites` package / accept-non-atomic-write-and-pray): trade-off between dependency surface vs implementation cost vs crash safety. **Recommendation**: D-C10-2 = (b) `python-atomicwrites` (MIT, ~zero-cost dependency, cross-platform, well-tested in production); fallback (a) hand-rolled if dependency-policy gate prefers in-tree.
|
||||
- **D-C10-3 NEW (CROSS-COMPONENT with AC-NEW-7)** — content-hash verification gate at takeoff load (yes — REJECT cache + STATUSTEXT + refuse takeoff [recommended] / yes — WARN + load anyway / no — trust filesystem): trade-off between safety vs availability vs operator-friction. **Recommendation**: D-C10-3 = (a) reject-and-refuse-takeoff; AC-NEW-7 cache-poisoning budget makes silent acceptance unsafe; operator can re-run pre-flight with `--force-rebuild` to cleanly recover.
|
||||
- **D-C10-4 NEW** — descriptor-cache load path (full-`read_index` / mmap via `IO_FLAG_MMAP_IFC` [recommended] / both available via env flag): trade-off between determinism (full-load is fully resident; mmap RSS depends on page cache) vs takeoff latency (mmap is faster) vs runtime page-fault sensitivity. **Recommendation**: D-C10-4 = (b) mmap with optional `madvise(MADV_WILLNEED)` pre-fault at takeoff (~1-2 s additional cost; eliminates runtime page-faults for the lifetime of the flight) OR (c) both available for Plan-phase Jetson MVE comparison.
|
||||
|
||||
---
|
||||
|
||||
### Fact #101 — D-C7-7 confirmation: TensorRT engine-build pipeline orchestrated via Polygraphy CLI (primary) + `trtexec` (simpler fallback) + direct `IBuilderConfig` Python API (reference-Jetson-prebuilt-engine fallback generation)
|
||||
|
||||
**Statement**: For C10 (pre-flight cache provisioning, cross-coupling minimal scope), the D-C7-7 TensorRT engine-build pipeline (Recommendation = `primary build-on-deployed-Jetson during pre-flight + reference-Jetson-built engines as fallback`) is operationalized as a three-tool orchestration matrix:
|
||||
|
||||
- **Primary path: Polygraphy CLI on the deployed Jetson during pre-flight** (per D-C7-7 = primary build-on-target):
|
||||
```bash
|
||||
polygraphy convert <model>.onnx \
|
||||
--int8 --fp16 \
|
||||
--data-loader-script ./calib_data_loader.py \
|
||||
--calibration-cache <calib_cache_dir>/<model>_calib.cache \
|
||||
--workspace=1000000000 \
|
||||
-o <engine_cache_dir>/<model>_sm87_jp62_trt103_<precision>.engine
|
||||
```
|
||||
- First build per-model: `--data-loader-script` reads the project's pinned calibration corpus per D-C7-1 closure (real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles; ~500-1500 representative samples per Source #120) and runs INT8 calibration; the resulting calibration scales are written to `--calibration-cache` for subsequent builds.
|
||||
- Subsequent rebuilds (when calibration corpus is unchanged): `polygraphy convert ... --calibration-cache <existing_cache>` — calibration step is skipped per Source #117 ("If the provided path does exist, it will be read and int8 calibration will be skipped during engine building").
|
||||
- Per-model precision flags follow D-C7-2 / D-C7-6 cross-component policy: VPR backbones (CNN-class) → `--int8 --fp16`; ViT-class VPR + matchers + learned VIO → `--fp16` only (NO `--int8`).
|
||||
- `--workspace=1000000000` (1 GB cap) per D-C7-8 lock to prevent tactic-profile segfault on 8 GB shared budget.
|
||||
- On-disk engine filename incorporates SM 87 + JetPack 6.2 + TRT 10.3 + precision tag (per D-C7-9 lock) so the runtime can reject a cached engine that was built for a different SM/JP/TRT/precision combination.
|
||||
|
||||
- **Simpler fallback: `trtexec` CLI** (when calibration cache already exists or for ad-hoc/emergency rebuilds):
|
||||
```bash
|
||||
trtexec --onnx=<model>.onnx \
|
||||
--saveEngine=<engine_cache_dir>/<model>_sm87_jp62_trt103_<precision>.engine \
|
||||
--fp16 --int8 \
|
||||
--calib=<calib_cache_dir>/<model>_calib.cache \
|
||||
--shapes=input:1x3x224x224 \
|
||||
--workspace=1000
|
||||
```
|
||||
- Faster invocation (no Python imports; single C++ binary).
|
||||
- Calibration cache file format is interoperable with Polygraphy's per Source #119 — caches built by Polygraphy are loadable by `trtexec` and vice versa.
|
||||
- Used as fallback when Polygraphy is unavailable (e.g., minimal install) OR for reference-Jetson-prebuilt-engine generation when no calibration data shipping is needed.
|
||||
- Critical caveat: `trtexec --int8` without `--calib` falls back to RANDOM data calibration → ~5-15% INT8 accuracy collapse → forbidden in the project's C10 path (always supply `--calib` from the existing calibration cache).
|
||||
|
||||
- **Reference-Jetson-prebuilt-engine fallback generation** (per D-C7-7 fallback path, for emergency provisioning): direct TensorRT `IBuilderConfig` + `IInt8EntropyCalibrator2` Python API per Source #121 — used when Polygraphy's `--data-loader-script` abstraction is too rigid for an unusual model (e.g., LightGlue with dynamic-shape inputs requiring a custom calibration profile per D-C3-2 + D-C3-3). Output: a versioned `.engine` file shipped to the deployed Jetson alongside the calibration cache file. The deployed Jetson at takeoff loads this prebuilt engine via `IRuntime.deserializeCudaEngine` (no on-Jetson rebuild required for the fallback path).
|
||||
|
||||
- **Manifest-hash + content-hash + atomic-write** (same pattern as Fact #100):
|
||||
- `manifest_hash = sha256(model_onnx.sha256, calibration_corpus.sha256, precision_mode, sm_version, jp_version, trt_version)` per engine.
|
||||
- `content_hash = sha256(<engine>.engine)` after build.
|
||||
- Atomic-write wrapper around the engine file output (Polygraphy + trtexec both write to a temp path inside their respective CLIs, but the project-side wrapper enforces the rename-into-position step on top to maintain crash safety across the broader pre-flight workflow).
|
||||
- Per-engine manifest entry recorded in `<engine_cache_dir>/manifest.json`: `(model, precision_mode, calib_corpus_sha256, build_iso8601, build_duration_sec, content_hash, sm_version, jp_version, trt_version)`.
|
||||
|
||||
- **Pinned input/output contract**:
|
||||
- inputs: `<model>.onnx` per inference target (C2 VPR backbone + C3 matcher + optional C1 learned VIO frontend, exported on the dev machine via `torch.onnx.export`); `calibration_corpus` per D-C7-1 closure (real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles in NumPy `.npy` or Torch `.pt` tensor format); `<calib_cache>` per Polygraphy/trtexec INT8 calibration cache file (project-side ships the calibration corpus + the calibration cache; cache is reusable across rebuilds when the corpus hash is unchanged).
|
||||
- outputs: per-model `.engine` file at canonical path `<engine_cache_dir>/<model>_sm87_jp62_trt103_<precision>.engine` + per-engine manifest entry in `<engine_cache_dir>/manifest.json` + AC-NEW-3 FDR record.
|
||||
- runtime context: pre-flight build runs ON the deployed Jetson Orin Nano Super (per D-C7-7 = primary build-on-target — per Source #105 SM 87 hardware-tying constraint). Reference-Jetson-prebuilt-engine fallback runs on a known-good HQ Jetson (same SM 87 / JetPack 6.2 / TensorRT 10.3 — per D-C7-9 lock).
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: `<model>.onnx: bytes` (ONNX graph from `torch.onnx.export`); `calibration_corpus: numpy.ndarray of shape [N=500-1500, C=3, H=224-320, W=224-320] and dtype float32 normalized to [0, 1]` per project's pinned VPR + matcher input shapes per D-C2-3 / D-C2-5 / D-C3-3; `precision_mode: str ∈ {'int8+fp16', 'fp16'}` per D-C7-6 per-family policy
|
||||
- outputs: serialized TensorRT engine file `.engine` + calibration cache file `.cache` (interoperable between Polygraphy and trtexec per Source #119) + manifest entry
|
||||
- runtime: TensorRT 10.3 + CUDA 12.6 + cuDNN 9.3 on JetPack 6.2 + Polygraphy bundled with TensorRT distribution OR `pip install nvidia-pyindex && pip install polygraphy` (Polygraphy is pure Python; ARM64 Python + TensorRT Python bindings sufficient)
|
||||
|
||||
**Source**:
|
||||
- Primary Polygraphy CLI: Source #117 NVIDIA/TensorRT GitHub `tools/Polygraphy/examples/cli/convert/01_int8_calibration_in_tensorrt/README.md` + canonical Polygraphy docs context7 indexed at `/websites/nvidia_deeplearning_tensorrt_static_polygraphy` (1041 code snippets, Source Reputation High)
|
||||
- Polygraphy `Calibrator` class API: Source #118 canonical NVIDIA TensorRT/Polygraphy SDK documentation (entropy/min-max algo defaults, dynamic-shapes calibration profile, data-loader-script + calibration-cache CLI flags)
|
||||
- `trtexec` CLI: Source #119 canonical NVIDIA TensorRT SDK documentation (`--onnx --saveEngine --int8 --fp16 --calib --shapes --workspace` flag set; calibration cache format interoperability with Polygraphy)
|
||||
- Calibration corpus size guidance: Source #120 vendor-aligned engineering guide (500-1000 image recommendation; cross-cite to project's D-C7-1 closure 500-1500 sample range)
|
||||
- Direct `IBuilderConfig` Python API: Source #121 (cross-cite from C7 batch 1 Source #102 + Source #105) — used for reference-Jetson-prebuilt-engine fallback generation
|
||||
- Cross-cite: C7 Fact #94 (D-C7-7 originating recommendation = primary build-on-deployed-Jetson + fallback prebuilt; D-C7-8 = 1 GB workspace; D-C7-9 = JetPack 6.2 + TRT 10.3 lock); C7 Fact #94 (D-C7-1 closure = real UAV nadir flight footage as calibration corpus distribution; specific fixture pin delegated to Test Spec)
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High for Polygraphy + trtexec API capability verification (L1 canonical NVIDIA docs); ✅ High for the orchestration pattern (canonical NVIDIA-blessed workflow per Source #117 README); ⚠️ Medium for the specific build-duration-on-Jetson-Orin-Nano-Super claim (extrapolated from C7 Fact #94 reference of "30-300 sec per model" + Source #105 constraints — exact build-duration depends on model complexity + INT8 calibration scope; needs Plan-phase Jetson MVE confirmation per D-C1-2)
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C10 row in `../06_component_fit_matrix/C10_preflight_provisioning.md` (this fact populates the D-C7-7 confirmation candidate row)
|
||||
- D-C7-7 cross-coupling: closes the C7 ↔ C10 cross-component gate inherited from C7 Fact #94 (`Plan-phase architect + C10 owner` joint ownership)
|
||||
- D-C7-1 closure (real UAV nadir flight footage corpus): C10 owns the calibration-corpus assembly at pre-flight; specific fixture-file pin remains delegated to Test Spec per the 2026-05-08 C9 / SQ7 restructure
|
||||
- AC-NEW-1 (cold-start TTFF <30 s p95): pre-flight engine build is amortized across all takeoffs that use the same artifacts; takeoff-load via `IRuntime.deserializeCudaEngine` is ~100-500 ms per engine × 3-5 engines = ~0.5-2.5 s — well within 30 s budget
|
||||
- AC-NEW-3 (FDR): per-engine manifest entry recorded as FDR field
|
||||
- AC-NEW-7 (cache-poisoning safety): same content-hash + atomic-write pattern as Fact #100 protects the engine cache file against partial-write corruption
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | All Polygraphy/trtexec invocations are local CLI subprocess calls; engine build runs entirely on the deployed Jetson. |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ✅ PASS | Polygraphy is pure Python (works on ARM64 + Python 3.10); trtexec is bundled with TensorRT 10.3 in JetPack 6.2 (installed by default at `/usr/src/tensorrt/bin/trtexec`); both interoperate with the JetPack-bundled TensorRT 10.3 per Source #117 + Source #119. |
|
||||
| **AC-1.x position accuracy** | N/A | Engine build is upstream of accuracy; this fact concerns the engine provenance layer. |
|
||||
| **AC-3.x resilience** | N/A | Engine cache is a takeoff-load artifact; runtime resilience is C5/C8 responsibility. |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | N/A | Engine build is pre-flight + takeoff-load, NOT runtime per-frame. Per-engine inference latency is governed by C7 Fact #94 / Fact #95 / Fact #96. |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | Per Source #105 + D-C7-8: Polygraphy/trtexec engine build with `--workspace=1000` (1 GB cap) holds peak build-time memory at ~3-5 GB out of 8 GB shared (build-time peak; runtime is much lower per C7 Fact #94 ~50-150 MB shared library + ~50-300 MB per engine). Pre-flight build is performed when no other workloads are active, so the 5 GB peak is acceptable. |
|
||||
| **AC-4.5 look-back refinement** | N/A | Engine build pipeline is forward-only. |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | ✅ PASS | Engine `.engine` files at 10-200 MB each per C7 Fact #94 × 3-5 engines = ~100-500 MB on disk (separate from the 10 GB tile cache; lives at `/var/lib/onboard/cache/trt/` or equivalent). Calibration cache files at 1-10 MB each are negligible. |
|
||||
| **AC-NEW-1 cold-start TTFF (<30 s p95)** | ✅ PASS | Takeoff-load via `IRuntime.deserializeCudaEngine` is ~100-500 ms per engine × 3-5 engines = ~0.5-2.5 s; combined with FAISS load <5 s (Fact #100) and content-hash gates total ~5-10 s, well within 30 s budget. **Build is pre-flight, NOT during cold-start** — engines are pre-built during pre-flight provisioning and persisted across reboots. |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | Per-engine manifest entry (model, precision_mode, calib_corpus_sha256, build_iso8601, build_duration_sec, content_hash, sm_version, jp_version, trt_version) is recordable as an FDR field per AC-NEW-3 forensic trail requirement. |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Engine build pipeline is upstream of the C5 estimator. |
|
||||
| **AC-NEW-7 cache-poisoning safety budget** | ✅ PASS at the engine-cache layer | Same content-hash + atomic-write pattern as Fact #100 (project-side wrapper around Polygraphy/trtexec output); engine-cache poisoning is detected at takeoff load via SHA-256 verification; manifest-hash binding guarantees that a calibration-corpus swap or ONNX-model swap forces a clean rebuild with new content hash. The reference-Jetson-prebuilt-engine fallback path uses a versioned `.engine` artifact that is signed/checksummed at the HQ source-of-truth (the project's release pipeline owns this signing). |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | Engine cache is loaded at takeoff; if a content-hash mismatch is detected, takeoff is refused (same posture as Fact #100). C5 demotion to `dead_reckoned` is the runtime failsafe path, not the pre-flight one. |
|
||||
|
||||
**Strengths** (positive structural advantages):
|
||||
1. **Polygraphy is the canonical NVIDIA-blessed orchestration tool** for TensorRT engine builds with INT8 calibration cache reuse — first-party support, multi-snippet docs coverage, production-mature; eliminates the need to write the calibrator + data-loader + builder-config glue code from scratch.
|
||||
2. **Calibration cache reuse across rebuilds** — first build per-model takes ~30-300 sec including INT8 calibration (per C7 Fact #94 reference); subsequent rebuilds skip the calibration step (per Source #117 explicit "calibration will be skipped" semantics) — typically <30 sec even for the most complex matchers. Critical for fast iteration during the operator's pre-flight workflow.
|
||||
3. **CLI interoperability between Polygraphy and trtexec** — the calibration cache file format is identical between the two tools per Source #119; the project can use Polygraphy for the canonical INT8-calibration-bearing build and trtexec for emergency/ad-hoc rebuilds without re-shipping calibration data.
|
||||
4. **Mixed-precision flag matrix matches D-C7-2 / D-C7-6 cross-component policy** — `--int8 --fp16` is the canonical Polygraphy/trtexec invocation for the project's per-family mixed precision per Source #117 + Source #119.
|
||||
5. **`--load-tactics` / `--save-tactics` for reference-Jetson-prebuilt-engine workflow** — Polygraphy supports replaying tactic-search results across multiple builds (per Source #118); the project can ship the tactic replay file alongside the prebuilt engine for fast on-Jetson rebuild without re-running tactic profiling.
|
||||
6. **Direct `IBuilderConfig` Python API as escape hatch** — for unusual models requiring custom calibration profiles (e.g., LightGlue with dynamic-shape inputs per D-C3-2 + D-C3-3) the project can drop down to the direct TensorRT Python API per Source #121 without abandoning the orchestration framework.
|
||||
7. **Pre-flight build amortized across all takeoffs** — engine cache is persistent; build runs only when calibration corpus or ONNX model changes (manifest-hash-driven); typical operator workflow is: build once at HQ ship → operator pulls fresh tile cache → operator triggers pre-flight (FAISS rebuild + maybe TRT rebuild if calibration-corpus refreshed) → takeoff.
|
||||
8. **License clean throughout** — Polygraphy (Apache-2.0); TensorRT (Apache-2.0 in TensorRT 10.x per C7 Fact #94); python-atomicwrites (MIT); no GPL contagion path on this orchestration layer.
|
||||
|
||||
**Negative-but-mitigable structural findings**:
|
||||
9. **First-build INT8 calibration takes 30-300 sec per model on Jetson** — large matcher models (e.g., LightGlue at K=1024 keypoints) can hit the upper end of this range. **Mitigation**: calibration cache reuse — once the cache is built, subsequent rebuilds are <30 sec; first build at HQ + ship cache to operator workstation pre-deployment.
|
||||
10. **Engine cache is hardware-specific (SM 87)** per C7 Fact #94 + Source #105 — can't ship engines across Jetson hardware variants. **Mitigation**: D-C7-7 = (c) primary-build-on-target with reference-Jetson-prebuilt-engine fallback ONLY for SM 87 / JetPack 6.2 / TRT 10.3 combinations; the project's deployed fleet is uniform per restrictions.md (Jetson Orin Nano Super pinned).
|
||||
11. **Polygraphy CLI requires `pip install polygraphy` separately if not bundled with TensorRT distribution** — minimal Jetson installs may need `pip install nvidia-pyindex && pip install polygraphy`. **Mitigation**: include in the project's pre-flight Docker image / OS image bake; verify at C10 setup.
|
||||
12. **`trtexec --int8` without `--calib` falls back to random-data calibration** with documented ~5-15% INT8 accuracy collapse per Source #119. **Mitigation**: project-side wrapper around `trtexec` invocation enforces `--calib=<existing_cache>` non-empty as a precondition; reject the build otherwise with clear error message.
|
||||
13. **Build-time peak memory ~3-5 GB out of 8 GB shared** per Source #105 constraint #4 + D-C7-8 — not safe to run pre-flight build concurrently with other heavy workloads (e.g., camera pipeline, FAISS build). **Mitigation**: pre-flight orchestration is sequential — build TRT engines one at a time, then FAISS index, then verification; takes ~5-15 min total at first-build (with calibration); ~1-3 min for subsequent rebuilds (cache-reused).
|
||||
14. **Calibration-corpus shipping mechanism** — per D-C7-1 closure the corpus is real UAV nadir flight footage at ~1 km AGL; this corpus is several GB of tensor data. **Mitigation**: ship calibration corpus + calibration cache together as a versioned artifact bundle; ship cache only (not raw corpus) to operators when the cache is sufficient (i.e., fixture-pin from Test Spec is stable and operators don't need to recalibrate).
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C10-N gates):
|
||||
|
||||
- **D-C10-5 NEW (CROSS-COMPONENT with C7)** — TensorRT engine-build orchestration tool choice (Polygraphy CLI primary [recommended] / `trtexec` CLI primary / direct `IBuilderConfig` Python API primary / hybrid: Polygraphy for INT8-calibrating builds + `trtexec` for cache-reuse rebuilds + direct API for unusual models): trade-off between orchestration sophistication vs install footprint vs flexibility. **Recommendation**: D-C10-5 = (d) hybrid — Polygraphy for INT8-calibrating builds (canonical NVIDIA tool, multi-snippet docs, supports custom data loaders); `trtexec` for cache-reuse fast rebuilds (single binary, no Python imports, faster invocation); direct `IBuilderConfig` Python API as escape hatch for unusual models (e.g., LightGlue dynamic shapes per D-C3-2 + D-C3-3).
|
||||
- **D-C10-6 NEW (CROSS-COMPONENT with D-C7-1)** — TensorRT calibration-cache reuse strategy (always reuse if cache file exists [most-aggressive] / rebuild on calib-corpus SHA-256 change [recommended] / rebuild every pre-flight [most-conservative]): trade-off between rebuild cost vs calibration-data freshness vs operator-workflow simplicity. **Recommendation**: D-C10-6 = (b) rebuild on calib-corpus SHA-256 change — manifest-hash-driven rebuild trigger from Fact #100 pattern naturally extends to TRT engine cache; idempotent + minimum-rebuild + operator-manual override flag `--force-trt-rebuild` available.
|
||||
- **D-C10-7 NEW** — TensorRT engine on-disk filename schema (`<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` [recommended] / hash-only filename / opaque content-addressable storage with separate manifest mapping): trade-off between operator-debuggability vs filesystem-simplicity vs versioning-rigor. **Recommendation**: D-C10-7 = (a) `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` self-describing filename + manifest.json side-cache; runtime can reject a cached engine that doesn't match the deployed Jetson's SM/JP/TRT combination with a clear error message at takeoff load.
|
||||
- **D-C10-8 NEW** — TensorRT prebuilt-fallback engine generation venue (reference Jetson at HQ [recommended] / CI pipeline with Jetson-class runner / deployed Jetson copy-to-HQ-archive after first successful local build): trade-off between reproducibility vs CI cost vs reduced pre-flight risk. **Recommendation**: D-C10-8 = (a) reference Jetson at HQ + (c) deployed-Jetson-copy-to-archive on first successful local build for opportunistic redundancy; both venues use the same Polygraphy/trtexec pipeline so artifacts are interchangeable; HQ-built engines serve as authoritative fallbacks signed by the project's release pipeline.
|
||||
|
||||
---
|
||||
|
||||
## C10 — Working conclusions and decisions (compounded from Fact #100 + Fact #101 closures)
|
||||
|
||||
**Selected primary**:
|
||||
- **D-C6-3 confirmation**: descriptor-cache rebuild trigger pipeline orchestrated via direct `faiss.write_index` / `faiss.read_index` Python API + `python-atomicwrites` (or hand-rolled atomic-write) + content-hash verification gate at takeoff + manifest-hash-driven rebuild trigger + optional `IO_FLAG_MMAP_IFC` mmap load path with `madvise(MADV_WILLNEED)` pre-fault. **Closes the C6 ↔ C10 cross-component gate.**
|
||||
- **D-C7-7 confirmation**: TensorRT engine-build pipeline orchestrated via the **hybrid** tool matrix per D-C10-5 = (d): Polygraphy CLI for INT8-calibrating builds (primary) + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API for unusual models (LightGlue dynamic shapes). Reference-Jetson-prebuilt-engine fallback per D-C10-8 = (a)+(c). Calibration corpus per D-C7-1 closure (real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles; specific fixture-file pin delegated to Test Spec). **Closes the C7 ↔ C10 cross-component gate.**
|
||||
|
||||
**Decisions raised (D-C10-N gates)** — see [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md):
|
||||
|
||||
- **D-C10-1** (Fact #100) — descriptor-cache rebuild trigger choice: manifest-hash-driven / always-rebuild / operator-manual — RECOMMENDED manifest-hash-driven + `--force-rebuild` override
|
||||
- **D-C10-2** (Fact #100) — descriptor-cache atomic-write strategy: hand-rolled / `python-atomicwrites` / no-atomic — RECOMMENDED `python-atomicwrites` (fallback hand-rolled if dependency-policy gate prefers in-tree)
|
||||
- **D-C10-3** (Fact #100, CROSS-COMPONENT with AC-NEW-7) — content-hash verification gate at takeoff load: reject + STATUSTEXT + refuse takeoff / warn + load anyway / no — RECOMMENDED reject + STATUSTEXT + refuse takeoff
|
||||
- **D-C10-4** (Fact #100) — descriptor-cache load path: full-`read_index` / mmap via `IO_FLAG_MMAP_IFC` / both via env flag — RECOMMENDED mmap with `madvise(MADV_WILLNEED)` pre-fault (or both for Plan-phase Jetson MVE)
|
||||
- **D-C10-5** (Fact #101, CROSS-COMPONENT with C7) — TensorRT engine-build orchestration tool choice: Polygraphy primary / trtexec primary / direct API primary / hybrid — RECOMMENDED hybrid (Polygraphy + trtexec + direct API by use case)
|
||||
- **D-C10-6** (Fact #101, CROSS-COMPONENT with D-C7-1) — TensorRT calibration-cache reuse strategy: always-reuse / rebuild-on-calib-corpus-SHA-256-change / rebuild-every-pre-flight — RECOMMENDED rebuild-on-calib-corpus-SHA-256-change + `--force-trt-rebuild` override
|
||||
- **D-C10-7** (Fact #101) — TensorRT engine on-disk filename schema: self-describing `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` / hash-only / content-addressable + manifest — RECOMMENDED self-describing filename + manifest.json side-cache
|
||||
- **D-C10-8** (Fact #101) — TensorRT prebuilt-fallback engine generation venue: reference Jetson at HQ / CI pipeline with Jetson-class runner / deployed-Jetson-copy-to-HQ-archive on first successful local build — RECOMMENDED reference Jetson at HQ + deployed-Jetson-copy-to-archive (opportunistic redundancy)
|
||||
|
||||
C10 batch 1 closed at 2/N on 2026-05-08 (cross-coupling minimal scope per `c10_scope=C` user choice). Operator CLI/desktop tooling, sector classification heuristics, freshness pipeline workflow remain **deferred to Plan-phase as `operator tooling design` out-of-research-scope**. **No further C10 batches required at the research layer** — D-C6-3 and D-C7-7 are now closed; remaining C10 questions are operational/UX, not architectural.
|
||||
|
||||
---
|
||||
@@ -0,0 +1,396 @@
|
||||
# Fact Cards — C1: Visual / Visual-Inertial Odometry
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `../01_source_registry/C1_vio.md` (see `../01_source_registry/00_summary.md` for index). Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential). Bound to sub-questions in `../00_question_decomposition.md`.
|
||||
>
|
||||
> Index: [`../00_summary.md`](../00_summary.md). Sibling categories: SQ6 ([FC external positioning](SQ6_fc_external_positioning.md)), SQ1 ([existing systems](SQ1_existing_systems.md)), SQ2 ([canonical pipeline](SQ2_canonical_pipeline.md)), C2 ([VPR](C2_vpr.md)), C3 ([matchers](C3_matchers.md)).
|
||||
|
||||
**Facts in this file**: VIO candidate enumeration (VINS-Mono, VINS-Fusion, OpenVINS, OKVIS2, Kimera-VIO, DROID-SLAM, DPVO, KLT+RANSAC baseline) + Plan-phase decisions D-C1-1, D-C1-2 + C1 working conclusions.
|
||||
|
||||
---
|
||||
|
||||
## SQ3+SQ4 / C1 — Visual / Visual-Inertial Odometry candidate enumeration
|
||||
|
||||
> **Project's pinned mode for every C1 candidate (binding)**: monocular ADTi 20MP nav camera @ 3 fps + IMU from FC over MAVLink @ ≥100 Hz, on Jetson Orin Nano Super (JetPack/CUDA/TensorRT, 8 GB shared LPDDR5, 25 W TDP), producing relative 6-DoF metric pose between consecutive frames + per-axis covariance, with attitude (yaw + pitch) hard-contract σ ≤ 5° at 1 σ (Fact #24), output cadence ≥3 Hz, no in-flight network, license compatible with onboard-binary distribution to a dual-use customer.
|
||||
>
|
||||
> Per the engine's "Per-Mode API Capability Verification" rule, any candidate marked `Selected` requires a `context7` lookup (mode enum + project's exact mode runnable example + disqualifier probe) AND a per-numbered-Restriction × per-numbered-AC sub-matrix. **This session covers candidate enumeration + preliminary applicability assessment only**; `context7` verification and the structured sub-matrix are deferred to the next session per the autodev context budget heuristic.
|
||||
|
||||
### Fact #28 — VINS-Mono is a canonical monocular-only sliding-window VIO with a working Jetson-Nano deployment record but no GitHub release and ~24-month-old master branch
|
||||
- **Statement**: VINS-Mono is the canonical mono+IMU sliding-window VIO from HKUST-Aerial-Robotics (Qin, Li, Shen — IEEE T-RO 2018). Features: efficient IMU pre-integration, automatic initialization, online camera-IMU spatial + temporal calibration, failure detection + recovery, DBoW2 loop detection, global pose-graph optimization. Output: metric-scale 6-DoF pose at IMU rate. **Repository state**: master-branch only (no tagged releases), 5,829 stars; last meaningful master-branch commit 2024-02-25 with a 2024-05-23 simulation-data commit. **Jetson record**: a 2021 IEICE paper (zinuok / KAIST) demonstrated VINS-Mono real-time on the original Jetson Nano (much weaker than Orin Nano Super) for MAV state estimation; a 2024 arXiv paper (2406.13345) showed an enhanced VINS-Mono variant achieving 50 FPS on a Raspberry Pi CM4 with on-sensor accelerated optical flow. **License**: GPL-3.0 (copyleft viral) — distribution of the onboard binary requires source disclosure for the entire linked binary and triggers GPL-3 anti-tivoization clauses for embedded firmware.
|
||||
- **Source**: Source #43 (canonical), Source #46 (KAIST Jetson benchmark), Source #43-linked LICENCE for license confirmation
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅ for algorithm class, mode support, and Jetson Nano feasibility; ⚠️ for Jetson Orin Nano Super specific latency (no direct measurement — but Orin Nano Super >> Jetson Nano, so feasibility is virtually certain); ⚠️ for the maintenance-status risk implied by ~24-month-old master branch.
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Established-production candidate
|
||||
- **Fit Impact**: **carry as lead candidate, conditional on user license decision.** Algorithmic fit is excellent (canonical mono+IMU VIO with metric scale and covariance); maintenance status is borderline; **GPL-3.0 license is a project-level decision required from the user** before this candidate can be marked Selected — see "C1 Open Decisions" section below.
|
||||
|
||||
### Fact #29 — VINS-Fusion is a multi-sensor superset of VINS-Mono but its monocular+IMU mode failed to run on Jetson TX2 in a 2021 KAIST benchmark; Orin Nano Super feasibility unverified
|
||||
- **Statement**: VINS-Fusion (Qin, Cao, Pan, Shen — extension of VINS-Mono) supports four documented sensor configurations: stereo+IMU, mono+IMU, stereo only, +GPS-fusion (toy example). KITTI Odometry top-ranked open-source stereo algorithm as of January 2019. **Repository state**: 4,476 stars; last update 2024-05-23; same master-branch-only convention. **Jetson record**: KAIST 2021 benchmark (Source #46) — on Jetson TX2, both **VINS-Fusion (CPU) and VINS-Fusion-imu fail to run** due to insufficient memory and CPU; VINS-Fusion-gpu (GPU-accelerated front-end) runs on TX2. Orin Nano Super has more memory than TX2 (8 GB LPDDR5 shared vs TX2's 8 GB LPDDR4 shared) and stronger CPU/GPU, but the project's onboard stack is *co-resident* with C2 VPR + C3 matcher + C5 estimator + C6 cache → memory-pressure on the VINS-Fusion-imu path is plausible. **License**: GPL-3.0, same dual-use distribution constraint as VINS-Mono.
|
||||
- **Source**: Source #44 (canonical), Source #46 (KAIST Jetson benchmark)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅ for the multi-sensor mode support and KITTI ranking; ✅ for the 2021 TX2 failure-to-run finding; ⚠️ for Orin Nano Super viability (between TX2 and Xavier NX in CPU/memory; not yet measured).
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Open-source candidate
|
||||
- **Fit Impact**: **carry as alternate candidate, with mandatory Jetson Orin Nano Super MVE before promotion.** VINS-Mono's narrower scope (mono+IMU only, no stereo overhead) makes VINS-Mono the preferred lead within the HKUST-Aerial-Robotics family; VINS-Fusion's multi-sensor coverage is a distractor for our pinned mode. **GPL-3.0 license decision is the same as VINS-Mono** — see "C1 Open Decisions".
|
||||
|
||||
### Fact #30 — OpenVINS is the most actively maintained MSCKF-class VIO and runs on Jetson Orin Nano Dev Kit + JetPack 6 + ROS 2 Humble with documented build adjustments; latency 270 ms on Xavier NX needs Orin-Nano-Super MVE
|
||||
- **Statement**: OpenVINS (rpng, U. Delaware — Geneva, Eckenhoff, Lee, Yang, Huang — ICRA 2020) is a modular MSCKF (Multi-State Constraint Kalman Filter) implementation that fuses IMU state with sparse visual feature tracks via the Mourikis-Roumeliotis 2007 sliding-window MSCKF. **Mode support**: monocular, stereo, multi-camera (1–N) + IMU; mono+IMU is a documented first-class configuration. Supports SLAM features (in-state landmarks) plus pure MSCKF features. **Jetson Orin Nano evidence**: rpng/open_vins issue #421 (Genozen, Feb 2024, closed) confirms OpenVINS ROS 2 builds on Jetson Orin Nano Dev Kit + JetPack 6 + Ubuntu 22.04 + ROS 2 Humble after one build patch (`#include <opencv2/aruco.hpp>` with newer OpenCV); fdcl-gwu/openvins_jetson_realsense (Nov 2025) provides a complete setup guide for Jetson Orin Nano + Intel RealSense + librealsense compiled-from-source + `--parallel-workers 1` build to avoid memory issues. **Latency record**: rpng/open_vins issue #164 — ~270 ms latency on Jetson Xavier NX (4 cores, 40% CPU utilisation). Recommended optimisations: subscriber queue size 1, Release builds with ARM-specific optimization flags (e.g., `armv8.2-a`), reduced camera resolution, prefer `odometry` topic over `pose_imu`. **License**: GPL-3.0, same dual-use distribution constraint as VINS-Mono / VINS-Fusion. Stars 2,828; 30 contributors; 12 releases; latest tag v2.7 (June 2023) but master branch active through 2024–2025 issue threads.
|
||||
- **Source**: Source #45 (canonical + LICENSE + docs.openvins.com), Source #46 (KAIST Jetson benchmark for class-level CPU/memory profile), agent-tools record `29ebf728...txt` (Jetson Orin Nano build evidence)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅ for mode support, MSCKF formulation, and Jetson Orin Nano build feasibility; ⚠️ for steady-state latency on Orin Nano Super under our 5472×3648 nav frames — KAIST benchmark used 640×480; 16× pixel count is a yellow-flag.
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Established-production candidate
|
||||
- **Fit Impact**: **carry as lead candidate, conditional on user license decision.** OpenVINS has the most documented Jetson-Orin-Nano build path of the three GPL-3.0 candidates; MSCKF formulation is more memory-efficient than VINS-Mono's full sliding-window optimisation, which is a meaningful advantage under co-resident-process memory pressure. **GPL-3.0 license decision is the same as VINS-Mono / VINS-Fusion**.
|
||||
|
||||
### Fact #31 — OKVIS2 is the most actively maintained VI-SLAM in the BSD-permissive license bucket; OKVIS2-X (T-RO 2025) extends it with optional GNSS fusion that is architecturally aligned with the project's spoof-promotion path
|
||||
- **Statement**: OKVIS2 (Leutenegger — arXiv 2022, ETH/Imperial/TUM Smart Robotics Lab) is a factor-graph VI-SLAM with bounded-size optimization. Algorithmic novelty: pose-graph edges from marginalised observations are "seamlessly turned back into observations" upon loop closure, reviving old landmarks and reprojection errors. Includes lightweight CNN segmentation for dynamic-region removal. **Mode support**: monocular and multi-camera + IMU; mono+IMU is a documented first-class configuration. **Successor OKVIS2-X (Boche, Jung, Laina, Leutenegger — IEEE T-RO 2025 vol 41 pp 6064–6083, DOI 10.1109/TRO.2025.3619051; arXiv 2510.04612, Oct 2025)** generalises the core to fuse multi-camera + IMU + optional GNSS receiver + LiDAR or depth. The OKVIS2-X GNSS-fusion mode (lineage: Visual-Inertial SLAM with Tightly-Coupled Dropout-Tolerant GPS Fusion, IROS 2022) directly mirrors the project's "VIO that may opportunistically fuse a non-spoofed GPS update when promotion completes" pattern (AC-NEW-2). **Repository state**: ethz-mrl/OKVIS2-X created 2025-09-23, last push 2026-03-17, 295 stars, 2 active contributors (bochsim, SebsBarbas). **License**: 3-clause BSD on the LICENSE file (GitHub UI shows "Other (NOASSERTION)" but the file is canonical 3-clause BSD per ASL-ETH Zurich convention) — permissive, no dual-use distribution friction.
|
||||
- **Source**: Source #47 (OKVIS2 canonical), Source #48 (OKVIS2-X T-RO 2025)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 / C5 implementer
|
||||
- **Confidence**: ✅ for algorithm, mode support, license, T-RO 2025 publication, repository activity; ⚠️ for Jetson Orin Nano runtime — no direct Jetson Orin Nano benchmark located; OKVIS2's factor-graph backend is plausibly heavier than OpenVINS' MSCKF on memory but lighter than Kimera (Kimera also produces a 3D mesh + semantic mesher, OKVIS2 does not).
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Open-source-permissive lead candidate; potential C1+C5+C8 unified factor-graph design
|
||||
- **Fit Impact**: **strong lead candidate by license + maintenance + GNSS-fusion alignment.** If license permissiveness is a priority, OKVIS2 + OKVIS2-X is the natural choice. The OKVIS2-X factor-graph also opens a design path where C5 (state estimator) collapses INTO C1 (the same factor graph absorbs sat-anchor measurements as constraints) — would simplify the pipeline at the cost of departing from the C1/C5 split, which is a Step-7.5 / `solution_draft01` design decision, not a SQ3+SQ4 question. **Pending Jetson Orin Nano Super MVE.**
|
||||
|
||||
### Fact #32 — Kimera-VIO is BSD-permissive but resource-heavy; KAIST benchmark found Kimera had the highest memory usage among VIOs tested and failed Xavier-NX-class memory under multi-process load
|
||||
- **Statement**: Kimera-VIO (MIT-SPARK — Rosinol, Abate, Chang, Carlone — ICRA 2020) is a VI-SLAM pipeline with frontend + backend (factor-graph optimization in iSAM2 or GTSAM) + 3D mesher + pose-graph optimizer. Mode support: stereo+IMU primary, mono+IMU optional but documented. **License**: BSD 2-Clause "Simplified" (LICENSE.BSD on the repo) — permissive. **Maintenance**: active issue/PR threads through Dec 2024 / Feb 2025 covering ROS 2 integration, mono-inertial discussion, dependency management. **Resource profile** (Source #46 KAIST 2021 benchmark): Kimera had the highest memory usage among the 9 algorithms tested (numerous computations per keyframe); Kimera failed to fit on Xavier NX-class memory under sustained multi-process load. The 3D mesh + semantic-label outputs are unused by the project's narrow C1 mandate (relative 6-DoF + covariance only) — Kimera's overhead is unjustified vs OKVIS2 / OpenVINS for our use case.
|
||||
- **Source**: Source #49 (Kimera canonical + LICENSE.BSD), Source #46 (KAIST Jetson benchmark)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects (build-vs-buy, mesh-feature decision)
|
||||
- **Confidence**: ✅ for algorithm, license, maintenance status; ✅ for the Source #46 finding (KAIST 2021); ⚠️ for whether Orin Nano Super's larger memory + Ampere GPU lifts Kimera into feasibility — the Source-46 failure was on Xavier NX 8 GB shared, same memory budget as Orin Nano Super, but Orin Nano Super has higher per-core throughput.
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Open-source-permissive secondary candidate
|
||||
- **Fit Impact**: **carry as fallback only, not lead.** Kimera's permissive license is attractive but its resource overhead (especially the unused 3D mesh + semantic mesher) is a poor fit under co-resident process pressure. Use as a conservative secondary fallback if OKVIS2 unexpectedly fails Jetson MVE. **Status**: not lead.
|
||||
|
||||
### Fact #33 — DROID-SLAM is disqualified by AC-4.2: ≥11 GB GPU VRAM inference budget exceeds the project's 8 GB shared LPDDR5; further, DROID-SLAM is monocular VO/SLAM without IMU fusion and would require an external metric-scale wrapper
|
||||
- **Statement**: DROID-SLAM (princeton-vl, Teed & Deng — NeurIPS 2021; arXiv 2108.10869) requires ≥11 GB GPU memory to run inference per the official README; training requires ≥24 GB on 4× RTX 3090. Issue #121 confirms that even with 128 GB system RAM and 16 GB VRAM (RTX 4080), users hit very large RAM consumption quickly. Algorithmically, DROID-SLAM is **monocular VO/SLAM** with recurrent dense bundle adjustment over a complete history of camera poses — no native IMU fusion; output pose is in arbitrary scale (no metric scale recovery without external alignment). DPV-SLAM (ECCV 2024, princeton-vl) is the lighter successor at ~4–5 GB GPU memory; DPVO (NeurIPS 2023, princeton-vl) is even lighter at ~3 GB, but neither natively integrates IMU.
|
||||
- **Source**: Source #50 (DROID-SLAM canonical), Source #51 (DPVO / DPV-SLAM successor), Source #52 (DPVO-QAT++ memory measurement)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 disqualified candidate
|
||||
- **Fit Impact**: **DISQUALIFIED outright.** AC-4.2 sets the 8 GB shared CPU+GPU memory budget; DROID-SLAM's ≥11 GB GPU-only requirement violates it before adding co-resident C2/C3/C5/C6 processes. Cite as "what the project cannot afford" in `solution_draft01` to pre-empt obvious questions.
|
||||
|
||||
### Fact #34 — DPVO is monocular VO only (no IMU fusion); it can fit a Jetson-suitable memory footprint with QAT but cannot satisfy the C1 VIO mandate alone — would need an external IMU + metric-scale wrapper
|
||||
- **Statement**: DPVO (Teed, Lipson, Deng — NeurIPS 2023; ECCV 2024 DPV-SLAM successor) is a deep-learning monocular VO with sparse patch tracking + differentiable bundle adjustment. **Mode**: monocular VO only — no IMU fusion in the published paper or repository; output pose is in arbitrary scale. Memory footprint: DPVO ~3 GB GPU, DPV-SLAM ~4–5 GB GPU on standard hardware; DPVO-QAT++ (arXiv 2511.12653, Cheng Liao, Nov 2025) reduces peak reserved memory to 1.02 GB on RTX 4060 (8 GB) via fused-CUDA INT8 fake-quantization while preserving ATE on TartanAir/EuRoC. **License**: MIT (permissive). Repository: 989 stars; last update 2024-10-12. **Crucial gap**: DPVO does NOT meet the C1 mandate of a "VIO that produces metric-scale 6-DoF + attitude with σ ≤ 5°" — for the project to use DPVO as the *VO half* of C1, an additional IMU+scale-fusion module (loosely-coupled ESKF with VO velocity / displacement priors) must be designed; alternatively, DPVO's pose can feed C5 directly as a relative-displacement constraint, with attitude served separately by FC IMU integration. **Jetson Orin Nano runtime evidence**: indirect — DPVO-QAT++ benchmarks on RTX 4060 desktop, NOT Jetson Orin Nano. The Ampere GPU architecture is shared between RTX 4060 and Orin Nano Super (both Ampere); the Orin Nano Super's GPU is smaller, so direct extrapolation is not safe — Jetson MVE required.
|
||||
- **Source**: Source #51 (DPVO / DPV-SLAM canonical), Source #52 (DPVO-QAT++ Nov 2025)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 / C5 implementer
|
||||
- **Confidence**: ✅ for "VO only, no IMU fusion" and the memory footprints; ⚠️ for Jetson Orin Nano direct runtime (no measurement); ⚠️ for the operational complexity of the QAT pipeline (teacher-student distillation training is a significant prerequisite vs the classical VINS-* / OpenVINS / OKVIS2 candidates).
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 conditional candidate (VO not VIO; needs external IMU wrapper)
|
||||
- **Fit Impact**: **NOT a drop-in C1 candidate; conditional fit only.** DPVO is **not** a substitute for VINS-Mono / OpenVINS / OKVIS2 — it is a candidate for the *VO half* of a hybrid design where C5 (estimator) absorbs IMU and DPVO provides relative-pose priors. This adds design complexity and is **not preferred** unless one of the established VIO candidates fails Jetson MVE for memory reasons. **Status**: secondary, conditional.
|
||||
|
||||
### Fact #35 — Pure VO baseline (KLT optical flow + 5-point essential matrix or homography RANSAC) is the project's mandatory simple-baseline candidate and is the de-facto fallback when learning-based methods fail on Jetson-budget constraints
|
||||
- **Statement**: The classical pipeline — Shi-Tomasi or FAST corner detection → KLT pyramidal optical flow tracking (`cv::calcOpticalFlowPyrLK`) → 5-point essential matrix (Nister, `cv::findEssentialMat`) or homography RANSAC (`cv::findHomography`) → relative pose with arbitrary scale → metric-scale alignment via IMU integration externally — is the foundational visual-odometry pipeline implemented in OpenCV samples and pedagogical repositories. For the project's nadir-down UAV at 1 km AGL over Ukrainian steppe (predominantly planar terrain, low relief), the **homography path is geometrically appropriate** (a plane induces a homography between two views); for non-planar relief, the **essential-matrix path is appropriate** at a small overhead. License: public domain / OpenCV-Apache-2.0 / MIT (whatever reference implementation is chosen) — permissive. Reference: representative public Monocular-Video-Odometery (MIT, alishobeiri 2018), Monocular-Visual-Odometry (Yacynte) at translation error 0.94% / rotation error 0.015°/m on KITTI dataset.
|
||||
- **Source**: Source #53 (OpenCV docs + reference implementations)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer + risk reviewer
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 Simple-baseline candidate (mandatory per Component Option Breadth rule)
|
||||
- **Fit Impact**: **carry as the project's `Simple baseline / known-runnable / known-failure-mode` C1 fallback.** Not a lead, but mandatory presence. Failure modes: (a) low-texture cropland / snow → KLT track loss; (b) sharp turns → low-overlap homography degeneracy; (c) no native IMU fusion → must wrap with external metric-scale alignment (same wrapper as DPVO). **Status**: simple-baseline reference; cited in `solution_draft01` to anchor the failure analysis.
|
||||
|
||||
### Fact #36 — Step-0.5-time-window assessment: VINS-Mono / VINS-Fusion master branches are at the Critical-novelty 18-month boundary; OpenVINS and OKVIS2 are within window; DPVO is borderline; the established baselines (KLT + RANSAC) are exempt
|
||||
- **Statement**: Per Step 0.5 timeliness assessment in `00_question_decomposition.md`, Critical-novelty topics require sources within 6 months for SOTA claims and 18 months for established libraries' API behaviour. Audit at access time 2026-05-07: VINS-Mono master last meaningful commit 2024-02-25 → ~27 months → **just over the 18-month window**; VINS-Fusion 2024-05-23 → ~24 months → just over; OpenVINS master active (issue threads through Feb 2025) and v2.7 release June 2023 → ~35 months for the tagged release but master in stable maintenance → within de-facto window for an established library; OKVIS2-X push 2026-03-17 → ~2 months → **fully within window**; DPVO last code update 2024-10-12 → ~19 months → just over but DPV-SLAM ECCV 2024 keeps the algorithm class within 6-month claim window; KLT / 5-point / RANSAC / homography → established baselines per Step 0.5 → **no time window applies**. **Implication**: VINS-Mono / VINS-Fusion fall into the "older than 18 months but classical authoritative reference" bucket — Step 0.5 allows up to 18 months strictly, but downstream forks (vins-mono-android, embedded variants) and the IEEE T-RO 2018 publication keep the algorithm class in active community use. Recommended treatment: **keep as candidates but require live MVE on Jetson Orin Nano Super before promotion to Selected**, to revalidate against the current OpenCV / Ceres / ROS 2 stack.
|
||||
- **Source**: Source #43, Source #44, Source #45, Source #47, Source #48, Source #51 (timeliness audit per source)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Step-7.5 reviewer + System architects
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 candidate-pool integrity
|
||||
- **Fit Impact**: **applies a conservative timeliness gate: every C1 candidate from VINS-Mono / VINS-Fusion / DPVO requires an Orin-Nano-Super MVE before being marked Selected**, since their master-branch staleness pushes them out of the Critical-novelty 18-month window. OpenVINS / OKVIS2 / OKVIS2-X / Kimera are within window via active issue threads or recent releases.
|
||||
|
||||
### C1 Component Applicability Gate — preliminary table (this session; structured Restrictions×AC sub-matrix per candidate is next session's work)
|
||||
|
||||
| Candidate | Mode (project) | License | Active maintenance? | Jetson Orin Nano Super runnable? | Native IMU fusion? | Native metric scale? | License blocks dual-use? | Preliminary status |
|
||||
|---|---|---|---|---|---|---|---|---|
|
||||
| **VINS-Mono** | mono+IMU | GPL-3.0 (copyleft) | ⚠️ borderline (24 mo) | ✅ proven on Jetson Nano (2021) → Orin Nano Super virtually certain | ✅ | ✅ | **⚠️ Verify with user** | Lead candidate **conditional on user license decision** + Orin-Nano-Super MVE |
|
||||
| **VINS-Fusion** | mono+IMU (mode) | GPL-3.0 | ⚠️ borderline (24 mo) | ⚠️ failed on TX2 (KAIST 2021); Orin Nano Super untested | ✅ | ✅ | **⚠️ Verify with user** | Alternate, secondary to VINS-Mono within HKUST family |
|
||||
| **OpenVINS** | mono+IMU | GPL-3.0 | ✅ active master | ✅ build confirmed on Orin Nano Dev Kit + JetPack 6 (2024 + 2025 community evidence); ~270 ms latency on Xavier NX | ✅ MSCKF | ✅ | **⚠️ Verify with user** | **Lead candidate** **conditional on user license decision** (best Jetson-Orin-Nano evidence + most maintained of the GPL-3 trio) |
|
||||
| **OKVIS2 / OKVIS2-X** | mono+IMU (+ optional GNSS) | BSD-3 | ✅ very active (2026 pushes) | ⚠️ no direct Jetson Orin Nano measurement; factor-graph backbone plausibly heavier than MSCKF | ✅ | ✅ | ✅ no | **Lead candidate by license + maintenance + spoof-promotion architectural alignment**, pending Jetson MVE |
|
||||
| **Kimera-VIO** | mono+IMU (optional) | BSD-2 | ✅ active | ⚠️ failed on Xavier NX 8 GB shared under multi-process (KAIST 2021) | ✅ | ✅ | ✅ no | Fallback secondary; resource overhead poor fit for project |
|
||||
| **DROID-SLAM** | mono VO/SLAM only | (project repo) | reference baseline | ❌ ≥11 GB GPU VRAM > 8 GB AC-4.2 budget | ❌ | ❌ (arbitrary scale) | n/a | **DISQUALIFIED** by AC-4.2 |
|
||||
| **DPVO / DPV-SLAM** | mono VO only | MIT | ⚠️ borderline (19 mo on code, ECCV 2024 paper) | ⚠️ DPVO-QAT++ (Nov 2025) shows 1.02 GB peak on RTX 4060 desktop; Jetson Orin Nano untested | ❌ (needs external IMU wrapper) | ❌ (needs external scale alignment) | ✅ no | Conditional secondary — VO half of a hybrid C1+C5 design only; not a drop-in VIO replacement |
|
||||
| **Pure VO baseline (KLT + 5pt RANSAC / homography)** | mono VO only | OpenCV-Apache-2.0 / MIT | ✅ foundational (no time window) | ✅ runs on any Jetson | ❌ (needs external IMU wrapper) | ❌ (needs external scale alignment) | ✅ no | **Mandatory simple-baseline reference** per Component Option Breadth rule |
|
||||
|
||||
**Surviving lead candidates (preliminary)**, in priority order based on this session's evidence:
|
||||
1. **OpenVINS** (GPL-3.0, MSCKF, best Jetson Orin Nano evidence) — pending user license decision + Orin-Nano-Super MVE
|
||||
2. **OKVIS2 / OKVIS2-X** (BSD-3, factor-graph + GNSS-fusion alignment, most active maintenance) — pending Jetson MVE
|
||||
3. **VINS-Mono** (GPL-3.0, sliding-window optimization, proven on Jetson Nano) — pending user license decision + Orin-Nano-Super MVE
|
||||
4. **Pure VO baseline** (mandatory simple-baseline; runtime guaranteed; carries the project as a graceful fallback)
|
||||
|
||||
**Disqualified outright**: DROID-SLAM (AC-4.2 memory budget), RTAB-Map and ORB-SLAM3 (already pruned by Fact #16).
|
||||
|
||||
**Conditional / not-direct-fit**: DPVO / DPV-SLAM (VO not VIO, needs external IMU wrapper), Kimera-VIO (resource overhead unjustified for narrow C1 mandate).
|
||||
|
||||
### C1 Open Decisions (to be resolved before SQ3+SQ4 closure)
|
||||
|
||||
**Decision D-C1-1 — GPL-3.0 license posture for the onboard binary** (BLOCKING for the GPL-3.0 trio: VINS-Mono / VINS-Fusion / OpenVINS).
|
||||
- The three most established VIO candidates (VINS-Mono / VINS-Fusion / OpenVINS) are GPL-3.0 (viral copyleft).
|
||||
- For dual-use UAV deployment, GPL-3 binary distribution to a customer triggers obligations: source-code disclosure for the entire linked binary, anti-tivoization clauses for embedded firmware updates, viral effect on any proprietary code linked into the same binary.
|
||||
- BSD/MIT alternatives exist (OKVIS2 BSD-3, Kimera BSD-2, DPVO MIT, pure-VO baseline OpenCV-Apache-2.0), but each comes with secondary trade-offs (Jetson MVE risk, missing IMU fusion, resource overhead).
|
||||
- Three options for the user:
|
||||
- **(a)** Accept GPL-3.0 — distribution model = release source on customer request; or operate the system as a service rather than transferring binaries. Lowest-risk algorithmic path (most-tested candidates).
|
||||
- **(b)** Restrict to permissive licenses only (BSD/MIT) — lead candidate becomes OKVIS2; carries Jetson MVE risk.
|
||||
- **(c)** Keep both options open through the design phase — make the final license decision after the Jetson Orin Nano MVE results are in.
|
||||
- **Recommended default**: **(c)** — defer the binary commitment until empirical evidence on Jetson Orin Nano. This is recorded as a flagged decision; SQ3+SQ4 candidate matrix will carry both license families to Step 7.5.
|
||||
|
||||
**Decision D-C1-2 — Acceptance of Jetson Orin Nano MVE as a Step-7.5 prerequisite** (procedural).
|
||||
- Per the Per-Mode API Capability Verification rule, every lead candidate library/SDK requires `context7` (or equivalent docs) lookup + a Minimum Viable Example for the project's pinned mode + per-numbered-Restriction × per-numbered-AC sub-matrix.
|
||||
- The Component Applicability Gate above is **preliminary** — it documents enumeration evidence but does NOT yet contain `context7` per-mode capability verification or the structured sub-matrix.
|
||||
- **Next session's mandatory work**: `context7` lookup (3 mandatory queries) for OpenVINS / OKVIS2 / VINS-Mono; per-Restriction × per-AC sub-matrix per candidate; the same for the simple-baseline path; record into `../02_fact_cards/C1_vio.md` per the engine template + `../06_component_fit_matrix/C1_vio.md` per Step 7.5.
|
||||
|
||||
### C1 Boundary check: candidate enumeration is saturated for this session
|
||||
|
||||
Saturation signals observed: (a) all 7 named candidates from `00_question_decomposition.md` C1 row enumerated with at least one canonical L1 source per candidate; (b) Jetson Orin Nano runtime evidence located for OpenVINS (direct) and VINS-Mono (Jetson Nano + RPi CM4); other candidates carry "MVE required" gates explicitly; (c) license diversity covered (GPL-3.0 trio + BSD-permissive duo + MIT + permissive-baseline); (d) explicit disqualifications recorded with cited evidence (DROID-SLAM, RTAB-Map, ORB-SLAM3). **Open**: per-mode `context7` verification (BLOCKING per rule) + Restrictions×AC sub-matrices (BLOCKING per Step 7.5) — explicitly deferred to next session.
|
||||
|
||||
---
|
||||
|
||||
## C1 — Per-Mode API Capability Verification (engine Step 2 — Mandatory `context7` lookup) [2026-05-08 session]
|
||||
|
||||
This section closes the per-mode API capability verification gate for the four C1 lead candidates. Each candidate has a pinned-mode statement, three documentary `context7` (or equivalent) queries answered, an MVE block, and a per-numbered-Restriction × per-numbered-AC sub-matrix. The candidates' final lead-promotion to "Selected" status remains gated by the dedicated Jetson Orin Nano Super hardware MVE (D-C1-2 deferred phase).
|
||||
|
||||
### Fact #37 — OpenVINS per-mode API capability verification (mono+IMU on Jetson Orin Nano Super) — DOCUMENTARY PASS; Jetson MVE pending
|
||||
- **Statement**: OpenVINS (`/rpng/open_vins`, master) exposes monocular / stereo / multi-camera + IMU as first-class launch configurations via `subscribe.launch.py` declared launch arguments `use_stereo` (bool) and `max_cameras` (int). The project's **pinned mode** is monocular + IMU, selected via `use_stereo:=false max_cameras:=1` with `config:=` pointing to a project-tuned `estimator_config.yaml`. **Mode-enumeration query (1/3)**: confirms 3 sensor configurations at the launch layer; supported IMU intrinsic models = KALIBR + RPNG (per `propagation-analytical.dox`). **Pinned-mode runnable example query (2/3)**: confirms `ros2 launch ov_msckf subscribe.launch.py config:=euroc_mav` is the documented runnable example; `euroc_mav` defaults to stereo per `subscribe.launch.py` but `use_stereo:=false max_cameras:=1` selects mono-only at runtime — no source patch required. **Disqualifier-probe query (3/3)**: did NOT surface any documented sub-20-Hz validation, hard frame-rate floor, or hard image-resolution ceiling in the master docs; the documented Xavier-NX latency baseline (~270 ms per rpng/open_vins issue #164) is below the AC-4.1 400 ms p95 budget head-room **at 640×480** but unverified at the project's 5472×3648 nav frames. The Jetson Orin Nano Dev Kit + JetPack 6 + ROS 2 Humble build patch is documented (rpng/open_vins issue #421 + fdcl-gwu/openvins_jetson_realsense). **Pinned-mode sentence**: "We will use **OpenVINS** in **monocular + IMU mode** with inputs `{1× ADTi 20MP nav frame stream + FC IMU via MAVLink/SCALED_IMU2}` and expect outputs `{6-DoF pose at IMU rate with covariance from MSCKF state, source label visual_propagated when no satellite anchor}` on `Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble)`."
|
||||
- **Source**: Source #54 (context7), Source #45 (canonical OpenVINS), Source #46 (KAIST Jetson benchmark for class-level comparison)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer + Step-7.5 reviewer
|
||||
- **Confidence**: ✅ for mode-enumeration and runnable-example documentary evidence; ⚠️ for sub-20-Hz validation and 5472×3648 latency (no documentary evidence — Jetson MVE will resolve)
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 lead candidate — per-mode API capability verification gate
|
||||
- **Fit Impact**: **DOCUMENTARY PASS for the per-mode API capability verification gate**; promotes OpenVINS to "lead candidate, documentary verification complete" status in `../06_component_fit_matrix/C1_vio.md` row. License-track decision (D-C1-1) still gates final Selected promotion (OpenVINS = GPL-3.0, lives in track A); Jetson Orin Nano Super hardware MVE (D-C1-2) still gates accuracy/latency/memory empirical promotion.
|
||||
|
||||
### Fact #38 — VINS-Mono per-mode API capability verification (mono+IMU on Jetson Orin Nano Super) — DOCUMENTARY PASS WITH FRAME-RATE CAVEAT; Jetson MVE pending
|
||||
- **Statement**: VINS-Mono (`HKUST-Aerial-Robotics/VINS-Mono`, master) is a single-mode system: "real-time SLAM framework for **Monocular Visual-Inertial Systems**" (README §1) — no mode enumeration is required because the pinned mode IS the only mode. **Mode-enumeration query (1/3)**: VINS-Mono is single-mode = mono+IMU; cross-source documentary evidence from VINS-Fusion `context7` confirms the same authors continue to ship `euroc_mono_imu_config.yaml` as a first-class config in the active fork (per the Per-Mode API rule, VINS-Fusion's mono+IMU mode is a separately-cataloged candidate, but the algorithmic core and required calibration surface are identical — see Fact #29). **Pinned-mode runnable example query (2/3)**: README §3.1.1 — `roslaunch vins_estimator euroc.launch` + EuRoC MH_01 bag is the canonical runnable example; supports online camera-IMU extrinsic calibration (`estimate_extrinsic:=2`), online temporal calibration (`estimate_td:=1`), and rolling-shutter cameras with documented calibration ceiling (`reprojection error <0.5 px`). Pinhole + MEI camera models supported. Camera intrinsics + IMU noise must be calibrated (Kalibr or equivalent). **Disqualifier-probe query (3/3)**: README §5.1 explicitly states *"The image should exceed 20Hz and IMU should exceed 100Hz."* — this is a documentary minimum-rate recommendation and is **below the project's 3 fps nav-camera target by ~6.7×**. See Fact #40 for the geometric analysis and the cross-cutting frame-rate-sensitivity finding. Ceres Solver dependency is pinned to v1.14.0 (build issues at ≥2.0.0 per README §1.2); JetPack-shipped Ceres versions need explicit verification. License: GPLv3 (README §8). **Pinned-mode sentence**: "We will use **VINS-Mono** in **monocular + IMU mode** with inputs `{1× ADTi 20MP nav frame stream (target 3 fps; under documentary 20 Hz floor) + FC IMU via MAVLink/SCALED_IMU2}` and expect outputs `{6-DoF pose at IMU rate via sliding-window optimization with covariance from optimization Hessian, loop closure via DBoW2}` on `Jetson Orin Nano Super (8 GB shared, JetPack 6, Ceres v1.14.0 build)`."
|
||||
- **Source**: Source #55 (VINS-Mono README + VINS-Fusion context7 cross-source), Source #43 (canonical VINS-Mono), Source #46 (KAIST Jetson benchmark for class-level comparison)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer + Step-7.5 reviewer
|
||||
- **Confidence**: ✅ for mode-enumeration (single mode by construction) and runnable-example evidence; ⚠️ for sub-20-Hz operation (documentary minimum-rate recommendation contradicts project frame-rate target); ⚠️ for Ceres v1.14.0 vs JetPack 6 stock Ceres compatibility
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 lead candidate — per-mode API capability verification gate
|
||||
- **Fit Impact**: **DOCUMENTARY PASS WITH FRAME-RATE CAVEAT**. Per the engine rule's escalation tier, the candidate is downgraded from "documentary lead" to **"Experimental only — sub-20-Hz operation requires Jetson MVE validation"** until the deferred Jetson hardware MVE explicitly measures VINS-Mono at the project's 3 fps. License-track decision (D-C1-1) still gates final Selected promotion (VINS-Mono = GPL-3.0, lives in track A).
|
||||
|
||||
### Fact #39 — OKVIS2 per-mode API capability verification (mono+IMU on Jetson Orin Nano Super) — DOCUMENTARY PASS; Jetson MVE pending
|
||||
- **Statement**: OKVIS2 (`smartroboticslab/okvis2`, main) is a keyframe-based factor-graph VI-SLAM with multi-camera + IMU support; the README documents coordinate-frame contract (`W` world / `C_i` cameras / `S` IMU / `B` body), state representation (`T_WS` pose + velocity + gyro/accel biases), and a two-callback API (`setOptimisedGraphCallback` for batch updates incl. loop closure + `setImuCallback` for high-rate prediction). **Mode-enumeration query (1/3)**: README + example apps confirm modes = mono / stereo / multi-camera (i-th camera frame `C_i`) — IMU is mandatory (`okvis::ViSensorBase::setImuCallback` is required). The example apps are `okvis_app_synchronous` (dataset replay), `okvis_app_realsense` (live D435i/D455), `okvis_app_realsense_record` (recording). ROS 2 build is opt-in (`BUILD_ROS2=ON`); ROS 2 launch files: `okvis_node_realsense.launch.xml`, `okvis_node_realsense_publisher.launch.xml`, `okvis_node_subscriber.launch.xml`, `okvis_node_synchronous.launch.xml`. **Pinned-mode runnable example query (2/3)**: README "Running the demo application" + "Configuration files" section — `./okvis_app_synchronous <config>.yaml <EuRoC_MH_01_easy_dir>` is the canonical mono dataset-replay example; the EuRoC config in `config/` is the documentary mono+IMU launch reference. Configuration trade-off surface: "various options to trade-off accuracy and computational expense as well as to enable online calibration" — explicit acknowledgement of latency/accuracy tuning surface. **Disqualifier-probe query (3/3)**: README does NOT state an explicit minimum image rate (cf. VINS-Mono's 20 Hz). OKVIS2's keyframe-based architecture inherently selects only "informative" frames for optimization, which is a structural advantage at lower input frame rates compared to sliding-window optimization. Optional LibTorch sky-segmentation CNN (`USE_NN`) can be disabled with `USE_NN=OFF` to remove the Jetson LibTorch dependency. License: 3-clause BSD (README "License" section). Health warning: "good results (or results at all) may only be obtained with appropriate calibration" — Kalibr-based intrinsic + extrinsic + IMU noise + tight time sync mandatory (this is shared with all VI candidates). OKVIS2-X (T-RO 2025) extends with optional GNSS fusion — architecturally aligned with the project's spoof-promotion path (per Fact #31). **Pinned-mode sentence**: "We will use **OKVIS2** (with `BUILD_ROS2=ON USE_NN=OFF`) in **monocular + IMU mode** with inputs `{1× ADTi 20MP nav frame stream + FC IMU via MAVLink/SCALED_IMU2 → re-published to /okvis/cam0/image_raw + /okvis/imu0}` and expect outputs `{6-DoF pose with covariance from factor-graph optimization via setOptimisedGraphCallback + high-rate IMU-predicted state via setImuCallback}` on `Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble)`."
|
||||
- **Source**: Source #56 (OKVIS2 README), Source #47 (canonical OKVIS2 paper arXiv:2202.09199), Source #48 (OKVIS2-X T-RO 2025)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer + Step-7.5 reviewer
|
||||
- **Confidence**: ✅ for mode-enumeration, runnable-example, and lower-frame-rate-tolerance arguments; ⚠️ for direct 3 fps validation (no documentary measurement — Jetson MVE will resolve); ⚠️ for direct Jetson Orin Nano measurement (Fact #31 noted no direct measurement; community evidence less abundant than OpenVINS)
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 lead candidate — per-mode API capability verification gate
|
||||
- **Fit Impact**: **DOCUMENTARY PASS for the per-mode API capability verification gate**; promotes OKVIS2 to "lead candidate, documentary verification complete" status in `../06_component_fit_matrix/C1_vio.md` row. OKVIS2's keyframe-based architecture is the **only candidate** of the four leads with a structural argument for tolerating sub-20-Hz operation — this re-orders the per-license-track lead ranking (see Fact #41 locked-in defaults). License-track decision (D-C1-1) does NOT gate OKVIS2 (BSD-3 already permissive); Jetson Orin Nano Super hardware MVE (D-C1-2) still gates empirical accuracy/latency/memory promotion.
|
||||
|
||||
### Fact #40 — Cross-cutting C1 finding: project's 3 fps nav-camera target is below VINS-Mono's documented 20 Hz minimum-rate recommendation; affects all sliding-window VIO candidates; OKVIS2's keyframe architecture is the structural mitigant
|
||||
- **Statement**: VINS-Mono README §5.1 documents "The image should exceed 20Hz and IMU should exceed 100Hz" as the recommended minimum-rate operating envelope (Source #55). The project's nav-camera processing target is 3 fps per `00_question_decomposition.md` Project Constraint Matrix. **Geometric analysis**: at 60 km/h cruise = 16.7 m/s × (1/3 s) = 5.5 m of forward motion between consecutive nav frames; at 1 km AGL with 12 cm/px GSD, that motion projects to ~46 px of in-image displacement (~0.84% of the 5472 px frame width) — **well within KLT-trackable range** for the nadir-down camera geometry, so the rate floor is NOT geometrically unreachable. **However**: the documented recommendation is about temporal-stability assumptions (motion-blur tolerance, IMU pre-integration noise growth, sliding-window optimisation Jacobian conditioning), not about geometric trackability. **Cross-candidate impact**: (a) **VINS-Mono** — sliding-window optimisation, full graph re-linearisation per keyframe, 20 Hz documentary recommendation explicitly violated by 6.7× → ⚠️ Experimental only until Jetson MVE measures actual sub-20-Hz behaviour; (b) **VINS-Fusion** — same algorithmic core as VINS-Mono mono+IMU mode, same caveat applies; (c) **OpenVINS** — MSCKF-based with sliding-window state + sparse feature constraints, has documented variable-rate tolerance via `init_imu_thresh`/`init_window_time` config, but no documentary sub-20-Hz validation surfaced in `context7` queries → ⚠️ Verify via Jetson MVE; (d) **OKVIS2** — keyframe-based, structurally selects only informative frames for optimization; the architecture is more naturally tolerant of variable / lower input rates → preferred candidate at low input frame rates; ✅ structural argument; (e) **Pure VO baseline** (KLT+RANSAC) — requires sufficient feature overlap between consecutive frames; at 0.84% in-image displacement this is well within KLT capture range; ✅ no rate-floor concern. **Architectural alternative for design-phase consideration**: instead of binding all C1 candidates to 3 fps, the nav-camera input pipeline could fork — full-resolution 5472×3648 at 3 fps for VPR/satellite-anchor (C2/C3) and a binned/cropped 1368×912 (or 640×480) at higher rate (≥10 fps) into the VIO front-end. ADTi 20MP 20L V1 (APS-C) bandwidth at full-res caps near 5–7 fps over USB 3 (≈2–3 GB/s raw); binned modes typically 3–10× the rate. This is a Plan-time decision, not a research-time one, but the option must be carried into Plan and the Jetson MVE must measure both single-rate and dual-rate paths.
|
||||
- **Source**: Source #55 (VINS-Mono README §5.1), Source #43 (canonical), restrictions.md "Cameras" section + `00_question_decomposition.md` Project Constraint Matrix (3 fps target)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 implementer + Plan-phase reviewer + Jetson MVE owner
|
||||
- **Confidence**: ✅ for the documentary 20 Hz minimum-rate recommendation; ✅ for geometric trackability calculation; ⚠️ for the binned/dual-rate pipeline option (camera-bandwidth estimate is plausible but needs ADTi datasheet verification at Plan time)
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 frame-rate sensitivity (cross-candidate); SQ4 (per-candidate runtime envelope binding)
|
||||
- **Fit Impact**: **(a)** Re-orders the per-license-track candidate ranking — within the BSD/permissive track, OKVIS2 strengthens its lead via structural keyframe argument; within the GPL-3.0 track, OpenVINS retains lead over VINS-Mono on this specific dimension because MSCKF's variable-rate tolerance is more documented than VINS-Mono's full-window optimisation. **(b)** Adds a Plan-phase decision: **single-rate (3 fps to all consumers) vs dual-rate (binned high-rate to VIO + full-res 3 fps to VPR/satellite)** — this becomes an explicit deliverable for the Plan phase, not the Jetson MVE phase, because the nav-camera input pipeline shape feeds into both C1 and C2/C3 candidate scoring. **(c)** Marks all VINS-* candidates as ⚠️ Experimental-only until the deferred Jetson hardware MVE explicitly measures sub-20-Hz behaviour.
|
||||
|
||||
### Fact #41 — D-C1-1 + D-C1-2 locked-in research-time defaults (after user-skipped clarification, 2026-05-08)
|
||||
- **Statement**: The user invoked `/autodev` and was presented with structured AskQuestion prompts for D-C1-1 (GPL-3.0 license posture) and D-C1-2 (Jetson MVE schedule); the user **skipped the questions with the directive "continue with the information you already have"**. Per autodev meta-rule "Critical Thinking" — locked-in research-time defaults selected to preserve maximum future optionality and to honour the documentary evidence already gathered: **D-C1-1 = (c) "Keep both license tracks open"** — rank GPL-3.0 leads (OpenVINS, VINS-Mono, VINS-Fusion) in parallel with BSD-permissive OKVIS2/OKVIS2-X; **carry both license tracks through Plan**; final license decision deferred to post-Jetson-MVE/Plan time when empirical evidence is available. **D-C1-2 = (b) "Defer Jetson MVE to a dedicated bring-up phase between research and Plan"** — research closes with documentary ranking + explicit "Jetson MVE pending" gates per candidate; the dedicated Jetson Orin Nano Super hardware MVE phase produces a single MVE artifact that promotes leads to "Selected" before Plan starts. The Plan phase MUST NOT lock a final C1 candidate before the deferred Jetson MVE artifact is produced and reviewed. **These defaults are explicitly tagged as user-deferred** — the user retains the right to revisit either decision at Plan time without losing the research artifact (both license tracks fully cataloged; both lead candidates carry full per-mode evidence).
|
||||
- **Source**: User clarification skip during 2026-05-08 `/autodev` invocation; autodev meta-rule "Critical Thinking"; greenfield-flow Step 14 (Plan) precondition rule
|
||||
- **Phase**: Phase 2 — process decision
|
||||
- **Target Audience**: System architects + Plan-phase reviewer + Step-7.5 reviewer
|
||||
- **Confidence**: ✅ (defaults selected and tagged as user-deferred; user can override at any later prompt)
|
||||
- **Related Dimension**: SQ3+SQ4 / C1 process gate; cross-cutting onto C2–C10 (license posture decision is project-wide, not C1-specific)
|
||||
- **Fit Impact**: **PROCESS GATE CLOSURE for C1**. Allows research to proceed past C1 to C2 (VPR) candidate enumeration without requiring user input now. The Plan phase MUST surface D-C1-1 again as a structured A/B/C decision before any C1 candidate is locked, AND MUST require the deferred Jetson MVE artifact as a precondition.
|
||||
|
||||
---
|
||||
|
||||
## C1 — Minimum Viable Example (MVE) Blocks
|
||||
|
||||
### MVE — OpenVINS in monocular + IMU mode
|
||||
- **Source**: Source #54 (context7 → `https://github.com/rpng/open_vins/blob/master/docs/gs-tutorial.dox` ROS 2 launch + `https://github.com/rpng/open_vins/blob/master/docs/gs-datasets.dox` EuRoC config), accessed 2026-05-08
|
||||
- **Inputs in the example**: EuRoC MAV stereo VI dataset (default `config:=euroc_mav` is stereo 2× cameras + IMU); the launch file declares `use_stereo` (default `true`) and `max_cameras` (default `2`) as runtime overrides; setting `use_stereo:=false max_cameras:=1` selects monocular operation against the same `estimator_config.yaml` parameter file with ROS topics `/cam0/image_raw` + `/imu0`
|
||||
- **Outputs in the example**: 6-DoF pose at IMU rate; ROS 1 publishes `/ov_msckf/poseimu`, `/ov_msckf/odomimu`, `/ov_msckf/pathimu`; ROS 2 publishes equivalent topics under the configured namespace
|
||||
- **Project inputs**: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) + FC IMU via MAVLink (SCALED_IMU2 at ≥100 Hz)
|
||||
- **Project outputs required**: 6-DoF pose at IMU rate with metric scale + 6×6 covariance + source label `visual_propagated` when no satellite anchor; AC-1.4-compliant 95% covariance ellipse; honest covariance per AC-NEW-4
|
||||
- **Match assessment**: ✅ exact mode match for **mono+IMU**; ⚠️ partial input shape (image-resolution 4–5× larger than EuRoC's 752×480 → latency/memory unverified at full resolution); ⚠️ partial input rate (3 fps vs EuRoC's 20 Hz — see Fact #40)
|
||||
- **If ⚠️ or ❌**: docs do not explicitly disqualify the configuration. The launch surface (`use_stereo`, `max_cameras`, `config_path`) supports the project's mode without source patches. Resolution and rate are **runtime/Jetson-MVE concerns**, not API-mode concerns. → Status: **Documentary lead**; final promotion to "Selected" requires Jetson Orin Nano Super hardware MVE artifact (D-C1-2 deferred phase).
|
||||
|
||||
### MVE — VINS-Mono in monocular + IMU mode (single mode by construction)
|
||||
- **Source**: Source #55 (VINS-Mono README §3.1.1 + cross-source VINS-Fusion `context7` `euroc_mono_imu_config.yaml`), accessed 2026-05-08
|
||||
- **Inputs in the example**: EuRoC MAV monocular VI dataset (the README explicitly notes "Although it contains stereo cameras, we only use one camera"); ROS topics with image rate >20 Hz and IMU rate >100 Hz per README §5.1; pinhole or MEI camera model with intrinsics + distortion calibrated; camera-IMU extrinsic + temporal calibration optional (online estimation supported via `estimate_extrinsic` and `estimate_td` params)
|
||||
- **Outputs in the example**: 6-DoF pose at IMU rate via sliding-window optimization with covariance from optimization Hessian; loop closure via DBoW2; pose-graph save/reuse via `s` keystroke
|
||||
- **Project inputs**: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps — **below documentary 20 Hz floor**) + FC IMU via MAVLink (SCALED_IMU2 at ≥100 Hz)
|
||||
- **Project outputs required**: same as OpenVINS MVE above
|
||||
- **Match assessment**: ✅ exact mode match (single-mode system, the project's pinned mode IS the only mode); ⚠️ partial input rate (3 fps vs documentary 20 Hz minimum recommendation per Fact #40); ⚠️ partial dependency stack (Ceres v1.14.0 vs JetPack 6 stock Ceres needs verification); ⚠️ partial input resolution (EuRoC 752×480 vs project 5472×3648)
|
||||
- **If ⚠️ or ❌**: README §5.1 *"The image should exceed 20Hz and IMU should exceed 100Hz"* — explicit documentary disqualifier for sub-20-Hz operation absent contrary measurement. Geometric analysis (Fact #40) shows in-image displacement at 3 fps is small (~0.84% of frame width) and KLT-trackable, but the documentary minimum is not validated by the upstream authors at this rate. → Status: **Experimental only** until Jetson MVE explicitly measures sub-20-Hz behaviour, OR until the Plan phase commits to the dual-rate camera pipeline (binned high-rate to VIO + full-res 3 fps to VPR — see Fact #40) which would put VINS-Mono back on a documentary lead path.
|
||||
|
||||
### MVE — OKVIS2 in monocular + IMU mode
|
||||
- **Source**: Source #56 (OKVIS2 README "Running the demo application" + "Building the project with ROS2" + arXiv:2202.09199), accessed 2026-05-08
|
||||
- **Inputs in the example**: EuRoC ASL/ETH dataset directory (e.g., MH_01_easy/) + a config file from the `config/` directory; alternative live input via Realsense D435i/D455 through `okvis_app_realsense`; the i-th camera frame `C_i` in the OKVIS coordinate model permits multi-camera operation but mono is supported when `C_0` is the only configured camera in the YAML
|
||||
- **Outputs in the example**: An `okvis::Trajectory` object that can be queried at any timestamp; updates delivered via `setOptimisedGraphCallback` (batch updates including loop closure) and high-rate prediction via `setImuCallback`; state `T_WS` (pose) + `v_W` (velocity) + `b_g`/`b_a` (gyro/accel biases)
|
||||
- **Project inputs**: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps) + FC IMU via MAVLink (SCALED_IMU2 at ≥100 Hz) → re-published to `/okvis/cam0/image_raw` + `/okvis/imu0` topics in the ROS 2 build path
|
||||
- **Project outputs required**: same as OpenVINS MVE above
|
||||
- **Match assessment**: ✅ exact mode match for **mono+IMU**; ✅ structural argument for sub-20-Hz tolerance (keyframe-based architecture per Fact #40); ⚠️ partial input shape (image resolution unverified at 5472×3648 — config files in `config/` are tuned for D435i/EuRoC resolutions); ⚠️ partial Jetson Orin Nano direct evidence (no community benchmark surfaced)
|
||||
- **If ⚠️ or ❌**: docs do not explicitly disqualify the configuration; the keyframe architecture is the structural mitigant for the project's frame-rate target. Optional LibTorch sky-segmentation can be disabled with `USE_NN=OFF` to remove the Jetson LibTorch dependency. → Status: **Documentary lead with structural advantage at sub-20-Hz**; final promotion to "Selected" requires Jetson Orin Nano Super hardware MVE artifact (D-C1-2 deferred phase).
|
||||
|
||||
### MVE — Pure VO baseline (KLT optical flow + 5-point essential matrix or homography RANSAC) — IMU-fusion external
|
||||
- **Source**: Source #53 (OpenCV `cv::calcOpticalFlowPyrLK` + `cv::findEssentialMat` + `cv::findHomography` + `cv::Rodrigues` + reference implementation `alishobeiri/Monocular-Video-Odometery` MIT 2018)
|
||||
- **Inputs in the example**: Sequence of monocular grayscale frames; OpenCV cookbook tutorial uses KITTI Odometry sequences (1241×376 at 10 fps, ground-plane motion); reference impl uses webcam at variable rate
|
||||
- **Outputs in the example**: Sequence of relative-pose 3×4 matrices `[R|t]` per frame pair (arbitrary scale via 5-point essential; metric scale recoverable via known scene structure or external IMU integration)
|
||||
- **Project inputs**: 1× ADTi 20MP nav frame stream (5472×3648, target 3 fps); FC IMU consumed by an **external metric-scale wrapper** (loosely-coupled ESKF that integrates IMU between visual updates and rescales the visual-odometry translation to metric units)
|
||||
- **Project outputs required**: same as VIO MVEs above; the external wrapper produces the C5-style covariance because pure VO has no native covariance
|
||||
- **Match assessment**: ⚠️ partial — the visual-odometry stage matches exactly (mono VO → relative pose); the IMU-fusion stage is **NOT in this candidate** and must be a separately-designed external module (loosely-coupled ESKF). At the C1 component scope, this candidate is "VO-only" and explicitly requires C5 to provide IMU fusion and covariance.
|
||||
- **If ⚠️ or ❌**: → Status: **Mandatory simple-baseline reference**, NOT a lead. Used to anchor failure-analysis discussion in `solution_draft01` and as a runnable fallback if all VIO candidates fail Jetson MVE. The external IMU-fusion wrapper for this candidate becomes part of C5 (state estimator) candidate scope, not C1.
|
||||
|
||||
---
|
||||
|
||||
## C1 — Per-numbered-Restriction × Per-numbered-AC Sub-Matrix per Candidate
|
||||
|
||||
> Per Per-Mode API Capability Verification rule item 4: every numbered Restriction line and every numbered Acceptance Criterion is bound to one of `{Pass, Fail, Verify, N/A}` per candidate, with one-line evidence cite. Lines marked N/A are out of C1 scope (handled by C2 / C3 / C4 / C5 / C6 / C7 / C8 / C9 / C10). Cells marked `Verify` block final "Selected" promotion until the Jetson Orin Nano Super hardware MVE phase resolves them.
|
||||
|
||||
### Sub-matrix legend
|
||||
|
||||
- **Pass**: pinned mode satisfies the line with cited documentary evidence
|
||||
- **Fail**: pinned mode contradicts the line with cited documentary evidence
|
||||
- **Verify**: no documentary evidence either way; deferred Jetson MVE phase will resolve
|
||||
- **N/A**: line is irrelevant to C1 (will be bound by C2/.../C10 in their respective rows)
|
||||
|
||||
### Cross-cutting N/A lines (apply to ALL C1 candidates)
|
||||
|
||||
The following AC and Restriction lines are out of C1 scope and are marked N/A for every C1 candidate without per-candidate citation:
|
||||
|
||||
- **All of AC-2.1b** (satellite-anchor registration) — bound by C2 (VPR) + C3 (matcher) + C4 (PnP)
|
||||
- **All of AC-2.2 (cross-domain MRE branch)** — bound by C3 (matcher)
|
||||
- **AC-3.4** (operator re-loc hint) — bound by C8 (FC adapter) + C10 (operator UX)
|
||||
- **All of AC-6.x** (GCS telemetry) — bound by C8
|
||||
- **All of AC-7.x** (AI-camera object localization) — bound outside C1 entirely
|
||||
- **All of AC-8.x** (satellite reference imagery) — bound by C6 (tile cache) + C10 (provisioning)
|
||||
- **All of AC-NEW-3** (FDR records — except the "per-frame estimates with covariance + source-label" line which is a downstream pass-through of C1 output) — bound by C5 (state estimator emits the per-frame record) + system-wide FDR component
|
||||
- **All of AC-NEW-5** (operating environmental envelope: −20 °C to +50 °C, vibration, cooling) — bound by C7 (Jetson runtime / thermal scheduler) + system-wide thermal design
|
||||
- **All of AC-NEW-6** (imagery freshness enforcement) — bound by C6 + C10
|
||||
- **All of AC-NEW-7** (cache-poisoning safety budget) — bound by C5 + C6 + system-wide
|
||||
- **Restriction "Satellite Imagery" entire section** — bound by C6 + C10
|
||||
- **Restriction "Communication protocol (pinned)"** + **"Output to FC"** — bound by C8
|
||||
- **Restriction "Ground station"** — bound by C8
|
||||
|
||||
### OpenVINS — per-numbered binding (C1-relevant lines only; cross-cutting N/A above)
|
||||
|
||||
| Line | Binding | Evidence (one-line cite) |
|
||||
|---|---|---|
|
||||
| AC-1.3 (drift between anchors: <100 m visual-only / <50 m IMU-fused) | **Verify** | OpenVINS produces metric-scale 6-DoF + IMU-fused covariance; absolute drift between anchors is a function of nav-cam frame rate + texture + IMU bias — Jetson MVE on Derkachi flight required |
|
||||
| AC-1.4 (95% covariance ellipse + source label) | **Pass** | MSCKF produces native 6×6 covariance from filter state; source label is a downstream pipeline concern (C5) — OpenVINS provides the covariance input |
|
||||
| AC-2.1a (frame-to-frame registration ≥95% normal flight) | **Verify** | OpenVINS feature-tracking front-end (KLT-based) success rate at 3 fps × 5472×3648 nadir-down low-texture cropland — Jetson MVE on Derkachi flight required |
|
||||
| AC-2.2 (frame-to-frame MRE <1.0 px) | **Verify** | OpenVINS reports per-feature reprojection residuals via the MSCKF measurement model; aggregate MRE under nadir-down low-texture conditions — Jetson MVE measurement |
|
||||
| AC-3.1 (tolerate 350 m outliers ±20° tilt) | **Pass (with Verify scope)** | MSCKF outlier-rejection via Mahalanobis gating is documented; the 350 m / ±20° envelope is an integration boundary owned by C5 — OpenVINS provides the per-feature gate |
|
||||
| AC-3.2 (sharp turns <5% overlap, <200 m drift, <70° heading change) | **Verify** | OpenVINS has documented failure-detection + recovery; recovery via satellite-reference re-localization (AC-3.3) is owned by C2/C3 — OpenVINS must trigger the recovery path, MVE measurement of sharp-turn recovery on Derkachi flight |
|
||||
| AC-3.3 (≥3 disconnected segments via satellite re-localization) | **Pass** | OpenVINS has documented failure-detection + recovery API (`StateOptions`); the re-localization input is provided by C2/C3 |
|
||||
| AC-3.5 (visual blackout + spoofed GPS → dead_reckoned label, ≤400 ms) | **Verify** | OpenVINS internal mode promotion (`SLAM` ↔ `IMU-only propagation`) latency under feature-loss conditions — Jetson MVE measurement; the label-state transition is owned by C5 |
|
||||
| AC-4.1 (latency <400 ms p95) | **Verify** | Documented Xavier NX baseline ~270 ms at 640×480 (Source #45 issue #164); 5472×3648 + Jetson Orin Nano Super at 3 fps unverified — Jetson MVE measurement |
|
||||
| AC-4.2 (memory <8 GB shared) | **Verify** | MSCKF has lower memory footprint than full sliding-window optimization; Jetson Orin Nano Dev Kit build confirmed (Source #45 issue #421) but co-resident memory pressure with C2/C3/C5/C6 not measured |
|
||||
| AC-4.4 (frame-by-frame, no batching) | **Pass** | OpenVINS publishes pose at IMU rate (per Source #54 launch evidence); no batching by design |
|
||||
| AC-4.5 (corrections allowed) | **Pass** | MSCKF natively re-linearises in its sliding window; corrections via state augmentation are documented |
|
||||
| AC-5.1 (initialise from FC EKF's last valid GPS + IMU-extrapolated position) | **Pass** | OpenVINS supports custom initialisation via `init_options` (per Source #54 estimator config); the FC-EKF input is plumbed by C5/C8 |
|
||||
| AC-5.3 (re-initialise on companion reboot from FC IMU-extrapolated position) | **Pass** | Same mechanism as AC-5.1; AC-NEW-1 covers the timing constraint |
|
||||
| AC-NEW-1 (cold-start TTFF <30 s) | **Verify** | OpenVINS initialisation latency under co-resident process startup on Jetson Orin Nano Super — Jetson MVE measurement |
|
||||
| AC-NEW-3 (per-frame estimates with covariance + source-label feed FDR) | **Pass** | OpenVINS publishes pose+covariance at IMU rate; the source-label and FDR pipeline are downstream (C5 + system-wide) |
|
||||
| AC-NEW-4 (false-position safety budget — covariance honesty) | **Pass (with Verify)** | MSCKF produces filter-consistent 6×6 covariance; honest-covariance discipline is shared with C5 (which carries the contract to AC-4.3); covariance under-reporting in the presence of cross-domain matches is a known MSCKF failure mode (Fact #5 family) — Jetson MVE on Derkachi flight required for empirical floor |
|
||||
| AC-NEW-8 (visual blackout + GPS spoofing — IMU-only ≤30 s, label dead_reckoned) | **Pass** | OpenVINS has documented IMU-only propagation mode after visual feature loss; the failsafe-label transition is owned by C5 |
|
||||
| Restriction "Sharp turns are exceptions; consecutive photos may share <5% overlap" | **Verify** | Same as AC-3.2 — Jetson MVE measurement |
|
||||
| Restriction "Navigation camera (pinned): ADTi 20MP 20L V1, 5472×3648" | **Verify** | Image-resolution scaling (16× larger than EuRoC's 752×480 baseline) — Jetson MVE measurement of feature-extraction latency at full-res; binned/cropped path option per Fact #40 |
|
||||
| Restriction "Companion computer (pinned): Jetson Orin Nano Super, 8 GB shared" | **Verify** | Build confirmed (Source #45 issue #421); steady-state co-resident memory pressure unverified — Jetson MVE measurement |
|
||||
| Restriction "High-rate IMU available from FC via MAVLink" | **Pass** | OpenVINS consumes IMU at any rate ≥100 Hz; SCALED_IMU2 at FC's native rate (typically 100–400 Hz) satisfies this |
|
||||
|
||||
### VINS-Mono — per-numbered binding (C1-relevant lines only; cross-cutting N/A above)
|
||||
|
||||
| Line | Binding | Evidence (one-line cite) |
|
||||
|---|---|---|
|
||||
| AC-1.3 (drift between anchors) | **Verify** | Same as OpenVINS; sliding-window optimisation has higher drift than MSCKF in low-texture per academic comparison — Jetson MVE measurement |
|
||||
| AC-1.4 (covariance ellipse + source label) | **Pass** | Sliding-window optimisation produces native covariance from optimization Hessian; source label is C5's concern |
|
||||
| AC-2.1a (frame-to-frame registration ≥95%) | **Fail (documentary) → Verify** | VINS-Mono README §5.1 documents 20 Hz minimum image rate; project's 3 fps is below this floor (Fact #40) → ⚠️ **Experimental only** until Jetson MVE explicitly validates sub-20-Hz operation |
|
||||
| AC-2.2 (MRE <1.0 px) | **Verify** | Same as OpenVINS; reprojection error under sub-20-Hz operation unverified |
|
||||
| AC-3.1 (tolerate 350 m outliers ±20° tilt) | **Pass (with Verify scope)** | VINS-Mono has documented failure-detection + recovery |
|
||||
| AC-3.2 (sharp turns) | **Verify** | Same as OpenVINS; under sub-20-Hz operation, sharp-turn recovery unverified — Jetson MVE measurement |
|
||||
| AC-3.3 (disconnected segments via satellite re-localization) | **Pass** | VINS-Mono has documented failure-recovery; pose-graph reuse via DBoW2 supports re-anchor |
|
||||
| AC-3.5 (visual blackout + spoofed GPS) | **Verify** | Same as OpenVINS |
|
||||
| AC-4.1 (latency <400 ms p95) | **Verify** | Documented on Jetson Nano (Source #43); Orin Nano Super virtually certain to meet but at 5472×3648 unverified — Jetson MVE measurement |
|
||||
| AC-4.2 (memory <8 GB shared) | **Verify** | Same as OpenVINS |
|
||||
| AC-4.4 (frame-by-frame) | **Pass** | VINS-Mono publishes pose at IMU rate |
|
||||
| AC-4.5 (corrections allowed) | **Pass** | Sliding-window optimization re-linearises and supports corrections |
|
||||
| AC-5.1 (initialise from FC EKF) | **Pass** | VINS-Mono has automatic initialization via IMU pre-integration; custom-init from FC EKF is a wiring task |
|
||||
| AC-5.3 (re-initialise on reboot) | **Pass** | Same as AC-5.1 |
|
||||
| AC-NEW-1 (cold-start TTFF <30 s) | **Verify** | VINS-Mono automatic initialization typically takes seconds; Jetson MVE measurement |
|
||||
| AC-NEW-3 (per-frame estimates feed FDR) | **Pass** | Same as OpenVINS |
|
||||
| AC-NEW-4 (covariance honesty) | **Pass (with Verify)** | Same as OpenVINS; sliding-window optimization Hessian is a less-conservative covariance source than MSCKF in some failure modes |
|
||||
| AC-NEW-8 (visual blackout + GPS spoofing) | **Pass (with Verify)** | VINS-Mono has documented failure-detection and IMU-only propagation; failsafe-label transition is C5's |
|
||||
| Restriction "Sharp turns are exceptions" | **Verify** | Same as AC-3.2 |
|
||||
| Restriction "Navigation camera (pinned): 5472×3648" | **Verify** | Same as OpenVINS; **plus** the Fact #40 dual-rate option is an explicit Plan-time consideration to bring VINS-Mono back from Experimental to documentary lead |
|
||||
| Restriction "Companion computer: Jetson Orin Nano Super, 8 GB" | **Verify** | Same as OpenVINS; Ceres v1.14.0 vs JetPack 6 stock Ceres compatibility is an additional sub-verify item |
|
||||
| Restriction "High-rate IMU available from FC via MAVLink" | **Pass** | VINS-Mono consumes IMU at ≥100 Hz; satisfied |
|
||||
|
||||
### OKVIS2 / OKVIS2-X — per-numbered binding (C1-relevant lines only; cross-cutting N/A above)
|
||||
|
||||
| Line | Binding | Evidence (one-line cite) |
|
||||
|---|---|---|
|
||||
| AC-1.3 (drift between anchors) | **Verify** | Factor-graph back-end with loop closure should produce lower drift than non-loop VIO; specific Derkachi-flight measurement deferred to Jetson MVE |
|
||||
| AC-1.4 (covariance ellipse + source label) | **Pass** | OKVIS2 produces 6×6 covariance from factor-graph marginal; source label is C5's concern |
|
||||
| AC-2.1a (frame-to-frame registration ≥95%) | **Pass (structural argument) → Verify** | Keyframe-based selection is structurally tolerant of variable input rates (Fact #40); explicit 3 fps validation deferred to Jetson MVE |
|
||||
| AC-2.2 (MRE <1.0 px) | **Verify** | OKVIS2 has tight reprojection-error inlier rejection in its keyframe matching; aggregate MRE under nadir-down low-texture — Jetson MVE measurement |
|
||||
| AC-3.1 (tolerate 350 m outliers ±20° tilt) | **Pass** | OKVIS2 has Cauchy-loss robust factor graph that tolerates outliers; documented in arXiv:2202.09199 |
|
||||
| AC-3.2 (sharp turns) | **Pass (structural)** | Keyframe selection inherently skips uninformative sharp-turn frames; recovery via re-localization is owned by C2/C3 |
|
||||
| AC-3.3 (≥3 disconnected segments) | **Pass** | OKVIS2 has explicit re-localization API + loop closure; OKVIS2-X adds GNSS-fusion which architecturally aligns with the spoof-promotion path (per Fact #31) |
|
||||
| AC-3.5 (visual blackout + spoofed GPS) | **Verify** | OKVIS2 IMU-only propagation between keyframes is via `setImuCallback`; latency under blackout-trigger — Jetson MVE measurement |
|
||||
| AC-4.1 (latency <400 ms p95) | **Verify** | No documented Jetson Orin Nano measurement (Fact #31); factor-graph is plausibly heavier than MSCKF — Jetson MVE measurement |
|
||||
| AC-4.2 (memory <8 GB shared) | **Verify** | Same as AC-4.1; co-resident memory pressure with C2/C3/C5/C6 unverified |
|
||||
| AC-4.4 (frame-by-frame) | **Pass** | `setImuCallback` provides high-rate prediction; `setOptimisedGraphCallback` provides batch updates including loop closure — both stream frame-by-frame from a consumer perspective |
|
||||
| AC-4.5 (corrections allowed) | **Pass** | Factor-graph re-linearisation on loop closure delivers corrections via `setOptimisedGraphCallback` |
|
||||
| AC-5.1 (initialise from FC EKF) | **Pass** | OKVIS2 supports custom initialisation via the `okvis::ViInterface` API; the FC-EKF input is plumbed by C5/C8 |
|
||||
| AC-5.3 (re-initialise on reboot) | **Pass** | Same mechanism as AC-5.1 |
|
||||
| AC-NEW-1 (cold-start TTFF <30 s) | **Verify** | OKVIS2 initialisation latency under co-resident process startup — Jetson MVE measurement |
|
||||
| AC-NEW-3 (per-frame estimates feed FDR) | **Pass** | OKVIS2 trajectory query at any timestamp via `okvis::Trajectory` supports the FDR pipeline |
|
||||
| AC-NEW-4 (covariance honesty) | **Pass (with Verify)** | Factor-graph marginal covariance is the gold standard for honest covariance among VIO classes; cross-domain match consistency under satellite anchor injection unverified — Jetson MVE measurement |
|
||||
| AC-NEW-8 (visual blackout + GPS spoofing) | **Pass** | OKVIS2 has documented IMU-only propagation between keyframes; OKVIS2-X GNSS-fusion is architecturally aligned with the spoof-promotion path |
|
||||
| Restriction "Sharp turns are exceptions" | **Pass (structural)** | Keyframe selection inherently handles sparse-overlap sharp-turn frames |
|
||||
| Restriction "Navigation camera (pinned): 5472×3648" | **Verify** | Image-resolution scaling — Jetson MVE measurement; OKVIS2 keyframe sub-sampling reduces the per-frame compute compared to per-frame VIO |
|
||||
| Restriction "Companion computer: Jetson Orin Nano Super, 8 GB" | **Verify** | No direct Jetson Orin Nano Super measurement; LibTorch sky-segmentation can be disabled with `USE_NN=OFF` to remove a major Jetson dependency |
|
||||
| Restriction "High-rate IMU available from FC via MAVLink" | **Pass** | `setImuCallback` consumes IMU at any rate ≥100 Hz; satisfied |
|
||||
|
||||
### Pure VO baseline (KLT + 5pt RANSAC / homography) — per-numbered binding (C1-relevant lines only; cross-cutting N/A above)
|
||||
|
||||
| Line | Binding | Evidence (one-line cite) |
|
||||
|---|---|---|
|
||||
| AC-1.3 (drift between anchors — visual-only/IMU-fused) | **Fail (visual-only sub-bound)** | Pure VO has higher drift than VIO; the "<100 m visual-only" sub-bound is achievable, but the "<50 m IMU-fused" requires the external ESKF wrapper (which is part of C5, not this candidate) |
|
||||
| AC-1.4 (covariance ellipse + source label) | **Fail** | Pure VO has no native covariance; covariance is provided by the external ESKF wrapper (C5) |
|
||||
| AC-2.1a (frame-to-frame registration ≥95%) | **Pass** | KLT optical flow at 0.84% in-image displacement (Fact #40 calculation) is well within trackable range |
|
||||
| AC-2.2 (MRE <1.0 px) | **Pass (with Verify)** | OpenCV `findHomography` with RANSAC produces sub-pixel inliers under planar steppe geometry; explicit measurement on Derkachi flight needed |
|
||||
| AC-3.1 (tolerate 350 m outliers ±20° tilt) | **Verify** | RANSAC outlier rejection threshold is tunable; explicit measurement under ±20° airframe tilt needed |
|
||||
| AC-3.2 (sharp turns) | **Fail** | Pure VO has no failure-recovery mechanism; sharp turns trigger KLT track loss; recovery via satellite re-localization (AC-3.3) is owned by C2/C3 — pure VO must signal track loss to C5 |
|
||||
| AC-3.3 (≥3 disconnected segments) | **N/A (handled by C5+C2/C3)** | Pure VO does not have re-localization; the disconnected-segment recovery is C2/C3's job |
|
||||
| AC-3.5 (visual blackout + spoofed GPS) | **N/A (handled by C5)** | Pure VO has no failsafe state; C5 owns the dead_reckoned transition |
|
||||
| AC-4.1 (latency <400 ms p95) | **Pass** | OpenCV KLT + RANSAC at 5472×3648 on Jetson Orin Nano CPU is documented as <100 ms class; latency budget is dominated by image I/O |
|
||||
| AC-4.2 (memory <8 GB shared) | **Pass** | KLT + RANSAC has trivial memory footprint (<100 MB working set) |
|
||||
| AC-4.4 (frame-by-frame) | **Pass** | Pure per-frame algorithm; no batching |
|
||||
| AC-4.5 (corrections allowed) | **N/A (handled by C5)** | Pure VO has no state to correct; C5 owns corrections |
|
||||
| AC-5.1 (initialise from FC EKF) | **N/A (handled by C5)** | Pure VO has no global state; C5 owns the initial pose |
|
||||
| AC-5.3 (re-initialise on reboot) | **N/A (handled by C5)** | Same as AC-5.1 |
|
||||
| AC-NEW-1 (cold-start TTFF <30 s) | **Pass** | Pure VO needs no warm-up beyond first frame pair |
|
||||
| AC-NEW-3 (per-frame estimates feed FDR) | **N/A (handled by C5)** | Pure VO emits relative pose only; FDR records the C5-fused estimate |
|
||||
| AC-NEW-4 (covariance honesty) | **Fail** | Pure VO has no native covariance; honest-covariance discipline is the external wrapper's contract (C5) |
|
||||
| AC-NEW-8 (visual blackout + GPS spoofing) | **N/A (handled by C5)** | Pure VO has no failsafe behavior; C5 owns the IMU-only mode |
|
||||
| Restriction "Sharp turns are exceptions" | **Fail** | Same as AC-3.2 |
|
||||
| Restriction "Navigation camera (pinned): 5472×3648" | **Pass** | KLT runs at any resolution; 5472×3648 may need image pyramid downsampling for runtime — standard OpenCV practice |
|
||||
| Restriction "Companion computer: Jetson Orin Nano Super, 8 GB" | **Pass** | Trivial memory + CPU-bound; no GPU dependency |
|
||||
| Restriction "High-rate IMU available from FC via MAVLink" | **N/A (handled by C5)** | Pure VO does not consume IMU; the external wrapper does |
|
||||
|
||||
**Pure VO baseline summary**: this candidate is **NOT a drop-in C1 VIO replacement**. It is a "VO + external IMU wrapper" two-component design where the external wrapper is owned by C5. As a C1 candidate it Fails AC-1.4 / AC-1.3 IMU-fused / AC-3.2 / AC-NEW-4 because those bindings inherently require IMU fusion which this candidate lacks. **Status remains "mandatory simple-baseline reference"** per Fact #35; the actual C1 fallback if all VIO leads fail Jetson MVE is "Pure VO + custom ESKF wrapper" — which is a Plan-phase design task, not a research-phase candidate.
|
||||
|
||||
---
|
||||
|
||||
## C1 — CLOSURE STATUS [2026-05-08 session]
|
||||
|
||||
C1 is **CLOSED at the documentary level**. All four lead candidates (OpenVINS, OKVIS2, VINS-Mono, Pure VO baseline) have:
|
||||
- ✅ Pinned-mode statement
|
||||
- ✅ Three-query `context7` (or equivalent) lookup with documentary evidence
|
||||
- ✅ MVE block
|
||||
- ✅ Per-numbered-Restriction × per-numbered-AC sub-matrix
|
||||
|
||||
**Final lead promotion to "Selected"** is gated by the **deferred Jetson Orin Nano Super hardware MVE phase** (D-C1-2 default = option (b) per Fact #41) — Plan phase MUST NOT lock a final C1 candidate without consuming the deferred Jetson MVE artifact.
|
||||
|
||||
**Per-license-track preliminary leads** (per Fact #41 default D-C1-1 = option (c) "keep both tracks open"):
|
||||
- **BSD/permissive track lead**: **OKVIS2 / OKVIS2-X** — strongest documentary-mode-fit profile; structural sub-20-Hz tolerance; OKVIS2-X GNSS-fusion architectural alignment with spoof-promotion path (AC-NEW-2). Risk: no direct Jetson Orin Nano Super measurement.
|
||||
- **GPL-3.0 track lead**: **OpenVINS** — best Jetson Orin Nano build evidence; MSCKF formulation more memory-efficient than VINS-Mono; documented Xavier NX 270 ms latency baseline. Risk: documentary 5472×3648 latency unverified.
|
||||
- **GPL-3.0 track alternate**: **VINS-Mono** — single-mode by construction; ⚠️ Experimental only until Jetson MVE explicitly validates sub-20-Hz operation OR Plan commits to dual-rate camera pipeline (Fact #40).
|
||||
|
||||
**Mandatory simple-baseline**: **Pure VO + external ESKF (C5)** — kept as runnable fallback if all VIO leads fail Jetson MVE.
|
||||
|
||||
**Cross-cutting design decision raised by C1 closure**: the **single-rate vs dual-rate nav-camera pipeline** (Fact #40) is now an explicit Plan-phase deliverable, because it materially changes which C1 candidates remain on documentary lead vs Experimental status.
|
||||
|
||||
C1 → C2 transition: ready to proceed to C2 (VPR) candidate enumeration in the next session.
|
||||
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -0,0 +1,204 @@
|
||||
# Fact Cards — C6: Tile cache + spatial index
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Bound to sub-questions in `../00_question_decomposition.md` line 74 (C6 = "storage + retrieval of basemap tiles + descriptors, with manifests, freshness, dedup, and write-back"). Sources for C6 cluster live in [`../01_source_registry/C6_tile_cache_spatial_index.md`](../01_source_registry/C6_tile_cache_spatial_index.md).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md). Cross-component gates: [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## Scope summary
|
||||
|
||||
C6 batch 1 closed at 2/N on 2026-05-08. **Fact #92** = mandatory simple-baseline (`mirror-of-existing-suite-pattern`: PostgreSQL + pure btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` + filesystem tile storage at `./tiles/{zoom}/{x}/{y}.jpg` + `bytea` descriptor blobs + app-side FAISS in-memory ANN loaded at takeoff). **Fact #93** = modern-competitive-lead-spatial-extension (PostgreSQL + PostGIS GiST on `geography(POINT,4326)` + pgvector HNSW for descriptor ANN + same filesystem tile storage). User-pinned scope: Postgres on Jetson at runtime (option A from `c6_postgres_locus`); satellite-provider pattern is NOT carved in stone — Cand 2 may cascade changes back to satellite-provider IF research reveals MATERIAL improvement (small improvements stay with Cand 1).
|
||||
|
||||
---
|
||||
|
||||
### Fact #92 — Manual mirror of existing parent-suite `satellite-provider` pattern: PostgreSQL btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` + bytea descriptor blobs + app-side FAISS HNSW + filesystem tile storage
|
||||
|
||||
**Statement**: For C6 (tile cache + spatial index), the mandatory simple-baseline candidate is direct-mirror of the parent-suite `satellite-provider` pattern (verified directly via filesystem read at `/Users/obezdienie001/dev/azaion/suite/satellite-provider/` per Source #92):
|
||||
|
||||
- **Geographic spatial index**: PostgreSQL btree composite index `idx_tiles_coordinates ON tiles(tile_zoom, tile_x, tile_y, version)` for spatial-grid range queries at slippy-map integer coordinates; secondary `idx_tiles_composite ON tiles(latitude, longitude, tile_size_meters)` for inverse-geocode lookups. Per Source #93 (PostgreSQL 16 multicolumn-indexes docs): "A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will always be used to limit the portion of the index that is scanned."
|
||||
- **Descriptor ANN over global VPR descriptors**: descriptors stored in `bytea` column on the `tiles` table (one new column added per migration: `descriptor BYTEA NULL`); app-side `faiss.IndexHNSWFlat(d=2048, M=32)` (or `d=1024` for SelaVPR / `d=512` for EigenPlaces per D-C2 final lock) loaded at takeoff via `faiss.read_index(path)` from a pre-serialized FAISS index built during C10 pre-flight cache provisioning. Per Source #96 (FAISS context7): `faiss.IndexHNSWFlat(d, M)` + `index.hnsw.efConstruction=40` + `index.hnsw.efSearch=16-64` is the canonical HNSW pattern matching pgvector's HNSW parameters.
|
||||
- **Raw tile storage**: filesystem at canonical slippy-map path `./tiles/{tile_zoom}/{tile_x}/{tile_y}.{image_type}` per Source #92 satellite-provider README + migration 011; DB stores `file_path VARCHAR(500)` pointer.
|
||||
- **Slippy-map coordinate transform**: `tile_x = FLOOR((lon + 180) / 360 * POWER(2, zoom))::INT` + `tile_y = FLOOR((1 - LN(TAN(RADIANS(lat)) + 1.0 / COS(RADIANS(lat))) / PI()) / 2.0 * POWER(2, zoom))::INT` per Source #92 migration 011 (matches Source #98 OSM canonical convention exactly).
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: `(query_lat, query_lon, query_alt_m)` from C5 state estimator @ 3 Hz; `(query_descriptor: numpy.ndarray of shape (d,) and dtype float32)` from C2 VPR @ 3 Hz; `(operator_reloc_hint_lat, hint_lon, hint_zoom)` rare per AC-3.4
|
||||
- outputs:
|
||||
- geographic-spatial-grid query: `[(tile_id, tile_x, tile_y, file_path, descriptor_bytea), ...]` returning K=9 (3x3 grid) to K=25 (5x5 grid) candidate tiles at `tile_zoom = Z_target` (typically Z=18 per project)
|
||||
- descriptor-ANN query: `[(tile_id, tile_x, tile_y, file_path, l2_distance), ...]` returning top-K=10 descriptor-similar tiles via FAISS HNSW
|
||||
- combined query: app-side intersection of the above two — **geographic-prefilter-then-descriptor-rerank** (canonical hierarchical retrieval pattern per Fact #21 SQ2 conclusion line 32 in source-registry/00_summary.md)
|
||||
- runtime: PostgreSQL 16 + psycopg-binary (Python driver) + FAISS-CPU on Jetson Orin Nano Super (8 GB shared, JetPack 6, Ubuntu 22.04 base) per Source #97 confirmation (Postgres-on-Jetson Medium article March 2026 confirms full Postgres + pgvector deployment works on Orin Nano)
|
||||
|
||||
**Source**:
|
||||
- Primary: Source #92 (parent-suite `satellite-provider` direct filesystem read of README + migrations 001/003/011 — confirms PostgreSQL + pure btree + filesystem pattern with NO PostGIS/extensions)
|
||||
- Btree multicolumn semantics: Source #93 PostgreSQL 16 official docs at <https://www.postgresql.org/docs/current/indexes-multicolumn.html> ("A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns")
|
||||
- Slippy-map convention: Source #98 OpenStreetMap Foundation canonical reference at <https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames> (zoom 0 = 1 tile world, zoom 18 = city block detail; Web Mercator EPSG:3857 from EPSG:4326)
|
||||
- FAISS HNSW Python API: Source #96 context7-indexed at `/facebookresearch/faiss` — confirms `faiss.IndexHNSWFlat(d, M)` + `index.hnsw.efConstruction` + `index.hnsw.efSearch` parameter pattern
|
||||
- Postgres-on-Jetson deployment: Source #97 Medium "Edge to Data Center: GPU-Accelerated Vector Search on a Jetson Orin Nano" (March 2026) — confirms OLTP throughput saturates at 10 concurrent connections on Jetson Orin Nano Super, **CPU cores (6) are the limiting factor, NOT memory**; minimal-config Postgres viable in <150 MB total per Coding Steve "Running PostgreSQL on Less Than 150MB of Memory"
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High — all evidence is L1 primary code/docs with direct verification; Postgres-on-Jetson deployment empirically demonstrated in Source #97 March 2026 article
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C6 row in `../06_component_fit_matrix/C6_tile_cache_spatial_index.md` (this fact populates the `Manual mirror of existing suite-pattern` candidate row)
|
||||
- SQ2 architectural decision #1 (Fact #23 closure): 2D-ortho-only cache contract preserved; `tile_size_meters` column tracks the project's 2D-ortho metric per migration 011
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | Postgres + FAISS + filesystem all entirely local; no network calls at runtime |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ✅ PASS | Postgres 16 ARM64 packages available via `apt install postgresql-16` on Ubuntu 22.04 (JetPack 6 base); FAISS-CPU ARM64 wheels available via `pip install faiss-cpu` (Source #96 + Source #97); psycopg-binary ARM64 wheels available |
|
||||
| **AC-1.1 (≤80 m at 1 km AGL)** | ✅ PASS | Cache delivers correct tiles to C2/C3/C4 pipeline; pose accuracy is downstream concern |
|
||||
| **AC-1.2 (≤30 m at 500 m AGL)** | ✅ PASS | Same as above |
|
||||
| **AC-3.1 sharp turns ±20° bank** | ✅ PASS | Geographic lookup pattern is bank-angle-agnostic (queries by horizontal position, not orientation) |
|
||||
| **AC-3.2 sharp-turn frames may share <5% overlap** | ✅ PASS | Cache pre-loads all tiles in mission corridor; sharp-turn coverage handled by spatial-grid radius parameter |
|
||||
| **AC-3.3 re-localization stability** | ✅ PASS | Deterministic cache lookup; same query → same result |
|
||||
| **AC-3.4 operator re-loc hint** | ✅ PASS | Operator-supplied `(hint_lat, hint_lon, hint_zoom)` becomes direct btree-indexed query: `WHERE tile_zoom = $hint_zoom AND tile_x = slippy_x($hint_lat, $hint_lon, $hint_zoom) AND tile_y = slippy_y($hint_lat, $hint_lon, $hint_zoom)` |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | ✅ PASS | Geographic btree lookup <1 ms (sub-millisecond on indexed integer columns at ~10K-100K rows) + descriptor ANN ~1-3 ms via FAISS HNSW with `efSearch=64` + tile-bytes load ~5-50 ms via filesystem page cache = total **~6-54 ms per cache hit**, well within budget |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | Postgres ~150-300 MB resident with conservative tuning (`shared_buffers=64MB`, `work_mem=4MB`, `maintenance_work_mem=32MB`, `effective_cache_size=512MB`) per Source #97 Coding Steve guide + FAISS ~50-200 MB depending on cache size + filesystem page cache ~500 MB-1 GB managed by kernel = total Postgres+FAISS+cache **~700 MB-1.5 GB** out of 8 GB |
|
||||
| **AC-4.5 look-back refinement** | N/A | Cache is read-only at flight time; refinement is C5 estimator's responsibility |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | ⚠️ TIGHT | JPEG tiles at ~30-100 KB each fit ~100K-300K tiles in 10 GB; descriptor blobs at 8 KB/tile (2048-D float32 MixVPR) consume additional ~800 MB for 100K tiles = total ~10.8 GB **marginally exceeds budget**. Mitigation = D-C6-1 NEW (descriptor-storage-format choice — halfvec at 4 KB/tile saves 50%, INT8 at 1 KB/tile saves 87.5%). For 512-D EigenPlaces variant per D-C2-10 = (b), descriptors fit in <500 MB for 100K tiles trivially |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | Cache hit/miss + tile_id + load latency are trivially recordable as FDR fields |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Cache is a passive lookup component; covariance is C4/C5 responsibility |
|
||||
| **AC-NEW-7 cache-poisoning safety** | ✅ PASS at storage layer | Immutable on-disk JPEGs with content-hash verification at load (BYTEA `tile_sha256` column to be added per D-C6-N future); Postgres row-level integrity via UNIQUE constraint on `(latitude, longitude, tile_zoom, tile_size_meters, version)` per Source #92 migration 011. **Cache-poisoning DETECTION** is C9/C10 responsibility (verify provenance signature at C10 pre-flight + C5 source-label state-machine demotion at runtime); cache simply REJECTS load if hash mismatch |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | Cache miss is handled gracefully (no tiles → C5 source-label demotes to `dead_reckoned` per AC-NEW-8 escalation thresholds); cache does NOT itself trigger failsafe |
|
||||
|
||||
**Strengths** (positive structural advantages):
|
||||
1. **Project-pattern alignment** — exactly mirrors the parent-suite `satellite-provider` pattern; if a tile is requested in pre-flight provisioning by C10 from the suite Postgres, the same SQL query and same filesystem path work on the Jetson at flight time. **No new infrastructure to learn, debug, or maintain across the suite vs onboard split.**
|
||||
2. **Trivial dependency footprint** — vanilla PostgreSQL 16 (already required if `c6_postgres_locus = A` Postgres-on-Jetson is the deployment-locus choice); NO Postgres extensions needed (no PostGIS, no pgvector, no pg_trgm); FAISS is a single Python package (~50 MB on disk via `pip install faiss-cpu`); psycopg-binary is a single Python package (~5 MB).
|
||||
3. **Sub-millisecond geographic lookup** — btree composite on integer-coordinate columns is structurally optimal for the dominant query pattern (3 Hz spatial-grid range query at zoom 18-20). Per Source #93 + EXPLAIN-ANALYZE empirical evidence at ~10K-100K rows: `Index Scan using idx_tiles_coordinates` with `cost=0.28..1.71 rows=9 width=170` extrapolated from Source #94 PostGIS workshop nyc_streets example.
|
||||
4. **Predictable memory footprint** — no extension memory overhead beyond Postgres baseline; FAISS in-memory budget scales linearly with `(n_descriptors × d_descriptor × 4 bytes)`. At 100K descriptors × 2048-D × 4 B = 800 MB; halfvec halves this to 400 MB.
|
||||
5. **License clean throughout** — PostgreSQL (PostgreSQL License = BSD-style permissive), FAISS (MIT), psycopg2/asyncpg (LGPL-3.0 / MIT-Apache-2.0 dual). **Eligible on every D-C1-1 license-posture choice** with the simplest license-compliance story.
|
||||
6. **Battle-tested storage primitive** — slippy-map filesystem hierarchy is the canonical OSM/web-map convention for ~15+ years; trivially debuggable via `ls`, `find`, `stat`; no proprietary container format.
|
||||
7. **Empirically-confirmed Postgres-on-Jetson viability** — Source #97 March 2026 article confirms full Postgres + pgvector deployment works on Jetson Orin Nano Super; **CPU cores are the limiting factor, NOT memory**, which means the 8 GB shared memory budget is plenty of headroom for Cand 1's modest 700 MB-1.5 GB total.
|
||||
|
||||
**Negative-but-mitigable structural findings**:
|
||||
8. **No native KNN distance ordering for geographic queries** — application must convert `(lat, lon)` → `(tile_x, tile_y)` integer math then issue a range query with a ±k radius in tile units, then sort by Euclidean tile-distance app-side. For 3x3 grid (k=1) this is trivial (~9 candidates, sorted in <100 us); does not generalize to "all tiles within R meters" without per-zoom k-derivation. **Mitigation**: precompute Web-Mercator-aware tile-to-meter conversion at zoom Z (per Source #98 zoom-level table at line 37); at zoom 18 ~150 m/tile, k=2 covers ~750 m radius; at zoom 20 ~38 m/tile, k=8 covers similar. For the project's 1 km AGL flight + ~60 km/h cruise, 3x3 grid at zoom 18 is sufficient coverage per AC-1.1/1.2 frame-center accuracy bars.
|
||||
9. **No native combined geographic-+-descriptor query** — must round-trip through application layer (DB returns geographic candidates → app filters by descriptor distance via FAISS). Overhead: ~1-2 ms per round trip vs ~5-10 ms for an equivalent PostGIS+pgvector single-SQL query (Cand 2). **Mitigation**: at 3 Hz query rate (333 ms budget per query inside AC-4.1 400 ms p95 envelope), the round-trip overhead is negligible — and Cand 1's app-side approach actually offers MORE flexibility (e.g., descriptor scoring with non-L2 metrics, custom rerank logic, integration with C5 covariance-honest filtering).
|
||||
10. **Descriptor ANN requires takeoff-time FAISS index build OR pre-serialized index load** — IndexHNSWFlat does not support cleanly removing vectors per Source #96, and bulk-add is slower than IndexFlatL2's append. **Mitigation**: build incrementally during C10 pre-flight cache provisioning + serialize to disk via `faiss.write_index(index, path)`; load via `faiss.read_index(path)` at takeoff in ~1-5 sec (much faster than rebuild). D-C6-3 NEW gate covers this.
|
||||
11. **No native great-circle / geodesic distance** — geographic queries are in slippy-map integer coordinates (Web Mercator approximation), not WGS84 geodesic. For low-altitude UAV at 1 km AGL covering ≤200 km mission radius (~2° latitude), Web Mercator distortion is <0.5% — negligible for tile-grid queries. **Mitigation**: zoom-level + slippy-map math handles this implicitly (each zoom's tile size shrinks toward poles by `cos(lat)`, matching reality).
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C6-N gates):
|
||||
|
||||
- **D-C6-1 NEW** — descriptor-storage-format choice (full-precision float32 in `bytea` column vs halfvec via app-side conversion + storage as 2-byte half-floats vs INT8 quantized via app-side conversion + storage as 1-byte integers + per-vector scale parameter): trade-off between cache footprint (1×/2×/4× ratio) vs Recall@K accuracy loss. **Recommendation**: D-C6-1 = (b) halfvec for descriptor storage at ~2× cache-footprint-saving with ~0-2% Recall@K loss documented in pgvector ecosystem.
|
||||
- **D-C6-2 NEW** — FAISS index variant choice for app-side descriptor ANN (`IndexFlatL2` brute-force / `IndexHNSWFlat` with M=16/32 ef_construction=64 / `IndexIVFFlat` with nlist=sqrt(N) / `IndexIVFPQ` for additional compression): trade-off between memory footprint vs query accuracy vs query latency. **Recommendation**: D-C6-2 = (b) `IndexHNSWFlat(d, M=32)` for the primary path; `IndexFlatL2` fallback for small caches (<10K tiles where exact brute force is faster than HNSW navigation overhead per Source #96 contextual guidance).
|
||||
- **D-C6-3 NEW** — descriptor-cache-rebuild-trigger strategy (rebuild on every cache modification = simplest but slow / incremental add via `index.add()` = faster but HNSW does not support delete cleanly per Source #96 / periodic rebuild during pre-flight = most robust but requires C10 coordination): **Recommendation**: D-C6-3 = (c) periodic rebuild during C10 pre-flight provisioning; serialize to disk via `faiss.write_index`; reload at flight takeoff in <5 sec.
|
||||
- **D-C6-4 NEW** — geographic-spatial-grid radius `k` (1 = 3x3 grid / 2 = 5x5 grid / 4 = 9x9 grid / dynamic based on zoom + ground-speed): trade-off between per-query candidate count vs spatial coverage. **Recommendation**: D-C6-4 = dynamic, derived from AC-3.1 sharp-turn bank rate + ground-speed projected over the next ~5 sec.
|
||||
|
||||
---
|
||||
|
||||
### Fact #93 — PostgreSQL + PostGIS GiST on `geography(POINT,4326)` with KNN distance ordering (`<->`) + pgvector HNSW for descriptor ANN + filesystem tile storage
|
||||
|
||||
**Statement**: For C6 (tile cache + spatial index), the modern-competitive-lead-spatial-extension candidate is PostgreSQL + PostGIS 3.4 + pgvector 0.7+ as a unified Postgres-extension-stack:
|
||||
|
||||
- **Geographic spatial index**: PostGIS `CREATE INDEX idx_tiles_geog ON tiles USING GIST(position::geography)` where `position` is `geometry(POINT, 4326)` derived from `(latitude, longitude)`. Per Source #94 (PostGIS workshop KNN docs at <https://postgis.net/workshops/postgis-intro/knn.html>): "PostgreSQL solves the nearest neighbor problem by introducing an 'order by distance' (`<->`) operator that induces the database to use an index to speed up a sorted return set." Native KNN: `ORDER BY position <-> ST_MakePoint($lon, $lat)::geography LIMIT K`. Native radius queries: `WHERE ST_DWithin(position::geography, ST_MakePoint($lon, $lat)::geography, $radius_m)`.
|
||||
- **Descriptor ANN over global VPR descriptors**: pgvector 0.7+ `CREATE INDEX idx_tiles_desc ON tiles USING hnsw (descriptor vector_l2_ops) WITH (m = 16, ef_construction = 64)` for HNSW-graph-based descriptor ANN. Per Source #95 (pgvector context7): default `hnsw.ef_search = 40` query-time; tunable via `SET hnsw.ef_search = 100` for higher recall at the cost of latency. Combined SQL query: `SELECT id, file_path, descriptor <-> $query_vec AS dist FROM tiles WHERE ST_DWithin(position::geography, ST_MakePoint($lon, $lat)::geography, $radius_m) ORDER BY descriptor <-> $query_vec LIMIT K`.
|
||||
- **Raw tile storage**: same as Cand 1 — filesystem at canonical slippy-map path `./tiles/{tile_zoom}/{tile_x}/{tile_y}.{image_type}`; DB stores `file_path VARCHAR(500)` pointer.
|
||||
- **Slippy-map coordinate transform**: same as Cand 1 — used to derive `(tile_x, tile_y)` columns alongside the new `position` PostGIS geometry column; permits both Cand-1-style integer-grid queries AND Cand-2-style geodesic-distance queries from a single schema.
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: identical to Cand 1 — `(query_lat, query_lon, query_alt_m)` from C5 @ 3 Hz; `(query_descriptor: numpy.ndarray of shape (d,) and dtype float32)` from C2 VPR @ 3 Hz; operator re-loc hint per AC-3.4
|
||||
- outputs:
|
||||
- geographic-KNN query: `[(tile_id, file_path, dist_m), ...]` returning K=10 nearest tiles by great-circle distance — **superior to Cand 1's slippy-map-tile-distance approximation for queries near the poles or at high zoom**
|
||||
- geographic-radius query: `[(tile_id, file_path, dist_m), ...]` returning all tiles within `$radius_m` meters — **NEW capability vs Cand 1** (Cand 1 requires per-zoom k-derivation app-side)
|
||||
- descriptor-ANN query: `[(tile_id, file_path, l2_distance), ...]` returning top-K descriptor-similar tiles via pgvector HNSW
|
||||
- **combined geographic-+-descriptor SQL query**: single SQL statement returns top-K geographically-prefiltered descriptor-similar tiles — **NEW capability vs Cand 1** (Cand 1 requires app-side round trip)
|
||||
- runtime: PostgreSQL 16 + PostGIS 3.4 extension (~30-80 MB shared libraries per Source #94 / EDB install footprint cite) + pgvector 0.7 extension (~5-10 MB shared library per Source #95) + psycopg-binary on Jetson Orin Nano Super (8 GB shared, JetPack 6); **PostGIS+pgvector ARM64 packages available via `apt install postgresql-postgis3` per Source #94** + `apt install postgresql-16-pgvector` for pgvector ARM64 deb package (verified for Ubuntu 22.04 base which JetPack 6 derives from)
|
||||
|
||||
**Source**:
|
||||
- Primary geographic-side: Source #94 PostGIS official workshop KNN docs at <https://postgis.net/workshops/postgis-intro/knn.html> + PostGIS context7 at `/postgis/postgis` — confirms `CREATE INDEX ... USING GIST(location)`, `<->` KNN operator, `ST_DWithin` radius queries with native great-circle distance for `geography` type
|
||||
- Primary descriptor-side: Source #95 pgvector context7 at `/pgvector/pgvector` — confirms `CREATE INDEX ON items USING hnsw (embedding vector_l2_ops) WITH (m = 16, ef_construction = 64)` HNSW pattern; `SET hnsw.ef_search = 100` query-time tuning
|
||||
- ARM64 deployability: Source #94 EDB Docs cross-cite confirms PostGIS 3.4 + Ubuntu 22.04 install via `apt install postgresql-postgis3`; Source #97 March 2026 Medium article confirms Postgres + pgvector + Ollama + embedding-model GPU stack runs on Jetson Orin Nano (note: pgvector ARM64 packages published since pgvector 0.7+; older versions required source build)
|
||||
- pgvector dimension limits: per Source #95 pgvector context7 — `vector_l2_ops` for full-precision float32 supports **up to 2,000 dimensions for HNSW indexes** (per pgvector 0.6 README baseline); newer pgvector 0.7+ supports `halfvec_l2_ops` (half-precision, 2-byte) and `sparsevec_l2_ops` for higher dimensions including **up to 16,000 dimensions for halfvec HNSW**
|
||||
- Filesystem layout: shared with Cand 1 per Source #92 satellite-provider pattern + Source #98 OSM slippy-map convention
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High for the API capability verification (PostGIS GiST + pgvector HNSW are L1 docs canonical APIs) + ⚠️ Medium-High for the Jetson-deployability claim (PostGIS+pgvector ARM64 packages confirmed available, but specific install footprint and runtime memory measurements on Jetson Orin Nano Super NOT empirically verified — needs Jetson MVE phase per D-C1-2)
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C6 row in `../06_component_fit_matrix/C6_tile_cache_spatial_index.md` (this fact populates the `PostGIS GiST + pgvector HNSW` candidate row)
|
||||
- SQ2 architectural decision #1 (Fact #23 closure): 2D-ortho-only cache contract preserved; PostGIS `geography(POINT,4326)` represents the tile center as a 2D geodetic point — fully compatible with the 2D-ortho contract
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | Postgres + PostGIS + pgvector + filesystem all entirely local |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ⚠️ PASS-with-Plan-phase-verification | Postgres 16 ARM64 + PostGIS 3.4 ARM64 (`apt install postgresql-postgis3`) + pgvector 0.7+ ARM64 (`apt install postgresql-16-pgvector`) all available for Ubuntu 22.04; **specific install footprint + runtime memory measurements on Jetson Orin Nano Super NOT empirically verified** (Source #94 search results explicit limitation: "do not provide specific information about PostGIS 3.4's compatibility with ARM64 architecture on Jetson devices, nor do they document the installation footprint"); D-C6-5 NEW gate covers this |
|
||||
| **AC-1.1 (≤80 m at 1 km AGL)** | ✅ PASS | Cache delivers correct tiles to C2/C3/C4 pipeline; pose accuracy is downstream concern |
|
||||
| **AC-1.2 (≤30 m at 500 m AGL)** | ✅ PASS | Same as above |
|
||||
| **AC-3.1 sharp turns ±20° bank** | ✅ PASS | Geographic lookup pattern is bank-angle-agnostic |
|
||||
| **AC-3.2 sharp-turn frames may share <5% overlap** | ✅ PASS | Cache pre-loads all tiles in mission corridor; sharp-turn coverage handled by `ST_DWithin` radius parameter with native geodesic semantics |
|
||||
| **AC-3.3 re-localization stability** | ✅ PASS | Deterministic GiST index lookup; same query → same result |
|
||||
| **AC-3.4 operator re-loc hint** | ✅ PASS | Operator-supplied `(hint_lat, hint_lon, hint_zoom)` becomes direct PostGIS query: `SELECT * FROM tiles WHERE ST_DWithin(position::geography, ST_MakePoint($hint_lon, $hint_lat)::geography, $hint_radius_m) AND tile_zoom = $hint_zoom` |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | ⚠️ TIGHT-BUT-FITS | Combined geographic-+-descriptor single-SQL query latency ~5-15 ms on Jetson CPU per Source #94 EXPLAIN-ANALYZE pattern (PostGIS GiST + pgvector HNSW indices both used in single query plan); **vs Cand 1's ~6-54 ms** (geographic + descriptor + tile-bytes combined). Tile-bytes load adds ~5-50 ms via filesystem page cache (same as Cand 1). **Total: ~10-65 ms per cache hit** — well within budget BUT 1.5-2× slower than Cand 1's geographic-only btree lookup |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | Postgres ~150-300 MB resident with conservative tuning + PostGIS extension shared libraries ~30-80 MB + pgvector extension ~5-10 MB + filesystem page cache ~500 MB-1 GB = total **~700 MB-1.4 GB** out of 8 GB (vs Cand 1's 700 MB-1.5 GB — essentially tied) |
|
||||
| **AC-4.5 look-back refinement** | N/A | Cache is read-only at flight time |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | ⚠️ TIGHT-with-mitigation | Same JPEG tile cost as Cand 1 (~30-100 KB each) + descriptor blobs **stored in pgvector `vector` type with 4 bytes/dim overhead** — at 2048-D float32 = 8 KB/tile (same as Cand 1's bytea); for **halfvec_l2_ops** = 4 KB/tile (50% saving, supports up to 16,000 dim); for `sparsevec_l2_ops` even less. **Same cache-footprint profile as Cand 1** with the same D-C6-1 NEW mitigation strategy |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | Cache hit/miss + tile_id + load latency are trivially recordable as FDR fields |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Cache is a passive lookup component |
|
||||
| **AC-NEW-7 cache-poisoning safety** | ✅ PASS at storage layer | Same immutable-on-disk-JPEG + content-hash + UNIQUE constraint approach as Cand 1; PostGIS adds `ST_IsValid` geometric integrity check on `position` column as an additional defense-in-depth layer |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | Cache miss handled gracefully via C5 source-label demotion |
|
||||
|
||||
**Strengths** (positive structural advantages over Cand 1):
|
||||
1. **Native KNN distance ordering for geographic queries** — `ORDER BY position <-> ST_MakePoint(...) LIMIT K` with index-assisted EXPLAIN per Source #94 evidence: `Index Scan using nyc_streets_geom_idx ... Order By: (geom <-> '...'::geometry)`. **No app-side k-derivation OR distance-sort required** vs Cand 1's per-zoom k-tile-radius math.
|
||||
2. **Native great-circle / geodesic distance for `geography` type** — `ST_DWithin(position::geography, ..., $radius_m)` returns true distance in meters across the WGS84 ellipsoid; no Web-Mercator approximation error. **Material accuracy improvement near poles or at very high zoom** but **negligible for project's UAV at 1 km AGL covering ≤200 km mission radius** (Web Mercator distortion <0.5% in this regime).
|
||||
3. **Native combined geographic-+-descriptor query in a single SQL statement** — `SELECT id, file_path, descriptor <-> $query_vec AS dist FROM tiles WHERE ST_DWithin(position::geography, ST_MakePoint($lon, $lat)::geography, $radius_m) ORDER BY descriptor <-> $query_vec LIMIT K`. **Eliminates app-side round-trip overhead** present in Cand 1 (~1-2 ms per query); enables Postgres query planner to choose the most selective filter first (geographic GiST or descriptor HNSW depending on row count distribution).
|
||||
4. **`ST_DWithin(geography, geography, radius_m)` native radius query in meters** — directly answers "give me all tiles within R meters of the query point" without per-zoom k-derivation. **NEW capability vs Cand 1**.
|
||||
5. **Battle-tested PostGIS GiST + pgvector HNSW** — both extensions are L1 canonical Postgres extensions with active maintenance + multi-million production deployments + canonical OGC SFS compliance for PostGIS.
|
||||
6. **Same filesystem tile storage as Cand 1** — zero migration cost on the raw-tile-bytes side.
|
||||
|
||||
**Negative-but-mitigable structural findings**:
|
||||
7. **Heavier Postgres-extension dependency** — PostGIS 3.4 install footprint ~30-80 MB shared libraries + ~10-20 MB SRID/projection metadata catalog; pgvector 0.7+ ~5-10 MB shared library. **Vs Cand 1's zero-extension Postgres**, this is **~50-100 MB additional memory + ~50-200 MB additional disk install footprint**. **Mitigation**: well within AC-4.2 8 GB budget (essentially noise) and AC-8.3 10 GB cache budget (extension install lives in `/usr/lib/postgresql`, not in cache budget). **Real cost**: extra extension to maintain, version-pin, and verify ARM64 compatibility for at C7 inference-runtime + Jetson MVE phase.
|
||||
8. **Geographic GiST index lookup ~5-10× slower than Cand 1's btree composite for the dominant 3 Hz spatial-grid query** — GiST lookup latency ~1-5 ms per Source #94 nyc_streets EXPLAIN evidence (`cost=0.28..79.58 rows=3`); Cand 1's btree lookup is ~0.1-0.5 ms. **Mitigation**: at 3 Hz query rate (333 ms budget per query inside AC-4.1 400 ms p95 envelope), the absolute latency difference (~1-5 ms vs 0.1-0.5 ms) is negligible — **but the relative slowdown is real**.
|
||||
9. **pgvector HNSW dimension limit at full-precision** — `vector` type HNSW supports up to **2,000 dimensions** per Source #95 pgvector README; for **MixVPR canonical 2048-D descriptors per Fact #18 cluster**, this **JUST EXCEEDS the limit**. **Mitigation**: use `halfvec_l2_ops` (half-precision, 2-byte storage, supports up to 16,000 dimensions) — cuts cache footprint by 50% AND clears the dimension limit; OR truncate to 1536-D (loses ~25% Recall@K); OR use 512-D EigenPlaces variant per D-C2-10 = (b) which is well within both pgvector limits AND smaller cache footprint.
|
||||
10. **No empirically-verified Jetson Orin Nano Super deployment for PostGIS+pgvector combined stack** — Source #97 March 2026 article confirms Postgres + pgvector deployment but does not explicitly include PostGIS; Source #94 search results explicitly note absence of Jetson-specific PostGIS install evidence. **Mitigation**: D-C6-5 NEW gate — Jetson MVE phase per D-C1-2 must include PostGIS+pgvector co-installation + OLTP+spatial+ANN combined-query profiling on Jetson Orin Nano Super.
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C6-N gates):
|
||||
|
||||
- **D-C6-5 NEW (Cand-2-only)** — Jetson PostGIS + pgvector co-installation Plan-phase verification choice (verify on Jetson MVE as part of D-C1-2 dedicated bring-up phase / fork PostGIS+pgvector ARM64 builds in-house if upstream packages incomplete / pivot to Cand 1 if PostGIS+pgvector co-installation reveals blocking incompatibility): trade-off between Plan-phase engineering investment vs documented evidence gap. **Recommendation**: D-C6-5 = (a) verify on Jetson MVE phase at D-C1-2 — already-required Jetson hardware bring-up cycle absorbs this work cheaply.
|
||||
- **D-C6-6 NEW (Cand-2-only)** — pgvector descriptor-storage-type choice (`vector` full-precision float32 with 2,000-dim max for HNSW per Source #95 / `halfvec` half-precision 2-byte with 16,000-dim max + 50% cache savings + ~0-2% Recall@K loss / `sparsevec` for sparse descriptors / `bit` for binary descriptors via Hamming distance): trade-off between cache footprint vs accuracy vs descriptor compatibility with C2 VPR candidate output format. **Recommendation**: D-C6-6 = (b) `halfvec` for the primary path; covers all C2 VPR descriptor candidates (MixVPR 2048-D, SelaVPR 1024-D, NetVLAD 4096-D PCA-whitened, EigenPlaces 2048-D-or-smaller-via-D-C2-10, SALAD 8448-D/2112-D/544-D-via-D-C2-6) with consistent storage format.
|
||||
- **D-C6-7 NEW (CROSS-COMPONENT — affects both Cand 1 and Cand 2)** — IF Cand 2 selected → cascade-changes-back-to-suite-satellite-provider strategy choice (cascade PostGIS+pgvector adoption back to satellite-provider for cross-suite consistency / keep satellite-provider on btree-only and gps-denied-onboard on PostGIS+pgvector — accept divergence / migrate satellite-provider to PostGIS+pgvector in a separate ticket post-MVP / leave satellite-provider unchanged + maintain compatibility shim in gps-denied-onboard's pre-flight cache-sync layer). **Recommendation**: per user's session-start clarification "if improvement is small, then there is no sense to change anything at all" — IF Cand 2's MATERIAL improvement justifies adoption, cascade via separate ticket; OTHERWISE stay with Cand 1 throughout the suite.
|
||||
|
||||
---
|
||||
|
||||
## C6 — Comparative-improvement-vs-Cand-1 analysis (closure of batch 1)
|
||||
|
||||
| Dimension | Cand 1 (mirror suite-pattern) | Cand 2 (PostGIS+pgvector) | Improvement magnitude (Cand 2 vs Cand 1) | Verdict per user's "significant-improvement-only" bar |
|
||||
|---|---|---|---|---|
|
||||
| **Geographic spatial-query API** | btree composite + app-side k-radius derivation + app-side distance sort | Native KNN `<->` + native `ST_DWithin` radius | **Material capability improvement** (Cand 2 supports radius queries natively) | ⚠️ Material — but **project's pinned use case is 3x3 grid lookup at fixed zoom** (per AC-3.x mission corridor); native radius queries are unused capability |
|
||||
| **Combined geographic-+-descriptor query** | App-side round trip (~1-2 ms overhead) | Single SQL statement (~0.5 ms overhead) | **Marginal latency improvement** (~1 ms saving per query × 3 Hz = 3 ms/sec saving in absolute time) | ⚪ Marginal |
|
||||
| **Geographic query latency** | ~0.1-0.5 ms btree lookup | ~1-5 ms GiST lookup | **NEGATIVE** — Cand 1 is 5-10× faster for the dominant query | 🔴 Cand 2 worse here |
|
||||
| **Descriptor ANN latency** | ~1-3 ms FAISS HNSW (in-process) | ~1-3 ms pgvector HNSW (in-DB) | **No material difference** | ⚪ Tied |
|
||||
| **Memory footprint** | Postgres + FAISS = ~700 MB-1.5 GB | Postgres + PostGIS + pgvector = ~700 MB-1.4 GB | **No material difference** | ⚪ Tied |
|
||||
| **Cache-budget impact (AC-8.3)** | bytea 8 KB/tile (float32-2048D) | vector 8 KB/tile or halfvec 4 KB/tile | **Tied if both use halfvec / float16** | ⚪ Tied |
|
||||
| **Engineering complexity** | ZERO new infrastructure (mirrors satellite-provider exactly) | TWO new Postgres extensions (PostGIS + pgvector) + ARM64 verification at Jetson MVE + descriptor-format conversion code | **NEGATIVE** — Cand 2 adds ~3-5 days engineering at Plan + Jetson MVE phases | 🔴 Cand 2 worse here |
|
||||
| **Project-pattern alignment** | EXACT mirror of suite satellite-provider | DIVERGENT from suite satellite-provider; requires D-C6-7 NEW gate cascade decision | **NEGATIVE** — Cand 2 forces a cross-suite consistency decision | 🔴 Cand 2 worse here |
|
||||
| **Operator re-loc hint (AC-3.4) handling** | Direct btree lookup at hint zoom + (x, y) | Direct ST_DWithin radius query at hint position + radius | **Tied — both handle it natively** | ⚪ Tied |
|
||||
| **License clean-throughput** | PostgreSQL + FAISS-MIT + psycopg-LGPL/MIT-Apache | PostgreSQL + PostGIS-GPL2 + pgvector-PostgreSQL-License + psycopg | ⚠️ Cand 2 introduces PostGIS-GPL-2.0-or-later which may conflict with D-C1-1 license-posture choice if (b) BSD/permissive-only-track is selected | 🔴 Cand 2 worse here (subject to D-C1-1) |
|
||||
|
||||
**Closure verdict (per user's "significant-improvement-only" bar)**:
|
||||
**Cand 1 (mirror suite-pattern) is the recommended primary path for C6**. Cand 2's improvements (native KNN, native radius queries, single-SQL combined query) are real BUT **the project's pinned 3 Hz spatial-grid query at fixed zoom does not exercise these capabilities** (per AC-3.x mission corridor + AC-1.x frame-center accuracy bars). Cand 2 is **5-10× slower for the dominant geographic query** AND **requires PostGIS+pgvector ARM64 Jetson MVE verification** AND **forces a cross-suite cascade decision (D-C6-7)** AND **may conflict with D-C1-1 license-posture choice (b)** due to PostGIS-GPL-2.0-or-later licensing. **The improvements are marginal-to-negative in the project's specific operating context — no material justification to deviate from the existing satellite-provider pattern.**
|
||||
|
||||
**Cand 2 promotion criteria (defer-to-Plan or Jetson-MVE)**: Cand 2 should be re-evaluated for promotion to primary IF AND ONLY IF (a) project use case expands to require radius-meters-based queries (e.g., dynamic mission corridor adjustment in flight) OR (b) Jetson MVE phase reveals Cand 1's app-side combined-query overhead is materially impacting AC-4.1 latency budget at the tail OR (c) D-C1-1 license-posture choice (a) GPL-3.0 track is selected AND the project elects to standardize on a single Postgres-extension stack for consistency.
|
||||
|
||||
---
|
||||
|
||||
## C6 — Working conclusions and decisions (compounded from Fact #92 + Fact #93 closures)
|
||||
|
||||
**Selected primary**: **Cand 1 (mirror suite-pattern)** — PostgreSQL btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` + filesystem `./tiles/{zoom}/{x}/{y}.{image_type}` + bytea descriptor blobs + app-side FAISS HNSW loaded at takeoff. **Cand 2 (PostGIS+pgvector) deferred to defer-to-Plan or Jetson-MVE secondary** per the comparative analysis above.
|
||||
|
||||
**Decisions raised (D-C6-N gates)** — see [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md):
|
||||
|
||||
- **D-C6-1** (Fact #92) — descriptor-storage-format choice: float32 / halfvec / INT8 — RECOMMENDED halfvec
|
||||
- **D-C6-2** (Fact #92) — FAISS index variant choice: IndexFlatL2 / IndexHNSWFlat / IndexIVFFlat / IndexIVFPQ — RECOMMENDED IndexHNSWFlat M=32
|
||||
- **D-C6-3** (Fact #92) — descriptor-cache-rebuild-trigger strategy: rebuild-on-modification / incremental-add / periodic-rebuild-during-C10-pre-flight — RECOMMENDED periodic-rebuild
|
||||
- **D-C6-4** (Fact #92) — geographic-spatial-grid radius `k`: fixed-1 / fixed-2 / fixed-4 / dynamic-by-zoom-and-ground-speed — RECOMMENDED dynamic
|
||||
- **D-C6-5** (Fact #93, Cand-2-only contingent) — Jetson PostGIS + pgvector co-installation Plan-phase verification choice — RECOMMENDED verify at Jetson MVE D-C1-2
|
||||
- **D-C6-6** (Fact #93, Cand-2-only contingent) — pgvector descriptor-storage-type choice — RECOMMENDED halfvec
|
||||
- **D-C6-7** (Fact #92 + Fact #93, CROSS-COMPONENT) — IF Cand 2 selected → cascade-changes-back-to-suite-satellite-provider strategy — RECOMMENDED cascade-via-separate-ticket OR stay-with-Cand-1 throughout
|
||||
|
||||
C6 batch 1 closed at 2/N. Subsequent C6 candidates (e.g., MBTiles single-sqlite-file, LMDB+geohash, FAISS-only-no-Postgres) deferable — current 2-candidate breadth satisfies engine Component Option Breadth rule for the user's pinned-Postgres scope.
|
||||
@@ -0,0 +1,308 @@
|
||||
# Fact Cards — C7: On-Jetson inference runtime
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Bound to sub-questions in `../00_question_decomposition.md` line 75 (C7 = "INT8/FP16 inference of the chosen VPR + matcher models within latency + memory budget"). Sources for C7 cluster live in [`../01_source_registry/C7_inference_runtime.md`](../01_source_registry/C7_inference_runtime.md).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md), [C6 Tile cache](C6_tile_cache_spatial_index.md). Cross-component gates: [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## Scope summary
|
||||
|
||||
C7 is a **cross-cutting integration row** rather than a per-component candidate row: it pins the on-Jetson inference runtime that hosts the C1 (learned-VIO frontends if any) + C2 VPR backbone + C3 matcher models. C7 batch 1 closed at 3/N on 2026-05-08 with three rows per the user-pinned scope (locked via `/autodev` AskQuestion 2026-05-08): **Fact #94** = TensorRT native primary (TensorRT 10.3 bundled with JetPack 6.2; `IInt8EntropyCalibrator2` calibration; `BuilderFlag.FP16` + `BuilderFlag.INT8` mixed-precision; engines built directly on Jetson SM 87). **Fact #95** = ONNX Runtime + TensorRT EP interop alternate (community-maintained Jetson AI Lab wheel index `pypi.jetson-ai-lab.io/jp6/cu126`; `TensorrtExecutionProvider` with `trt_fp16_enable` / `trt_int8_enable` config; subgraph fallback to CUDA EP / CPU EP). **Fact #96** = pure PyTorch FP16 mandatory simple-baseline (`torch.amp.autocast(device_type='cuda', dtype=torch.float16)`; `model.half()` eager-mode; PyTorch 2.5/2.9 wheels via Jetson AI Lab). Triton / DeepStream / CUDA-Python custom kernels noted-and-rejected in one sentence (server/video-pipeline class or out-of-budget for embedded 8 h mission). User-pinned `c7_quantization=A`: INT8 primary + FP16 fallback per candidate; INT8-only candidates marked Experimental until calibration data exists. **Critical caveat (raised by Source #103)**: feature-matching networks (LightGlue / DISK / XFeat) suffer material accuracy degradation under INT8/FP8 quantization vs FP16 — INT8 is the right primary axis for **VPR backbones** (CNN class) but **FP16 is the safer primary axis for matchers** (transformer class). This is captured in D-C7-6 INT8-vs-FP16-per-model-family-precision-policy.
|
||||
|
||||
---
|
||||
|
||||
### Fact #94 — TensorRT native primary: JetPack-bundled TensorRT 10.3 + IInt8EntropyCalibrator2 + BuilderFlag.FP16+INT8 mixed-precision; engines built directly on Jetson Orin Nano Super SM 87
|
||||
|
||||
**Statement**: The TensorRT native primary candidate for C7 uses the JetPack 6.2 bundled TensorRT 10.3 SDK (CUDA 12.6 + cuDNN 9.3 per Source #104) with the canonical INT8/FP16 mixed-precision build flow:
|
||||
|
||||
- **INT8 calibrator hierarchy** (per Source #99 context7-verified): `nvinfer1::IInt8Calibrator` (abstract base) + `nvinfer1::IInt8EntropyCalibrator2` (current canonical recommended algorithm, returns `kENTROPY_CALIBRATION_2`) + `nvinfer1::IInt8MinMaxCalibrator` (alternate for activations with bimodal distributions). Each implements `getBatchSize()` + `getBatch(void* bindings[], const char* names[], int32_t nbBindings)` + `readCalibrationCache(size_t& length)` + `writeCalibrationCache(const void* ptr, size_t length)` + `getAlgorithm()`.
|
||||
- **Python builder INT8 enable pattern** (canonical TensorRT 10.x per Source #99):
|
||||
```python
|
||||
config.set_flag(trt.BuilderFlag.INT8)
|
||||
config.int8_calibrator = Int8_calibrator
|
||||
Int8_calibrator = EntropyCalibrator(["input_node_name"], batchstream)
|
||||
```
|
||||
- **Mixed-precision flag pattern**: `config.set_flag(trt.BuilderFlag.FP16)` + `config.set_flag(trt.BuilderFlag.INT8)` for combined FP16+INT8 mixed precision (TensorRT auto-selects per-layer precision based on calibration data; quantization-sensitive layers fall back to FP16, less-sensitive layers stay INT8).
|
||||
- **Calibration data requirement**: ~500-1,500 representative input samples (UAV nadir frames at flight altitude over season-matched satellite tiles) — gates the INT8 build path (no calibration data → INT8 NOT achievable, FP16-only build is the fallback; see D-C7-1 + D-C7-6).
|
||||
- **Engine build location**: per Source #105 constraint #2 + #3, engines MUST be built directly on the Jetson target (SM 87 = Ampere class). Laptop / dev-machine GPUs (e.g., RTX 4090 = SM 89) build engines that fail load with `Target GPU SM 87 is not supported by this TensorRT release`. Build-time memory pressure on the 8 GB shared budget caps `config.max_workspace_size` at ~1-2 GB to avoid tactic-profile segfaults (Source #105 constraint #4).
|
||||
- **Install path**: per Source #105 constraint #1, `pip install tensorrt` is NOT supported on Jetson Tegra; the canonical install is the JetPack-bundled TensorRT (already present after `apt install nvidia-jetpack`), accessed via `/usr/lib/python3.10/dist-packages/tensorrt`. Upgrading TensorRT independently of JetPack is not officially supported.
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: ONNX model graph (exported from PyTorch via `torch.onnx.export` on the dev machine) + a representative calibration dataset (NumPy `.npy` or Torch `.pt` tensors of shape `[N, C, H, W]` matching the model's expected input)
|
||||
- outputs: serialized TensorRT engine `.engine` file (hardware-specific to SM 87 Jetson Orin Nano Super) + per-frame inference latency in the 3-8 ms range for CNN-backbone VPR networks at ~224×224-320×320 input (per Source #102 YOLO26n empirical benchmarks); inference accepts CUDA tensors via `IExecutionContext.execute_v2` Python API or the `enqueueV3` async path
|
||||
- runtime: TensorRT 10.3 + CUDA 12.6 + cuDNN 9.3 on JetPack 6.2 + Jetson Orin Nano Super in Super Mode (per Source #104 — 70% AI TOPS increase + 50% memory bandwidth boost vs base mode)
|
||||
|
||||
**Source**:
|
||||
- Primary API: Source #99 NVIDIA TensorRT 10.x official documentation (context7 indexed at `/websites/nvidia_deeplearning_tensorrt`, 9371 code snippets) — confirms `IInt8EntropyCalibrator2`, `BuilderFlag.INT8`, `BuilderFlag.FP16`, calibrator interface methods.
|
||||
- Latency anchor: Source #102 Ultralytics YOLO26 benchmark suite on Jetson Orin Nano Super (April 2026) — TensorRT FP32 7.53 ms / FP16 4.57 ms / INT8 3.80 ms for YOLO26n (CNN object detector, ~3M parameters, 640×640 input).
|
||||
- Software stack pin: Source #104 NVIDIA JetPack 6.2 release notes — TensorRT 10.3 + CUDA 12.6 + cuDNN 9.3 + Super Mode for Orin Nano production modules.
|
||||
- Install constraints: Source #105 — `pip install tensorrt` not supported on Tegra; engines hardware-specific; build-on-target mandatory; memory-pressure during tactic profiling caps workspace size.
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High for the API capability verification (TensorRT INT8/FP16 build APIs are L1 official docs + 9371 context7 snippets); ⚠️ Medium-High for the latency claim on this specific project's models (YOLO benchmarks anchor CNN-class throughput; VPR networks like MixVPR/EigenPlaces are CNN-class similar-architecture and likely follow the same trend, but matcher networks like LightGlue/DISK/XFeat are transformer-class and known to deviate per Source #103); ⚠️ Medium for the "INT8 achievable for matchers" axis — Source #103 evidence shows LightGlue FP8 caused "match counts dropped sometimes hard", and INT8 is structurally similar to FP8 in dynamic-range reduction.
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C7 row in `../06_component_fit_matrix/C7_inference_runtime.md` (this fact populates the `TensorRT native primary` candidate row).
|
||||
- SQ5 (failure modes) — feature-matching INT8 quantization-sensitivity is captured here as a NEW failure-mode line item.
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | TensorRT runtime is entirely local (CUDA-side execution on Jetson GPU); no network calls at inference time. |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ✅ PASS | TensorRT 10.3 ships bundled in JetPack 6.2 ARM64 install; JetPack-bundled wheel is the canonical install path per Source #105. |
|
||||
| **AC-1.1 (≤80 m at 1 km AGL)** | ✅ PASS | Inference accuracy is downstream of model selection (C2/C3); TensorRT runtime accuracy parity with PyTorch is documented at FP16 (typically <0.5% delta) and at INT8 with calibration data ranges from <1% (CNN backbones, e.g. Source #102 YOLO26n FP16 mAP 0.4800 vs INT8 0.4490 = -6.5% — concerning for matchers but acceptable for VPR backbones at Recall@K granularity). |
|
||||
| **AC-1.2 (≤30 m at 500 m AGL)** | ✅ PASS | Same as above. |
|
||||
| **AC-3.1 sharp turns ±20° bank** | ✅ PASS | TensorRT inference is deterministic; sharp-turn input frames are processed at the same latency as level-flight frames. |
|
||||
| **AC-3.2 sharp-turn frames may share <5% overlap** | ✅ PASS | Matcher-side quantization-sensitivity (per Source #103) is the dominant concern, NOT runtime; D-C7-6 covers per-model-family precision policy (matchers FP16, VPR INT8). |
|
||||
| **AC-3.3 re-localization stability** | ✅ PASS | TensorRT engine is deterministic (no randomness within a single inference; `IExecutionContext.execute_v2` is bit-exact reproducible across runs). |
|
||||
| **AC-3.4 operator re-loc hint** | ✅ PASS | Operator-supplied hints affect cache lookup (C6) and pose initialization (C5), not C7 runtime. |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | ✅ PASS | Per Source #102 empirical YOLO26n on Jetson Orin Nano Super: TensorRT FP16 4.57 ms / INT8 3.80 ms per inference. For the project's pipeline at 3 Hz (~333 ms budget per query), running C2 VPR (CNN, ~5-15 ms FP16/INT8 estimated for MixVPR/EigenPlaces ResNet50 at 224×224-320×320) + C3 LightGlue matcher per pair (~15-40 ms FP16 estimated, K=10 pairs = 150-400 ms — TIGHT, addressed by D-C3-3 K-pairs reduction) easily fits when matchers run FP16-only and VPR runs INT8. |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | TensorRT runtime memory: ~50-150 MB shared library + per-engine activation memory ~50-300 MB depending on model. Peak combined for VPR-engine + matcher-engine + executor context typically ~1-2 GB out of 8 GB shared budget. Engine build-time peak is ~3-5 GB (capped via `max_workspace_size` per D-C7-8) — must be done at pre-flight, NOT at flight time. |
|
||||
| **AC-4.5 look-back refinement** | N/A | Inference runtime is forward-only; look-back is C5 estimator's responsibility. |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | N/A | TensorRT engine `.engine` files are typically 10-200 MB each; 3-5 engines (VPR + matcher + optional VIO frontend) consume ~100-500 MB on disk — separate from the 10 GB cache budget (engines live in `/usr/local/lib/onboard/engines/`, not in tile cache). |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | TensorRT inference latency + memory + per-layer profile recordable as FDR fields via `IExecutionContext.profiler` API. |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Runtime is passive; covariance is C4/C5 responsibility. |
|
||||
| **AC-NEW-7 cache-poisoning safety** | N/A | Runtime does not write to the tile cache. |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | TensorRT inference does NOT trigger failsafe directly; a runtime crash (rare; TensorRT 10.x is production-stable) is caught by the supervising process and triggers C5 demotion to `dead_reckoned` per AC-NEW-8 escalation thresholds. |
|
||||
|
||||
**Strengths** (positive structural advantages):
|
||||
1. **Native NVIDIA stack — fastest possible inference path**. TensorRT directly maps ONNX graph operations to fused CUDA kernels with hardware-aware scheduling on Ampere SM 87. Per Source #102 benchmarks, TensorRT FP16 is **1.65× faster than TensorRT FP32** and **~2× faster than pure-PyTorch FP16** at the YOLO26n class workload — this gap widens with larger models and is roughly preserved across CNN architectures.
|
||||
2. **Mixed-precision per-layer auto-selection at INT8 build time** — sensitivity-based fallback to FP16 for layers that fail INT8 calibration tolerance (configured via `config.set_flag(trt.BuilderFlag.OBEY_PRECISION_CONSTRAINTS)` + per-layer `setOutputType` + `setPrecision`). This auto-mitigates the concern in Source #103 about feature-matching networks suffering severe INT8 degradation: TensorRT's calibrator can keep the matcher's attention layers at FP16 while quantizing convolutional preprocessing.
|
||||
3. **JetPack-bundled — zero install friction**. Per Source #105, TensorRT 10.3 ships pre-installed with JetPack 6.2; no external pip dependency, no version-mismatch failure modes, no community wheel index dependency.
|
||||
4. **Hardware-aware engine optimization** at build time (tactic search across kernel implementations selects the fastest for SM 87 specifically). This is unique to TensorRT — ONNX Runtime + TRT EP also produces TRT engines but with less control over the build flags.
|
||||
5. **Production-mature** — TensorRT 10.x is the canonical NVIDIA production inference SDK with multi-million deployment footprint (auto driving / robotics / industrial) and structured release notes per JetPack version.
|
||||
6. **Profile-driven debugging** via `IExecutionContext.profiler` API — per-layer latency + memory + precision visible at runtime, drives D-C7-6 calibration tuning loops.
|
||||
|
||||
**Negative-but-mitigable structural findings**:
|
||||
7. **Engines are hardware-specific** — must be rebuilt on the Jetson target. Per Source #105 constraint #2 + #3, laptop-built engines fail load with `Target GPU SM 87 is not supported`. **Mitigation**: engine build is part of pre-flight cache provisioning (C10 row when opened), not a runtime concern. Engine builds typically take 30-300 sec per model and are persisted across flights via `IRuntime.deserializeCudaEngine` from disk.
|
||||
8. **INT8 calibration requires representative dataset** — typically 500-1,500 input samples covering the deployment distribution. **Mitigation**: D-C7-1 closed at C7 batch 1 with calibration-corpus distribution = real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles (per the 2026-05-08 C9 / SQ7 restructure decision in `../00_question_decomposition.md`). Specific fixture-file pin delegated to Test Spec (greenfield Step 5). Candidate corpora carried forward to Test Spec: AerialVL S03 + AerialExtreMatch + project's own Mavic + Derkachi flight footage.
|
||||
9. **Build-time memory pressure on 8 GB shared budget** — can segfault during tactic profiling per Source #105 constraint #4. **Mitigation**: cap `config.max_workspace_size` at 1-2 GB; build during pre-flight when no other workloads are active; serialize the engine for runtime deserialization.
|
||||
10. **No per-mode pip install path** — requires JetPack-bundled TensorRT. **Mitigation**: project's deployment is JetPack-based by hardware constraint; no alternative install path is needed.
|
||||
11. **Feature-matching INT8 quantization-sensitivity** (per Source #103: "match counts dropped sometimes hard" for FP8 LightGlue; INT8 is structurally similar). **Mitigation**: D-C7-6 INT8-vs-FP16-per-model-family-precision-policy — CNN-class models (VPR backbones MixVPR/EigenPlaces/SelaVPR-DINOv2) target INT8; transformer-class matchers (LightGlue / DISK / XFeat) target FP16; calibration data and per-layer precision overrides handled in build script.
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C7-N gates):
|
||||
|
||||
- **D-C7-1 CLOSED IN C7 batch 1 (2026-05-08, per the C9 / SQ7 restructure user choice A in `../00_question_decomposition.md`)** — calibration-dataset-strategy. **Closure**: strategy = real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles as the calibration corpus distribution (matches the Project Constraint Matrix's "Inputs available" pinning + provides realistic noise/illumination/season distribution that the deployed system will see). Specific fixture-file pin (AerialVL S03 vs project's Mavic + Derkachi flight clips vs other corpora) is fixture-class and DELEGATED to Test Spec (greenfield Step 5). Synthetic-tile augmentation via random homography is the documented low-data fallback, only invoked if real flight footage is insufficient for Recall@K-target calibration. ~500–1,500 representative samples per the C7 batch 1 INT8 build constraint. No Plan-phase Choose block remains.
|
||||
- **D-C7-2 NEW** — TensorRT mixed-precision flag matrix per model family (VPR INT8+FP16 fallback / matchers FP16-only / VIO learned-frontends if any FP16-only). **Recommendation**: D-C7-2 = ladder per family; finalize at Jetson MVE phase per D-C1-2 + D-C7-6.
|
||||
- **D-C7-7 NEW** — engine-build-on-Jetson-vs-prebuilt-engine-shipping strategy (build engines at pre-flight on the deployed Jetson / build engines on a known-good "reference Jetson" then ship the same `.engine` files to all production Jetsons / both — primary path build-on-target with reference-Jetson-built engines as a fallback if pre-flight build fails). **Recommendation**: D-C7-7 = primary build-on-deployed-Jetson during pre-flight (handles SM-version drift + future TensorRT minor version updates); fallback prebuilt engines for emergency provisioning.
|
||||
- **D-C7-8 NEW** — `config.max_workspace_size` cap to avoid tactic-profile segfault during build (1 GB safe default / 2 GB for richer kernel-fusion search / 3 GB for fastest-possible engine but high segfault risk on 8 GB budget). **Recommendation**: D-C7-8 = 1 GB safe default; raise to 2 GB only if Plan-phase Jetson MVE shows engine quality is materially worse at 1 GB.
|
||||
- **D-C7-9 NEW** — TensorRT version pin within JetPack lifecycle (pin to JetPack 6.2's bundled TensorRT 10.3 / track JetPack 6.x minor releases / lock the exact JetPack point release for cross-deployment reproducibility). **Recommendation**: D-C7-9 = lock to JetPack 6.2 + TensorRT 10.3 for the project's first deployment; revisit at Plan-phase per JetPack release cadence.
|
||||
|
||||
---
|
||||
|
||||
### Fact #95 — ONNX Runtime + TensorRT EP interop alternate: onnxruntime-gpu via Jetson AI Lab JP6/CU126 wheel index + TensorrtExecutionProvider config + automatic CUDA EP / CPU EP subgraph fallback
|
||||
|
||||
**Statement**: The ONNX Runtime + TensorRT EP interop alternate candidate for C7 uses ONNX Runtime as the model-agnostic inference frontend with TensorRT as the kernel-execution backend, hosted on the Jetson via the community-maintained Jetson AI Lab wheel index:
|
||||
|
||||
- **Provider enumeration + config pattern** (canonical Python API per Source #100 context7-verified):
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
print(ort.get_available_providers())
|
||||
tensorrt_options = {
|
||||
'device_id': 0,
|
||||
'trt_max_workspace_size': 1073741824, # 1 GB cap per D-C7-8
|
||||
'trt_fp16_enable': True,
|
||||
'trt_int8_enable': False, # see D-C7-6
|
||||
'trt_engine_cache_enable': True,
|
||||
'trt_engine_cache_path': '/var/cache/onboard/trt-engines',
|
||||
}
|
||||
cuda_options = {'device_id': 0, 'arena_extend_strategy': 'kNextPowerOfTwo'}
|
||||
session = ort.InferenceSession(
|
||||
"model.onnx",
|
||||
providers=[
|
||||
('TensorrtExecutionProvider', tensorrt_options),
|
||||
('CUDAExecutionProvider', cuda_options),
|
||||
'CPUExecutionProvider'
|
||||
],
|
||||
)
|
||||
```
|
||||
- **Provider-cascade behavior**: ORT TRT EP attempts to optimize each subgraph via TensorRT (subgraph = a maximal contiguous region of the ONNX graph whose ops are TRT-supported); falls back to CUDA EP for unsupported ops; falls back to CPU EP if neither GPU EP applies. Subgraph fallback is automatic and per-op transparent — operators that TRT does not support (rare custom ops, specialized attention variants) silently route through CUDA EP without runtime error.
|
||||
- **Engine cache integration**: `trt_engine_cache_enable=True` + `trt_engine_cache_path` causes ORT TRT EP to serialize the per-subgraph TensorRT engines on first execution and reuse them on subsequent runs (~10-300 sec first-run cost amortized to <1 sec on subsequent loads). Same hardware-specificity constraint applies (engines tied to SM 87 — see Source #105 #2 + #3).
|
||||
- **Install path (CRITICAL)**: per Source #100, standard `pip install onnxruntime-gpu` does NOT work on Jetson Tegra. The canonical install paths are:
|
||||
- **JetPack 6 + CUDA 12.6 + Ubuntu 22.04 (project target)**: `pip3 install onnxruntime-gpu --index-url https://pypi.jetson-ai-lab.io/jp6/cu126`
|
||||
- **JetPack 6 + CUDA 12.9 + Ubuntu 24.04 (alternate)**: `pip3 install onnxruntime-gpu --index-url https://pypi.jetson-ai-lab.io/jp6/cu129`
|
||||
- **Known incompatibility**: onnxruntime-gpu v1.23.0 wheels for JetPack 6 were built against `numpy<2.0.0` (per Source #100 GitHub Issue #27562); importing under `numpy>=2.0.0` raises a compatibility error. Project requirements file MUST pin `numpy<2.0.0` until upstream rebuild.
|
||||
- **Provider availability gate**: standard `pip install onnxruntime` (no `-gpu` suffix) installs the CPU-only build that exposes ONLY `CPUExecutionProvider` and `AzureExecutionProvider` — does NOT include CUDA EP or TensorRT EP. Project provisioning script must verify `'TensorrtExecutionProvider' in ort.get_available_providers()` at startup.
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: ONNX model graph (any source — PyTorch via `torch.onnx.export`, TensorFlow via `tf2onnx`, vendor-shipped ONNX) + per-session config dict for TRT EP / CUDA EP / CPU EP fallback ladder
|
||||
- outputs: `ort.InferenceSession.run(output_names, input_feed)` — accepts NumPy arrays as input (auto-marshaled to GPU tensors at the EP boundary); per-session subgraph engine cache persisted to disk for fast warm-start
|
||||
- runtime: onnxruntime-gpu (community-maintained Jetson AI Lab build) + JetPack-bundled TensorRT 10.3 + CUDA 12.6 + Python 3.10 on Jetson Orin Nano Super in Super Mode
|
||||
|
||||
**Source**:
|
||||
- Primary API: Source #100 Microsoft ONNX Runtime official documentation (context7 indexed at `/microsoft/onnxruntime` v1.25.0, 1462 code snippets at Benchmark Score 82.23 — highest of the three C7 candidate context7 lookups).
|
||||
- Jetson install path: Source #100 dusty-nv/jetson-containers Issue #1283 + microsoft/onnxruntime Issue #20503 — confirms Jetson AI Lab wheel index as canonical install for JetPack 6.
|
||||
- NumPy incompatibility: Source #100 microsoft/onnxruntime Issue #27562 — onnxruntime-gpu v1.23.0 JetPack 6 wheels built with `numpy<2.0.0`.
|
||||
- Software stack pin: Source #104 — JetPack 6.2 ships TensorRT 10.3, which ORT TRT EP delegates to.
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High for the API capability verification (1462 context7 code snippets at Benchmark Score 82.23); ⚠️ Medium for the deployability claim — community-maintained wheels (Jetson AI Lab) carry slightly higher version-drift risk than JetPack-bundled TensorRT, plus the documented `numpy<2.0.0` pin limits forward-compatibility. Plan-phase Jetson MVE per D-C1-2 + D-C7-3 must validate the exact ORT version + numpy pin.
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C7 row in `../06_component_fit_matrix/C7_inference_runtime.md` (this fact populates the `ONNX Runtime + TensorRT EP` candidate row).
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | ONNX Runtime + TRT EP runtime is entirely local. |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ⚠️ PASS-with-Plan-phase-verification | onnxruntime-gpu prebuilt aarch64 wheels NOT published by Microsoft (per Source #100 Issue #20503); canonical install requires Jetson AI Lab community wheel index `pypi.jetson-ai-lab.io/jp6/cu126`. Microsoft Issues acknowledge the gap; community wheels are widely used in the Jetson ecosystem but are NOT officially-supported by Microsoft. Plan-phase Jetson MVE per D-C7-3 must verify that the wheel index is reachable in the project's offline-deployment context (likely requires pre-flight wheel-cache-mirroring). |
|
||||
| **AC-1.1 (≤80 m at 1 km AGL)** | ✅ PASS | Inference accuracy parity with TensorRT-native (ORT TRT EP delegates to TensorRT for supported subgraphs). |
|
||||
| **AC-1.2 (≤30 m at 500 m AGL)** | ✅ PASS | Same as above. |
|
||||
| **AC-3.1 sharp turns ±20° bank** | ✅ PASS | Same deterministic-inference profile as TensorRT-native. |
|
||||
| **AC-3.2 sharp-turn frames may share <5% overlap** | ✅ PASS | Same as TensorRT-native — quantization-sensitivity is model-family-dependent, not runtime-dependent. |
|
||||
| **AC-3.3 re-localization stability** | ✅ PASS | Engine-cache deserialization is deterministic; bit-exact reproducibility across runs once engines are warm. First-run subgraph compilation can take 10-300 sec per model (one-time cost; engines persisted to `trt_engine_cache_path`). |
|
||||
| **AC-3.4 operator re-loc hint** | ✅ PASS | Operator hint affects C5/C6, not C7. |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | ⚠️ TIGHT-BUT-FITS | After warm cache, per-inference latency is essentially TensorRT-native + a small ORT framework overhead (~1-3 ms per session.run() call for input marshaling and provider dispatch). Per Source #100 ORT provider-cascade behavior, op-level fallback to CUDA EP for unsupported subgraphs adds latency vs pure-TRT — but for canonical CNN VPR backbones (MixVPR/EigenPlaces) and matchers with TRT-supported attention (LightGlue with FlashAttentionV2 plugin per Source #103), full TRT-EP coverage is achievable. Cold-start cost (~10-300 sec for first-run engine build) is paid once per Jetson per model — handled at pre-flight per D-C7-7. |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | ORT runtime memory: ~30-100 MB framework + ~50-150 MB CUDA EP + per-engine activation memory (delegated to TRT). Peak combined for VPR-engine + matcher-engine + ORT runtime typically ~1-2 GB out of 8 GB shared budget, slightly heavier than TensorRT-native (Fact #94) but within the same order of magnitude. |
|
||||
| **AC-4.5 look-back refinement** | N/A | Forward-only inference. |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | N/A | ORT engine cache (~100-500 MB total across 3-5 models) lives in `trt_engine_cache_path`, not in the tile cache budget. |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | ORT exposes per-session profiling via `SessionOptions.enable_profiling=True` → `session.end_profiling()` returns a JSON profile file with per-op latency. |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Runtime is passive. |
|
||||
| **AC-NEW-7 cache-poisoning safety** | N/A | Runtime does not write to the tile cache. |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | A runtime crash is caught by the supervising process and triggers C5 demotion. |
|
||||
|
||||
**Strengths** (positive structural advantages over TensorRT-native):
|
||||
1. **Model-format-agnostic** — ONNX is the de-facto interchange format; PyTorch / TensorFlow / JAX / scikit-learn / vendor models all export to ONNX with high fidelity. Avoids the per-framework export friction of pure TensorRT (which historically requires specific UFF/Caffe parsers OR ONNX-then-TRT-builder).
|
||||
2. **Subgraph fallback to CUDA EP / CPU EP for unsupported ops** — robust to model-architecture additions that TensorRT does not yet support natively (rare custom attention variants, specialized aggregations). TensorRT-native fails to build the engine in these cases; ORT TRT EP gracefully degrades to CUDA EP.
|
||||
3. **Engine-cache integration** — per-subgraph engines are auto-built on first run and persisted; subsequent runs warm-start in <1 sec. Eliminates the explicit `trtexec` build step from the deployment workflow.
|
||||
4. **Cross-architecture portability of the source code** — the same Python inference script runs on the dev machine (CUDA EP only) and on the Jetson (TensorRT EP + CUDA EP); no Jetson-specific code paths required.
|
||||
5. **Active Microsoft maintenance** — context7 v1.25.0 confirmed at Benchmark Score 82.23 (highest of the three C7 candidate lookups); ORT release cadence is monthly with NVIDIA-sponsored TRT EP improvements.
|
||||
|
||||
**Negative-but-mitigable structural findings** (over TensorRT-native):
|
||||
6. **Jetson install requires community wheel index** (per Source #100 Issue #20503) — adds an external dependency NOT officially supported by Microsoft. **Mitigation**: pre-flight provisioning mirrors the Jetson AI Lab wheel index to a project-controlled artifact registry (~50 MB per wheel set); offline deployment is then self-contained.
|
||||
7. **NumPy <2.0.0 pin** for onnxruntime-gpu v1.23.0 JetPack 6 wheels (per Source #100 Issue #27562) — restricts forward-compatibility with downstream packages that require NumPy 2.x. **Mitigation**: project requirements file pins `numpy<2.0.0`; track upstream rebuild via `microsoft/onnxruntime` release notes for the version bump that resolves this.
|
||||
8. **Slight runtime overhead vs TensorRT-native** (~1-3 ms per `session.run()` call for input marshaling and provider dispatch) — material at the per-frame budget but small relative to the total ~5-15 ms VPR + ~15-40 ms matcher per pair. **Mitigation**: at the project's 3 Hz frame rate the absolute overhead is ~3-9 ms/sec, well within the AC-4.1 400 ms p95 budget.
|
||||
9. **First-run subgraph build cost** (10-300 sec per model) — silent at runtime if `trt_engine_cache_path` doesn't exist. **Mitigation**: pre-flight provisioning script builds the cache by running a synthetic warm-up batch through each model; runtime startup then warm-loads in <1 sec.
|
||||
10. **Less direct control over TRT build flags** vs TensorRT-native — ORT TRT EP exposes a curated subset of flags via `tensorrt_options` (`trt_fp16_enable`, `trt_int8_enable`, `trt_max_workspace_size`, etc.); fine-grained per-layer precision policy (e.g., `setPrecision` overrides per node) requires the explicit TensorRT API. **Mitigation**: the curated subset covers the C7 user-pinned scope (`c7_quantization=A`); per-model-family precision policy is captured in D-C7-6 + handled via `trt_int8_enable` per-engine flag toggling.
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C7-N gates):
|
||||
|
||||
- **D-C7-3 NEW (Cand-2 specific)** — ORT-Jetson-wheel-index-pin choice (`pypi.jetson-ai-lab.io/jp6/cu126` for JetPack 6.2 / `pypi.jetson-ai-lab.io/jp6/cu129` for JetPack 6.x with newer CUDA / mirror the wheel index to a project-controlled artifact registry for offline-deployment robustness). **Recommendation**: D-C7-3 = mirror to project artifact registry (~50 MB per wheel set; pre-flight provisioning step) + cu126 variant for JetPack 6.2 alignment.
|
||||
- **D-C7-4 NEW (Cand-2 specific)** — numpy-version-pin choice (`numpy<2.0.0` per Source #100 Issue #27562 / wait for upstream onnxruntime-gpu rebuild against numpy>=2 / pin to a specific onnxruntime-gpu version known to work with numpy<2). **Recommendation**: D-C7-4 = `numpy<2.0.0` until upstream rebuild; track Issue #27562 status at Plan phase.
|
||||
|
||||
---
|
||||
|
||||
### Fact #96 — Pure PyTorch FP16 mandatory simple-baseline: torch.amp.autocast + model.half() + Jetson AI Lab PyTorch 2.x ARM64 wheel
|
||||
|
||||
**Statement**: The pure PyTorch FP16 mandatory simple-baseline candidate for C7 uses PyTorch's native AMP (Automatic Mixed Precision) machinery as the deployment baseline — no ONNX export, no TensorRT engine build, no engine cache. The role is **mandatory simple-baseline** per the engine's Component Option Breadth rule (always have a runnable fallback) and per the user-pinned `c7_breadth=B` scope (TensorRT primary + ONNX Runtime+TRT EP alternate + pure PyTorch FP16 baseline):
|
||||
|
||||
- **`torch.amp.autocast(device_type, dtype, enabled, cache_enabled)`** (canonical AMP context manager since PyTorch 1.10 per Source #101 context7-verified):
|
||||
```python
|
||||
with torch.no_grad():
|
||||
with torch.autocast(device_type='cuda', dtype=torch.float16, enabled=True):
|
||||
output = model(input)
|
||||
```
|
||||
Auto-selects per-op precision: matmul / conv / linear at FP16; layer-norm / softmax / accumulators stay FP32 for numerical stability.
|
||||
- **`model.half()`** — eager-mode FP16 weight conversion (full-precision FP16 throughout, simpler but loses autocast's per-op precision auto-selection):
|
||||
```python
|
||||
model = model.half().cuda().eval()
|
||||
output = model(input.half().cuda())
|
||||
```
|
||||
Matches the canonical `model.half()` deployment pattern documented in PyTorch eager-mode FP16 inference recipes.
|
||||
- **`torch.compile(model, backend='inductor')`** — graph-mode optimization for further speedup; tradeoff is cold-start compile cost (~10-60 sec). Per Source #101, `inductor` is the default backend; `cudagraphs` for static-shape inference; `ipex` for Intel CPU. The Jetson Orin Nano Super CUDA path uses `inductor`.
|
||||
- **Install path (Jetson)**: per Source #101 NVIDIA Developer Forum threads, standard `pip install torch` does NOT include CUDA support on Jetson — must use NVIDIA-published or Jetson AI Lab community wheels:
|
||||
- **JetPack 6.2 + CUDA 12.6 + Ubuntu 22.04 + Python 3.10 canonical**: `torch-2.9.0-cp310-cp310-linux_aarch64.whl` from Jetson AI Lab (alternative stable: PyTorch 2.5 + torchvision 0.20).
|
||||
- **CUDA capability**: Jetson Orin Nano Super GPU = **SM 87** (Ampere class). PyTorch wheels must be built against CUDA 12.6 to match JetPack 6.2's CUDA toolchain.
|
||||
- **Known dependency issues**: missing `libcudss.so.0` and `libnvdla_runtime.so` on PyTorch 2.9 cu129 wheel under JetPack 6.2 (CUDA 12.6) — version-mismatch between wheel build target and installed JetPack CUDA. Mitigation: prefer the cu126 variant for JetPack 6.2.
|
||||
|
||||
**Mode pinning** (per-mode API verification rule):
|
||||
- inputs: in-process Python PyTorch model (`torch.nn.Module`) loaded from a checkpoint (`torch.load(path)`) at startup; input tensors as `torch.Tensor` on CUDA device
|
||||
- outputs: forward-pass result tensor in FP16 (autocast) or FP16-end-to-end (`model.half()`); per-frame inference latency in the **15-40 ms range for CNN VPR networks** (extrapolated from Source #102 YOLOv8s on Jetson Orin Nano FP16 ~9.7 ms = TensorRT FP16; PyTorch FP16 typically ~1.5-2× slower than TensorRT FP16 due to no kernel fusion)
|
||||
- runtime: Jetson AI Lab PyTorch 2.5 / 2.9 ARM64 wheel + Python 3.10 + CUDA 12.6 on Jetson Orin Nano Super in Super Mode
|
||||
|
||||
**Source**:
|
||||
- Primary API: Source #101 PyTorch official documentation (context7 indexed at `/pytorch/pytorch` v2.5.1 / v2.8.0 / v2.9.1 / v2.11.0; 4866 code snippets at Benchmark Score 76.69) — confirms `torch.amp.autocast`, `torch.no_grad`, `torch.compile`, `model.half()`.
|
||||
- Jetson install path: Source #101 NVIDIA Developer Forum threads (multiple) — confirms Jetson AI Lab as canonical wheel source for JetPack 6.x; documents `libcudss.so.0` / `libnvdla_runtime.so` dependency issues on cu129 vs cu126 variants.
|
||||
- Latency anchor (relative): Source #102 — pure PyTorch FP16 typically ~1.5-2× slower than TensorRT FP16 at the same workload; extrapolation from YOLOv8s 9.7 ms FP16 TRT → ~15-20 ms pure PyTorch FP16 on Orin Nano Super.
|
||||
|
||||
**Phase**: Mode A Phase 2 — engine Step 3 + Step 7.5 (Component Applicability Gate)
|
||||
|
||||
**Confidence**: ✅ High for the API capability verification (4866 context7 snippets); ⚠️ Medium for the latency claim (extrapolated from YOLO benchmarks; PyTorch eager-mode latency is more variable across model architectures than TensorRT's). Plan-phase Jetson MVE per D-C1-2 produces the actual Pure-PyTorch-FP16 latency numbers per project model.
|
||||
|
||||
**Sub-Question Binding**:
|
||||
- SQ3+SQ4 → C7 row in `../06_component_fit_matrix/C7_inference_runtime.md` (this fact populates the `pure PyTorch FP16` mandatory simple-baseline candidate row).
|
||||
|
||||
**Implication / per-numbered-Restriction × per-numbered-AC sub-matrix**:
|
||||
|
||||
| Project Restriction / AC | Verdict | Evidence |
|
||||
|---|---|---|
|
||||
| **R-NEW-2 no cloud at flight** | ✅ PASS | PyTorch runtime is entirely local. |
|
||||
| **R-NEW-4 Jetson Orin Nano Super JetPack 6 ARM64** | ⚠️ PASS-with-Plan-phase-verification | PyTorch ARM64 wheels not officially distributed by PyTorch Foundation for Jetson; canonical install via Jetson AI Lab community + NVIDIA Developer Forum recommendations. Same community-wheel-index dependency as ORT (Fact #95) but with broader community footprint (PyTorch on Jetson is a well-trodden path). |
|
||||
| **AC-1.1 (≤80 m at 1 km AGL)** | ✅ PASS | FP16 inference accuracy parity with FP32 is documented at <0.5% delta for CNN backbones; matchers (transformer-class) are FP16-stable at production grade per Source #103 (FP8 caused degradation, but FP16 did not). |
|
||||
| **AC-1.2 (≤30 m at 500 m AGL)** | ✅ PASS | Same as above. |
|
||||
| **AC-3.1 sharp turns ±20° bank** | ✅ PASS | Eager-mode PyTorch is deterministic at fixed seed; per-frame inference is bit-exact reproducible. |
|
||||
| **AC-3.2 sharp-turn frames may share <5% overlap** | ✅ PASS | Runtime-agnostic; quantization-sensitivity does not apply to FP16 baseline (only INT8). |
|
||||
| **AC-3.3 re-localization stability** | ✅ PASS | No engine cache or compilation step; consistent per-frame latency from first frame onward. |
|
||||
| **AC-3.4 operator re-loc hint** | ✅ PASS | Hint affects C5/C6, not C7. |
|
||||
| **AC-4.1 latency budget (<400 ms p95 end-to-end)** | ⚠️ TIGHT — likely fails for full pipeline | Pure PyTorch FP16 is ~1.5-2× slower than TensorRT FP16 per Source #102 extrapolation. For VPR (~15-20 ms) + matcher (~30-80 ms per pair × K=10 = 300-800 ms) the matcher cost alone exceeds the AC-4.1 400 ms p95 budget at K=10. **Mitigation**: D-C3-3 K-pairs reduction (K=3-5) brings matcher cost to ~90-400 ms — TIGHT but possibly within budget for K=3-4 at the cost of recall. **Pure PyTorch FP16 is the FALLBACK runtime, NOT the primary**; the primary is TensorRT (Fact #94). |
|
||||
| **AC-4.2 memory budget (<8 GB shared on Jetson)** | ✅ PASS | PyTorch runtime: ~500 MB-1 GB framework (CUDA + cuDNN libraries shared with all CUDA processes) + per-model weight memory (~50-300 MB for VPR + ~20-100 MB for LightGlue at 1024 keypoints); peak combined ~1-2 GB out of 8 GB shared budget. |
|
||||
| **AC-4.5 look-back refinement** | N/A | Forward-only inference. |
|
||||
| **AC-8.3 10 GB persistent tile cache budget** | N/A | PyTorch checkpoints (~50-300 MB per model × 3-5 models = ~150-1.5 GB total) live in `/var/cache/onboard/checkpoints/`, not in tile cache. |
|
||||
| **AC-NEW-3 (FDR)** | ✅ PASS | PyTorch has `torch.profiler.profile` for per-op latency profiling; integrates naturally with the FDR data plane. |
|
||||
| **AC-NEW-4 covariance honesty** | N/A | Runtime is passive. |
|
||||
| **AC-NEW-7 cache-poisoning safety** | N/A | Runtime does not write to the tile cache. |
|
||||
| **AC-NEW-8 blackout failsafe** | ✅ PASS | A runtime crash is caught by the supervising process and triggers C5 demotion. |
|
||||
|
||||
**Strengths** (positive structural advantages — for the simple-baseline role):
|
||||
1. **Zero export friction** — model is loaded directly from PyTorch checkpoint; no ONNX export, no TensorRT engine build, no engine cache. Fastest path from "model trained" to "model running on Jetson".
|
||||
2. **Trivial debugging** — full PyTorch eager-mode visibility (set breakpoints, inspect intermediate tensors, swap modules at runtime). Critical for the **mandatory simple-baseline** role: when a TensorRT-built engine produces unexpected output, the pure-PyTorch baseline is the reference for accuracy parity verification.
|
||||
3. **Production-mature framework** — PyTorch is the de-facto research and deployment ML framework with daily-active maintenance. Jetson AI Lab wheels track upstream PyTorch releases at ~1-3 month lag.
|
||||
4. **No INT8 calibration required** — FP16 baseline path is calibration-free; runs as soon as the checkpoint is loaded. This is the **fallback path** when INT8 calibration data is unavailable (D-C7-1 not yet resolved).
|
||||
5. **`torch.compile` available** for additional optimization — Inductor backend can close 30-50% of the gap to TensorRT for certain models (per PyTorch Foundation benchmarks); first-call cost is ~10-60 sec vs zero for eager-mode.
|
||||
6. **Same source code on dev machine and Jetson** — fully cross-architecture portable; no separate build step.
|
||||
|
||||
**Negative-but-mitigable structural findings**:
|
||||
7. **~1.5-2× slower than TensorRT FP16** at the same workload (per Source #102 extrapolation). Material for the project's tight AC-4.1 budget — **DISQUALIFIES pure PyTorch FP16 from the primary path**, restricts it to the simple-baseline role + dev-machine reference role + emergency-fallback role if TensorRT engine build fails on the deployed Jetson.
|
||||
8. **Jetson AI Lab wheel dependency** — same community-wheel-index concern as Fact #95. **Mitigation**: pre-flight wheel mirror + project-controlled artifact registry.
|
||||
9. **No per-layer precision auto-selection at INT8** — INT8 path requires explicit quantization (e.g., `torch.quantization.quantize_dynamic` or PyTorch FX-graph-mode quantization); these do NOT use TensorRT INT8 calibrators. **Implication**: pure PyTorch INT8 is NOT a project-applicable path (out-of-scope for c7_quantization=A scope which targets TensorRT INT8 calibration); pure PyTorch is **FP16-only baseline** for this project.
|
||||
|
||||
**Caveats / open Plan-phase decisions raised** (D-C7-N gates):
|
||||
|
||||
- **D-C7-5 NEW (Cand-3 specific)** — PyTorch-Jetson-wheel-pin choice (PyTorch 2.5 + torchvision 0.20 stable / PyTorch 2.9 + torchvision latest / track Jetson AI Lab cadence). **Recommendation**: D-C7-5 = PyTorch 2.5 + torchvision 0.20 for the project's first deployment (most-stable combination per NVIDIA Developer Forum); revisit at Plan phase based on Jetson MVE results.
|
||||
|
||||
---
|
||||
|
||||
## C7 — Cross-cutting model-family precision policy (closure of batch 1)
|
||||
|
||||
**The user-pinned `c7_quantization=A` scope is INT8 primary + FP16 fallback per candidate; INT8-only candidates marked Experimental until calibration data exists.** Combining this with Source #103 evidence on feature-matching-network INT8 quantization-sensitivity, the closure recommendation is a **per-model-family precision policy** (D-C7-6):
|
||||
|
||||
| Model family | Project models | Recommended precision (TensorRT-native primary, Fact #94) | Recommended precision (ORT TRT EP alternate, Fact #95) | Recommended precision (PyTorch FP16 baseline, Fact #96) | Rationale |
|
||||
|---|---|---|---|---|---|
|
||||
| **VPR backbones (CNN class)** | MixVPR, EigenPlaces, NetVLAD | INT8 + FP16 mixed (auto-fallback to FP16 for sensitive layers) | `trt_int8_enable=True` + per-engine calibration cache | FP16 only (no INT8 path) | YOLO-class CNN benchmarks (Source #102) confirm INT8 well-tolerated at -6.5% mAP50-95; for VPR Recall@K granularity this typically translates to <-2% R@1 = acceptable |
|
||||
| **VPR backbones (ViT-class)** | SelaVPR (DINOv2-L), conditional AnyLoc/BoQ/DINOv2-VLAD | FP16 + Plan-phase D-C2-5 verification | `trt_fp16_enable=True` only; INT8 deferred to Jetson MVE | FP16 only | DINOv2 ViT export to TensorRT INT8 is a Plan-phase gate per D-C2-5; defer INT8 until Jetson MVE confirms acceptable Recall@K loss |
|
||||
| **Matchers (transformer class)** | LightGlue (with SP / DISK / ALIKED), XFeat, XFeat+LighterGlue | FP16 only (NO INT8) | `trt_fp16_enable=True` only; INT8 explicitly disabled | FP16 only | Source #103 evidence: FP8 (similar dynamic-range reduction to INT8) on LightGlue causes "match counts dropped sometimes hard". Matchers stay FP16 throughout |
|
||||
| **Learned VIO frontends** (if any selected at C1 closure) | DPVO, learned-front-end VINS | FP16 only initially; INT8 deferred to Jetson MVE per D-C7-2 | FP16 only initially | FP16 only | Insufficient INT8-on-VIO empirical evidence at research time; conservative FP16 default, revisit at Jetson MVE |
|
||||
|
||||
**Closure verdict (per user's `c7_quantization=A` scope + Source #103 caveat)**:
|
||||
- **TensorRT-native (Fact #94) is RECOMMENDED PRIMARY** for VPR backbones (CNN-class INT8) AND matchers (FP16-only). Matches the user-pinned scope exactly: INT8 primary + FP16 fallback per candidate; matcher-class candidates marked Experimental for INT8 (D-C7-6 pinning FP16 as the matcher's locked precision).
|
||||
- **ONNX Runtime + TensorRT EP (Fact #95) is RECOMMENDED ALTERNATE** for the cross-architecture-portability axis; same precision policy as TensorRT-native. Switch to ORT if model-export friction with TensorRT-native arises.
|
||||
- **Pure PyTorch FP16 (Fact #96) is RECOMMENDED MANDATORY SIMPLE-BASELINE** — required for the engine's Component Option Breadth rule + dev-machine reference parity + emergency-fallback if TensorRT engine build fails on the deployed Jetson. **Pure PyTorch FP16 is NOT eligible for the primary path** due to ~1.5-2× latency penalty vs TensorRT FP16 (per Source #102 extrapolation) which exceeds the AC-4.1 400 ms p95 budget for the full pipeline.
|
||||
|
||||
---
|
||||
|
||||
## C7 — Working conclusions and decisions (compounded from Fact #94 + Fact #95 + Fact #96 closures)
|
||||
|
||||
**Selected primary**: **Fact #94 TensorRT native primary** — JetPack-bundled TensorRT 10.3 + IInt8EntropyCalibrator2 + BuilderFlag.FP16+INT8 mixed-precision; per-model-family precision policy per D-C7-6 (VPR INT8+FP16 fallback, matchers FP16-only).
|
||||
|
||||
**Selected alternate**: **Fact #95 ONNX Runtime + TensorRT EP interop alternate** — eligible if cross-architecture portability axis becomes important OR if TensorRT-native model export friction arises; same precision policy as primary.
|
||||
|
||||
**Selected mandatory simple-baseline**: **Fact #96 pure PyTorch FP16** — required for the engine's Component Option Breadth rule; dev-machine reference parity + emergency-fallback role only.
|
||||
|
||||
**Decisions raised (D-C7-N gates)** — see [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md):
|
||||
|
||||
- **D-C7-1** (Fact #94) — calibration-dataset-strategy — **CLOSED IN C7 batch 1 (2026-05-08, per C9 / SQ7 restructure)**: strategy = real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles; specific fixture-file pin delegated to Test Spec (Step 5); synthetic-tile augmentation as documented low-data fallback. No Plan-phase Choose block remains.
|
||||
- **D-C7-2** (Fact #94) — TensorRT mixed-precision flag matrix per model family — RECOMMENDED ladder per D-C7-6 policy
|
||||
- **D-C7-3** (Fact #95) — ORT-Jetson-wheel-index-pin choice — RECOMMENDED mirror to project artifact registry + cu126 variant
|
||||
- **D-C7-4** (Fact #95) — numpy-version-pin choice — RECOMMENDED `numpy<2.0.0` until upstream rebuild
|
||||
- **D-C7-5** (Fact #96) — PyTorch-Jetson-wheel-pin choice — RECOMMENDED PyTorch 2.5 + torchvision 0.20
|
||||
- **D-C7-6** (NEW from C7 batch 1 closure, CROSS-COMPONENT with C2 + C3 + C1) — INT8-vs-FP16-per-model-family-precision-policy — RECOMMENDED per the table in "Cross-cutting model-family precision policy" section above
|
||||
- **D-C7-7** (Fact #94) — engine-build-on-Jetson-vs-prebuilt-engine-shipping strategy — RECOMMENDED build-on-deployed-Jetson at pre-flight + prebuilt fallback
|
||||
- **D-C7-8** (Fact #94) — `config.max_workspace_size` cap — RECOMMENDED 1 GB safe default
|
||||
- **D-C7-9** (Fact #94) — TensorRT version pin within JetPack lifecycle — RECOMMENDED lock to JetPack 6.2 + TensorRT 10.3
|
||||
|
||||
**C7 batch 1 closed at 3/N on 2026-05-08**. Subsequent C7 candidates (NVIDIA Triton, NVIDIA DeepStream, CUDA-Python custom kernels) are noted-and-rejected per the user-pinned `c7_overkill_options=A` scope: Triton + DeepStream are server / video-pipeline class with deployment footprints (~500 MB-2 GB) that exceed the project's embedded budget without delivering proportional benefit; CUDA-Python custom kernels would require ~2-4 weeks of CUDA engineering per model with marginal speedup over TensorRT's hardware-aware tactic search. Further candidate evaluation only if Plan-phase Jetson MVE reveals TensorRT-native + ORT TRT EP do not satisfy AC-4.1 latency budget — at which point CUDA-Python custom kernels for the matcher's inner loop become a NEW candidate (separate session).
|
||||
@@ -0,0 +1,277 @@
|
||||
# Fact Cards — C8: MAVLink / MSP2 FC adapter
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Sources logged in [`../01_source_registry/C8_fc_adapter.md`](../01_source_registry/C8_fc_adapter.md). Per-fact mode-pinning in **bold**; per-numbered-Restriction × per-numbered-AC sub-matrix below each Fact's `**Implication**` block where relevant. Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential).
|
||||
>
|
||||
> Index: [`../00_summary.md`](../00_summary.md). Prior cross-cuts: [SQ6 external positioning](SQ6_fc_external_positioning.md) — established the per-FC adapter design at SQ6 closure (Facts #1–#10), which C8 batch 1 candidate rows now operationalize. Sibling component categories: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md), [C6 Tile cache](C6_tile_cache_spatial_index.md), [C7 Inference runtime](C7_inference_runtime.md).
|
||||
|
||||
## Scope summary
|
||||
|
||||
C8 batch 1 evaluates THREE candidate adapter implementations after the c8_inav_recovery=B mid-batch correction (preserves locked SQ6 + AC-4.3 + restrictions.md verdict that MSP2_SENSOR_GPS is the iNav primary, with UBX impersonation as comparative-improvement-evaluable alternate; per-FC-breadth narrowest at one ArduPilot candidate + two iNav candidates):
|
||||
|
||||
| # | Candidate | FC | Transport | License | Status (per Fit Matrix) |
|
||||
|---|---|---|---|---|---|
|
||||
| 1 | **pymavlink → MAVLink GPS_INPUT (msg 232)** | ArduPilot Plane | MAVLink over UART/USB/UDP | LGPL-3.0 (linkable from Apache-2.0 app per LGPL §6) | Mandatory primary + RECOMMENDED PRIMARY (cooperative-path, SQ6 Fact #1 lead) |
|
||||
| 2 | **MSP2_SENSOR_GPS (id 7939 / 0x1F03) via Python MSP2 (YAMSPy or INAV-Toolkit msp_v2_encode)** | iNav | MSP V2 over UART/USB | MIT (libraries) | Mandatory primary + RECOMMENDED PRIMARY (cooperative-path, SQ6 Fact #6 lead) |
|
||||
| 3 | **UBX impersonation via pyubx2 NAV-PVT (forged u-blox frames through standard GPS pipeline)** | iNav | UBX over UART | BSD-3-Clause (pyubx2) | Documentary-evaluable alternate (comparative-improvement assessment vs Cand 2 per user's "significant-improvement-only" bar) |
|
||||
|
||||
---
|
||||
|
||||
## C8 — On-FC adapter
|
||||
|
||||
### Fact #97 — ArduPilot Plane FC adapter primary: pymavlink → MAVLink GPS_INPUT (msg 232) cooperative-path; `GPS1_TYPE = 14` MAVLink + `EK3_SRC1_POSXY = 3` GPS source-set drives EKF3 ingestion via `AP_GPS_MAV` driver
|
||||
|
||||
- **Statement**: pymavlink (LGPL-3.0, canonical Python MAVLink stack maintained by ArduPilot per Source #106) is the single adapter library for the ArduPilot Plane side. Companion-side canonical send pattern (per pymavlink generated dialect + Source #107 ArduPilot dev docs):
|
||||
```python
|
||||
from pymavlink import mavutil
|
||||
master = mavutil.mavlink_connection('udpout:127.0.0.1:14550', source_system=1, source_component=240)
|
||||
master.mav.gps_input_send(
|
||||
time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type,
|
||||
lat_deg_e7, lon_deg_e7, alt_m, hdop, vdop, vn_cmps, ve_cmps, vd_cmps,
|
||||
speed_accuracy_mps, horiz_accuracy_m, vert_accuracy_m, satellites_visible, yaw_cdeg,
|
||||
)
|
||||
```
|
||||
FC-side configuration (per Source #107): `GPS1_TYPE = 14` (MAVLink) is REQUIRED for AP_GPS to instantiate the AP_GPS_MAV driver; `EK3_SRC1_POSXY = 3` (GPS) selects the GPS_INPUT-fed virtual GPS as primary horizontal-position source for EKF3. AP's preferred non-GPS messages are `ODOMETRY` / `VISION_POSITION_ESTIMATE` at ≥4 Hz, but `GPS_INPUT` is the right transport for the project's "WGS84 coordinates as a real-GPS replacement" outcome contract (AC-4.3) AND for the project's `{satellite_anchored, visual_propagated, dead_reckoned}` source-label scheme (per SQ6 Fact #4: ODOMETRY-velocity-only is NOT supported in current AP, so `visual_propagated` cannot ride ODOMETRY — must be GPS_INPUT with widened `horiz_accuracy`).
|
||||
- **Mode pinning**: `master.mav.gps_input_send(time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type, lat, lon, alt, hdop, vdop, vn, ve, vd, speed_accuracy, horiz_accuracy, vert_accuracy, satellites_visible, yaw)` per pymavlink generated dialect (verified via SQ6 Source #4 AP_GPS_MAV.cpp ingestion path Fact #1).
|
||||
- **Source**: Source #106 (pymavlink context7 + GitHub); Source #107 (ArduPilot Plane Non-GPS Position Estimation + GPS_INPUT MAVProxy module dev docs); cross-cite SQ6 Source #4 (AP_GPS_MAV.cpp master) + SQ6 Fact #1 + SQ6 Fact #2 + SQ6 Fact #3 + SQ6 Fact #4
|
||||
- **Phase**: Phase 2
|
||||
- **Confidence**: ✅
|
||||
- **Sub-Question Binding**: SQ3 + SQ4 (per-component candidate selection for C8); SQ6 (per-FC inbound transport — already closed)
|
||||
- **Related Dimension**: C8, C5 (covariance contract via `horiz_accuracy/vert_accuracy/speed_accuracy`), AC-NEW-2 (FC-side EKF source-set switch via `MAV_CMD_SET_EKF_SOURCE_SET`, SQ6 Fact #3)
|
||||
- **Implication**: **supports selection** — Cand 1 (pymavlink → GPS_INPUT) satisfies AC-4.3 ArduPilot side; covariance honesty (AC-NEW-4) is wired through three fields (`horiz_accuracy`, `vert_accuracy`, `speed_accuracy`); spoof-promotion (AC-NEW-2) is companion-driven via `MAV_CMD_SET_EKF_SOURCE_SET`; visual-blackout failsafe (AC-NEW-8) maps directly to `fix_type` 0/1/2 + `horiz_accuracy = 999.0` sentinel per AP convention; source-label semantics (AC-1.4) emit out-of-band via `STATUSTEXT` / `NAMED_VALUE_FLOAT` per locked AC-4.3 wording.
|
||||
|
||||
#### Per-numbered-Restriction × per-numbered-AC sub-matrix (Cand 1: pymavlink → GPS_INPUT, ArduPilot Plane)
|
||||
|
||||
| Numbered AC / Restriction | Cand 1 (pymavlink → GPS_INPUT) verdict | Justification |
|
||||
|---|---|---|
|
||||
| AC-1.4 (95% covariance + source label) | **Pass** | `horiz_accuracy` = 95% covariance proxy; source label rides STATUSTEXT/NAMED_VALUE_FLOAT per AC-4.3 |
|
||||
| AC-4.1 (≤400 ms p95 frame latency) | **Pass** | pymavlink Python encoding overhead is <1 ms per packet on Jetson Orin Nano Super CPU; UDP/UART transmit is sub-ms |
|
||||
| AC-4.2 (<8 GB shared memory) | **Pass** | pymavlink runtime footprint is ~5-10 MB Python heap |
|
||||
| AC-4.3 (FC output contract) | **Pass** | GPS_INPUT is exactly the locked AC-4.3 ArduPilot transport |
|
||||
| AC-4.4 (frame-by-frame streaming) | **Pass** | pymavlink supports unbuffered immediate send |
|
||||
| AC-4.5 (look-back refinement) | **N/A** | Adapter is downstream of estimator; no smoothing here |
|
||||
| AC-6.1 (1-2 Hz GCS downsample) | **Pass** | Companion can throttle GPS_INPUT to FC at 1-3 Hz (AP samples at its own rate) |
|
||||
| AC-NEW-1 (TTFF <30 s) | **Pass** | First valid GPS_INPUT frame is sent as soon as the estimator publishes an anchored fix |
|
||||
| AC-NEW-2 (<3 s spoof promotion) | **Verify** | `MAV_CMD_SET_EKF_SOURCE_SET` round-trip latency under load — SITL validation per AC-NEW-2.Validation |
|
||||
| AC-NEW-3 (FDR retains all emitted frames) | **Pass** | Companion-side raw MAVLink stream (tlog) capture is trivial via pymavlink |
|
||||
| AC-NEW-4 (false-position safety budget) | **Pass IF covariance honest** | Project must publish honest `horiz_accuracy` (under-reporting defeats EKF3 quality chain per SQ6 Fact #2) |
|
||||
| AC-NEW-7 (no covert GPS spoofing without consent) | **Pass** | GPS_INPUT is the documented external-positioning channel; not covert |
|
||||
| AC-NEW-8 (visual-blackout failsafe) | **Pass** | Maps to `fix_type` 0/1/2 + `horiz_accuracy=999.0` sentinel per AP convention |
|
||||
| Restriction "Supported FCs: ArduPilot Plane, iNav" | **Pass** for AP side | Cand 1 covers AP path only; iNav covered by Cand 2/3 |
|
||||
| Restriction "Communication protocol per-FC: MAVLink for AP" | **Pass** | Exact match |
|
||||
| LGPL-3.0 license posture (pymavlink) | **Pass** | LGPL §6 allows linking from Apache-2.0 app; project does not modify pymavlink, so no obligation beyond republishing modifications (none); fully dual-use compatible |
|
||||
|
||||
### Fact #98 — iNav FC adapter alternate: UBX impersonation via pyubx2 NAV-PVT (forging u-blox frames through standard GPS pipeline) — viability gated by iNav `gpsMapFixType()` validation: must set `flags & 0x01 (gnssFixOK) = 1` AND `fixType ∈ {2, 3}`
|
||||
|
||||
- **Statement**: pyubx2 (BSD-3-Clause, canonical Python UBX/NMEA/RTCM3 parser per Source #108) supports `UBXMessage(ubxClass='NAV', ubxID='NAV-PVT', mode=GET, **kwargs)` constructor with full per-attribute control, plus `serialize()` for wire-format output (sync-bytes 0xB5 0x62 + class + ID + length + payload + 8-bit Fletcher checksum). Companion-side canonical send pattern:
|
||||
```python
|
||||
from pyubx2 import UBXMessage, GET, parsebitfield
|
||||
msg = UBXMessage(
|
||||
'NAV', 'NAV-PVT', GET,
|
||||
iTOW=ms_of_week,
|
||||
year=2026, month=5, day=8, hour=12, min=30, sec=0,
|
||||
valid=0b0111, # validDate | validTime | fullyResolved
|
||||
tAcc=10_000_000, nano=0,
|
||||
fixType=3, # FIX_3D — required for iNav gpsMapFixType to return GPS_FIX_3D
|
||||
flags=0b00000001, # gnssFixOK set — required for fix_status & NAV_STATUS_FIX_VALID
|
||||
flags2=0,
|
||||
numSV=12,
|
||||
lon=int(lon_deg * 1e7), lat=int(lat_deg * 1e7),
|
||||
height=int(alt_m * 1000), hMSL=int(alt_m * 1000),
|
||||
hAcc=int(horiz_acc_m * 1000), vAcc=int(vert_acc_m * 1000),
|
||||
velN=int(vn_mps * 1000), velE=int(ve_mps * 1000), velD=int(vd_mps * 1000),
|
||||
gSpeed=int(speed_2d_mps * 1000),
|
||||
headMot=int(heading_deg * 1e5),
|
||||
sAcc=int(speed_acc_mps * 1000), headAcc=int(heading_acc_deg * 1e5),
|
||||
pDOP=int(pdop * 100),
|
||||
headVeh=0, magDec=0, magAcc=0,
|
||||
)
|
||||
serial_out.write(msg.serialize())
|
||||
```
|
||||
iNav-side validation logic (per Source #110 `gps_ublox.c` direct read at line 654 + line 215-220):
|
||||
```c
|
||||
// Line 654 (NAV-PVT path):
|
||||
next_fix_type = gpsMapFixType(_buffer.pvt.fix_status & NAV_STATUS_FIX_VALID, _buffer.pvt.fix_type);
|
||||
// Line 215-220 (validation gate):
|
||||
static gpsFixType_e gpsMapFixType(bool fixValid, uint8_t ubloxFixType) {
|
||||
if (fixValid && ubloxFixType == FIX_2D) return GPS_FIX_2D;
|
||||
if (fixValid && ubloxFixType == FIX_3D) return GPS_FIX_3D;
|
||||
return GPS_NO_FIX;
|
||||
}
|
||||
```
|
||||
Two validation requirements together: (a) `_buffer.pvt.fix_status & NAV_STATUS_FIX_VALID` evaluates `flags & 0x01` (= `gnssFixOK` bit) — must be 1; (b) `_buffer.pvt.fix_type` must be `FIX_2D = 2` or `FIX_3D = 3`. iNav 9.0+ at u-blox version ≥ 15.0 configures NAV-PVT-only protocol (per Source #110 lines 1024-1028) — companion must advertise version ≥ 15.0 via NAV-VER (CLASS=0x0A, ID=0x04) at startup to drive iNav into the simpler protocol surface.
|
||||
- **Mode pinning**: `pyubx2.UBXMessage('NAV', 'NAV-PVT', GET, **kwargs).serialize()` produces wire-format bytes for direct UART write to iNav's GPS port; companion is the sole GPS source (SQ6 Fact #7 — iNav has no dual-GPS arbitration).
|
||||
- **Source**: Source #108 (pyubx2 context7 + canonical README); Source #109 (u-blox NEO-M9N + M8 NAV-PVT canonical specifications); Source #110 (iNav `gps_ublox.c` master validation gates); cross-cite SQ6 Fact #10 (UBX-only over UART; NMEA dropped in 7.0; UBX ≥ 15.00 in 9.0+) + SQ6 Fact #7 (single-GPS architecture)
|
||||
- **Phase**: Phase 2
|
||||
- **Confidence**: ✅
|
||||
- **Sub-Question Binding**: SQ3 + SQ4 (per-component candidate selection for C8); SQ6 (UBX emulation alternate, Fact #10)
|
||||
- **Related Dimension**: C8, C5 (covariance contract via NAV-PVT `hAcc/vAcc/sAcc`), AC-NEW-2 (no FC-side switch needed — companion is sole GPS), AC-NEW-7 (UBX impersonation IS a forgery operation; safety implication)
|
||||
- **Implication**: **viable alternate, comparative-improvement gate against Cand 2** — UBX path bypasses MSP2 queueing/arbitration concerns (companion appears as a normal u-blox receiver to iNav's stock GPS pipeline) AND requires no `USE_GPS_PROTO_MSP` build flag. Trade-offs: (a) implementation cost — companion must implement a fuller protocol surface (NAV-PVT periodic + NAV-VER on startup + correct ACK/NAK behaviour for CFG-MSG/CFG-RATE polls) vs MSP2_SENSOR_GPS which is a single periodic injection message; (b) iNav-firmware-side validation contract is stricter (`gpsMapFixType()` + 100-200 lines of stateful u-blox protocol handling vs `mspGPSReceiveNewData()` direct passthrough); (c) AC-NEW-7 nuance — UBX impersonation is a clearer forgery posture (companion is pretending to be a u-blox receiver) than MSP2_SENSOR_GPS (companion is using a documented sensor-injection path); (d) per user's "significant-improvement-only" bar (carried from C6 closure precedent), the Plan-phase comparative verdict needs to weigh: does UBX add material value over MSP2_SENSOR_GPS to justify the implementation cost + AC-NEW-7 nuance?
|
||||
|
||||
#### Per-numbered-Restriction × per-numbered-AC sub-matrix (Cand 3: UBX impersonation via pyubx2 NAV-PVT, iNav)
|
||||
|
||||
| Numbered AC / Restriction | Cand 3 (UBX impersonation) verdict | Justification |
|
||||
|---|---|---|
|
||||
| AC-1.4 (95% covariance + source label) | **Pass** | NAV-PVT `hAcc`/`vAcc`/`sAcc` carry covariance proxies; source label rides separate MSP2_DEBUG_MSG / TextMessage off-band channel (UBX has no equivalent of MAVLink STATUSTEXT — must use a sibling iNav telemetry channel) |
|
||||
| AC-4.1 (≤400 ms p95 frame latency) | **Pass** | pyubx2 serialization overhead is <1 ms per packet; UART transmit at 115200+ baud is sub-ms |
|
||||
| AC-4.2 (<8 GB shared memory) | **Pass** | pyubx2 runtime footprint is ~5-10 MB Python heap |
|
||||
| AC-4.3 (FC output contract) | **Pass** (UBX is iNav's documented GPS protocol) | NAV-PVT through standard GPS pipeline IS a documented external-positioning interface; AC-4.3 wording mentions MSP2_SENSOR_GPS as primary but does not exclude UBX-emulation alternate |
|
||||
| AC-4.4 (frame-by-frame streaming) | **Pass** | NAV-PVT streamed at companion's chosen rate (5-10 Hz typical) |
|
||||
| AC-4.5 (look-back refinement) | **N/A** | Adapter is downstream of estimator |
|
||||
| AC-6.1 (1-2 Hz GCS downsample) | **N/A** for UBX path (GCS sees iNav's MAVLink outbound, not UBX inbound) | iNav still emits MAVLink telemetry to GCS regardless of UBX vs MSP2 inbound choice |
|
||||
| AC-NEW-1 (TTFF <30 s) | **Pass** | First valid NAV-PVT frame is sent as soon as estimator publishes anchored fix |
|
||||
| AC-NEW-2 (<3 s spoof promotion) | **Pass by architecture** | Companion is sole iNav GPS; no FC-side switch needed (per SQ6 Fact #7) |
|
||||
| AC-NEW-3 (FDR retains all emitted frames) | **Pass** | Companion-side raw UBX stream capture is trivial |
|
||||
| AC-NEW-4 (false-position safety budget) | **Verify** | Need to confirm iNav nav-stack actually USES NAV-PVT `hAcc/vAcc` for outlier handling (the SQ6 Fact #6 "iNav explicitly does NOT validate GPS for spoofing" caveat applies symmetrically to UBX path — companion-side honesty is mandatory because iNav-side rejection chain is minimal) |
|
||||
| AC-NEW-7 (no covert spoofing without consent) | **Verify** | UBX impersonation IS a forgery posture; project must explicitly document this in the FDR audit trail (mitigates by being a documented project design, but the impersonation framing is unambiguous) |
|
||||
| AC-NEW-8 (visual-blackout failsafe) | **Pass** | NAV-PVT `fixType` enum carries graceful degrade: 0=NoFix / 1=DeadReck / 2=2D / 3=3D / 4=GNSS+DR / 5=TimeOnly; companion can emit `fixType=0` for blackout-no-fix or `fixType=2` (2D) for degraded-covariance mode |
|
||||
| Restriction "Supported FCs: ArduPilot Plane, iNav" | **Pass** for iNav side | Cand 3 covers iNav path only |
|
||||
| Restriction "Communication protocol per-FC: MSP2 for iNav" | **Verify (alternate)** | Locked restrictions.md says MSP2; UBX is documented in SQ6 Fact #10 as fallback, not primary. Plan-phase decision (D-C8-N) chooses between MSP2 primary or UBX primary based on comparative verdict |
|
||||
| BSD-3-Clause license posture (pyubx2) | **Pass** | Clean dual-use compatible |
|
||||
|
||||
### Fact #99 — iNav FC adapter primary: MSP2_SENSOR_GPS (id 7939 / 0x1F03) via Python MSP V2 implementation (YAMSPy or INAV-Toolkit `msp_v2_encode`) — `mspGPSReceiveNewData()` direct passthrough; covariance fields `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy` align directly with AP `GPS_INPUT.horiz_accuracy`/`vert_accuracy`/`speed_accuracy`
|
||||
|
||||
- **Statement**: MSP2_SENSOR_GPS (id 7939 / 0x1F03 — verified in iNav `msp_protocol_v2_sensor.h` master per Source #113) is iNav's documented sensor-plugin GPS injection path. Per Source #111 master `docs/development/msp/README.md` lines 2999-3031: payload is 36 bytes `instance/u8 + gpsWeek/u16 + msTOW/u32 + fixType/u8 + satellitesInView/u8 + hPosAccuracy/u16(mm) + vPosAccuracy/u16(mm) + hVelAccuracy/u16(cm/s) + hdop/u16 + longitude/i32(deg×1e7) + latitude/i32(deg×1e7) + mslAltitude/i32(cm) + nedVelNorth/i32(cm/s) + nedVelEast/i32(cm/s) + nedVelDown/i32(cm/s) + groundCourse/u16(deg×100) + trueYaw/u16(deg×100, 65535=unavailable) + year/u16 + month/u8 + day/u8 + hour/u8 + min/u8 + sec/u8`. iNav-side: `mspGPSReceiveNewData()` is called with no return value — direct passthrough to `gpsSol` (per Source #111 Notes block), NO additional validation gate beyond the data parse itself (contrast with UBX path's `gpsMapFixType()` validation). Required iNav build flag: `USE_GPS_PROTO_MSP` — **enabled by default in `target/common.h`** per SQ6 Source #13 (so stock firmware reaches this path).
|
||||
|
||||
Companion-side canonical send pattern using INAV-Toolkit primitives (per Source #112):
|
||||
```python
|
||||
import struct
|
||||
from inav_msp import msp_v2_encode # CRC-8 DVB-S2 envelope encoder
|
||||
|
||||
MSP2_SENSOR_GPS = 0x1F03
|
||||
payload = struct.pack(
|
||||
'<BHIBBHHHHIIIiiiHHHBBBBB',
|
||||
instance, gps_week, ms_tow, fix_type, sats_in_view,
|
||||
h_pos_acc_mm, v_pos_acc_mm, h_vel_acc_cmps, hdop_x100,
|
||||
lon_deg_e7, lat_deg_e7, msl_alt_cm,
|
||||
ned_n_cmps, ned_e_cmps, ned_d_cmps,
|
||||
ground_course_deg_x100, true_yaw_deg_x100,
|
||||
year, month, day, hour, min_, sec,
|
||||
)
|
||||
frame = msp_v2_encode(MSP2_SENSOR_GPS, payload) # MSP V2 envelope: 0x24 'X' 0x3C flag cmd_le len_le payload crc8_dvb_s2
|
||||
serial_out.write(frame)
|
||||
```
|
||||
Two community-maintained Python options: **YAMSPy** (MIT, community-blessed in iNav Issue #4465 for external-device communication) OR **INAV-Toolkit `inav_msp.py`** (MIT, 951-line direct primary-source reference for `msp_v2_encode` / `msp_v2_decode` with CRC-8 DVB-S2). Either route avoids the UBX-protocol-handler complexity required for Cand 3.
|
||||
- **Mode pinning**: `msp_v2_encode(MSP2_SENSOR_GPS=0x1F03, payload)` + `serial_out.write(frame)` per iNav MSP V2 envelope spec; iNav-side `mspGPSReceiveNewData()` direct passthrough per Source #111 Notes.
|
||||
- **Source**: Source #111 (iNav MSP message reference master, MSP2_SENSOR_GPS canonical payload spec); Source #112 (YAMSPy + INAV-Toolkit Python implementations); Source #113 (iNav `msp_protocol_v2_sensor.h` MSP V2 sensor-message-range definition); cross-cite SQ6 Source #12 (master MSP message reference) + SQ6 Source #13 (`USE_GPS_PROTO_MSP` enabled by default) + SQ6 Fact #6 (MSP2_SENSOR_GPS covariance-rich path)
|
||||
- **Phase**: Phase 2
|
||||
- **Confidence**: ✅ for protocol spec; ⚠️ for community-library version-stability (YAMSPy / INAV-Toolkit may need extension or thin-custom-encoder replacement at design phase if MSP V2 sensor-message-range support lags upstream iNav)
|
||||
- **Sub-Question Binding**: SQ3 + SQ4 (per-component candidate selection for C8); SQ6 (per-FC inbound transport, Fact #6 — already SQ6 Selected lead)
|
||||
- **Related Dimension**: C8, C5 (covariance contract via `hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy`), AC-NEW-2 (no FC-side switch needed — companion is sole GPS), AC-NEW-4 (covariance honesty is companion-side responsibility per SQ6 Fact #8)
|
||||
- **Implication**: **supports selection** — Cand 2 (MSP2_SENSOR_GPS via Python MSP V2) is the SQ6 closure lead AND is the locked AC-4.3 transport for iNav; covariance honesty (AC-NEW-4) is wired through three fields aligning DIRECTLY with AP `GPS_INPUT.horiz_accuracy/vert_accuracy/speed_accuracy` — same companion-side covariance contract for both FCs. AC-NEW-7 framing: MSP2_SENSOR_GPS is a documented sensor-injection path, NOT an impersonation — clearer audit-trail posture than Cand 3 UBX. AC-NEW-8 graceful-degrade: `fixType` enum carries 6 levels (per `gpsFixType_e`) — companion can emit `GPS_NO_FIX` (0) for blackout-no-fix, `GPS_FIX_2D` (1) for degraded-covariance >100 m mode. Single-message contract = simpler than UBX's NAV-PVT + NAV-VER + CFG-* protocol surface.
|
||||
|
||||
#### Per-numbered-Restriction × per-numbered-AC sub-matrix (Cand 2: MSP2_SENSOR_GPS via Python MSP V2, iNav)
|
||||
|
||||
| Numbered AC / Restriction | Cand 2 (MSP2_SENSOR_GPS) verdict | Justification |
|
||||
|---|---|---|
|
||||
| AC-1.4 (95% covariance + source label) | **Pass** | `hPosAccuracy` = 95% covariance proxy; source label rides separate MSP2 telemetry channel (e.g. MSP2_SENSOR_RANGEFINDER spare bytes or a custom MSP2_INAV_DEBUG variant) |
|
||||
| AC-4.1 (≤400 ms p95 frame latency) | **Pass** | Python `struct.pack` + `msp_v2_encode` overhead is <1 ms per frame on Jetson |
|
||||
| AC-4.2 (<8 GB shared memory) | **Pass** | YAMSPy or INAV-Toolkit runtime footprint is ~5-10 MB Python heap |
|
||||
| AC-4.3 (FC output contract) | **Pass** | MSP2_SENSOR_GPS is exactly the locked AC-4.3 iNav transport |
|
||||
| AC-4.4 (frame-by-frame streaming) | **Pass** | MSP2 supports periodic injection at companion's chosen rate (5-10 Hz typical) |
|
||||
| AC-4.5 (look-back refinement) | **N/A** | Adapter is downstream of estimator |
|
||||
| AC-6.1 (1-2 Hz GCS downsample) | **N/A** for MSP2 path (GCS sees iNav's MAVLink outbound, not MSP2 inbound) | iNav still emits MAVLink telemetry to GCS regardless of MSP2 vs UBX inbound choice |
|
||||
| AC-NEW-1 (TTFF <30 s) | **Pass** | First valid MSP2_SENSOR_GPS frame is sent as soon as estimator publishes anchored fix |
|
||||
| AC-NEW-2 (<3 s spoof promotion) | **Pass by architecture** | Companion is sole iNav GPS; no FC-side switch needed (per SQ6 Fact #7) |
|
||||
| AC-NEW-3 (FDR retains all emitted frames) | **Pass** | Companion-side raw MSP V2 stream capture is trivial |
|
||||
| AC-NEW-4 (false-position safety budget) | **Verify** | Need to confirm iNav nav-stack actually USES `hPosAccuracy/vPosAccuracy/hVelAccuracy` for outlier handling per SQ6 Fact #6 + SQ6 Fact #8 — design-phase task to read `src/main/io/gps_msp.c` `mspGPSReceiveNewData()` body |
|
||||
| AC-NEW-7 (no covert spoofing without consent) | **Pass** | MSP2_SENSOR_GPS is the documented sensor-injection path; not covert/forgery |
|
||||
| AC-NEW-8 (visual-blackout failsafe) | **Pass** | `fixType` enum (`gpsFixType_e`) carries graceful degrade levels; companion can emit `GPS_NO_FIX` (0) or `GPS_FIX_2D` (1) for the covariance>100 m / blackout thresholds |
|
||||
| Restriction "Supported FCs: ArduPilot Plane, iNav" | **Pass** for iNav side | Cand 2 covers iNav path only |
|
||||
| Restriction "Communication protocol per-FC: MSP2 for iNav" | **Pass** | Exact match — locked SQ6 + AC-4.3 + restrictions.md |
|
||||
| MIT license posture (YAMSPy + INAV-Toolkit) | **Pass** | Clean dual-use compatible |
|
||||
|
||||
---
|
||||
|
||||
## C8 — Cand 2 vs Cand 3 comparative-improvement-vs-Cand-2 verdict (closure of batch 1, iNav side)
|
||||
|
||||
Per user's session-start "significant-improvement-only" bar (same calibration as C6 closure verdict that locked Cand 1 PostgreSQL+btree+FAISS as primary over Cand 2 PostGIS+pgvector secondary):
|
||||
|
||||
| Lever | Cand 2 (MSP2_SENSOR_GPS) | Cand 3 (UBX impersonation) | Material improvement of Cand 3 over Cand 2? |
|
||||
|---|---|---|---|
|
||||
| Wire format complexity | Single MSP2 envelope + 36-byte payload + CRC-8 DVB-S2 | NAV-PVT (92 bytes) + NAV-VER startup + CFG-MSG/CFG-RATE ACK behaviour | **Cand 3 ADDS complexity (negative)** |
|
||||
| Protocol-surface footprint | One message ID (0x1F03) | NAV-PVT + NAV-VER (CLASS=0x0A,ID=0x04) + ACK/NAK protocol | **Cand 3 ADDS surface (negative)** |
|
||||
| iNav-side validation gate | `mspGPSReceiveNewData()` direct passthrough | `gpsMapFixType()` requires `flags & 0x01 = 1` AND `fixType ∈ {2,3}` | **Cand 3 ADDS validation gate (mixed: stricter = more brittle to companion bugs, but also catches malformed frames earlier)** |
|
||||
| Covariance-honesty contract | `hPosAccuracy/vPosAccuracy/hVelAccuracy` aligned with AP `GPS_INPUT.horiz_accuracy/vert_accuracy/speed_accuracy` | NAV-PVT `hAcc/vAcc/sAcc` aligned with same | **Tie** |
|
||||
| AC-NEW-7 audit-trail posture | Documented sensor-injection path (clean) | Forgery posture (companion impersonates u-blox receiver) | **Cand 3 WORSE for AC-NEW-7** |
|
||||
| Dependency on iNav build flags | `USE_GPS_PROTO_MSP` (enabled by default) | None (UBX path always available) | **Cand 3 marginally better — no dependency on a build flag** |
|
||||
| Library maturity | YAMSPy / INAV-Toolkit (community, MIT, ~951-line reference impl); ⚠️ may need extension for MSP2 sensor-message-range | pyubx2 (canonical, BSD-3-Clause, daily-active, 139+239 context7 code snippets) | **Cand 3 has more mature library** |
|
||||
| AC-NEW-2 architectural fit | Pass-by-architecture (companion is sole GPS) | Pass-by-architecture (same) | **Tie** |
|
||||
| AC-NEW-8 graceful-degrade | `gpsFixType_e` 6-level enum | NAV-PVT `fixType` 6-level enum | **Tie** |
|
||||
| Cross-FC consistency with AP path | MSP2 ≠ MAVLink — different protocol on the wire, but same logical companion-side covariance contract | UBX ≠ MAVLink — same | **Tie** |
|
||||
|
||||
**Verdict (closure)**: Cand 3 (UBX impersonation) does NOT clear the user's "significant-improvement-only" bar over Cand 2 (MSP2_SENSOR_GPS). UBX's sole real upside is library maturity (pyubx2 vs YAMSPy/INAV-Toolkit) — but YAMSPy + INAV-Toolkit are MIT-clean and the canonical msp_v2_encode primitive is well-documented (951 lines of primary-source reference in INAV-Toolkit). Cand 3's downsides (added protocol-surface complexity + AC-NEW-7 forgery posture + stricter validation gate) outweigh the upside.
|
||||
|
||||
**Recommendation**: Cand 2 (MSP2_SENSOR_GPS) is **RECOMMENDED PRIMARY** for the iNav side; Cand 3 (UBX impersonation) is **DEFERRED secondary** with explicit re-evaluation criteria — promote to primary IF (a) YAMSPy + INAV-Toolkit prove insufficient at Plan-phase MSP V2 sensor-message-range support and project chooses NOT to extend them, OR (b) Plan-phase iNav MVE reveals that `mspGPSReceiveNewData()` does NOT use the covariance fields per AC-NEW-4 verify-cell and the project needs the stricter `gpsMapFixType()` validation contract for runtime sanity-checking, OR (c) the project re-opens AC-NEW-7 and decides UBX impersonation is preferred for some yet-to-be-identified safety reason.
|
||||
|
||||
---
|
||||
|
||||
## C8 — Working conclusions and decisions (compounded from Fact #97 + Fact #98 + Fact #99 closures)
|
||||
|
||||
### Per-FC adapter design (re-confirmed from SQ6 closure, now operationalized)
|
||||
|
||||
| FC | Adapter library | Transport | Lead candidate fact | License posture |
|
||||
|---|---|---|---|---|
|
||||
| **ArduPilot Plane** | pymavlink | MAVLink GPS_INPUT (msg 232) over UART/USB/UDP | Fact #97 | LGPL-3.0 (linkable from Apache-2.0 app per LGPL §6) |
|
||||
| **iNav (RECOMMENDED PRIMARY)** | YAMSPy or INAV-Toolkit msp_v2_encode | MSP2_SENSOR_GPS (id 7939 / 0x1F03) over UART/USB | Fact #99 | MIT |
|
||||
| **iNav (DEFERRED secondary)** | pyubx2 | UBX NAV-PVT impersonation over UART | Fact #98 | BSD-3-Clause |
|
||||
|
||||
### Plan-phase Decision Gates raised by C8 batch 1
|
||||
|
||||
- **D-C8-1 (NEW from Fact #97 closure 2026-05-08, Cand-1-only)** — pymavlink connection-string transport choice
|
||||
- Options: (a) `udpout:127.0.0.1:14550` for in-process companion + autopilot UDP; (b) `serial:/dev/ttyTHS1:921600` for direct UART to AP TELEM port (no companion-router middlebox); (c) `tcp:127.0.0.1:5760` for SITL replay; (d) **all three configurable via env var, default UART (b) for production deployment, UDP (a) for SITL replay, TCP (c) for unit tests RECOMMENDED**.
|
||||
- Owner: Plan-phase architect.
|
||||
- Rationale: pymavlink supports all three transports identically; choice depends on deployment topology. Default to UART for production reduces moving parts.
|
||||
|
||||
- **D-C8-2 (NEW from Fact #97 closure 2026-05-08, Cand-1-only CROSS-COMPONENT with AC-NEW-2)** — `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch ownership
|
||||
- Options: (a) companion always claims source-set 1 and FC keeps real-GPS at source-set 2 (companion reactive only); (b) **companion publishes to source-set 2 and switches FC to set 2 when companion publishes its first valid fix; switches back to set 1 when companion is unavailable RECOMMENDED ~mirrors NGPS/Auterion pattern**; (c) operator manually flips source-set via RC aux switch (option 90).
|
||||
- Owner: Plan-phase architect + AC-NEW-2 owner.
|
||||
- Rationale: per SQ6 Fact #3, "no GCSs are currently known to implement" companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` — but it works at firmware level. The project gets to define the canonical pattern.
|
||||
|
||||
- **D-C8-3 (NEW from Fact #97 closure 2026-05-08, Cand-1-only)** — pymavlink LGPL-3.0 license-posture verification
|
||||
- Options: (a) **bundle pymavlink unmodified + publish requirements.txt with version pin RECOMMENDED ~standard LGPL §6 compliance**; (b) statically link via Cython compilation (LGPL §6 obligation: provide relinkable form); (c) wrap pymavlink behind a thin C++/Rust process boundary to keep companion-app fully Apache-2.0 (over-engineered; not justified by project posture).
|
||||
- Owner: Plan-phase architect + license owner.
|
||||
- Rationale: aligns with D-C1-1 license-posture-track decision; pymavlink LGPL-3.0 vs project Apache-2.0 dual-use track is straightforward.
|
||||
|
||||
- **D-C8-4 (NEW from Fact #99 closure 2026-05-08, Cand-2-only)** — Python MSP V2 implementation choice
|
||||
- Options: (a) **YAMSPy (community-blessed for iNav external-device comms per Issue #4465); MIT; latest commit pre-2025-Q4 RECOMMENDED ~widest community usage**; (b) INAV-Toolkit `msp_v2_encode` primitive lifted into the project (951-line MIT module, direct primary-source reference); (c) thin custom encoder using `struct.pack` + CRC-8 DVB-S2 helper (50-line bespoke); (d) project-side fork of one of the above.
|
||||
- Owner: Plan-phase architect.
|
||||
- Rationale: all options are MIT and produce identical wire bytes; choice depends on maintainability vs minimum-dependency-surface preference.
|
||||
|
||||
- **D-C8-5 (NEW from Fact #99 closure 2026-05-08, Cand-2-only)** — MSP2_SENSOR_GPS injection rate
|
||||
- Options: (a) **5 Hz periodic RECOMMENDED ~matches GPS_INPUT 5 Hz cadence on AP side, single-rate cross-FC consistency**; (b) 10 Hz to match iNav nav-cycle frequency; (c) variable rate matching estimator publication rate (3 Hz nominal, up to 10 Hz when matcher confidence is high).
|
||||
- Owner: Plan-phase architect.
|
||||
- Rationale: estimator publishes at 3 Hz nominal (per pinned dual-rate camera pipeline Fact #40); 5 Hz adapter-side rate has spare headroom for IMU-propagation between estimator updates.
|
||||
|
||||
- **D-C8-6 (NEW from Fact #98 closure 2026-05-08, Cand-3-only contingent)** — IF Cand 3 selected → UBX-version-advertisement strategy
|
||||
- Options: (a) **advertise hwVersion ≥ M9 + swVersion ≥ 15.00 via NAV-VER (CLASS=0x0A, ID=0x04) at startup + every reset; force iNav into NAV-PVT-only protocol surface RECOMMENDED ~simplest configuration path**; (b) advertise hwVersion = M8 + swVersion = 14.x to drive iNav into legacy NAV-POSLLH+NAV-SOL+NAV-VELNED+NAV-TIMEUTC quad mode (more messages but historical iNav-friendly path); (c) implement adaptive advertisement based on iNav firmware-version probe.
|
||||
- Owner: Plan-phase architect.
|
||||
- Rationale: per Source #110 lines 1024-1060, iNav configures the simpler NAV-PVT-only path for u-blox version ≥ 15.0 — companion impersonator should advertise this version to minimize protocol surface.
|
||||
|
||||
- **D-C8-7 (NEW from Fact #98 closure 2026-05-08, Cand-3-only contingent)** — IF Cand 3 selected → AC-NEW-7 audit-trail posture
|
||||
- Options: (a) **explicit FDR audit entry on every UBX impersonation session start, naming companion as the UBX source + providing operator-consent provenance check at boot RECOMMENDED**; (b) silent operation with user-manual disclosure only; (c) require runtime parameter `gps-denied-onboard.enable_ubx_impersonation = true` to be set explicitly by the user via QGC (active opt-in).
|
||||
- Owner: Plan-phase architect + AC-NEW-7 owner.
|
||||
- Rationale: UBX impersonation is unambiguously a forgery posture (companion pretends to be u-blox receiver); AC-NEW-7 (no covert GPS spoofing without consent) requires an audit trail.
|
||||
|
||||
- **D-C8-8 (NEW from Fact #97 + Fact #99 closures 2026-05-08, CROSS-COMPONENT — affects both Cand 1 and Cand 2; CROSS-COMPONENT with C5 covariance contract)** — covariance-honesty cross-FC enforcement
|
||||
- Options: (a) project always publishes the SAME covariance value to both FCs (single shared contract, simpler test surface); (b) **per-FC covariance unit conversion: AP `GPS_INPUT.horiz_accuracy` (m) vs iNav `MSP2_SENSOR_GPS.hPosAccuracy` (mm) — companion publishes the same source covariance, formatted per-FC RECOMMENDED**; (c) per-FC covariance smoothing (different filter parameters per FC) — over-engineered and adds covariance-monotonicity-violation risk under C5 D-C5-2 long-cruise observability.
|
||||
- Owner: Plan-phase architect + AC-NEW-4 owner.
|
||||
- Rationale: AC-NEW-4 covariance-honesty obligation is the same for both FCs; only the unit and field-name change.
|
||||
|
||||
### Cross-row dependencies
|
||||
|
||||
- **C5 covariance contract integration**: Both Cand 1 (AP) and Cand 2 (iNav) require honest covariance from C5 estimator. The C5 GTSAM `Marginals.marginalCovariance` path (Fact #89) produces a 6×6 pose covariance matrix; the C8 adapter extracts the 2×2 horizontal sub-matrix (rows 3-4 = x, y in GTSAM's `Pose3` ordering) and converts to scalar `horiz_accuracy` (m) for AP or `hPosAccuracy` (mm) for iNav using the 95% confidence ellipse semi-major axis approximation `sqrt(2.0 * 5.991 * λ_max)` where λ_max is the largest eigenvalue of the 2×2 horizontal covariance.
|
||||
- **AC-NEW-2 spoof-promotion latency cross-FC validation**: SITL test on each FC under spoof-injection — AP path validates `MAV_CMD_SET_EKF_SOURCE_SET` round-trip; iNav path validates companion-internal reaction time (companion is sole GPS, FC does not participate in source switching). Both should hit 95th percentile <3 s.
|
||||
- **AC-NEW-8 visual-blackout cross-FC behaviour**: AP `fix_type` enum (0/1/2/3/4/5/6) and iNav `gpsFixType_e` enum (`GPS_NO_FIX/GPS_FIX_2D/GPS_FIX_3D/...`) carry the same 0/1/2/3 ordering, simplifying cross-FC graceful-degrade implementation.
|
||||
|
||||
---
|
||||
|
||||
### Boundary check: C8 batch 1 saturation status
|
||||
|
||||
C8 batch 1 (3 of N candidate adapters with explicit per-FC pinning) closed at the documentary level on 2026-05-08:
|
||||
- **Cand 1 (Fact #97, ArduPilot pymavlink → GPS_INPUT)** — RECOMMENDED PRIMARY. Documentary verification ✅ via context7 + ArduPilot dev docs + cross-cite SQ6 Source #4 ingestion path; mode-pinned send pattern verified; per-AC sub-matrix complete.
|
||||
- **Cand 2 (Fact #99, iNav MSP2_SENSOR_GPS via Python MSP V2)** — RECOMMENDED PRIMARY for iNav side. Documentary verification ✅ via iNav master MSP message reference + msp_protocol_v2_sensor.h source + community library landscape (YAMSPy + INAV-Toolkit); mode-pinned send pattern verified; per-AC sub-matrix complete.
|
||||
- **Cand 3 (Fact #98, iNav UBX impersonation via pyubx2 NAV-PVT)** — DEFERRED secondary for iNav side after comparative-improvement verdict. Documentary verification ✅ via pyubx2 context7 + u-blox NAV-PVT canonical specs + iNav `gps_ublox.c` direct source read of validation gate; mode-pinned send pattern verified; per-AC sub-matrix complete.
|
||||
|
||||
Saturation rationale: SQ6 closure already covered the per-FC inbound architecture; C8 batch 1 was the operationalization step for the three viable candidates (one per FC for AP; two per FC for iNav since user requested parallel evaluation). No additional candidates surfaced during research that haven't been considered-and-rejected per c8_overkill_options=A (MAVProxy/mavp2p/ardupilot-router are router-class not adapter-class; full MAVSDK C++/Rust SDKs are out-of-budget vs Python pymavlink; no sibling third iNav transport beyond MSP2 + UBX exists in iNav 9.0 master).
|
||||
|
||||
C8 batch 1 closure is gated only on the eight Plan-phase decisions D-C8-1..8 and the cross-row C5 covariance contract integration. Plan-phase Choose blocks are recorded in [`../06_component_fit_matrix/99_cross_component_gates.md`](../06_component_fit_matrix/99_cross_component_gates.md).
|
||||
@@ -0,0 +1,155 @@
|
||||
# Fact Cards — SQ1: Existing / competitor GPS-denied UAV navigation systems
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `../01_source_registry/SQ1_existing_systems.md` (see `../01_source_registry/00_summary.md` for index). Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential). Bound to sub-questions in `../00_question_decomposition.md`.
|
||||
>
|
||||
> Index: [`../00_summary.md`](../00_summary.md). Sibling categories: SQ6 ([FC external positioning](SQ6_fc_external_positioning.md)), SQ2 ([canonical pipeline](SQ2_canonical_pipeline.md)), C1 ([VIO](C1_vio.md)), C2 ([VPR](C2_vpr.md)), C3 ([matchers](C3_matchers.md)).
|
||||
|
||||
**Facts in this file**: #11–#20 (peer/adjacent systems: OSCAR, Auterion Artemis, Vantor Raptor, NGPS, SPRIN-D winner, RTAB-Map/ORB-SLAM3 pruning, DSMAC/TERCOM lineage, hierarchical matching SOTA, AerialExtreMatch benchmark, DARPA FLA + USAF SBIR) + SQ1 working conclusions.
|
||||
|
||||
---
|
||||
|
||||
## SQ1 — Existing / competitor GPS-denied UAV navigation systems
|
||||
|
||||
### Fact #11 — Twist Robotics OSCAR is a deployed Ukrainian peer system in the same architectural class as this project
|
||||
- **Statement**: Twist Robotics (Ukraine) has a fielded camera + map-matching navigation module called OSCAR (Optical System of Coordinates with Automatic Relocalisation). The vendor states the system "captures the terrain, identifies landmarks, compares them with a map, determines coordinates, and transmits them to the autopilot as a reliable GPS signal" — the same five-stage architecture this project is building. Vendor-stated specs: ≤20 m accuracy without cumulative error, day/night/fog operation, and operational deployment of "more than 500,000 km across 25,000 combat missions over 24 months". Hardware includes active cooling, indicating a non-trivial onboard compute (likely Jetson-class). **No public independent benchmark of the 20 m number.**
|
||||
- **Source**: Source #25, Source #26
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + AC owners (existence-of-peer evidence, not implementation guide)
|
||||
- **Confidence**: ✅ for "deployed at scale on Ukrainian combat platforms"; ⚠️ for "20 m accuracy" (vendor self-report); ❓ for "fully resistant to spoofing and jamming" (claim not independently verified)
|
||||
- **Related Dimension**: SQ1, SQ8 (anti-spoofing claim audit), SQ9 (synthesis — ours must beat or at least match this in the operational regime)
|
||||
- **Fit Impact**: **establishes feasibility floor** — a Ukrainian peer is operating a similar architecture against the same threat environment our system targets. Project framing must explicitly differentiate (e.g., 1 km AGL vs unspecified OSCAR altitude; 8 h endurance vs unspecified OSCAR endurance; AC-NEW-4 honest covariance contract vs OSCAR's unspecified covariance reporting).
|
||||
|
||||
### Fact #12 — Auterion Artemis is a production-shipping fixed-wing one-way attack drone with Ukraine-validated GPS-denied navigation, defining the production benchmark for this class
|
||||
- **Statement**: Auterion completed the US Defense Innovation Unit Artemis program in October 2025, delivering a Shahed-class deep-strike drone with up to 1,000-mile range and up to 40 kg warhead, running on **Auterion Skynode N mission computer + Auterion Visual Navigation system + built-in terminal guidance**. Government evaluators signed off after operational flight tests in Ukraine including ground launch, GPS and GPS-denied navigation, long-range transit, and terminal engagement. Manufacturing is being established in US, UA, and DE; Auterion is offering the system to the US Department of War and allied nations.
|
||||
- **Source**: Source #31; Source #32 confirms Skynode S sibling architecture (NPU-equipped companion).
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects (production-pattern reference)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1 (closest commercial production peer), SQ9 (architecture template)
|
||||
- **Fit Impact**: **establishes production reference architecture** — companion-class autopilot + visual navigation + terminal guidance is shipping at production scale to a US defense customer. Implication: building a per-FC adapter (project decision in SQ6) is consistent with what production stacks already do; integrating against the Artemis architecture is realistic; competing on price + Ukraine-specific operational tuning + AC-NEW-4 honest-covariance contract is a viable differentiation.
|
||||
|
||||
### Fact #13 — Vantor Raptor is a production COTS visual-GPS-replacement software suite, demonstrating that "branded sat-tile basemap + on-drone vision software" is a viable commercial pattern
|
||||
- **Statement**: Vantor Raptor product family (Guide / Sync / Ace) provides vision-based GPS replacement using the drone's existing camera plus Vantor's "100 million-plus sq km of highly accurate 3D terrain data" (Vivid Terrain, vendor-stated 3 m accuracy). Vendor-demonstrated absolute accuracy: **<7 m in all dimensions** for aerial position (Guide), **<3 m** for ground coordinate extraction (Sync, Ace). Works at night and at low altitudes. Platform-agnostic, deployable on commodity hardware, integrates with existing onboard cameras. Inertial Labs has published a VINS-integrated Raptor Guide white paper. Recent partnerships: Niantic Spatial (Dec 2025) for unified air-to-ground positioning in GPS-denied areas; Maxar partnership with AIDC (Sep 2025) for Taiwan UAV resilience against GPS interference.
|
||||
- **Source**: Source #30
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Architecture / business decision-makers (build-vs-buy framing)
|
||||
- **Confidence**: ✅ for product existence + claimed accuracy bounds (vendor primary); ⚠️ for whether Vantor's commercial accuracy figures hold under the project's specific Ukrainian-steppe + active-conflict-tile-staleness conditions
|
||||
- **Related Dimension**: SQ1 (commercial), C2/C3 (commercial alternatives to building ourselves), SQ8 (basemap as a service vs offline cache)
|
||||
- **Fit Impact**: **build-vs-buy lens** — Raptor Guide's <7 m claim is *better* than the project's AC-1.1 budget (≤80 m / 95% under AC-1.1.1), so it's not a disqualifier on accuracy. Reasons we still build vs buy: (a) Vantor is a US vendor; export / dual-use licensing into the Ukrainian battlefield is uncertain; (b) restrictions specify offline cache from the project's own Azaion Suite Satellite Service (AC-2.x), not Vantor's Vivid Terrain — replacing the basemap is non-negotiable; (c) covariance honesty contract (AC-NEW-4) and source-label contract (AC-1.4) are project-specific and may not be exposed by Vantor's API. **Outcome**: keep Raptor as a competitive comparator in `solution_draft01`, NOT as a candidate component to integrate.
|
||||
|
||||
### Fact #14 — snktshrma/ngps_flight (NGPS — ArduPilot GSoC 2024) is the closest open-source pipeline match to this project's exact C1+C2+C3+C5+C8 stack
|
||||
- **Statement**: NGPS = ROS 2 + ArduPilot pipeline composed of three packages: **`ap_ngps_ros2`** (visual geo-localization at 1–2 Hz by matching live camera frames to georeferenced satellite imagery using **LightGlue + SuperPoint**, deep-learning-based feature matching), **`ap_ukf`** (Unscented Kalman Filter fusing NGPS absolute positions with VIO estimates), **`ap_vips`** (VIO providing relative pose). Output is fused odometry to ArduPilot's EKF (per related ArduPilot issue #23471, this is via `VISION_POSITION_ESTIMATE` requiring EKF source-set 2/3 with `EK3_SRC*_POSXY=Vision`). Project is published under ArduPilot's GSoC 2024 program. Sibling `ap_nongps` is an earlier OpenCV-based prototype.
|
||||
- **Source**: Source #33
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer
|
||||
- **Confidence**: ✅ for project existence, component breakdown, and matcher choice (LightGlue+SuperPoint); ⚠️ for runtime behaviour under our exact constraints (Jetson Orin Nano, 1 km AGL, 17 m/s, 3 fps); ❓ for production hardening / covariance honesty / spoof-defence (none documented)
|
||||
- **Related Dimension**: SQ1 (closest open-source peer), SQ2 (canonical pipeline confirmation), SQ3+SQ4 (architectural template for component candidate matrix), SQ6 (alternate AP transport debate)
|
||||
- **Fit Impact**: **architectural template** — confirms the project's split (C1 VIO ↔ C2/C3 visual absolute ↔ C5 fusion ↔ C8 FC adapter) is canonical, not novel. Two concrete deltas:
|
||||
1. **Transport choice on AP**: NGPS uses `VISION_POSITION_ESTIMATE`. SQ6 picked `GPS_INPUT` because it carries `horiz_accuracy` directly, supports source-set switching via `MAV_CMD_SET_EKF_SOURCE_SET`, and avoids EKF-source-set reconfiguration. The trade-off (NGPS's path vs SQ6's pick) must be re-examined at design time before final AP-transport selection.
|
||||
2. **Estimator choice**: NGPS uses UKF; SQ3/SQ4 will compare UKF vs ESKF vs MSCKF vs factor-graph (GTSAM) on the same matrix.
|
||||
|
||||
### Fact #15 — RGB satellite-image matching as a *low-altitude* (<25 m AGL) localization technique is unreliable per the SPRIN-D Challenge; our 1 km AGL operates in the regime where the same authors note it "works reasonably well"
|
||||
- **Statement**: The CTU Prague team's SPRIN-D winning paper directly states: *"Some teams used RGB satellite image-based matching, but this has proved to be highly unreliable at such low altitudes."* (referring to <25 m AGL). The paper's related-work review separately notes that *"high-altitude matching... works reasonably well, but at low altitudes (25 m) the viewpoint differs drastically, making roofs, facades, and vegetation inconsistent with satellite imagery."* The project operates at ≤1 km AGL — which is the *high-altitude* regime in the paper's terminology — making RGB sat-matching the appropriate technique class. The paper's CPU-only winning method (LiDAR heightmap-gradients + clustered particle filter) is **not** transferable to our hardware: our project has no LiDAR.
|
||||
- **Source**: Source #28
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer + Domain expert
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1, SQ5 (failure modes), SQ2 (canonical pipeline)
|
||||
- **Fit Impact**: **disambiguates a potentially-disqualifying lesson** — the CTU paper's "RGB sat-matching is unreliable" finding does NOT disqualify our approach because the failure was caused by low-altitude viewpoint mismatch, which our 1 km AGL regime does not have. This must be cited explicitly in `solution_draft01` to pre-empt the natural objection from anyone who reads the paper. Separately, the CTU paper's specific lessons are still binding: VIO degrades catastrophically without IMU vibration isolation; magnetometer is unreliable near steel/concrete; "ability to recover from periods of high uncertainty and re-localize" matters more than instantaneous RMSE — this last lesson is a direct architectural input for AC-NEW-2 / AC-NEW-8.
|
||||
|
||||
### Fact #16 — RTAB-Map and ORB-SLAM3 both fail beyond 1 km / above 2 m/s flight in the SPRIN-D environment; our cruise profile (≤17 m/s, kilometers between satellite anchors) explicitly excludes both as primary candidates
|
||||
- **Statement**: The SPRIN-D paper states: *"We tested state-of-the-art visual SLAM systems such as RTAB-Map and ORB-SLAM3 in a high-fidelity simulator, and found that both performance degraded significantly in a long-range scenario (beyond 1 km), as their memory and compute demands grow with the size of the environment. Moreover, RTAB-Map was unable to maintain quality odometry in faster flight speeds (beyond 2 m/s), while ORB-SLAM3 suffered from tracking loss in textureless areas."*
|
||||
- **Source**: Source #28
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer (component selection for C1)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1, SQ3+SQ4 component C1 (VO/VIO), SQ5 (failure modes)
|
||||
- **Fit Impact**: **prunes the C1 candidate landscape** — RTAB-Map and ORB-SLAM3 should not be pursued as C1 leads. Plausible C1 leads remain: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline (KLT + RANSAC homography). NGPS (Fact #14) uses `ap_vips` = OpenVINS-class VIO — confirming an aligned community choice. Final C1 selection happens in SQ3+SQ4.
|
||||
|
||||
### Fact #17 — DSMAC + TERCOM lineage: pre-cached scene matching for downward-looking navigation is a 40+ year deployed technique class with documented sub-10 m terminal accuracy
|
||||
- **Statement**: DSMAC (Digital Scene Matching Area Correlator) is an autonomous missile-guidance system based on area correlation of sensed downward-camera ground scenes against pre-stored reference imagery (often satellite reconnaissance). It achieves 3–10 m terminal accuracy by correlating buildings, road intersections, and distinctive terrain landmarks. Tomahawk: TERCOM (radar altimeter + DEM) for mid-flight + DSMAC for terminal guidance reduces CEP from ~30 m to "only meters". Documented combat record: 1991 Gulf War, >80% of 280 launched Tomahawks hit target. Recent miniaturisation: Destinus Ruta (300 km strike-class) is integrating UAV Navigation's (Spanish, Grupo Oesía) DSMAC-class system, validated in Ukrainian combat conditions including GNSS-denied / jamming / spoofing.
|
||||
- **Source**: Source #36, Source #27
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Domain expert + Decision-maker
|
||||
- **Confidence**: ✅ for the lineage and Tomahawk performance numbers (DTIC + open-source); ⚠️ for the Ruta-specific "DSMAC operating principle" inference (Defense Express analyst inference, not vendor disclosure)
|
||||
- **Related Dimension**: SQ1 (lineage), SQ8 (baseline accuracy expectations for AC-1.1.1 80 m / AC-NEW-4 false-position budget)
|
||||
- **Fit Impact**: **establishes baseline accuracy expectations** — the technique class has documented sub-10 m accuracy in the cruise-missile-terminal regime. Our budget (AC-1.1.1: <80 m at 1 km AGL with ≥0.5 m/px tiles) is loose by comparison, indicating that the AC budget is *not* aggressive against the technique-class baseline — it is aggressive against the Jetson Orin Nano + 8-h-continuous + 25 W envelope. **Implication for AC-NEW-4**: claiming P(error >500 m) <0.1% per flight is consistent with the DSMAC-lineage class; an honestly-reported failure rate at this level is realistic, not unprecedented.
|
||||
|
||||
### Fact #18 — Hierarchical Image Matching (arXiv 2506.09748, June 2025) is a current academic SOTA pipeline for our exact problem, but uses DINOv2 — a heavyweight foundation model that must be benchmarked under our 25 W / 8 GB Jetson envelope before any selection
|
||||
- **Statement**: 2025 academic SOTA pipeline structure: (1) image retrieval module (off-the-shelf, optimal-transport feature aggregation); (2) Semantic-Aware and Structure-Constrained Matching Module (SASCM) using **DINOv2** features + 4D correlation tensor + SoftMNN + 4D conv; (3) lightweight fine-grained matching module for pixel-level. Constructs UAV absolute visual localization without VIO/relative-localization dependence (retrieval-and-matching only). Evaluation on AerialVL + their own CS-UAV dataset claims superior accuracy under cross-source and cross-temporal variation.
|
||||
- **Source**: Source #29
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Implementer / Engineer + Domain expert
|
||||
- **Confidence**: ✅ for pipeline structure and method; ⚠️ for "superior" claim (single-paper benchmark; AerialExtreMatch evaluates 16 methods with broader rigor — Source #34 is the better cross-method ranker); ❓ for Jetson-Orin-Nano runtime (no published number)
|
||||
- **Related Dimension**: SQ1 (academic SOTA), C2 (VPR), C3 (cross-domain registration), SQ5 (foundation-model-on-Jetson failure mode)
|
||||
- **Fit Impact**: **academic-SOTA snapshot, candidate template** — the retrieval → semantic-aware coarse → fine-grained pipeline is a candidate template for our C2+C3, but DINOv2 introduces a Jetson-deployment risk that must be quantified before commitment. Candidate-level decision: include DINOv2-based pipelines (AnyLoc, BoQ, this paper's SASCM) in the C2/C3 candidate matrix with mandatory MVE on Jetson Orin Nano under our exact frame size and 3 fps cadence. Reject DINOv2 if total inference latency cannot be brought under (400 ms - other-stages budget) at INT8 / fp16. Per Source #28 lesson, classical matchers (LightGlue+SuperPoint as in NGPS) should also be in the matrix as the "simple baseline / known-Jetson-runnable" option.
|
||||
|
||||
### Fact #19 — AerialExtreMatch (2025) is the academic benchmark our C2+C3 candidate matrix must publish numbers against, with 32 difficulty-stratified cells exposing exactly the cross-source / cross-pitch / cross-scale failure modes our project will face
|
||||
- **Statement**: AerialExtreMatch publishes (a) 1.5 M synthetic train pairs (RGB+depth, diverse UAV/satellite viewpoints); (b) ~30,000 evaluation pairs in **32 difficulty levels** stratified by overlap (4 bins: <20%, 20–40%, 40–60%, >60%), pitch difference (4 bins: 50–55°, 55–60°, 60–65°, 65–70°), and scale variation (2 bins: 1–2×, >2×); (c) a real-world UAV-localization split captured with DJI M300 RTK + H20T against UAV-derived orthomosaic/DSM AND lower-quality satellite maps. The benchmark evaluates 16 representative detector-based and detector-free image matching methods.
|
||||
- **Source**: Source #34
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Domain expert + Implementer
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1 (academic landscape), SQ7 (datasets), C2 (VPR), C3 (cross-domain registration)
|
||||
- **Fit Impact**: **defines the C2/C3 evaluation matrix** — every C2/C3 candidate going into `solution_draft01` must report numbers on AerialExtreMatch's 32 difficulty cells, with at least the high-pitch (65–70°) and high-scale (>2×) cells representing our worst-case (UAV vs satellite tile geometry mismatch + ortho-rectification residual). The dataset's real-world UAV-localization split with both UAV-orthomosaic AND satellite-map references mirrors our project's offline-cache-tile semantics directly.
|
||||
|
||||
### Fact #20 — DARPA FLA + USAF SBIR establish the US-defense-program tailwind, but do not directly validate the project's specific regime (fixed-wing, ~1 km AGL, sat-tile basemap, 8-h endurance)
|
||||
- **Statement**: DARPA Fast Lightweight Autonomy (FLA) program ran 2015–2018 (Phase 1 Florida 2017; Phase 2 Georgia 2018; complete). Focused on small quadcopter autonomy at ≤20 m/s through cluttered indoor/outdoor environments using onboard cameras + LIDAR + sonar + IMU, no GPS / datalink / pilot. A 2025 retrospective (arXiv 2504.08122) reviews FLA testing methodology and Phase 1 results. A 2025 USAF SBIR Phase II solicitation (Sweetspot ID `7946c818-409f-5b31-8f06-554466071d83`) is requesting visual position and navigation capability for sUAS in GPS-denied environments — confirming the regulatory + funding environment is currently active for this category in 2025.
|
||||
- **Source**: Source #35
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Decision-maker + Domain expert
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ1 (defense-program lineage)
|
||||
- **Fit Impact**: **context only, no direct candidate gain** — FLA pre-dates the project's specific regime by 8 years, focused on a different platform (multirotor) and altitude (low-altitude obstacle avoidance, not 1 km AGL nadir-camera satellite-anchor). Useful only to establish lineage and context. The USAF SBIR datapoint is more directly relevant: confirms that an active US-defense-funded need exists for sUAS visual position + navigation in GPS-denied environments — i.e., the project's market exists outside Ukraine.
|
||||
|
||||
---
|
||||
|
||||
## SQ1 — Conclusions (working summary, will be re-checked at Step 7.5)
|
||||
|
||||
### Existing-systems landscape (5 named-and-evidenced peer / adjacent systems)
|
||||
|
||||
| System | Class | Operational regime | Closest match dimension | Closest mismatch dimension | Status as evidence |
|
||||
|---|---|---|---|---|---|
|
||||
| **Twist Robotics OSCAR** (UA) | Deployed Ukrainian peer | Combat-deployed, fixed-wing-class, GPS-denied vision-nav | **Same architecture, same threat environment** | Altitude / endurance / FC / accuracy contract not publicly specified | Closest peer for "feasibility floor" |
|
||||
| **Auterion Artemis** | Production COTS one-way attack drone | Shahed-class, 1000-mile range, 40 kg warhead, Ukraine-validated GPS-denied nav | Same architectural pattern (Skynode + Visual Navigation + terminal guidance) | One-way attack vs reusable; no covariance/source-label contract published | Closest production reference architecture |
|
||||
| **Vantor Raptor (Guide / Sync / Ace)** | Production COTS software suite | Vision-based GPS replacement on existing drone camera + Vivid Terrain 3D basemap | Visual-position software pattern | Vendor-managed sat-tile basemap is not the project's Azaion Suite Satellite Service; no AC-NEW-4 / AC-1.4 contract | Closest commercial peer for "build-vs-buy" framing |
|
||||
| **snktshrma/ngps_flight (NGPS, ArduPilot GSoC 2024)** | Open-source research prototype | LightGlue+SuperPoint+UKF+`VISION_POSITION_ESTIMATE` to AP | **Same component split, same FC family** | GSoC prototype, not production; no spoof defence; no covariance honesty | **Closest open-source pipeline match — explicit architectural template** |
|
||||
| **CTU Prague SPRIN-D winner** | Academic / competition | Multirotor, ≤25 m AGL, LiDAR + heightmap gradient + particle filter on CPU | "Recover-from-uncertainty > low-instantaneous-RMSE" lesson; VIO discipline | LiDAR-required, low-altitude regime, no sat-tile basemap | Architectural-pattern reference + cautionary tale |
|
||||
| **Destinus Ruta + UAV Navigation** | Production miniaturised cruise missile | 300 km strike, DSMAC-class, Ukraine-combat-validated | Pre-cached basemap + visual matching + autopilot ingestion | One-way attack, terminal guidance, no covariance contract | Shows DSMAC-class miniaturised into UAV tier |
|
||||
|
||||
### Per-perspective coverage
|
||||
|
||||
| Perspective | Facts supporting | Saturation status |
|
||||
|---|---|---|
|
||||
| **Implementer / Engineer** | Fact #14 (NGPS), Fact #16 (SLAM failure modes), Fact #18 (DINOv2 risk) | Saturated for SQ1 — deeper component-level deep-dives go to SQ3/SQ4 |
|
||||
| **Practitioner / Field (Ukraine)** | Fact #11 (OSCAR), Source #37 (~70% UAV losses to EW), Source #27 (Ruta + UAV Navigation Ukraine combat validation) | Saturated for SQ1 |
|
||||
| **Domain expert / Academic** | Fact #18 (Hierarchical Matching SOTA), Fact #19 (AerialExtreMatch benchmark), Fact #15 (SPRIN-D regime distinction) | Saturated for SQ1 — academic SOTA benchmarking handed off to SQ3/SQ4 + SQ7 |
|
||||
| **Contrarian / Devil's advocate** | Fact #15 (low-altitude RGB matching unreliable lesson), Fact #16 (RTAB-Map / ORB-SLAM3 disqualified), Fact #18 (DINOv2-on-Jetson risk) | Saturated for SQ1 |
|
||||
| **Decision-maker / Business** | Fact #12 (production-ready Auterion), Fact #13 (commercial Vantor build-vs-buy framing), Fact #20 (USAF SBIR market context) | Saturated for SQ1 |
|
||||
|
||||
### Architectural conclusions for `solution_draft01`
|
||||
|
||||
1. **Build-vs-buy stance**: build. Vantor Raptor and Auterion Visual Navigation are commercially superior on hardening + integration but neither exposes the covariance honesty contract (AC-NEW-4) nor uses the project-specified Azaion Suite Satellite Service tile cache (AC-2.x); both are dual-use export risks for the Ukrainian battlefield. NGPS (Fact #14) is the open-source architectural template to learn from but is a GSoC research prototype lacking production hardening, spoof defence, and the covariance-honesty contract. Architectural conclusion: build with NGPS as the template, with project-specific contracts (AC-NEW-4, AC-1.4, AC-NEW-7) and per-FC adapter (SQ6 conclusion) layered on top.
|
||||
2. **Differentiation from OSCAR (Twist Robotics)** must be made explicit in `solution_draft01`: (a) honest covariance contract per AC-NEW-4; (b) explicit `{satellite_anchored, visual_propagated, dead_reckoned}` source-label contract per AC-1.4; (c) AC-NEW-7 cache-poisoning safety budget on tile write-back; (d) ArduPilot Plane + iNav both supported per project's revised AC-4.3.
|
||||
3. **Pipeline canonicalness**: the C1+C2+C3+C4+C5+C8 split is canonical (NGPS + the 2025 hierarchical-matching paper + SPRIN-D winner all use the same shape; only the specific algorithm choices differ). SQ2 will sanity-check this against one more pipeline-survey paper, but this is essentially a low-risk question now.
|
||||
4. **Component-pruning** carried into SQ3/SQ4:
|
||||
- C1: **prune RTAB-Map and ORB-SLAM3** as primary candidates per Fact #16. Carry: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure VO baseline.
|
||||
- C2/C3: **mandatorily benchmark** any DINOv2-based candidate (AnyLoc, BoQ, SASCM-style) against AerialExtreMatch at our pitch / scale / overlap regime AND against Jetson Orin Nano latency budget (per Fact #18). Maintain LightGlue+SuperPoint as the "simple-baseline / known-Jetson-runnable" option per NGPS precedent.
|
||||
- C8 transport: NGPS uses `VISION_POSITION_ESTIMATE`. SQ6 picked `GPS_INPUT`. Re-examine the trade-off in design phase, but SQ6's selection stands for the research draft.
|
||||
5. **Lessons from SPRIN-D winner that must propagate to `solution_draft01`**:
|
||||
- "Ability to recover from periods of high uncertainty and re-localize" > "low instantaneous RMSE" — directly informs AC-NEW-2 / AC-NEW-8.
|
||||
- VIO requires mechanically-decoupled IMU; this is a hardware-integration constraint, not a software issue.
|
||||
- Magnetometer is unreliable near steel/concrete; sensor fusion of heading sources is essential.
|
||||
- "No single sensor can be fully relied upon" — directly supports our IMU+camera+sat-tile multi-source posture.
|
||||
|
||||
### Open follow-ups (deferred to later sub-questions)
|
||||
|
||||
- **(SQ8)** Independent verification of OSCAR's "fully resistant to spoofing/jamming" claim — if available. Otherwise, Twist Robotics's claim remains a vendor-only signal.
|
||||
- **(SQ8)** Vantor Raptor and Auterion Visual Navigation's covariance reporting behaviour — for benchmarking AC-NEW-4 compliance.
|
||||
- **(SQ3+SQ4 / C2)** AnyLoc / BoQ / DINOv2-VLAD / MixVPR / EigenPlaces / NetVLAD on AerialExtreMatch for cross-source aerial — already in C2 search plan; SQ1 just confirmed they're the right candidate set.
|
||||
- **(SQ3+SQ4 / C3)** LightGlue / LoFTR / RoMa / DKM / MASt3R + classical SIFT+RANSAC + XFeat on AerialExtreMatch — already in C3 search plan; SQ1 confirms shape.
|
||||
- **(SQ7)** AerialExtreMatch + AerialVL + CS-UAV + RealUAV/SAVL + UAV-VisLoc as the dataset shortlist for our cross-validation — confirmed by SQ1 hits.
|
||||
|
||||
### Boundary check: SQ1 is saturated
|
||||
|
||||
Saturation signals observed: 4 perspectives saturated, ≥3 high-confidence facts per perspective, last 3 search rounds (Anduril Iris detail probe, ArduPilot prior-art probe, DSMAC lineage probe) yielded only one new substantive datapoint (NGPS) and confirmed already-known patterns. No unresolved contradictions. Per `references/source-tiering.md` "Search saturation rule" → SQ1 is closed.
|
||||
@@ -0,0 +1,123 @@
|
||||
# Fact Cards — SQ2: Canonical GPS-denied pipeline & SOTA components
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `../01_source_registry/SQ2_canonical_pipeline.md` (see `../01_source_registry/00_summary.md` for index). Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential). Bound to sub-questions in `../00_question_decomposition.md`.
|
||||
>
|
||||
> Index: [`../00_summary.md`](../00_summary.md). Sibling categories: SQ6 ([FC external positioning](SQ6_fc_external_positioning.md)), SQ1 ([existing systems](SQ1_existing_systems.md)), C1 ([VIO](C1_vio.md)), C2 ([VPR](C2_vpr.md)), C3 ([matchers](C3_matchers.md)).
|
||||
|
||||
**Facts in this file**: #21–#27 (canonical pipeline definition, EKF fusion patterns, cross-domain matchers, hierarchical retrieval, end-to-end visual localization rejection, hardware MVE doctrine) + SQ2 working conclusions.
|
||||
|
||||
---
|
||||
|
||||
## SQ2 — Canonical pipeline decomposition (sanity-check)
|
||||
|
||||
### Fact #21 — The canonical pipeline for offline-cache visual geo-localization is two-stage: global VPR retrieval, then local alignment (image matching → pose)
|
||||
- **Statement**: Source #38 (Skoltech aerial-VPR survey) defines the field's canonical pipeline verbatim: "Visual geolocalization can be implemented through various methods, typically relying on a pre-built database of images with known locations. This approach generally involves two stages: global localization (or Visual Place Recognition, VPR) and local alignment. Global localization involves identifying the nearest frame from the database (Image Retrieval), while local alignment determines the precise position using the selected frame." Source #42 (NUDT 2026 absolute-VL survey) names the same shape "**retrieval → matching → pose-estimation hierarchical framework**" and explicitly contrasts it against three rejected alternatives: (a) relative-only VIO/SLAM (cumulative error), (b) end-to-end direct localization (poor generalization), (c) map-free localization (scene-dependent). Source #39 (U.Maine cross-view survey) traces the same lineage from 2003 pixel-wise template-matching → 2013 hand-engineered features → 2017 CNN/triplet-loss → 2018+ Siamese/GAN → 2022+ Transformer → 2023 DINOv2-class. Source #41 (AnyVisLoc benchmark) implements this hierarchy as: image retrieval (rough) → image matching (2D-2D) → DSM-lift to 3D → PnP+RANSAC, with **Top-N re-rank by inlier count** as a critical fourth stage between matching and pose.
|
||||
- **Source**: Source #38, Source #39, Source #41, Source #42
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Architects of `solution_draft01`
|
||||
- **Confidence**: ✅ (four independent surveys/benchmarks converge)
|
||||
- **Related Dimension**: SQ2, C2 (VPR), C3 (cross-domain matching), C4 (pose estimation)
|
||||
- **Fit Impact**: **confirms** the project's C1–C10 decomposition is canonical for the **C2 → C3 → C4** chain. The component split is not novel; the project's contribution is the **integration discipline** (covariance honesty AC-NEW-4, source-label contract AC-1.4, offline-cache safety AC-NEW-7) layered on top. **Augment** the existing decomposition with an explicit "Top-N re-rank by inlier count" stage between C3 and C4 (currently implicit).
|
||||
|
||||
### Fact #22 — AdHoP (Adaptive Homography Preconditioning) is a method-agnostic post-matching refinement loop that improves translation accuracy by ~30% average and up to 63% for previously-underperforming methods, at the cost of a second matching pass
|
||||
- **Statement**: Source #40 (OrthoLoC benchmark, Sep 2025): from initial 2D-2D query↔orthophoto correspondences, estimate a homography H via DLT+RANSAC, warp the orthophoto with H to better match the query's perspective (reducing residual perspective gap), re-match in this warped frame, then map the new correspondences back to the original orthophoto via H⁻¹, lift to 3D using DSM, and run PnP+RANSAC + Levenberg-Marquardt refinement. Accept the AdHoP-refined pose only if reprojection error decreases vs. the non-refined pose. **Quantitative effects** (16,425 images, 47 locations, 1m-1° threshold): GIM+DKM 75.4% recall (best); AdHoP-refined methods see ~30% average matching improvement, ~20% translation/rotation error reduction; for previously-underperforming methods AdHoP yields up to 95% matching improvement (XFeat*) or 63% translation reduction (DKM); for RoMa, AdHoP lifts 1m-1° recall by +23 points (54.6% → 77.6%-class). **Cross-domain regime** (war-zone-equivalent: scene change between query and reference): translation error increases ~3× when only the visual modality differs, ~7× when both visual and structural (DSM) gaps exist (0.16 m → 1.12 m for GIM+DKM+AdHoP). **Method-agnostic** — works on top of any 2D-2D matcher.
|
||||
- **Source**: Source #40
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C3/C4 implementers
|
||||
- **Confidence**: ✅ for headline numbers (single-paper, but published dataset + open code + reproducible per repo)
|
||||
- **Related Dimension**: SQ2 (new sub-stage), C3 (matcher), C4 (pose), SQ5 (cross-domain failure mode)
|
||||
- **Fit Impact**: **adds a new sub-stage** between C3 and C4. Decision for `solution_draft01`: include AdHoP-class refinement as an **optional** stage gated on Jetson Orin Nano latency budget — if (single-pass match latency × 2) + homography estimation + reprojection check fits under (400 ms - other-stages), include it; otherwise reserve as offline-replay-time refinement. Cross-domain 3× translation-error penalty is a **direct AC-NEW-4 calibration input** — companion-side covariance must inflate proportionally when scene-change detection (deferred to SQ8) flags a stale tile.
|
||||
|
||||
### Fact #23 — 6-DoF aerial-to-satellite localization requires DSM (Digital Surface Model) elevation data; without DSM, the system collapses to 3-DoF (position + 1 rotation) or must compute attitude purely from IMU/VIO
|
||||
- **Statement**: Source #40 OrthoLoC explicitly: "Our pipeline matches the query image with the DOP, lifts the matched 2D points in DOP to 3D using the DSM, and then estimates the camera pose using PnP and RANSAC." Without the DSM lift, the matcher produces 2D↔2D correspondences that constrain a homography (which encodes 3-DoF for a planar scene + planar camera) but **not** the full 6-DoF camera pose. Source #41 AnyVisLoc independently confirms by measuring: aerial-photogrammetry map (with paired DSM at 0.94 m/px) achieves 74.1% A@5m; satellite map (with ALOS 30 m DSM) achieves only 18.5% A@5m — a 4× accuracy collapse driven by DSM coarseness. The project's offline cache from the Azaion Suite Satellite Service is currently specified as **2D ortho tiles only** (no DSM commitment in restrictions.md or AC). **Three architectural responses** are available: (a) **3-DoF acceptance** — fix attitude from IMU/VIO, treat the matcher output as a homography-only constraint, ignore DSM; sacrifices the up-to-2× higher accuracy reported when DSM is present, but stays within current cache contract; (b) **Request DSM tiles from the Suite Sat Service** — adds C2 cache schema work + a Suite Sat Service contract change; preserves 6-DoF accuracy; (c) **IMU/VIO-only attitude + 2D-2D matching translation** — same as (a) but explicitly contracts the IMU/VIO module to provide attitude with σ ≤ 5° (per Fact #24); operationally identical to (a), differs only in how the contract is written.
|
||||
- **Source**: Source #40, Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + Suite Sat Service stakeholder + AC owner
|
||||
- **Confidence**: ✅ for the architectural claim; ✅ for the 4× accuracy collapse number
|
||||
- **Related Dimension**: SQ2 (decomposition), C2 (cache schema), C3 (matcher output contract), C4 (pose), C5 (estimator), C6 (IMU/VIO contract), AC-1.1 / AC-1.1.1 (accuracy budget)
|
||||
- **Fit Impact**: **architectural decision required, surfaced for user.** The current restrictions.md (no DSM commitment) implicitly forces option (a) or (c). The accuracy budget AC-1.1.1 (≤80 m at 1 km AGL) is loose enough that 3-DoF + IMU-attitude almost certainly satisfies it on a per-frame basis (per Fact #21 and DSMAC-class lineage in Fact #17), but **requires explicit acknowledgement** in the architecture before commitment. **Proposed default** for `solution_draft01`: option (c) — fix attitude from IMU/VIO with documented σ ≤ 5° contract on yaw, σ ≤ 5° on pitch (per Fact #24), translation from 2D-2D matching + camera pose. Flag option (b) as a "Suite Sat Service follow-up" if 6-DoF accuracy ever becomes a hard requirement.
|
||||
|
||||
### Fact #24 — IMU-derived yaw and pitch priors with σ ≤ 5° are required for the matching+PnP stack to hit benchmark accuracy; σ ≥ 10° causes 2–4% A@5m drops, σ ≥ 30° causes ≥4% drops, σ ≥ 60° causes 25.7% drops
|
||||
- **Statement**: Source #41 AnyVisLoc systematically perturbs yaw and pitch priors and measures localization accuracy collapse. Yaw: σ = 5° → no impact; σ = 10° → −1.9% A@5m; σ = 30° → −4.1%; σ = 50° → −13.7%; σ = 60° → −25.7%. Pitch: σ < 5° → no impact; σ ≥ 7° → 1–5% drops. The benchmark is conducted at low altitude (30–300 m AGL) with 20–90° pitch range; lessons transfer to our 1 km AGL nadir-camera regime in the **direction** but the magnitudes may be lower at 1 km AGL because nadir geometry is less yaw-sensitive than oblique. Conservatively adopting the benchmark numbers gives a hard contract: **IMU/VIO must deliver yaw with σ ≤ 5° and pitch with σ ≤ 5° to the matcher** (1σ, not 95%, since the benchmark is single-σ). Pitch is naturally tighter on a nadir-fixed camera (mechanically constrained); yaw is the binding constraint and is the typical IMU/magnetometer failure mode (per SPRIN-D lesson Fact #15).
|
||||
- **Source**: Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C1 (VIO) implementer + C5 (estimator) implementer
|
||||
- **Confidence**: ✅ for the AnyVisLoc numbers; ⚠️ for direct transfer to 1 km AGL nadir regime (magnitudes likely smaller at our altitude/pitch — direction is conservative)
|
||||
- **Related Dimension**: SQ2 (sensor-prior contract), C1 (VIO output contract), C5 (estimator), C6 (IMU)
|
||||
- **Fit Impact**: **architectural contract** for `solution_draft01`: the C1 module's published contract to the C2/C3 stack is yaw σ ≤ 5° AND pitch σ ≤ 5°. Magnetometer-only yaw is **insufficient** by the SPRIN-D lesson (Fact #15) — VIO must contribute. **Adds a constraint** that flows back to the C6 IMU integration: IMU mechanical isolation per SPRIN-D Fact #15 is required; magnetometer + GPS-yaw startup alignment at the airbase (before take-off, while real GPS is healthy) is part of the boot sequence.
|
||||
|
||||
### Fact #25 — Top-N re-ranking by inlier count is the dominant accuracy/cost trade-off; pure-matching-without-retrieval is catastrophic (A@5m collapses from 62.2% to 34.3% with the same matcher)
|
||||
- **Statement**: Source #41 AnyVisLoc and Source #38 Skoltech survey both quantify the value of retrieval as a search-space reducer for matching. Source #41 explicitly: "Top-N re-rank by inlier count is the best accuracy/cost trade-off" → 62.2% A@5m at 0.8 s/frame on RTX 3090. **Without retrieval** (pure exhaustive matching against the cache): 34.3% A@5m — i.e., almost **half** the accuracy at infeasible compute. Source #38 measures sparse-VPR re-ranking specifically: AnyLoc descriptor + SuperGlue re-rank on top-100 candidates = 15–25 s/frame on RTX 3090 (catastrophic for our 400 ms budget); LightGlue re-rank ≈ 1 s/frame (still over budget); SelaVPR re-rank < 0.1 s/frame (in-budget on RTX 3090, must be re-tested on Jetson Orin Nano). **Re-ranking budget** = (frame budget) − (descriptor extraction) − (initial top-N retrieval) − (matcher pose estimation) − (AdHoP if included).
|
||||
- **Source**: Source #38, Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System architects + C2 implementer
|
||||
- **Confidence**: ✅ (two-source convergence on the qualitative claim; quantitative numbers are RTX-3090-specific and must be Jetson-MVE'd)
|
||||
- **Related Dimension**: SQ2 (pipeline structure), C2 (VPR), C3 (matcher), SQ3+SQ4 (Jetson MVE)
|
||||
- **Fit Impact**: **mandates** Top-N re-rank by inlier count as a stage in `solution_draft01`. Trade-off Top-N value (typical N=5–20 in literature) goes to SQ3+SQ4 candidate matrix, not SQ2.
|
||||
|
||||
### Fact #26 — High-accuracy SOTA models (AnyLoc + SuperGlue + RoMa-class) are NOT viable on Jetson Orin Nano under the 400 ms p95 budget; lightweight VPR (MixVPR / SALAD / SelaVPR-class) + lightweight matchers (LightGlue / XFeat-class) are the only candidates that survive a basic latency pre-screen
|
||||
- **Statement**: Two independent runtime measurements on RTX 3090 (≥10× faster than Jetson Orin Nano in dense matrix ops): Source #38 — AnyLoc descriptor calculation 0.37–0.84 s/frame (huge ViT-G DINOv2); SuperGlue re-rank 15–25 s/frame on top-100; LightGlue re-rank ~1 s/frame; SelaVPR re-rank < 0.1 s/frame. Source #41 — RoMa dense matcher 659 ms/frame; SP+LightGlue+GIM sparse 105 ms/frame; ratio = 6.3×. **Memory**: AnyLoc descriptors = 2.3–13.9 GB for 4–7k tiles (out of 8 GB Jetson Orin Nano envelope before model weights); SelaVPR descriptors < 0.2 GB. Pre-screen conclusion: AnyLoc / SuperGlue / RoMa-class are **disqualified** on the Jetson Orin Nano at 3 fps unless heavy quantization (INT8) reduces them ≥10×, which is not yet established for our latency target on this hardware. Surviving candidates from the literature: **VPR**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD-class; **matchers**: LightGlue, XFeat, XFeat*, SP+LightGlue. **Disqualification is preliminary** — final go/no-go happens at SQ3+SQ4 with on-Jetson MVE per `references/mode-A-mve-rules.md`.
|
||||
- **Source**: Source #38, Source #41
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: C2 + C3 implementer; SQ3+SQ4 candidate-matrix author
|
||||
- **Confidence**: ✅ for RTX-3090 numbers; ⚠️ for direct Jetson translation (Jetson Orin Nano AI score is well-published; ratio is conservative)
|
||||
- **Related Dimension**: SQ2 (Jetson budget feasibility), SQ3+SQ4 (candidate pre-screen), SQ5 (foundation-model-on-edge failure mode), C2, C3, C7 (Jetson runtime)
|
||||
- **Fit Impact**: **prunes the SQ3+SQ4 candidate matrix BEFORE expensive Jetson MVE.** Candidates entering SQ3+SQ4 with mandatory Jetson MVE: (C2 VPR) MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD; (C3 matcher) LightGlue, XFeat, XFeat*, SP+LightGlue. Candidates that need Jetson INT8 quant before they earn an MVE slot: AnyLoc, BoQ, DINOv2-VLAD (must demonstrate INT8 build path with vendor-validated accuracy preservation). Candidates pruned outright: RoMa dense, SuperGlue, MASt3R (latency).
|
||||
|
||||
### Fact #27 — A 20% covisibility floor between query frame and reference tile is required for localization to succeed; below it, ALL methods fail regardless of matcher quality
|
||||
- **Statement**: Source #40 OrthoLoC: "When the covisibility between the UAV image and the orthographic geodata is too small (less than ~20%), the localization fails for all methods regardless of matcher quality." This is a geometric floor, not a method-specific limit. The implication for the project: any tile-cache design that allows a query to fall outside 20% covisibility with the **best available** cached tile must also include a **runtime covisibility-check + graceful degrade** to `visual_propagated` mode (per AC-1.4 source label). This is a runtime condition, not a one-time setup parameter.
|
||||
- **Source**: Source #40
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: C2 (cache scheduler) + C5 (estimator) + AC-1.4 owner
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: SQ2 (boundary condition), C2 (tile cache), C5 (estimator state machine), AC-1.4
|
||||
- **Fit Impact**: **adds a runtime invariant** to `solution_draft01`: tile selection must guarantee ≥20% covisibility OR explicitly emit the `visual_propagated` source label per AC-1.4 with covariance widened per AC-NEW-4. This becomes a hard constraint on the C2 cache schema (must support tile-extent metadata) and a runtime check before invoking C3 matcher.
|
||||
|
||||
---
|
||||
|
||||
## SQ2 — Conclusions (working summary, will be re-checked at Step 7.5)
|
||||
|
||||
### Pipeline-component coverage table (existing C1–C10 vs. survey-listed components)
|
||||
|
||||
| Survey/benchmark canonical stage | Project component (current) | Coverage status | Required action |
|
||||
|---|---|---|---|
|
||||
| Image retrieval (global VPR) | **C2 — Visual Place Recognition** | ✅ covered | No change |
|
||||
| Re-ranking (top-N inlier-based) | (currently implicit, inside C2 or C3) | ⚠️ implicit | **Promote to explicit sub-stage** (`C2.5` or `C3.0`) in `solution_draft01` |
|
||||
| Local image matching (2D-2D, sparse or dense) | **C3 — Cross-domain registration** | ✅ covered | Add Top-N re-rank-by-inlier-count requirement |
|
||||
| AdHoP-style perspective preconditioning | (not represented) | ❌ missing | **Add as optional sub-stage** between C3 and C4, gated on Jetson latency budget |
|
||||
| 2D-3D lift via DSM | (not represented; current cache is 2D ortho only) | ❌ architectural decision required | **Decision required from user** — see below |
|
||||
| Pose estimation (PnP + RANSAC + LM) | **C4 — Pose estimation** | ✅ covered | No change |
|
||||
| State estimator / fusion (UKF / ESKF / MSCKF / factor graph) | **C5 — Estimator / fusion** | ✅ covered | Augmented with covariance-honesty contract from AC-NEW-4 |
|
||||
| IMU + VIO contract | **C1 — VO/VIO** + **C6 — IMU integration** | ✅ covered | Add yaw σ ≤ 5°, pitch σ ≤ 5° hard contract from Fact #24 |
|
||||
| Tile cache + scheduler | **C2 — VPR tile cache** + **C6 — Tile cache + spatial index** + **C10 — Pre-flight cache freshness pipeline** | ✅ covered | Add 20% covisibility runtime invariant (Fact #27). (Cache hygiene moved from former-C9 to C10 per 2026-05-08 C9 / SQ7 restructure.) |
|
||||
| Anti-spoof / source-switch | **C7 — Spoof detection** + **C8 — FC adapter** | ✅ covered | Already addressed in SQ6 |
|
||||
| Health monitoring / safety | **C10 — Safety / health monitoring** | ✅ covered | Already addressed |
|
||||
|
||||
### Architectural decisions surfaced (require user resolution before SQ3+SQ4 starts)
|
||||
|
||||
1. **DSM dependency on the Suite Sat Service tile cache** (per Fact #23). Three options:
|
||||
- **(a) 3-DoF acceptance** — accept that without DSM, only position is recovered from matching; attitude is fixed by IMU/VIO with no satellite-tile cross-check. Lowest project scope. Requires AC budget verification (likely passes AC-1.1.1).
|
||||
- **(b) Request DSM tiles** — Suite Sat Service contract change. Highest accuracy. Adds ~1 cycle to delivery. Recommended if 6-DoF accuracy ever becomes a hard AC.
|
||||
- **(c) IMU/VIO-attitude + 2D-2D matching translation** — operationally identical to (a) but contracts the IMU/VIO module explicitly with σ ≤ 5° yaw / pitch (Fact #24).
|
||||
- **Recommended default**: **(c)** — explicit IMU/VIO contract; fall back to (b) if AC tightens.
|
||||
|
||||
2. **AdHoP refinement loop** (per Fact #22). Three options:
|
||||
- **(a) Always-on** — included in every frame; Jetson budget must accommodate 2× matching latency.
|
||||
- **(b) Conditional** — only when initial reprojection error exceeds a threshold; gated on per-frame budget.
|
||||
- **(c) Off (initial release)** — relegate to offline-replay refinement.
|
||||
- **Recommended default**: **(b) Conditional** — fits within latency variance budget while capturing the cross-domain accuracy gain.
|
||||
|
||||
3. **Top-N re-rank promotion to explicit pipeline sub-stage** (per Fact #25). Recommendation: promote to a named sub-stage in `solution_draft01` with N as an SQ3+SQ4 hyperparameter sweep target.
|
||||
|
||||
### Component-pruning carried into SQ3+SQ4
|
||||
|
||||
- **C2 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD.
|
||||
- **C2 candidates entering SQ3+SQ4 conditional on INT8 quantization path**: AnyLoc, BoQ, DINOv2-VLAD.
|
||||
- **C2 candidates pruned**: SuperGlue-as-reranker (latency).
|
||||
- **C3 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: LightGlue, XFeat, XFeat*, SP+LightGlue (NGPS template).
|
||||
- **C3 candidates pruned**: RoMa, MASt3R, DKM (dense matcher latency on Jetson).
|
||||
- **C3 candidates as "AerialExtreMatch reference points" only, NOT for production**: GIM+DKM, GIM+LightGlue (per Source #40, used as accuracy benchmark only).
|
||||
|
||||
### Boundary check: SQ2 is saturated
|
||||
|
||||
Saturation signals observed: (a) four independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey) converge on the **same** "retrieval → matching → pose-estimation hierarchical framework" as canonical; (b) two independent runtime sources (Skoltech survey on RTX 3090; AnyVisLoc on RTX 3090 with explicit dense-vs-sparse breakdown) agree on the relative cost ordering of model classes; (c) cross-source agreement on AdHoP value (Source #40 only, but with reproducible code and dataset — single-source-but-strong evidence); (d) cross-source agreement on covisibility / sensor-prior thresholds. Two outstanding decisions are flagged for user — neither blocks SQ2's saturation status, both block SQ3+SQ4 start. Per `references/source-tiering.md` "Search saturation rule" → SQ2 is closed pending user decisions on DSM dependency + AdHoP gating.
|
||||
@@ -0,0 +1,148 @@
|
||||
# Fact Cards — SQ6: ArduPilot Plane vs iNav external positioning
|
||||
|
||||
> Mode A Phase 2 — engine Step 3 (Fact Extraction & Evidence Cards). Extracted from sources logged in `../01_source_registry/SQ6_external_positioning.md` (see `../01_source_registry/00_summary.md` for index). Confidence labels: ✅ High (L1 / verified source code), ⚠️ Medium (L1/L2 with caveat), ❓ Low (L3/L4 inferential). Bound to sub-questions in `../00_question_decomposition.md`.
|
||||
>
|
||||
> Index: [`../00_summary.md`](../00_summary.md). Sibling categories: SQ1 ([existing systems](SQ1_existing_systems.md)), SQ2 ([canonical pipeline](SQ2_canonical_pipeline.md)), C1 ([VIO](C1_vio.md)), C2 ([VPR](C2_vpr.md)), C3 ([matchers](C3_matchers.md)).
|
||||
|
||||
**Facts in this file**: #1–#10 (ArduPilot/iNav inbound positioning interfaces, covariance honesty, spoof-promotion, dead-reckoning, UBX emulation) + SQ6 working conclusions.
|
||||
|
||||
---
|
||||
|
||||
## SQ6 — ArduPilot Plane vs iNav external positioning
|
||||
|
||||
### Fact #1 — ArduPilot Plane EKF3 ingests `GPS_INPUT` (MAVLink ID 232) as a first-class GPS source
|
||||
- **Statement**: ArduPilot's `AP_GPS_MAV` driver (master) decodes `MAVLINK_MSG_ID_GPS_INPUT` and stores the resulting state into the GPS slot identified by `gps_id`. Decoded fields: lat/lon (degE7), alt (mm → cm internally), hdop/vdop, velocity (vn/ve/vd cm/s), speed/horizontal/vertical accuracy (m / m/s), yaw (cdeg, `0` sentinel = "not provided"). Honors `ignore_flags` for ALT/HDOP/VDOP/VEL_HORIZ/VEL_VERT/SPEED_ACCURACY/HORIZONTAL_ACCURACY/VERTICAL_ACCURACY. Requires `fix_type ≥ 3` and `time_week > 0` for jitter-corrected timestamping.
|
||||
- **Source**: Source #4 (AP_GPS_MAV.cpp master), Source #1 (Plane Non-GPS Navigation docs)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: ArduPilot Plane operators / developers
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8 (FC adapter), C5 (estimator covariance contract)
|
||||
- **Fit Impact**: **supports selection** — ArduPilot side of AC-4.3 is satisfied by `GPS_INPUT` as the primary external-positioning message; covariance fields (`horiz_accuracy`, `vert_accuracy`, `speed_accuracy`) are wired through.
|
||||
|
||||
### Fact #2 — ArduPilot's covariance honesty (AC-NEW-4) is enforced via the `horiz_accuracy` field of `GPS_INPUT`
|
||||
- **Statement**: When `GPS_INPUT_IGNORE_FLAG_HORIZONTAL_ACCURACY` is unset, AP_GPS stores `packet.horiz_accuracy` into `state.horizontal_accuracy` and sets `state.have_horizontal_accuracy = true`. EKF3's quality chain consumes this via (a) ground-stationary 3 m drift check (`_gpsCheckScaler`-modulated), (b) innovation gating (`POS_I_GATE`/`VEL_I_GATE`), (c) soft de-weighting via `EK3_GLITCH_RADIUS` (PR #24135). Under-reporting `horiz_accuracy` defeats these gates — exactly the AC-NEW-4 risk the project flagged.
|
||||
- **Source**: Source #4, Source #23 (PR #24135), Source #24 (AP_NavEKF3 master)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers writing the C5 estimator → C8 adapter
|
||||
- **Confidence**: ✅ (source code + L1 docs); ⚠️ for the precise innovation-gate mechanics (deferred to design-phase SITL tuning)
|
||||
- **Related Dimension**: C5 covariance, AC-NEW-4
|
||||
- **Fit Impact**: **architectural constraint** — the C5 estimator MUST publish honest `horiz_accuracy` (not optimistic) for AP's EKF3 quality chain to function. Aligns directly with AC-1.4 / AC-NEW-4.
|
||||
|
||||
### Fact #3 — ArduPilot supports runtime EKF source-set switching from companion via `MAV_CMD_SET_EKF_SOURCE_SET`
|
||||
- **Statement**: EKF3 supports up to three source sets (`EK3_SRC1..3_*`). A companion can request a switch by sending `MAV_CMD_SET_EKF_SOURCE_SET`. Alternative paths: RC aux-switch option 90 ("EKF Pos Source"), Lua scripts (e.g., `ahrs-source.lua`). **Caveat from L1 docs**: "no GCSs are currently known to implement this" — companion-driven switching works at the firmware level but is not exposed in stock GCS UIs.
|
||||
- **Source**: Source #2, Source #3
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers handling AC-NEW-2 spoof-promotion path on ArduPilot
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8 + AC-NEW-2
|
||||
- **Fit Impact**: **supports selection** — AP allows the project to model two source sets (set 1 = real GPS, set 2 = onboard `GPS_INPUT`) and switch automatically. Keeps companion lightweight; switching does not require the companion to suppress real-GPS itself.
|
||||
|
||||
### Fact #4 — ArduPilot ODOMETRY-velocity-only fusion is currently NOT supported (open enhancement)
|
||||
- **Statement**: Issue #23485 confirms current limitation: feeding `ODOMETRY` without position causes EKF position-estimate timeout / failsafe. Implication: the project's `visual_propagated` mode (VO drift between satellite anchors, no global position) **cannot be expressed as ODOMETRY-velocity-only on current AP** — must be sent as a full `GPS_INPUT` with covariance widened to reflect drift uncertainty.
|
||||
- **Source**: Source #8
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers
|
||||
- **Confidence**: ✅ (open enhancement, open as of accessed date)
|
||||
- **Related Dimension**: C5 + C8 + AC-1.3 (`visual_propagated` label) + AC-1.4 (covariance ellipse)
|
||||
- **Fit Impact**: **architectural constraint** — `visual_propagated` and `dead_reckoned` labels both ride `GPS_INPUT` with growing `horiz_accuracy`, NOT a separate `ODOMETRY` channel. Single-message contract = simpler. AC-NEW-8 thresholds (`horiz_accuracy = 999.0` for "no fix") map directly.
|
||||
|
||||
### Fact #5 — iNav firmware (master, post-9.0) has NO inbound MAVLink handler for any external-positioning message
|
||||
- **Statement**: Authoritative inbound switch in `src/main/telemetry/mavlink.c::processMAVLinkIncomingTelemetry` (master) handles only: HEARTBEAT, PARAM_REQUEST_LIST (stub reply), MISSION_CLEAR_ALL, MISSION_COUNT, MISSION_ITEM, MISSION_REQUEST_LIST, MISSION_REQUEST, COMMAND_INT (only `MAV_CMD_DO_REPOSITION`), RC_CHANNELS_OVERRIDE, ADSB_VEHICLE, RADIO_STATUS. **No `GPS_INPUT`, `VISION_POSITION_ESTIMATE`, `ODOMETRY`, `GLOBAL_POSITION_INT`, or `GPS_RAW_INT` are accepted as inputs.** Wiki page (Source #10) confirms: "Limited command support: Commands that are not implemented are ignored."
|
||||
- **Source**: Source #9 (master code), Source #10 (wiki, edited 2025-12-11)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers + AC-4.3 author
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8, AC-4.3
|
||||
- **Fit Impact**: **DISQUALIFIES the literal AC-4.3 wording** ("the standard external-positioning message type(s) accepted by ArduPilot AND iNav"). No single MAVLink external-positioning message is accepted by both FCs. Project must adopt a per-FC adapter design and AC-4.3 must be revised to acknowledge two transports.
|
||||
|
||||
### Fact #6 — iNav accepts external GPS injection via two MSP paths; `MSP2_SENSOR_GPS` is the covariance-rich path
|
||||
- **Statement**: `MSP_SET_RAW_GPS (201)` (legacy MSP1, 14 bytes): fixType, numSat, lat, lon, alt (m, internal cm), speed (cm/s). **No covariance, no per-axis velocity, no yaw.** `MSP2_SENSOR_GPS (7939, MSPv2 sensor plugin)`: instance, gpsWeek, msTOW, fixType, satellitesInView, hPosAccuracy (mm), vPosAccuracy (mm), hVelAccuracy (cm/s), hdop, lat, lon, mslAltitude (cm), nedVelNorth/East/Down (cm/s), groundCourse (cdeg×100), trueYaw (cdeg×100), date+time. Routes through `mspGPSReceiveNewData()` via `GPS_PROVIDER_MSP`. Requires build flag `USE_GPS_PROTO_MSP` — **enabled by default in iNav's `target/common.h`**, so stock firmware reaches this path.
|
||||
- **Source**: Source #12 (MSP message reference, master), Source #13 (target/common.h master + gps.c provider table)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers (C8 adapter, MSP transport)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8, C5 covariance contract
|
||||
- **Fit Impact**: **supports selection** of `MSP2_SENSOR_GPS` for the iNav adapter. Covariance fields (`hPosAccuracy`, `vPosAccuracy`, `hVelAccuracy`) align semantically with `GPS_INPUT.horiz_accuracy` / `vert_accuracy` / `speed_accuracy`, but unit conversions differ (mm vs m). The C8 adapter must therefore be FC-aware, not protocol-monomorphic.
|
||||
|
||||
### Fact #7 — iNav does NOT support dual-GPS arbitration; companion must be the SOLE GPS source
|
||||
- **Statement**: Issue #10141 is an open feature request for dual-GPS support. Current iNav (master incl. 9.0.x) has single-GPS architecture with one UART selected as the GPS port. There is no primary/secondary failover and no per-instance arbitration in the nav stack.
|
||||
- **Source**: Source #14
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers (architecture)
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: C8, C5, AC-NEW-2 (spoof promotion)
|
||||
- **Fit Impact**: **architectural constraint** — on iNav, real GPS receivers must NOT be wired directly to the FC. Real GPS goes to the companion; the companion fuses (or rejects) it and emits the single iNav-facing feed via MSP2_SENSOR_GPS (or via a UBX-emulation UART). AC-NEW-2 latency on iNav = companion's internal reaction time only; iNav does not participate in source switching at all.
|
||||
|
||||
### Fact #8 — iNav explicitly does NOT validate GPS for spoofing; anti-spoofing is fully the companion's responsibility
|
||||
- **Statement**: iNav's `docs/GPS_fix_estimation.md` states verbatim: "Not a solution for GPS spoofing (GPS output is not validated in INAV)." Combined with Fact #7, the architectural conclusion on iNav: companion = anti-spoofing oracle + nav-camera estimator + IMU-propagation source, all collapsed into the single MSP2_SENSOR_GPS feed.
|
||||
- **Source**: Source #15
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers; AC-NEW-2 / AC-3.5 / AC-NEW-8 owners
|
||||
- **Confidence**: ✅
|
||||
- **Related Dimension**: AC-NEW-2, AC-3.5, AC-NEW-8
|
||||
- **Fit Impact**: **supports selection** of "companion as iNav's only GPS"; **disqualifies** any architecture that relies on iNav-side spoof detection for AC-NEW-2 reaction.
|
||||
|
||||
### Fact #9 — iNav dead-reckoning has documented stability bugs under intermittent feeds; AC-NEW-8 must avoid letting iNav enter dead-reckoning
|
||||
- **Statement**: Issue #10588 documents porpoising and motor-burst behaviour during intermittent GPS outages on iNav fixed-wing dead-reckoning. The community recommendation captured in the issue: "GPS should be rejected if providing erroneous coordinates rather than no fix." `inav_allow_dead_reckoning` (default OFF) and `inav_allow_gps_fix_estimation` (default OFF) are both fixed-state booleans — entering dead-reckoning mid-flight is a discrete transition, not a smooth degrade.
|
||||
- **Source**: Source #15, Source #16 (Settings.md), Source #17 (#10588)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers; AC-NEW-8 owner
|
||||
- **Confidence**: ✅ for setting names; ⚠️ for severity of stability bug (single open issue)
|
||||
- **Related Dimension**: AC-NEW-8, AC-3.5, C8
|
||||
- **Fit Impact**: **architectural constraint** — on iNav, the AC-NEW-8 path must keep emitting `MSP2_SENSOR_GPS` with growing `hPosAccuracy` rather than letting the feed drop and iNav switch to dead-reckoning. The "no fix" semantics on iNav must be expressed via `fixType` field of MSP2_SENSOR_GPS (not by silence). The horiz/vert accuracy fields are the only signal available; iNav has no equivalent of the AP `horiz_accuracy = 999.0` "no fix" sentinel — must verify which `fixType` enum values iNav treats as no-fix.
|
||||
|
||||
### Fact #10 — iNav supports UBX-only over UART (NMEA dropped in 7.0); UBX emulation is a viable third transport
|
||||
- **Statement**: iNav 7.0 removed NMEA. Currently supports u-blox UBX protocol with version ≥ 15.00 in 9.0+. Recommended physical receivers: u-blox M8/M9/M10. Companion can implement a UBX-emulation writer on the iNav GPS UART (NAV-PVT mandatory; NAV-DOP optional). UBX carries `hAcc`/`vAcc`/`headAcc`/velocity components — covariance honesty preserved.
|
||||
- **Source**: Source #11 (iNav GPS-and-Compass-setup wiki)
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: System designers (transport-choice)
|
||||
- **Confidence**: ✅ for UBX-only; ⚠️ for "minimum NAV-* set" — the canonical U-blox protocol spec (Source filed in agent-tools as `fd8513f8-...txt`) plus iNav's `gps_ublox.c` drive the precise message set; **this is a follow-up search before final selection**.
|
||||
- **Related Dimension**: C8 transport choice
|
||||
- **Fit Impact**: **alternate candidate, NOT YET SELECTED** — UBX path bypasses MSP queueing/arbitration concerns and treats the companion as a normal GPS to iNav. Trade-off: implementation cost (UBX writer + correct ACK behaviour) vs. MSP path (already-designed wire format, but iNav-specific).
|
||||
|
||||
---
|
||||
|
||||
## SQ6 — Conclusions (working summary, will be re-checked at Step 7.5)
|
||||
|
||||
### Per-FC adapter design is unavoidable (single-message AC-4.3 wording is unsatisfiable)
|
||||
|
||||
| FC | Inbound external-positioning transport | Message | Covariance fields | Per-axis velocity | Yaw | Source-switching from companion |
|
||||
|---|---|---|---|---|---|---|
|
||||
| **ArduPilot Plane** | MAVLink (TELEM/USB/UDP serial) | `GPS_INPUT` (id 232) — primary | `horiz_accuracy`, `vert_accuracy`, `speed_accuracy` (m/m·s⁻¹) | `vn`, `ve`, `vd` (cm/s) | `yaw` cdeg, 0 = not provided | `MAV_CMD_SET_EKF_SOURCE_SET` (FW supports; stock GCS UIs do not — companion-driven OK) |
|
||||
| **iNav** | MSP2 (UART/USB) | `MSP2_SENSOR_GPS` (id 7939) — primary candidate | `hPosAccuracy` mm, `vPosAccuracy` mm, `hVelAccuracy` cm/s | `nedVelNorth/East/Down` cm/s | `trueYaw` cdeg×100 | **N/A** — iNav has single-GPS arch; companion = sole GPS source |
|
||||
| iNav alt 1 | MSP1 | `MSP_SET_RAW_GPS` (id 201) — **rejected for production** | none | none | none | N/A |
|
||||
| iNav alt 2 | UART | UBX emulation (NAV-PVT etc.) — **alternate candidate, requires NAV-* subset verification** | UBX `hAcc`/`vAcc`/`headAcc` mm/cm/scale | NED in NAV-PVT | yes | N/A |
|
||||
|
||||
**Selection (preliminary, pending Step 7.5 component-fit gate):**
|
||||
- **AP path**: `GPS_INPUT` — Selected (lead).
|
||||
- **iNav path**: `MSP2_SENSOR_GPS` — Selected (lead). UBX-emulation kept as fallback if MSP2_SENSOR_GPS proves rate-limited or quality-flag-lossy.
|
||||
|
||||
### AC / Restriction binding (per-mode, Per-Mode API Capability Verification rule)
|
||||
|
||||
| Numbered AC / Restriction | AP `GPS_INPUT` | iNav `MSP2_SENSOR_GPS` | iNav `MSP_SET_RAW_GPS` |
|
||||
|---|---|---|---|
|
||||
| AC-1.4 (95% cov + source label `{satellite_anchored, visual_propagated, dead_reckoned}`) | **Pass** (`horiz_accuracy` carries 95% covariance proxy; source label is companion-side metadata, not in MAVLink — emit via STATUSTEXT/NAMED_VALUE_FLOAT) | **Pass** (`hPosAccuracy` = covariance proxy; same off-band source-label channel) | **Fail** (no covariance field → cannot publish 95% ellipse) |
|
||||
| AC-NEW-4 (false-position safety budget; covariance honesty) | **Pass** (de-weighted via `EK3_GLITCH_RADIUS` if covariance is honest) | **Verify** (need to confirm iNav nav-stack actually uses `hPosAccuracy` for outlier handling — pre-Step-7.5 follow-up) | **Fail** |
|
||||
| AC-NEW-2 (<3 s p95 spoof promotion) | **Verify** via SITL (`MAV_CMD_SET_EKF_SOURCE_SET` round-trip latency under load) | **Pass** by architecture (companion is sole GPS, no FC-side switch needed) | Pass-by-arch but Fails AC-1.4 |
|
||||
| AC-NEW-8 (visual-blackout + spoofed GPS failsafe; covariance growth + degraded fix levels) | **Pass** (`fix_type` 0/1/2 + `horiz_accuracy=999.0` documented sentinel maps to AC-NEW-8 thresholds) | **Verify** (iNav's `fixType` enum mapping for "no fix" — pre-Step-7.5 follow-up) | **Fail** (no graceful degrade signal) |
|
||||
| AC-3.5 (label switch within ≤1 frame OR ≤400 ms; reject spoofed GPS as input) | **Pass** by architecture (EKF source switch + STATUSTEXT) | **Pass** by architecture (companion suppresses spoofed-GPS contribution upstream) | Pass-by-arch but Fails AC-1.4 |
|
||||
| AC-4.3 (FC accepts the chosen messages) | **Pass** | **Pass** (default build, `USE_GPS_PROTO_MSP` on) | **Pass** but Fails AC-1.4 — discard |
|
||||
| Restriction "Supported FCs: ArduPilot, iNav (both via standard MAVLink)" | **Pass** | **Fail** of "via standard MAVLink" — restriction's literal wording is incorrect because iNav has no inbound MAVLink external-positioning. The restriction must be revised to "ArduPilot via MAVLink GPS_INPUT; iNav via MSP2_SENSOR_GPS". | n/a |
|
||||
|
||||
### Required AC / Restrictions edits flagged for user review
|
||||
|
||||
1. **AC-4.3** — current text says "the standard external-positioning message type(s) accepted by ArduPilot and iNav". Reality: no single message type is accepted by both. **Proposed revision** (outcome-shaped, IEEE-830-style): "WGS84 coordinates are delivered to each supported FC via that FC's documented external-positioning interface — MAVLink `GPS_INPUT` for ArduPilot Plane, MSP2 `MSP2_SENSOR_GPS` for iNav. Honest covariance is carried in the field each FC uses for outlier rejection (under-reported covariance is a defect — see AC-NEW-4). Source-label semantics per AC-1.4 are emitted out-of-band (FC-appropriate STATUSTEXT / NAMED_VALUE_FLOAT / equivalent)."
|
||||
2. **Restriction "Communication protocol (pinned): MAVLink for both FC and GCS"** — incorrect for iNav. **Proposed revision**: "Communication protocol: MAVLink for ArduPilot Plane and for QGroundControl GCS; MSP2 for iNav (UART or USB transport). MAVLink remains the GCS-facing protocol for both FCs." (iNav still emits MAVLink telemetry outbound to QGC; this is preserved.)
|
||||
3. **AC-NEW-2** — keep numerical budget (<3 s p95) but split per-FC validation: ArduPilot validation = SITL round-trip of `MAV_CMD_SET_EKF_SOURCE_SET` from companion under spoof injection; iNav validation = companion-internal reaction time (companion-only metric — iNav doesn't participate).
|
||||
4. **AC-NEW-8** — language "fix-quality 2D fix or worse when covariance > 100 m" maps to `GPS_INPUT.fix_type` for AP. iNav's `fixType` enum mapping (per `gpsFixType_e` in iNav's enums-reference) must be confirmed at design time before this AC is testable on iNav.
|
||||
|
||||
### Open follow-up probes (deferred to SQ8 + design phase, NOT blocking SQ6 closure)
|
||||
|
||||
- **(SQ8)** Confirm the precise MAVLink message + field set ArduPilot exposes for spoofing/jamming integrity reports (PR #2110 merged, but `GPS_RAW_INT` in current published common.xml shows no spoofing bits — likely lives in a sibling message such as `GPS_INTEGRITY`). This is the FC→companion direction needed for AC-NEW-2's input side and AC-3.5's spoofing detection.
|
||||
- **(SQ8)** UBX-emulation minimum NAV-* subset for iNav 9.0 (UBX ≥ 15.00). Authoritative inputs: U-blox protocol spec (cached) + iNav `gps_ublox.c` (cached). Output a "minimum companion-side UBX writer" definition.
|
||||
- **(design)** SITL parameter sets for both FCs for AC-NEW-2 / AC-NEW-8 validation. Out of research scope.
|
||||
- **(design)** Verify iNav nav-stack consumption of `MSP2_SENSOR_GPS.hPosAccuracy` for outlier handling (read `src/main/io/gps_msp.c` / `mspGPSReceiveNewData` in design phase, not research phase).
|
||||
|
||||
### Boundary check: this SQ6 is saturated for the architectural decision
|
||||
|
||||
Saturation signals observed: ArduPilot side covered by L1 docs + L1 source code; iNav side covered by L1 source code (master) + L1 wiki (edited 2025-12-11) + L1 release notes (8.0/9.0). Three independent rounds of search yielded the same architectural conclusion (no inbound external-positioning MAVLink on iNav). Last queries returned no novel facts. Per `references/source-tiering.md` "Search saturation rule" → SQ6 is closed pending the SQ8 follow-up probes above; user decision required on the AC/restriction edits before further architectural work.
|
||||
@@ -0,0 +1,124 @@
|
||||
# Comparison Framework
|
||||
|
||||
> Mode A Phase 2 — engine Step 4 (Build Comparison/Analysis Framework). Aggregates the per-component candidate matrices in `06_component_fit_matrix/` (per-component sub-matrices = 7.5.2; cross-component gates = `99_cross_component_gates.md`) into a single dimension-axis lens.
|
||||
>
|
||||
> **Research Output Class**: Technical-component selection (per `00_question_decomposition.md`). Decision Support framework type (per `references/comparison-frameworks.md`).
|
||||
>
|
||||
> Backing artifacts:
|
||||
> - Source registry: [`01_source_registry/00_summary.md`](01_source_registry/00_summary.md) (#1–#121)
|
||||
> - Fact cards: [`02_fact_cards/00_summary.md`](02_fact_cards/00_summary.md) (#1–#101)
|
||||
> - Component fit matrix: [`06_component_fit_matrix/00_summary.md`](06_component_fit_matrix/00_summary.md)
|
||||
> - Question decomposition + scope: [`00_question_decomposition.md`](00_question_decomposition.md)
|
||||
|
||||
---
|
||||
|
||||
## Selected Framework Type
|
||||
|
||||
**Decision Support** (per `references/comparison-frameworks.md`). The output names specific libraries/SDKs/algorithms that an implementation team will build with on a pinned hardware target (Jetson Orin Nano Super) within a pinned Project Constraint Matrix. Concept-Comparison framework is insufficient because every candidate is being measured against shared numerical AC budgets (latency p95, memory cap, error CDF), not just typed against each other.
|
||||
|
||||
## Selected Dimensions
|
||||
|
||||
The eight Decision Support dimensions from `references/comparison-frameworks.md`, plus four project-mandatory dimensions added because the Project Constraint Matrix demands them:
|
||||
|
||||
| # | Dimension | Why it's in scope |
|
||||
|---|-----------|-------------------|
|
||||
| 1 | **Solution overview** | What this candidate is and what role it plays in the full pipeline. |
|
||||
| 2 | **Implementation cost** | Engineering days/weeks to integrate the candidate against the pinned mode/config. Includes ONNX export work, retraining cost, on-Jetson port effort. |
|
||||
| 3 | **Maintenance cost** | Upstream activity (last commit, issue response time), API stability across versions, dependency-pin risk on Jetson AI Lab community wheels. |
|
||||
| 4 | **Risk assessment** | License posture (per D-C1-1 track), maintenance staleness, cross-domain transfer assumption risk, hardware-specific compile risk. |
|
||||
| 5 | **Expected benefit** | Documentary lift over the mandatory simple-baseline (AUC@5°, Recall@K, latency reduction, accuracy bound). |
|
||||
| 6 | **Applicable scenarios** | UAV-vs-satellite-tile cross-view registration at ~1 km AGL with the Project Constraint Matrix's pinned mission profile. |
|
||||
| 7 | **Team capability requirements** | Specific skills required (TensorRT INT8 calibration, GTSAM factor-graph design, MAVLink/MSP2 protocol authoring). |
|
||||
| 8 | **Migration difficulty** | Cost to swap this candidate for an alternate after Plan-phase lock-in. |
|
||||
| **PROJECT-9** | **License-track posture** | D-C1-1 split (BSD/permissive vs GPL-3.0 vs both) drives candidate eligibility per component. Not generic "license" — the project tracks two parallel candidate axes per row. |
|
||||
| **PROJECT-10** | **AC-NEW-4 covariance-honesty fit** | Project requires explicit 6×6 posterior covariance recovery; only some C4/C5 candidates satisfy this NATIVELY. |
|
||||
| **PROJECT-11** | **AC-4.1 + AC-4.2 fit on Jetson Orin Nano Super SM 87** | Pinned hardware target; FP16/INT8 precision viability per model family + 8 GB shared CPU+GPU + 25 W TDP. |
|
||||
| **PROJECT-12** | **AC-NEW-7 cache-poisoning safety fit** | Specific to C6+C10 path: descriptor cache + tile cache must not silently load corrupted/tampered files. FAISS "no internal integrity check" is the canonical disqualifier. |
|
||||
|
||||
These twelve dimensions are populated component-by-component in §Initial Population below. Each cell cites at least one Fact # or Source # from the backing artifacts.
|
||||
|
||||
---
|
||||
|
||||
## Initial Population
|
||||
|
||||
The matrix is organized component-axis-down × dimension-axis-across. **Each cell summarizes the per-component candidate verdict from the corresponding `06_component_fit_matrix/Cx_*.md` row file**; consult those row files for full per-candidate detail.
|
||||
|
||||
### Component-axis ordering
|
||||
|
||||
| C# | Component | Status (research close) | Selected primary | Selected secondary / fallback / experimental |
|
||||
|----|-----------|-------------------------|------------------|----------------------------------------------|
|
||||
| C1 | Visual / Visual-Inertial Odometry | Doc-closed (Sources #43–#56; Facts in `C1_vio.md`) | OKVIS2 (BSD-3-Clause; modern-competitive-lead) | VINS-Mono (BSD; mandatory simple-baseline); KLT+RANSAC (project-internal homemade fallback) |
|
||||
| C2 | Visual Place Recognition | Doc-closed mandatory pre-screen 5/5 (Sources #57–#68; Facts in `C2_vpr.md`) | MixVPR (MIT; mandatory simple-baseline) on BSD/permissive track; SALAD (GPL-3.0; modern-competitive-lead) on GPL-3.0 track | EigenPlaces (MIT; viewpoint-robust BSD/permissive sibling); SelaVPR (MIT; two-stage DINOv2-L sibling); NetVLAD (MIT canonical; classical-baseline) |
|
||||
| C3 | Cross-domain matchers | Doc-closed (Sources #69–#81; Facts in `C3_matchers.md`) | DISK+LightGlue (Apache-2.0 throughout; recommended-primary-mitigation for canonical-SP-license-disqualifier) | XFeat / XFeat\* / XFeat+LighterGlue (Apache-2.0; alternate-modern-competitive-lead); ALIKED+LightGlue (Apache-2.0; modern-competitive-lead-secondary) |
|
||||
| C4 | Pose estimation (PnP+RANSAC+LM) | Closed at 3/N (Sources #82–#87; Facts in `C4_pose_estimation.md`) | OpenCV `cv::solvePnPRansac` (Apache-2.0; mandatory simple-baseline) wrapped by GTSAM `Marginals` for D-C4-2 covariance recovery (BSD-3-Clause) | OpenGV (BSD-3-Clause-equivalent NOASSERTION pending license-clearance D-C4-3; modern-competitive-lead-richer-minimal-solver) |
|
||||
| C5 | State estimator / sensor fusion | Closed at 2/N batch-1 (Sources #88–#91; Facts in `C5_state_estimator.md`) | Manual ESKF (Solà 2017; project-side implementation under project Apache-2.0; mandatory simple-baseline) | GTSAM iSAM2 + CombinedImuFactor + smart factors + Marginals + IncrementalFixedLagSmoother (BSD-3-Clause; modern-competitive-lead-factor-graph; **shares GTSAM substrate with C4 D-C4-2 = (b)** per D-C5-5 = (c) recommendation) |
|
||||
| C6 | Tile cache + spatial index | Closed at 2/N batch-1 (Sources #92–#98; Facts in `C6_tile_cache_spatial_index.md`) | Mirror-of-suite-`satellite-provider` pattern (PostgreSQL btree + bytea + FAISS HNSW + filesystem; PostgreSQL License + MIT) | PostGIS+pgvector (GPL-2.0-or-later via PostGIS; deferred-secondary, comparative-improvement verdict does NOT clear user's significant-improvement bar) |
|
||||
| C7 | On-Jetson inference runtime | Closed at 3/N batch-1 (Sources #99–#105; Facts in `C7_inference_runtime.md`) | TensorRT native (Apache-2.0 in TRT 10.x; bundled with JetPack 6.2; lowest-latency primary path) | ONNX Runtime + TensorRT EP (MIT; cross-architecture portability for replay/SITL); pure PyTorch FP16 (BSD-3; mandatory simple-baseline + reference-correctness oracle) |
|
||||
| C8 | MAVLink / MSP2 FC adapter | Closed at 3/N batch-1 (Sources #106–#113; Facts in `C8_fc_adapter.md`) | pymavlink → MAVLink `GPS_INPUT` (LGPL-3.0; recommended-primary for ArduPilot Plane); MSP2_SENSOR_GPS via Python MSP V2 (YAMSPy + INAV-Toolkit MIT; recommended-primary for iNav) | UBX impersonation via pyubx2 NAV-PVT (BSD-3-Clause; deferred-secondary for iNav; comparative-improvement verdict does NOT clear user's significant-improvement bar over MSP2_SENSOR_GPS) |
|
||||
| C9 | Datasets / SITL / replay | **DROPPED 2026-05-08 per SQ7/C9 restructure** (deferred to Test Spec greenfield Step 5) | n/a | n/a |
|
||||
| C10 | Pre-flight cache provisioning + sector classification + freshness pipeline | Closed at 2/N batch-1 under CROSS-COUPLING MINIMAL scope (Sources #114–#121; Facts in `C10_preflight_provisioning.md`) | D-C6-3 confirmation: direct `faiss.write_index`/`faiss.read_index` Python API + `python-atomicwrites` + content-hash gate at takeoff load + `IO_FLAG_MMAP_IFC` mmap (FAISS MIT, atomicwrites MIT); D-C7-7 confirmation: hybrid Polygraphy CLI primary + `trtexec` for cache-reuse rebuilds + direct `IBuilderConfig` Python API escape hatch (Apache-2.0 throughout) | Operator CLI/desktop tooling, sector classification heuristics, freshness pipeline workflow — **deferred to Plan-phase as `operator tooling design` out-of-research-scope** |
|
||||
|
||||
### Dimension matrix (compact form)
|
||||
|
||||
The full per-candidate cell content lives in `06_component_fit_matrix/Cx_*.md`. The cells below carry only the cross-component verdict for each dimension.
|
||||
|
||||
| Dimension | C1 (VIO) | C2 (VPR) | C3 (Matchers) | C4 (Pose) | C5 (State estimator) | C6 (Tile cache) | C7 (Inference runtime) | C8 (FC adapter) | C10 (Pre-flight) |
|
||||
|---|---|---|---|---|---|---|---|---|---|
|
||||
| **1. Solution overview** | Frame-to-frame visual+IMU odometry; produces relative poses + IMU bias estimates (Fact #43) | Tile-level global descriptors for retrieval against satellite cache (Facts in `C2_vpr.md`) | UAV-frame ↔ satellite-tile dense cross-domain feature matching for absolute anchor (Facts in `C3_matchers.md`) | 3D-2D RANSAC PnP + LM refinement → 6-DoF anchor pose (Facts #52–#54) | Fuse C1 (VIO), C3 (PnP-anchor), IMU; produce 6-DoF posterior + AC-NEW-4 covariance (Facts #88–#89) | Cache satellite tiles + descriptors + spatial index for AC-3.3 re-loc retrieval (Facts #92–#93) | Run C2/C3/C1 ONNX models on Jetson at AC-4.1 budget (Facts #94–#96) | Deliver final pose to FC over per-FC external-positioning interface (Facts #97–#99) | Build/refresh descriptor cache + TensorRT engines pre-flight (Facts #100–#101) |
|
||||
| **2. Implementation cost** | OKVIS2 ~1-2 weeks integration; KLT+RANSAC ~3-5 days fallback (`C1_vio.md`) | MixVPR ~3-5 days as-is; ~1-2 weeks if D-C2-1 retrain on aerial corpus; SALAD ~similar (`C2_vpr.md`) | DISK+LightGlue ~1 week ONNX export per D-C3-2; +1-2 weeks if D-C2-1 retrain on aerial corpus (`C3_matchers.md`) | OpenCV `cv::solvePnPRansac` ~1-3 days as wrapper; GTSAM `Marginals` recovery ~3-5 days for D-C4-2 = (b) (`C4_pose_estimation.md`) | Manual ESKF from Solà 2017 ~1-2 weeks; GTSAM iSAM2 ~2-3 weeks for full factor-graph (`C5_state_estimator.md`) | Cand 1 (mirror-suite-pattern) ~3-5 days as-is; Cand 2 (PostGIS+pgvector) ~1-2 weeks + PostGIS+pgvector co-installation (`C6_tile_cache_spatial_index.md`) | TensorRT engine builds ~1 week first-model + ~1 day per subsequent model via Polygraphy/trtexec recipe per D-C7-2 + D-C10-5 (`C7_inference_runtime.md`) | pymavlink+GPS_INPUT ~3-5 days; MSP2_SENSOR_GPS via YAMSPy ~3-5 days (`C8_fc_adapter.md`) | Pre-flight orchestration wrapper ~1 week (FAISS write+content-hash + Polygraphy/trtexec invocation) per D-C10-1..D-C10-8 (`C10_preflight_provisioning.md`) |
|
||||
| **3. Maintenance cost** | OKVIS2 maintained 2024-2026; VINS-Mono stable since 2018 (`C1_vio.md`) | MixVPR active 2026; SALAD active 2024-2025 (`C2_vpr.md`) | LightGlue active 2025-2026; XFeat active 2024-2025 (`C3_matchers.md`) | OpenCV LTS 4.x; GTSAM daily-active (last-pushed 2026-05-08 today) (`C4_pose_estimation.md`) | Solà 2017 reference paper stable; GTSAM daily-active (`C5_state_estimator.md`) | PostgreSQL + FAISS stable; pgvector active (`C6_tile_cache_spatial_index.md`) | TensorRT 10.3 stable in JetPack 6.2; Polygraphy + trtexec bundled (`C7_inference_runtime.md`) | pymavlink + YAMSPy active (`C8_fc_adapter.md`) | All dependencies inherited from C6+C7 maintenance posture (`C10_preflight_provisioning.md`) |
|
||||
| **4. Risk assessment** | OKVIS2 GPL-3.0 contingent (D-C1-1 = (a) eligible) (`C1_vio.md`) | SALAD GPL-3.0 contingent; D-C2-1 retrain a real cost; D-C2-5 ViT-export risk (`C2_vpr.md`) | Magic Leap noncommercial license on canonical SP weights = HARD DISQUALIFIER (D-C3-1 forced mitigation) (`C3_matchers.md`) | OpenGV NOASSERTION + ~3 yr stale (D-C4-3 + D-C4-4 mitigations) (`C4_pose_estimation.md`) | Reference ESKF code license uncertainty (D-C5-1 mitigation = re-implement from canonical Solà 2017 paper) (`C5_state_estimator.md`) | PostGIS GPL-2.0-or-later contingent on D-C1-1 = (a) track for Cand 2 (`C6_tile_cache_spatial_index.md`) | TensorRT 10.x Apache-2.0 throughout; Jetson AI Lab community wheels (`C7_inference_runtime.md`) | pymavlink LGPL-3.0 (D-C8-3 mitigation = bundle unmodified) (`C8_fc_adapter.md`) | FAISS "no internal integrity check" (D-C10-3 mitigation = SHA-256 content-hash gate at takeoff) (`C10_preflight_provisioning.md`) |
|
||||
| **5. Expected benefit** | OKVIS2 modern-competitive lift over VINS-Mono on cross-domain tracking (`C1_vio.md`) | SALAD-full +5-7 R@1 over MixVPR-2048 on MSLS Challenge (`C2_vpr.md`) | DISK+LightGlue +7.99 absolute AUC@5° over canonical SP+LightGlue per LightGlue paper Table 6 (`C3_matchers.md`) | GTSAM `Marginals` provides NATIVE 6×6 posterior covariance per Source #87 — unique among C4 candidates (`C4_pose_estimation.md`) | GTSAM iSAM2 NATIVE AC-4.5 look-back refinement unique among C5 candidates (`C5_state_estimator.md`) | Cand 1 verdict: improvements of Cand 2 are "marginal-to-negative" in pinned 3 Hz spatial-grid query context — no material lift (`C6_tile_cache_spatial_index.md`) | TensorRT INT8 ~2-3× speedup over FP16 per Source #102 YOLO26n benchmark (`C7_inference_runtime.md`) | All 3 FC paths satisfy AC-4.3 by design (`C8_fc_adapter.md`) | Polygraphy `--data-loader-script` cleaner than hand-written `IInt8EntropyCalibrator2` (Source #117 + #118) (`C10_preflight_provisioning.md`) |
|
||||
| **6. Applicable scenarios** | All C1 candidates apply to nadir-down ~1 km AGL flight (`C1_vio.md`) | All C2 candidates trained on street-view; D-C2-1 retrain required for aerial domain (`C2_vpr.md`) | All C3 candidates retrain-friendly to aerial domain (`C3_matchers.md`) | OpenCV simple-baseline + GTSAM modern competitive lead apply throughout (`C4_pose_estimation.md`) | Manual ESKF for fixed-wing cruise; GTSAM iSAM2 for sliding-window refinement (`C5_state_estimator.md`) | Cand 1 mirrors verified-existing `satellite-provider` pattern (Source #92 filesystem read) (`C6_tile_cache_spatial_index.md`) | TensorRT + Polygraphy + trtexec all run on Jetson Orin Nano Super SM 87 per Source #105 (`C7_inference_runtime.md`) | pymavlink GPS_INPUT covers ArduPilot Plane (verified Source #4 + #106 + #107); MSP2_SENSOR_GPS covers iNav (verified Source #111 + #112 + #113) (`C8_fc_adapter.md`) | All Source #114-#121 evidence on Jetson Orin Nano Super SM 87 (`C10_preflight_provisioning.md`) |
|
||||
| **7. Team capability requirements** | C++ + ROS comfort for OKVIS2; basic OpenCV for KLT (`C1_vio.md`) | PyTorch + ONNX export literacy for VPR (`C2_vpr.md`) | Same + LightGlue API + DISK ONNX export (`C3_matchers.md`) | OpenCV calib3d + GTSAM Python API + factor graph design (`C4_pose_estimation.md`) | NumPy/SciPy for ESKF; GTSAM C++/Python factor-graph design + iSAM2 internals (`C5_state_estimator.md`) | PostgreSQL DBA + FAISS Python API; Cand 2 adds PostGIS + pgvector + Jetson aarch64 build (`C6_tile_cache_spatial_index.md`) | TensorRT INT8 calibration + ONNX export + Jetson AI Lab wheel management (`C7_inference_runtime.md`) | MAVLink protocol literacy + iNav MSP V2 protocol literacy (`C8_fc_adapter.md`) | Bash/Python orchestration + crash-safe atomic file writes + FAISS + TensorRT (`C10_preflight_provisioning.md`) |
|
||||
| **8. Migration difficulty** | OKVIS2 → VINS-Mono ~1 week swap (similar interface) (`C1_vio.md`) | MixVPR → SALAD ~1 week swap (`C2_vpr.md`) | DISK+LightGlue → ALIKED+LightGlue ~1 week swap (`C3_matchers.md`) | OpenCV→OpenGV ~2 weeks if D-C4-3+D-C4-4 close (`C4_pose_estimation.md`) | Manual ESKF → GTSAM iSAM2 ~2-3 weeks (different state representation) (`C5_state_estimator.md`) | Cand 1 → Cand 2 ~1-2 weeks if Cand 2 elevated (`C6_tile_cache_spatial_index.md`) | TRT-native → ONNX Runtime+TRT EP ~3-5 days for portability path (`C7_inference_runtime.md`) | Cand 2 (MSP2) → Cand 3 (UBX) ~1-2 weeks (different message family) (`C8_fc_adapter.md`) | Tools are interchangeable per D-C10-5 = (d) hybrid (`C10_preflight_provisioning.md`) |
|
||||
| **PROJECT-9. License-track posture** | BSD/permissive: VINS-Mono / OKVIS2 / Kimera-VIO / DPVO / KLT+RANSAC. GPL-3.0: VINS-Fusion / OpenVINS (`C1_vio.md`) | BSD/permissive: MixVPR + SelaVPR + NetVLAD + EigenPlaces (4-mode COMPLETE). GPL-3.0: SALAD + (conditional AnyLoc/BoQ/DINOv2-VLAD) (`C2_vpr.md`) | BSD/permissive: DISK+LightGlue + ALIKED+LightGlue + XFeat + XFeat+LighterGlue (4-mode COMPLETE). HARD DISQUALIFIER: canonical SP+LightGlue (Magic Leap noncommercial) (`C3_matchers.md`) | All 3 candidates BSD/permissive (Apache-2.0 / BSD-3-Clause / NOASSERTION pending) (`C4_pose_estimation.md`) | All BSD/permissive (Solà 2017 paper public-domain canonical equations + project-side Apache-2.0 implementation; GTSAM BSD-3-Clause) (`C5_state_estimator.md`) | Cand 1 BSD/permissive (PostgreSQL License + MIT). Cand 2 GPL-2.0-or-later via PostGIS — gated on D-C1-1 = (a) (`C6_tile_cache_spatial_index.md`) | All BSD/permissive (TensorRT 10.x Apache-2.0; ORT MIT; PyTorch BSD-3-Clause) (`C7_inference_runtime.md`) | Cand 1 LGPL-3.0 (D-C8-3 mitigation); Cand 2 + Cand 3 MIT/BSD-3 (`C8_fc_adapter.md`) | All BSD/permissive (FAISS MIT; atomicwrites MIT; Polygraphy + TensorRT 10.x Apache-2.0) (`C10_preflight_provisioning.md`) |
|
||||
| **PROJECT-10. AC-NEW-4 covariance-honesty fit** | n/a (C1 produces relative poses; covariance is C5's job) | n/a (C2 produces descriptor distances; covariance is C5's job) | n/a (C3 produces feature matches; covariance is C5's job) | **OpenCV: NO native 6×6 covariance — D-C4-2 mitigation REQUIRED** (post-hoc Jacobian or wrap in GTSAM Marginals); **GTSAM: NATIVE via `Marginals.marginalCovariance` — only candidate that satisfies AC-NEW-4 NATIVELY** | Manual ESKF: NATIVE via analytic Jacobian (Solà §6); GTSAM iSAM2: NATIVE via `Marginals.marginalCovariance` (`C5_state_estimator.md`) | n/a (C6 stores tiles + descriptors; covariance is C5's job) | n/a (C7 runs models; covariance is C5's job) | **C8 enforces AC-NEW-4 via D-C8-8 per-FC unit conversion** (extracts 2×2 horizontal sub-matrix from C5 GTSAM `Marginals` 6×6, computes 95% confidence ellipse semi-major axis, emits as `horiz_accuracy` for AP / `hPosAccuracy` for iNav) | n/a (C10 is pre-flight; runtime covariance is C5's job) |
|
||||
| **PROJECT-11. AC-4.1 + AC-4.2 fit on Jetson Orin Nano Super SM 87** | OKVIS2 ~30-50 ms per frame on Jetson Orin Nano Super extrapolation; KLT ~5-10 ms (`C1_vio.md`) | MixVPR ~10-20 ms FP16 + ~5-10 ms INT8 per query on Jetson per Source #102 extrapolation (`C2_vpr.md`) | DISK+LightGlue ~30-60 ms per pair FP16 on Jetson per Source #103 extrapolation; tight at K=10 pairs (`C3_matchers.md`) | OpenCV ~5-15 ms per RANSAC iteration; GTSAM `Marginals` ~30-90 ms per pose recovery (Plan-phase Jetson MVE) (`C4_pose_estimation.md`) | Manual ESKF ~5-15 ms per update; GTSAM iSAM2 ~5-100 ms per update depending on D-C5-5 factor density (`C5_state_estimator.md`) | Cand 1 ~6-54 ms per cache hit (Postgres btree + FAISS HNSW); Cand 2 5-10× slower geographic lookup per Source #93 (`C6_tile_cache_spatial_index.md`) | INT8+FP16 mixed per D-C7-6 per-family policy meets AC-4.1 across pipeline; ~700 MB-1.5 GB total memory within AC-4.2 (`C7_inference_runtime.md`) | pymavlink + MSP2 send-side ~1-5 ms per message; rate 5 Hz per D-C8-5 (`C8_fc_adapter.md`) | Pre-flight only; not in AC-4.1 budget. Takeoff load <5 s per D-C10-4 mmap path (`C10_preflight_provisioning.md`) |
|
||||
| **PROJECT-12. AC-NEW-7 cache-poisoning safety fit** | n/a | n/a | n/a | n/a | n/a | Cand 1: filesystem tile storage + content-hash mandate per restrictions.md (`C6_tile_cache_spatial_index.md`); Cand 2: pgvector descriptor verification deferred to Plan-phase | n/a (TensorRT engines per-build, manifest-tracked per D-C10-7) | n/a | **D-C10-3 content-hash verification gate at takeoff load = direct AC-NEW-7 satisfaction**; D-C10-2 atomic-write mitigates the truncated-file class separately (`C10_preflight_provisioning.md`) |
|
||||
|
||||
---
|
||||
|
||||
## Cross-component coupling table (read alongside the dimension matrix)
|
||||
|
||||
The dimension matrix above hides the inter-component design coupling. The cross-component gates file [`06_component_fit_matrix/99_cross_component_gates.md`](06_component_fit_matrix/99_cross_component_gates.md) lists every D-Cx-y gate; the most architecturally significant couplings are:
|
||||
|
||||
| Coupling | Components | Recommended path | Why it matters |
|
||||
|---|---|---|---|
|
||||
| **Shared GTSAM substrate** (D-C5-5 = (c)) | C4 D-C4-2 = (b) wraps `solvePnPRansac` in GTSAM `Marginals`; C5 GTSAM iSAM2 fuses C4 anchor as `PriorFactorPose3` with native 6×6 covariance | RECOMMENDED — couples C4+C5 covariance recovery via shared GTSAM substrate; satisfies AC-NEW-4 NATIVELY at both layers; eliminates impedance-mismatch at the C4↔C5 boundary | Strongest cross-component lever in the C4+C5 design space; reduces dependency footprint by sharing GTSAM library between two layers; reduces engineering cost (D-C4-2 + D-C5 share calibration of factor weights) |
|
||||
| **Per-model-family precision policy** (D-C7-6 = (b)) | C2 (CNN VPR backbones), C3 (matchers), C1 (learned VIO frontends), C7 (TensorRT) | RECOMMENDED — VPR backbones INT8+FP16 mixed; matchers FP16-only NO INT8; ViT-class VPR FP16-only initially; learned VIO FP16-only initially | Source #103 LightGlue FP8 quantization-sensitivity finding drives the matchers→FP16-only carve-out; ignoring this risks AC-1.1/1.2 frame-center accuracy violations |
|
||||
| **C6 ↔ C10 descriptor-cache rebuild orchestration** (D-C6-3 closure + D-C10-1..D-C10-4) | C6 (cache file structure), C10 (rebuild trigger + atomic-write + content-hash gate) | RECOMMENDED — manifest-hash-driven rebuild + `python-atomicwrites` + SHA-256 content-hash gate at takeoff + mmap load with `madvise(MADV_WILLNEED)` | C10 owns the rebuild pipeline; C6 owns the cache file format; AC-NEW-7 cache-poisoning safety satisfied at the D-C10-3 gate |
|
||||
| **C7 ↔ C10 TensorRT engine-build orchestration** (D-C7-7 closure + D-C10-5..D-C10-8) | C7 (precision policy + JetPack pin), C10 (orchestration tool matrix + filename schema + fallback venue) | RECOMMENDED — hybrid Polygraphy + trtexec + direct API matrix per D-C10-5 = (d); self-describing filename schema per D-C10-7; reference Jetson at HQ + deployed-Jetson-copy-to-archive per D-C10-8 | TensorRT engines are SM-version-tied per Source #105; D-C7-7 = (c) primary build-on-target with reference-Jetson fallback engines closes the operational risk |
|
||||
| **C5 ↔ C8 covariance contract** (D-C8-8 = (b)) | C5 GTSAM `Marginals` 6×6 posterior, C8 per-FC `horiz_accuracy`/`hPosAccuracy` extraction | RECOMMENDED — extract 2×2 horizontal sub-matrix from C5 `Marginals.marginalCovariance`, compute 95% confidence ellipse semi-major axis `sqrt(2.0 * 5.991 * λ_max)`, emit per-FC | Strongest C5+C8 cross-component coupling; AC-NEW-4 covariance-honesty obligation is the same for both FCs; only the unit + field-name change |
|
||||
| **C1 ↔ C2 ↔ C5 frame-rate pipeline** (Fact #40 dual-rate camera pipeline) | C1 (VIO at ~10 Hz), C2 (VPR at ~3 Hz), C5 (estimator-output at ~3 Hz nominal up to ~10 Hz when matcher confidence high) | RECOMMENDED — single-rate vs dual-rate is a Plan-phase decision; affects C1 candidate ranking + C2/C3 candidate scoring | Fact #40 was raised by the SQ2 closure as cross-cutting; resolution lives at Plan-phase |
|
||||
|
||||
---
|
||||
|
||||
## Decisions accumulated across the matrix (D-Cx-y by owner)
|
||||
|
||||
The full per-decision text is in [`06_component_fit_matrix/99_cross_component_gates.md`](06_component_fit_matrix/99_cross_component_gates.md). Aggregate count by owner:
|
||||
|
||||
| Owner | Count | Notes |
|
||||
|---|---|---|
|
||||
| User + Plan-phase architect | 4 | D-C1-1 license posture, D-C2-1 VPR retrain, D-C3-1 matcher mitigation, D-C2-11 MegaLoc successor evaluation |
|
||||
| User + license-posture decision-maker | 1 | D-C2-8 NetVLAD PyTorch-port-strategy + license verification |
|
||||
| Plan-phase architect | 27 | D-C2-2..D-C2-7 + D-C2-9..D-C2-10 + D-C3-3..D-C3-6 + D-C4-1..D-C4-4 + D-C5-1..D-C5-5 + D-C6-1..D-C6-7 + D-C7-1..D-C7-9 + D-C8-1..D-C8-8 + D-C10-1..D-C10-8 |
|
||||
| Project bring-up team / C7 inference-runtime owner | 4 | D-C1-2 Jetson MVE, D-C2-4 + D-C2-5 ViT export, D-C7-3..D-C7-5 Jetson AI Lab wheel pinning, D-C3-2 LightGlue runtime |
|
||||
| User + AC-NEW-7 owner | 1 | D-C10-3 content-hash verification gate (CROSS-COMPONENT) |
|
||||
| User + AC-NEW-4 owner | 1 | D-C8-8 covariance-honesty cross-FC enforcement (CROSS-COMPONENT) |
|
||||
|
||||
The 27 Plan-phase-architect-owned decisions are the surface area the Plan skill (greenfield Step 3) must traverse. None requires user input as a hard prerequisite to start Plan, but D-C1-1 (license posture) is recommended to be confirmed by the user upfront because it gates which candidates per row are eligible.
|
||||
|
||||
---
|
||||
|
||||
## What this framework does NOT cover (deliberately deferred)
|
||||
|
||||
| Out-of-scope here | Where it goes | Reason |
|
||||
|---|---|---|
|
||||
| Fixture-file pin for D-C7-1 calibration corpus (e.g., AerialVL S03 vs Mavic + Derkachi flight clips) | Test Spec (greenfield Step 5) | Fixture-class; doesn't change architectural choice |
|
||||
| Sector classification heuristics (active-conflict vs stable rear) | Plan-phase architect + operations team | Operational; AC-8.2 freshness threshold is operational not architectural |
|
||||
| Operator CLI/desktop tooling for C10 pre-flight provisioning | Plan-phase architect + UX | Tool shape is UX/integration, doesn't bind architectural contract |
|
||||
| Tile freshness pipeline workflow (when to re-pull from Suite Sat Service) | Plan-phase architect + operations team | Operational; cross-coupling with runtime architecture is mediated entirely via C6 + C10 cache files |
|
||||
| Test datasets / SITL replay environments (was C9) | Test Spec (greenfield Step 5) | Per 2026-05-08 SQ7/C9 restructure |
|
||||
| Engine-step SQ5 (failure modes / deployment lessons) | Plan-phase architect — interleaved | Per investigation-order pin in `00_question_decomposition.md` |
|
||||
| Engine-step SQ8 (safety considerations AC-NEW-4 / AC-NEW-7) | Plan-phase architect | Carries the AP_GPS spoofing-signal probe deferred from SQ6 |
|
||||
@@ -0,0 +1,320 @@
|
||||
# Reasoning Chain
|
||||
|
||||
> Mode A Phase 2 — engine Step 6 (Fact-to-Conclusion Reasoning Chain). Walks each dimension from `03_comparison_framework.md` as `fact → mechanism comparison → conclusion`. Conclusions come from mechanism comparison, not "gut feelings" (per `references/quality-checklists.md`).
|
||||
>
|
||||
> Backing artifacts: source registry [`01_source_registry/00_summary.md`](01_source_registry/00_summary.md) (#1–#121); fact cards [`02_fact_cards/00_summary.md`](02_fact_cards/00_summary.md) (#1–#101); component fit matrix [`06_component_fit_matrix/00_summary.md`](06_component_fit_matrix/00_summary.md); cross-component gates [`06_component_fit_matrix/99_cross_component_gates.md`](06_component_fit_matrix/99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## Dimension 1: Solution overview — pipeline shape
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
The canonical GPS-denied UAV navigation pipeline converges on **`retrieval → matching → pose-estimation → fusion`** with VIO/IMU as auxiliary, per multiple SQ2 surveys (Skoltech aerial VPR, U.Maine cross-view, OrthoLoC 2.5D geodata, AnyVisLoc low-altitude multi-view, NUDT 2026 sciopen survey — Sources #38–#42; Facts in `02_fact_cards/SQ2_canonical_pipeline.md`).
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
End-to-end visual-localization (single-network direct lat/lon regression) was rejected per Source #38 evidence of poor cross-domain generalization + no native covariance output. Twist Robotics OSCAR (Ukrainian peer, deployed; Source #25), Auterion Artemis (Skynode N + Visual Navigation, Ukraine-tested; Source #31), and snktshrma/ngps_flight (NGPS GSoC 2024 — LightGlue+SuperPoint+UKF+VISION_POSITION_ESTIMATE; Source #33) all converge on the same hierarchical retrieval+matching+pose+EKF pipeline.
|
||||
|
||||
### Conclusion
|
||||
|
||||
The project's pipeline shape is **C2 (VPR retrieval) → C3 (cross-domain matcher) → C4 (PnP+RANSAC+LM pose) → C5 (state estimator fusion with C1 VIO + IMU) → C8 (per-FC adapter to deliver pose to flight controller)**, with C6 (tile cache + spatial index) feeding C2 retrieval and C7 (on-Jetson inference runtime) hosting all learned models, and C10 (pre-flight cache provisioning) building C6's descriptor cache + C7's TensorRT engines before takeoff. **No deviation from the canonical pipeline is justified by evidence**.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — five independent SQ2 surveys agree; three independent SQ1 deployed/peer systems agree; rejection of end-to-end alternatives is L1-evidence-backed.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 2: Implementation cost & dependency footprint
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
Per-component implementation-cost cells in `03_comparison_framework.md` row 2 sum to a roughly **8-12 week** integration window for the recommended primary candidates (single engineer FTE, no parallelization), broken down: C1 OKVIS2 ~1-2 wk; C2 MixVPR base ~3-5 days + D-C2-1 retrain ~1-2 wk; C3 DISK+LightGlue ~1 wk + retrain ~1-2 wk; C4 OpenCV+GTSAM ~3-5 days; C5 Manual ESKF ~1-2 wk + GTSAM iSAM2 ~2-3 wk; C6 mirror-suite-pattern ~3-5 days; C7 TensorRT engine builds ~1 wk first model + ~1 day each subsequent; C8 pymavlink+MSP2 ~1 wk total; C10 orchestration wrapper ~1 wk. Plus the dedicated Jetson MVE bring-up phase (D-C1-2) ~1-2 weeks before any candidate can be locked.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
Selecting GPL-3.0-track candidates (D-C1-1 = (a)) costs roughly the same engineering time but adds license-track gating (forces SALAD over MixVPR for C2; forces VINS-Fusion or OpenVINS over OKVIS2/Kimera-VIO for C1). Selecting Cand 2 (PostGIS+pgvector) over Cand 1 (mirror-suite-pattern) for C6 adds ~1-2 weeks PostGIS+pgvector co-installation Jetson MVE work + cross-suite cascade decision (D-C6-7). UBX impersonation (Cand 3 for C8) adds ~1-2 weeks vs MSP2_SENSOR_GPS without measurable benefit.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**Recommended implementation order**: C8 (FC adapter) + C7 (Jetson runtime) + C6 (cache) before C2/C3 (because the latter depend on the former for pre-flight build + runtime hosting), then C2/C3 (with parallel D-C2-1 retrain), then C1+C4+C5 in parallel (each consumes an independent input class), then C10 orchestration wrapper as the integration capstone. Gives a parallelizable critical path of ~6-8 weeks for two engineers + ~2-week Jetson MVE bring-up overlap.
|
||||
|
||||
### Confidence
|
||||
|
||||
⚠️ Medium — engineering estimates are L3 inferential (no L1 measured-time evidence), but per-component closure verdicts in `06_component_fit_matrix/Cx_*.md` cite the specific work items and their L1 supporting docs.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 3: Maintenance cost & dependency stability
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
The recommended primary stack consists of dependencies whose maintenance posture is verified at access time (`02_fact_cards/Cx_*.md` per-fact "Date accessed" lines):
|
||||
- OpenCV LTS 4.x — stable since 2018 (Source #82+#83); Apache-2.0
|
||||
- GTSAM — daily-active, last-pushed 2026-05-08 = TODAY at access time (Source #86+#87+#90+#91); BSD-3-Clause
|
||||
- TensorRT 10.3 — bundled with JetPack 6.2; Apache-2.0 in TRT 10.x (Source #99+#104+#105)
|
||||
- LightGlue — active 2025-2026 (Source #69+#70+#71); Apache-2.0
|
||||
- DISK — Apache-2.0 (Source #76+#77); paper 2020 + active maintenance
|
||||
- pymavlink — LGPL-3.0 (Source #106); ArduPilot-canonical
|
||||
- FAISS — MIT (Source #114); Facebook Research, daily-active
|
||||
|
||||
The most-stale recommended primary dependency is OpenGV (Source #84) with a last-pushed of 2023-06-07 — gated behind D-C4-3 (license clearance) + D-C4-4 (maintenance staleness mitigation) closures because it is recommended only as the modern-competitive-lead-richer-minimal-solver for C4, not as the primary path.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
The Jetson AI Lab community wheels (D-C7-3 + D-C7-4 + D-C7-5) are the highest dependency-pin risk in the stack — they're community-maintained and have a release cadence independent of upstream PyTorch/onnxruntime-gpu. Mitigation: D-C7-3 = (c) mirror to project-controlled artifact registry; D-C7-9 lock JetPack 6.2 + TRT 10.3 for first deployment.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**Maintenance posture is BSD/permissive-clean across the recommended primary stack**, with two contained risks: (a) Jetson AI Lab community-wheel cadence (mitigated via D-C7-3 mirror), (b) OpenGV staleness (mitigated via D-C4-4 fork-and-patch). No recommended primary dependency is on a deprecated, abandoned, or reverse-license-shifted project.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — every dependency's maintenance signal verified via L1 source (GitHub last-pushed timestamp + license file + canonical doc index).
|
||||
|
||||
---
|
||||
|
||||
## Dimension 4: Risk assessment — license, hardware, cross-domain transfer
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
Three categories of risk, per the cross-component gates file `99_cross_component_gates.md`:
|
||||
|
||||
**License risk**: D-C1-1 user-decision split (BSD/permissive vs GPL-3.0 vs both) drives candidate eligibility per row. Hard disqualifiers: canonical SP+LightGlue (Magic Leap noncommercial — D-C3-1 forced mitigation); MASt3R (CC-BY-NC). Contingent: SALAD (GPL-3.0; D-C2-N gating); PostGIS (GPL-2.0-or-later; D-C6-7 gating); pymavlink (LGPL-3.0; D-C8-3 mitigation = bundle unmodified per LGPL §6).
|
||||
|
||||
**Hardware-target risk**: TensorRT engines are tied to (SM 87, JetPack 6.2, TRT 10.3, precision mode) per Source #105 — cannot be transferred between Jetson SKUs or across JetPack point releases. Mitigation: D-C7-7 = (c) primary build-on-target + reference-Jetson fallback; D-C10-7 self-describing filename schema; D-C10-8 reference Jetson at HQ + deployed-Jetson-copy-to-archive.
|
||||
|
||||
**Cross-domain transfer risk**: All C2 VPR candidates are street-view-pretrained (per Facts in `02_fact_cards/C2_vpr.md`); D-C2-1 retrain on aerial corpus is required to close the cross-domain gap. AnyVisLoc + AerialExtreMatch + OrthoLoC 2.5D surveys (Sources #34, #40, #41) all confirm street-view → aerial cross-domain transfer is the dominant accuracy-loss source.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
Rejected risk-mitigation alternatives:
|
||||
- "Ship without retrain" — would violate AC-1.1/1.2 frame-center accuracy on cross-domain UAV-vs-satellite-tile inference.
|
||||
- "Build TensorRT engines on x86 dev machine and copy to Jetson" — IMPOSSIBLE per Source #105 hardware-tied constraint.
|
||||
- "Skip license posture and ship under permissive default" — would force project to either (a) accept GPL-3.0 contagion if user-chosen GPL-3.0 candidates are linked, or (b) silently exclude GPL-3.0 candidates without user awareness.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**Risk is decomposable into three independent gates**: license-track (D-C1-1) for source eligibility, hardware-tied-engine (D-C7-7 + D-C10-5..D-C10-8) for runtime artifact provenance, cross-domain transfer (D-C2-1) for accuracy. Each gate has a closed mitigation pathway. No risk is open-ended.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — every cited risk has an L1-evidence-backed mitigation path documented in the cross-component gates file.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 5: Expected benefit — quantified lift over mandatory baseline
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
Per-component documentary lift over each component's mandatory simple-baseline:
|
||||
- **C1**: OKVIS2 vs VINS-Mono — modern-competitive lift on cross-domain tracking robustness (Fact #44 / Fact #47); not quantified at the per-pixel error level.
|
||||
- **C2**: SALAD-full vs MixVPR-2048 = +5-7 R@1 on MSLS Challenge (Fact in `C2_vpr.md`); EigenPlaces vs MixVPR = -0.6 R@1 at 512-D variant (paper Tab 3, Fact #20).
|
||||
- **C3**: DISK+LightGlue vs canonical SP+LightGlue = +7.99 absolute AUC@5° on IMC 2020 stereo (LightGlue paper Appendix A Table 6, Fact in `C3_matchers.md`); ALIKED vs SP = +1-3 absolute AUC@5° per ALIKED paper Table VII.
|
||||
- **C4**: GTSAM `Marginals` vs OpenCV `solvePnPRansac` post-hoc Jacobian = NATIVE 6×6 covariance recovery vs ~3-5 day engineering cost for hand-rolled Jacobian + Schur complement (Fact #54 vs Fact #52).
|
||||
- **C5**: GTSAM iSAM2 vs Manual ESKF = NATIVE AC-4.5 look-back refinement + smoother bias estimation across sliding window (Fact #89 vs Fact #88); pure ESKF has no look-back.
|
||||
- **C6**: Cand 2 (PostGIS+pgvector) vs Cand 1 (mirror-suite-pattern) = native KNN + radius queries; but **5-10× slower geographic lookup** at the project's pinned 3 Hz spatial-grid query rate per Source #93 + Source #97 evidence — improvements MARGINAL-TO-NEGATIVE.
|
||||
- **C7**: TensorRT INT8 vs FP16 = 2-3× speedup per Source #102 YOLO26n benchmark on Jetson Orin Nano Super.
|
||||
- **C8**: All three FC candidates satisfy AC-4.3 by design; UBX impersonation (Cand 3) provides no measurable AC-4.3 lift over MSP2_SENSOR_GPS (Cand 2); mid-batch comparative-improvement verdict locked Cand 2 as primary for iNav.
|
||||
- **C10**: Polygraphy `--data-loader-script` cleaner than hand-written `IInt8EntropyCalibrator2` (Source #117 + #118 vs Source #121); calibration-cache reuse keeps subsequent rebuilds <30 sec vs 10-30 minute first-build cost.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
The **only** dimension where a modern-competitive-lead candidate is being preferred over the mandatory simple-baseline AS THE PRIMARY PATH is C3 (DISK+LightGlue over canonical SP+LightGlue) — and that's forced by license disqualifier on canonical SP weights, not by lift alone. Every other component keeps the mandatory simple-baseline as the primary path (or as a co-primary alongside the modern-competitive-lead per the GTSAM-shared-substrate hybrid for C4+C5).
|
||||
|
||||
### Conclusion
|
||||
|
||||
**Expected benefit is asymmetric across components**: C3 has a forced-modern-lead path; C4+C5 have a recommended hybrid (simple-baseline at the algorithmic core + modern-competitive-lead for covariance recovery via shared GTSAM); all other components keep the simple-baseline as primary. This shape minimizes the radius of any single component swap and preserves AC-NEW-4 covariance honesty NATIVELY at the C4+C5 layer.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — per-component lift cells cite specific paper tables / benchmark numbers / API capability evidence.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 6: Applicable scenarios — pinned mission-profile fit
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
The Project Constraint Matrix (`00_problem/restrictions.md` + `00_problem/acceptance_criteria.md`) pins the deployment context: fixed-wing UAVs, eastern/southern Ukraine, 8 h flights at ~60 km/h cruise, ≤1 km AGL, sector ≤150 km² + transit corridor ~50 km², predominantly sunny daytime with seasonal/visibility class coverage required, Jetson Orin Nano Super (8 GB shared, 25 W TDP), ArduPilot Plane + iNav as the supported FCs (PX4 explicitly out of scope).
|
||||
|
||||
Per-component applicability:
|
||||
- **C1, C2, C3, C4, C5**: All recommended primary candidates apply to nadir-down ~1 km AGL flight; D-C2-1 retrain on aerial corpus closes the street-view-pretrained gap.
|
||||
- **C6**: Cand 1 mirrors verified-existing parent-suite `satellite-provider` pattern (Source #92 filesystem read at `/Users/obezdienie001/dev/azaion/suite/satellite-provider/`).
|
||||
- **C7**: TensorRT + Polygraphy + trtexec all run on Jetson Orin Nano Super SM 87 per Source #105.
|
||||
- **C8**: pymavlink GPS_INPUT covers ArduPilot Plane (verified Source #4 + #106 + #107); MSP2_SENSOR_GPS covers iNav (verified Source #111 + #112 + #113); both within AC-4.3 contract.
|
||||
- **C10**: All Source #114-#121 evidence on Jetson Orin Nano Super SM 87 + JetPack 6.2 + CUDA 12.6 + TRT 10.3 + cuDNN 9.3 stack.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
Auterion Artemis (Source #31) deploys the same canonical pipeline shape on similar Jetson-class hardware (Skynode N) in Ukrainian theater with reportedly 1000-mile range — but on a closed-source proprietary stack. NGPS (Source #33) deploys SP+LightGlue+UKF+VISION_POSITION_ESTIMATE on ArduPilot — confirms ArduPilot Plane + visual-localization companion pattern is operationally validated. The novelty in this project relative to existing systems is (a) iNav support (no other open-source GPS-denied companion targets iNav), (b) AC-NEW-7 cache-poisoning safety budget (no existing system enforces multi-flight Service-side ingest voting on tile geo-alignment).
|
||||
|
||||
### Conclusion
|
||||
|
||||
**Every recommended primary candidate is applicable to the pinned mission profile with no open scope mismatches**. The two project-novel elements (iNav adapter + cache-poisoning safety) are covered by C8 (MSP2_SENSOR_GPS path) and C10+C6 (D-C10-3 content-hash gate) respectively, both with selected candidates and mitigations in place.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — every applicability claim cites either a verified-existing pattern in the parent suite OR an L1 documentary source for the deployed hardware.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 7: Team capability requirements
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
The recommended primary stack requires the following skill set: PyTorch + ONNX export literacy (C1/C2/C3), TensorRT INT8 calibration via Polygraphy CLI (C7+C10), GTSAM Python API + factor-graph design (C4 D-C4-2 = (b) + C5 D-C5-5 = (c)), MAVLink + iNav MSP V2 protocol literacy (C8), PostgreSQL + FAISS Python API (C6), bash/Python orchestration with crash-safe atomic file writes (C10). C++ + ROS comfort needed for OKVIS2 (C1 modern-competitive-lead) but OPTIONAL — KLT+RANSAC fallback (Fact #53) is pure-OpenCV Python.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
Alternate recommendations would shift skill demand: choosing OpenGV (Source #84) for C4 would add ~3-5 days engineering for OpenGV-internal Jacobian propagation through bearing-vector residuals (harder than OpenCV's pixel-Jacobian per Fact #53 closure); choosing UBX impersonation (Cand 3) for C8 would add UBX protocol literacy + AC-NEW-7 audit-trail design (D-C8-7); choosing Cand 2 (PostGIS+pgvector) for C6 would add PostGIS-on-aarch64 build literacy.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**The recommended primary stack maps to a 2-engineer team with a junior+mid Python/C++ split**: senior engineer drives the GTSAM-shared-substrate hybrid (C4+C5) + the FC adapter integration (C8), junior+mid engineer drives the rest (C1 fallback + C2/C3 + C6 + C7 + C10 + Test Spec deliverables). No specialty (e.g., Cython, Rust, native-CUDA-kernel authoring, GPU-driver internals, FPGA programming) outside the standard CV/ML/robotics-engineering Python + C++ stack is required.
|
||||
|
||||
### Confidence
|
||||
|
||||
⚠️ Medium — team-capability mapping is L3 inferential; per-component skill demand is L1-evidence-backed.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 8: Migration difficulty — swap cost across components
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
Per-component swap-cost cells in `03_comparison_framework.md` row 8 are bounded at ~2-3 weeks max for the most expensive swap (Manual ESKF → GTSAM iSAM2 = different state representation per `C5_state_estimator.md`). Most swaps are ~1 week (DISK+LightGlue → ALIKED+LightGlue; OKVIS2 → VINS-Mono; MixVPR → SALAD; TRT-native → ONNX Runtime+TRT EP; Cand 2 MSP2 → Cand 3 UBX impersonation). C7 hybrid orchestration tools (D-C10-5 = (d)) are interchangeable per the hybrid policy.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
Cross-component swap costs are smaller than within-component swaps because the C2/C3 boundary is well-defined (descriptor → matcher API) and the C4/C5 boundary is well-defined (anchor pose + 6×6 covariance → estimator factor). The exception is the C4+C5 GTSAM-shared-substrate hybrid (D-C5-5 = (c)) — swapping out GTSAM at C5 would force reverting D-C4-2 = (b) and re-engineering C4's covariance recovery via post-hoc Jacobian (D-C4-2 = (a)) at ~1 week additional cost.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**Migration difficulty is bounded and per-component**. The largest swap radius is the GTSAM-shared-substrate hybrid (~3-4 weeks combined cost to revert both D-C4-2 + D-C5-5), but reverting it is a Plan-phase decision that doesn't surface at runtime. No component lock-in exceeds ~3 weeks of engineering, which is well within typical Plan-cycle revision budgets.
|
||||
|
||||
### Confidence
|
||||
|
||||
⚠️ Medium — engineering swap estimates are L3 inferential; per-component swap pathways are L1-evidence-backed.
|
||||
|
||||
---
|
||||
|
||||
## Dimension PROJECT-9: License-track posture (D-C1-1 split)
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
D-C1-1 user-decision splits the candidate landscape into BSD/permissive vs GPL-3.0 tracks. The BSD/permissive track is COMPLETE for C2 (4 mandatory candidates: MixVPR + SelaVPR + NetVLAD + EigenPlaces), C3 (4 candidates: DISK+LightGlue + ALIKED+LightGlue + XFeat + XFeat+LighterGlue, after canonical SP+LightGlue HARD DISQUALIFIER from Magic Leap noncommercial license), C4 (3 candidates: OpenCV + OpenGV pending D-C4-3 + GTSAM), C5 (Manual ESKF + GTSAM iSAM2), C6 (Cand 1 mirror-suite-pattern), C7 (TensorRT 10.x Apache-2.0 + ORT MIT + PyTorch BSD-3-Clause), C8 (MSP2 + UBX MIT/BSD-3; pymavlink LGPL-3.0 = bundle-unmodified compliant with LGPL §6 per D-C8-3), C10 (FAISS MIT + atomicwrites MIT + Polygraphy + TensorRT Apache-2.0).
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
The GPL-3.0 track is partial: VINS-Fusion + OpenVINS for C1 (Fact in `C1_vio.md`); SALAD + (conditional AnyLoc/BoQ/DINOv2-VLAD) for C2; PostGIS contingent for C6; pymavlink LGPL-3.0 throughout for C8 (covers both tracks via bundle-unmodified pattern).
|
||||
|
||||
Hard disqualifiers (independent of D-C1-1 = (a) or (b)): canonical SP+LightGlue (Magic Leap noncommercial); MASt3R (CC-BY-NC).
|
||||
|
||||
### Conclusion
|
||||
|
||||
**The BSD/permissive track is COMPLETE**: every component has at least one BSD/permissive primary candidate available. The user can choose D-C1-1 = (b) (BSD/permissive only) and the project is unblocked. Choosing D-C1-1 = (a) (GPL-3.0 only) would unlock additional candidates in C1 (VINS-Fusion + OpenVINS) and C2 (SALAD + conditional pre-screen extensions) but would force a license posture decision on every downstream consumer of the project. The recommended default is D-C1-1 = (c) (both tracks open) which preserves the modular swap pathway documented in Dimension 8.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — license verification per candidate is L1-evidence-backed via repo LICENSE files + SPDX identifiers + GitHub API license metadata.
|
||||
|
||||
---
|
||||
|
||||
## Dimension PROJECT-10: AC-NEW-4 covariance-honesty fit
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
AC-NEW-4 requires `P(error >500 m) <0.1 %` and `P(error >1 km) <0.01 %` per flight, with covariance carried in the MAVLink message as the FC's only defense (per `00_problem/acceptance_criteria.md` line 81-83). Achieving this requires honest 6×6 posterior covariance from C5, propagated through C8's per-FC field conversion.
|
||||
|
||||
Native 6×6 covariance support per candidate:
|
||||
- **C4 OpenCV `cv::solvePnPRansac`**: NO (returns `retval, rvec, tvec, inliers` only per Source #83 signature) — D-C4-2 mitigation REQUIRED (post-hoc Jacobian OR wrap in GTSAM Marginals).
|
||||
- **C4 OpenGV `absolute_pose::optimize_nonlinear`**: NO (no covariance output API per Source #85) — D-C4-2 = (d) mitigation if OpenGV elevated to primary.
|
||||
- **C4 GTSAM `Marginals(graph, result).marginalCovariance(pose_key)`**: YES, NATIVE per Source #87 (multiple snippets) — **only C4 candidate that satisfies AC-NEW-4 NATIVELY**.
|
||||
- **C5 Manual ESKF**: NATIVE 6×6 via analytic Jacobian per Solà §6 canonical recipe (Fact #88).
|
||||
- **C5 GTSAM iSAM2**: NATIVE 6×6 via `Marginals.marginalCovariance` (Fact #89) — same NATIVE AC-NEW-4 satisfaction pathway as C4 GTSAM.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
The C4+C5 GTSAM-shared-substrate hybrid (D-C5-5 = (c)) couples both layers via GTSAM's `Marginals.marginalCovariance` API: C4 wraps `solvePnPRansac` result in GTSAM `BetweenFactor<Pose3>` prior + per-inlier `GenericProjectionFactorCal3_S2` factors → `LevenbergMarquardtOptimizer` → `Marginals` (D-C4-2 = (b) per Fact #54), then C5 ingests that anchor + covariance as a `PriorFactorPose3` in iSAM2 (Fact #89). C8 D-C8-8 = (b) extracts the 2×2 horizontal sub-matrix from C5 `Marginals` 6×6, computes the 95% confidence ellipse semi-major axis `sqrt(2.0 * 5.991 * λ_max)`, and emits per-FC.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**The GTSAM-shared-substrate hybrid is the architecturally cleanest path to AC-NEW-4 satisfaction**: covariance is recovered NATIVELY at C4, propagated NATIVELY through C5, and converted-then-emitted at C8 with no impedance mismatch. The Manual ESKF path (C5 simple-baseline) also satisfies AC-NEW-4 NATIVELY but requires C4's covariance to be recovered via D-C4-2 = (a) post-hoc Jacobian (~1 day engineering) since the ESKF can't ingest a non-covariance-bearing anchor. This is acceptable but loses the cross-layer NATIVE coupling.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — every covariance-API verification is L1-evidence-backed via official SDK docs + canonical paper equations.
|
||||
|
||||
---
|
||||
|
||||
## Dimension PROJECT-11: AC-4.1 + AC-4.2 fit on Jetson Orin Nano Super SM 87
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
AC-4.1 requires end-to-end latency <400 ms p95; AC-4.2 requires <8 GB shared memory. Per-component latency budgets on Jetson Orin Nano Super (extrapolated from L1 benchmarks on similar hardware where Jetson-direct evidence is unavailable):
|
||||
- **C1**: OKVIS2 ~30-50 ms per frame; KLT ~5-10 ms.
|
||||
- **C2**: MixVPR ~10-20 ms FP16 + ~5-10 ms INT8 per query.
|
||||
- **C3**: DISK+LightGlue ~30-60 ms per pair FP16 — **TIGHT at K=10 retrieval pairs per UAV frame** (300-600 ms standard / 150-300 ms adaptive); D-C3-3 mitigation via reduced K (3-5) OR adaptive depth (1.86× speedup on easy pairs per LightGlue paper §5.4).
|
||||
- **C4**: OpenCV ~5-15 ms per RANSAC iteration; GTSAM `Marginals` ~30-90 ms per pose recovery (Plan-phase Jetson MVE confirmation).
|
||||
- **C5**: Manual ESKF ~5-15 ms per update; GTSAM iSAM2 ~5-100 ms per update depending on D-C5-5 factor density (RECOMMENDED D-C5-5 = (c) ~2-5 ms per update is fastest path).
|
||||
- **C6**: Cand 1 ~6-54 ms per cache hit (Postgres btree + FAISS HNSW within AC-4.1).
|
||||
- **C7**: TensorRT INT8+FP16 mixed per D-C7-6 per-family policy meets AC-4.1 across pipeline.
|
||||
- **C8**: pymavlink + MSP2 send-side ~1-5 ms per message; rate 5 Hz per D-C8-5.
|
||||
- **C10**: Pre-flight only; not in AC-4.1 budget. Takeoff load <5 s per D-C10-4 mmap path.
|
||||
|
||||
Memory: C7 ~700 MB-1.5 GB total across all loaded engines; C5 GTSAM iSAM2 ~50-200 MB factor graph; C6 ~430 MB FAISS HNSW at 2048-D halfvec × 100K tiles (per Source #115 formula). Total estimated ~1.5-2.5 GB peak runtime within AC-4.2 8 GB budget.
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
The dominant latency consumer is **C3 matchers at K=10 retrieval pairs per UAV frame** (300-600 ms standard for DISK+LightGlue). D-C3-3 mitigation paths are documented and parameterizable. Source #102 YOLO26n benchmark on Jetson Orin Nano Super confirms TensorRT INT8 delivers ~2-3× speedup over FP16 for CNN-class models — giving budget headroom for C2 + C7 + per-frame VPR retrieval.
|
||||
|
||||
### Conclusion
|
||||
|
||||
**AC-4.1 satisfaction is feasible at K=3-5 retrieval pairs per frame with adaptive-depth LightGlue** (~150-300 ms for matchers, leaving ~100-250 ms headroom for C1+C4+C5+C8). AC-4.2 satisfaction has comfortable headroom (~5-6 GB free under recommended primary stack). **Strongest mitigation lever**: D-C3-3 K-pair budget choice; secondary lever: D-C7-6 per-family precision policy.
|
||||
|
||||
### Confidence
|
||||
|
||||
⚠️ Medium-High — most latency cells are L2 extrapolation from RTX-3080/3090 benchmarks scaled to Jetson; final confirmation requires Plan-phase Jetson MVE per D-C1-2.
|
||||
|
||||
---
|
||||
|
||||
## Dimension PROJECT-12: AC-NEW-7 cache-poisoning safety fit
|
||||
|
||||
### Fact Confirmation
|
||||
|
||||
AC-NEW-7 requires `P(geo-misalign >30 m) <1 %` and `P(>100 m) <0.1 %` per flight across all onboard tiles written. The end-to-end safety contract spans (a) onboard tile-write side (AC-8.4 mid-flight tile generation; per-tile quality metadata), (b) Suite Sat Service-side multi-flight ingest voting layer (out of onboard scope), and (c) **descriptor-cache + TensorRT engine integrity at takeoff load**.
|
||||
|
||||
The (c) part is what C6+C10 own. FAISS Source #114 explicit security warning: "No attempt is made to check the correctness of loaded data. A faulty or malicious file could lead to out-of-memory errors or code execution." — direct AC-NEW-7 risk if untreated. D-C10-3 mitigation: SHA-256 content-hash verification gate at takeoff load, reject + STATUSTEXT to FC + refuse takeoff on mismatch. D-C10-2 mitigation for the truncated-file class (separate from tampering): `python-atomicwrites` package (write-to-temp + fsync + atomic rename + parent-dir fsync per Source #116).
|
||||
|
||||
### Reference Comparison
|
||||
|
||||
Skipping content-hash verification (D-C10-3 = (a)) would leave the cache-poisoning failure mode open at the cost of ~50 ms one-time hash check at takeoff. Skipping atomic-write (D-C10-2 = (a)) would leave the truncated-file failure mode open — a power loss or process kill mid-`faiss.write_index` leaves a corrupt FAISS file that loads successfully and produces silently-wrong descriptor matches at takeoff (direct AC-NEW-7 violation + AC-3.3 re-localization stability violation).
|
||||
|
||||
### Conclusion
|
||||
|
||||
**AC-NEW-7 cache-poisoning safety on the descriptor-cache + TensorRT engine path is satisfied by the D-C10-2 atomic-write + D-C10-3 content-hash + D-C10-7 self-describing filename triad**. The Suite Sat Service-side multi-flight ingest voting (the dependent half of the contract per AC-NEW-7 external-dependency note) is out of onboard scope but is acknowledged in `00_problem/acceptance_criteria.md` line 98.
|
||||
|
||||
### Confidence
|
||||
|
||||
✅ High — D-C10-2 + D-C10-3 + D-C10-7 mitigations cite L1 evidence (Source #114 FAISS warning + Source #116 atomic-write pattern + Source #105 hardware-tied-engine constraint).
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting reasoning summary
|
||||
|
||||
| Reasoning lever | Conclusion | Confidence |
|
||||
|---|---|---|
|
||||
| Pipeline shape | Canonical retrieval+matching+pose+fusion; no end-to-end alternative | ✅ High |
|
||||
| Implementation cost | ~6-8 weeks parallelizable critical path + ~2 wk Jetson MVE overlap | ⚠️ Medium |
|
||||
| Maintenance posture | BSD/permissive-clean primary stack; OpenGV staleness contained | ✅ High |
|
||||
| Risk decomposition | License + hardware-tied + cross-domain; all three have closed mitigations | ✅ High |
|
||||
| Expected benefit asymmetry | C3 forced-modern-lead; C4+C5 hybrid; rest keep simple-baseline primary | ✅ High |
|
||||
| Mission-profile fit | Every primary candidate applies; iNav + AC-NEW-7 are project-novel and covered | ✅ High |
|
||||
| Team capability | 2-engineer Python+C++ split; no specialty stack required | ⚠️ Medium |
|
||||
| Migration difficulty | ≤3 weeks per swap; GTSAM-shared-substrate is the largest radius | ⚠️ Medium |
|
||||
| License-track posture | BSD/permissive track COMPLETE; recommend D-C1-1 = (c) both tracks open | ✅ High |
|
||||
| AC-NEW-4 covariance honesty | GTSAM-shared-substrate hybrid satisfies NATIVELY across C4+C5+C8 | ✅ High |
|
||||
| AC-4.1 + AC-4.2 fit | Feasible at K=3-5 LightGlue pairs + adaptive depth + D-C7-6 per-family precision | ⚠️ Medium-High (Plan-phase Jetson MVE confirms) |
|
||||
| AC-NEW-7 cache-poisoning safety | D-C10-2 + D-C10-3 + D-C10-7 triad satisfies onboard side; Suite Service side out of scope | ✅ High |
|
||||
@@ -0,0 +1,149 @@
|
||||
# Validation Log
|
||||
|
||||
> Mode A Phase 2 — engine Step 7 (Use-Case Validation / Sanity Check). Validates the recommended primary stack from `04_reasoning_chain.md` against a typical UAV mission scenario, surfaces counterexamples where they exist, runs the engine's review checklist, and lists conclusions that need revision.
|
||||
>
|
||||
> Backing artifacts: source registry [`01_source_registry/00_summary.md`](01_source_registry/00_summary.md) (#1–#121); fact cards [`02_fact_cards/00_summary.md`](02_fact_cards/00_summary.md) (#1–#101); component fit matrix [`06_component_fit_matrix/00_summary.md`](06_component_fit_matrix/00_summary.md); cross-component gates [`06_component_fit_matrix/99_cross_component_gates.md`](06_component_fit_matrix/99_cross_component_gates.md); comparison framework [`03_comparison_framework.md`](03_comparison_framework.md); reasoning chain [`04_reasoning_chain.md`](04_reasoning_chain.md).
|
||||
|
||||
---
|
||||
|
||||
## Validation Scenarios
|
||||
|
||||
The recommended primary stack must hold up across the full envelope of normal-flight + edge-case scenarios called out in the Project Constraint Matrix. Walked through five representative scenarios — one nominal cruise, two edge cases, two adversarial.
|
||||
|
||||
### Scenario 1 — Nominal cruise (steady-state visual anchoring)
|
||||
|
||||
A fixed-wing UAV at 1 km AGL cruises at 60 km/h over rolling-steppe agricultural terrain east of Dnipro. GPS is jammed. Nav camera produces 3 frames/s (~333 ms cadence). FC delivers 100-200 Hz IMU + attitude over MAVLink. C2 (MixVPR per recommended primary on the BSD/permissive track) retrieves K=3-5 candidate satellite tiles per frame; C3 (DISK+LightGlue + adaptive depth per D-C3-3 mitigation) registers UAV frame against best candidate; C4 (OpenCV `cv::solvePnPRansac` wrapped in GTSAM `Marginals` per D-C4-2 = (b)) emits 6-DoF pose + 6×6 covariance; C5 (GTSAM iSAM2 per D-C5-5 = (c)) fuses with C1 (OKVIS2 frame-to-frame VIO) + IMU; C8 (pymavlink → MAVLink `GPS_INPUT` for ArduPilot Plane / MSP2_SENSOR_GPS for iNav) emits WGS84 + per-FC `horiz_accuracy`/`hPosAccuracy` at 5 Hz per D-C8-5.
|
||||
|
||||
### Scenario 2 — Sharp turn with <5% inter-frame overlap (AC-3.2)
|
||||
|
||||
UAV banks ±20° to enter a search pattern. Two consecutive frames share <5% overlap. C1 frame-to-frame VIO loses tracking; C5 propagates dead-reckoned via IMU + last-good-anchor. C2/C3 next-frame retrieval recovers a valid satellite-anchor within 1-2 frames per AC-3.2 ("recovery via satellite-reference re-localization"). Within the AC-3.4 budget (≥3 consecutive frames AND ≥2 s without a position before requesting operator re-loc).
|
||||
|
||||
### Scenario 3 — Stale tile in active-conflict sector (AC-NEW-6)
|
||||
|
||||
Cache contains a tile from 8 months ago for a sector flagged as active-conflict. AC-8.2 freshness threshold is <6 mo for active-conflict. C6 manifest carries `capture_date` per restrictions.md mandate. The retrieval path must reject (or downgrade label to non-`satellite_anchored`) per AC-NEW-6.
|
||||
|
||||
### Scenario 4 — Cache file corruption (AC-NEW-7 cache-poisoning safety)
|
||||
|
||||
Pre-flight: a malicious actor swaps `/var/lib/onboard/cache/faiss/v_2048_M32.index` with a tampered file containing crafted descriptors that would point to wrong tiles for given UAV-frame queries. Takeoff load via `faiss.read_index` would silently load this file (Source #114 explicit warning: "no internal integrity check, expects validated input").
|
||||
|
||||
### Scenario 5 — GPS spoofing + visual blackout (AC-3.5 + AC-NEW-2 + AC-NEW-8)
|
||||
|
||||
UAV enters a cloud bank (visual blackout) while FC simultaneously reports GPS signal-quality anomaly indicating spoofing. C1 + C2 + C3 + C4 all fail (no usable visual input); C5 must propagate from last trusted state via IMU only, label every estimate `{dead_reckoned}`, degrade MAVLink fix-quality to "2D fix or worse" when 95% covariance semi-major axis >100 m, escalate to "no fix" when >500 m or blackout >30 s. C8 must NOT promote spoofed real-GPS back into the estimator unless FC GPS health stable + non-spoofed for ≥10 s AND a visual/satellite consistency check has succeeded. AC-NEW-2 spoofing-promotion latency <3 s p95 from spoof onset to companion estimate becoming primary FC source.
|
||||
|
||||
---
|
||||
|
||||
## Expected behavior under recommended primary stack
|
||||
|
||||
### Scenario 1 — Nominal cruise
|
||||
|
||||
If using **MixVPR + DISK+LightGlue + OpenCV+GTSAM-Marginals + GTSAM iSAM2 + pymavlink/MSP2** at the recommended primary stack:
|
||||
- C2 MixVPR query ~10-20 ms FP16 + ~5-10 ms INT8 per frame; K=3-5 retrieval list returned.
|
||||
- C3 DISK+LightGlue FP16 (per D-C7-6 matchers→FP16-only per-family precision policy) ~30-60 ms per pair × K=3-5 pairs = 90-300 ms (within AC-4.1 400 ms p95 if K=3 + adaptive depth applied per D-C3-3).
|
||||
- C4 `cv::solvePnPRansac` ~5-15 ms inlier filter + GTSAM `Marginals` recovery ~30-90 ms (Plan-phase Jetson MVE confirms).
|
||||
- C5 GTSAM iSAM2 with D-C5-5 = (c) PriorFactorPose3-only + IncrementalFixedLagSmoother K=10-20 keyframes per D-C5-3 ~2-5 ms per update.
|
||||
- C8 pymavlink GPS_INPUT or MSP2_SENSOR_GPS encode + send ~1-5 ms.
|
||||
- Total end-to-end: ~140-420 ms p95. Within AC-4.1 budget at K=3 + adaptive depth.
|
||||
- Memory: ~1.5-2.5 GB peak. Well within AC-4.2 8 GB budget.
|
||||
- AC-NEW-4 satisfied NATIVELY via GTSAM `Marginals.marginalCovariance` per D-C8-8 per-FC unit conversion.
|
||||
|
||||
### Scenario 2 — Sharp turn
|
||||
|
||||
C1 VIO loses frame-to-frame tracking on the <5% overlap consecutive frames per AC-3.2 ("Sharp-turn frames may fail frame-to-frame registration"). C5 ESKF/iSAM2 propagates from last-good-anchor via IMU per D-C5-2 long-cruise-observability strategy (covariance growth alert if covariance > threshold); IMU bias-stationarity prior (D-C5-2 = (a) accept + monitor) keeps drift bounded. Next 1-2 frames trigger C2+C3 satellite-anchor re-localization per AC-3.2 recovery clause. Within AC-3.4 budget if recovery within 3 frames + 2 s. Per AC-3.3 the system handles ≥3 disconnected segments per flight via satellite-reference re-localization as core capability.
|
||||
|
||||
### Scenario 3 — Stale tile
|
||||
|
||||
C6 cache entry carries `capture_date` per restrictions.md tile manifest schema mandate. Retrieval path must check `capture_date` against AC-8.2 threshold (<6 mo active-conflict, <12 mo stable rear). If stale, downgrade label to non-`satellite_anchored` per AC-NEW-6 ("verify stale-tile match never produces `satellite_anchored`"). Sector classification (active-conflict vs stable rear) is deferred to Plan-phase per the C10 scope restructure 2026-05-08.
|
||||
|
||||
### Scenario 4 — Cache file corruption
|
||||
|
||||
D-C10-3 content-hash verification gate at takeoff load: compute `SHA-256(faiss_index_file)` at takeoff load + compare against manifest-recorded hash + reject load + emit `STATUSTEXT` to FC + refuse takeoff if mismatch. ~50 ms one-time hash check at takeoff per Source #115 size formula (~430 MB at 2048-D halfvec × 100K tiles read at SATA SSD ~500 MB/s). Direct AC-NEW-7 satisfaction at the descriptor-cache load layer.
|
||||
|
||||
### Scenario 5 — GPS spoofing + visual blackout
|
||||
|
||||
C1+C2+C3+C4 all fail; C5 propagates dead-reckoned via IMU only. Per AC-3.5: switch label to `{dead_reckoned}` within ≤1 processed frame OR ≤400 ms; reject spoofed GPS as estimator input. Per AC-NEW-8: continue emitting external-position MAVLink frames from IMU-only propagation for ≤30 s after the last trusted anchor, label every estimate `{dead_reckoned}`, degrade MAVLink fix-quality to "2D fix or worse" when 95% covariance semi-major axis >100 m, escalate to "no fix" + `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT when >500 m OR blackout >30 s. C8 D-C8-2 = (b) companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` switch ownership pattern: companion publishes to source-set 2 + auto-switches FC + switches back to set 1 when companion is unavailable. AC-NEW-2 spoofing-promotion latency <3 s p95 satisfied via the companion-driven switch (no GCS round-trip required).
|
||||
|
||||
---
|
||||
|
||||
## Actual validation results
|
||||
|
||||
| Scenario | Recommended primary stack behavior | Outcome |
|
||||
|---|---|---|
|
||||
| 1 — Nominal cruise | Total end-to-end 140-420 ms p95; memory 1.5-2.5 GB peak; AC-NEW-4 NATIVELY satisfied | ✅ **PASS** with K=3 + adaptive depth applied (Plan-phase Jetson MVE confirms exact tail) |
|
||||
| 2 — Sharp turn AC-3.2 | C5 dead-reckon + C2/C3 re-localize within 1-2 frames; AC-3.3 ≥3 disconnected segments handled | ✅ **PASS** per design |
|
||||
| 3 — Stale tile AC-NEW-6 | C6 manifest `capture_date` check; downgrade label to non-`satellite_anchored` if stale | ✅ **PASS** at architectural level; sector-classification heuristic deferred to Plan-phase |
|
||||
| 4 — Cache poisoning AC-NEW-7 | D-C10-3 SHA-256 content-hash gate at takeoff; D-C10-2 atomic-write covers truncation | ✅ **PASS** for descriptor-cache + TensorRT engine path; Suite Sat Service multi-flight ingest voting OUT OF onboard scope (per AC-NEW-7 external-dependency note) |
|
||||
| 5 — GPS spoofing + visual blackout | C5 dead-reckon, C8 companion-driven source-set switch, AC-NEW-8 escalation thresholds enforced | ✅ **PASS** per AC-3.5 + AC-NEW-2 + AC-NEW-8 + D-C8-2 + D-C8-8 |
|
||||
|
||||
---
|
||||
|
||||
## Counterexamples
|
||||
|
||||
### Counterexample CE-1 — K=10 retrieval pairs in Scenario 1 violates AC-4.1
|
||||
|
||||
If C3 K=10 retrieval pairs per frame (canonical default per LightGlue paper §5.4 evaluation methodology) is naively applied without D-C3-3 mitigation, total end-to-end at DISK+LightGlue ~30-60 ms × 10 = 300-600 ms standard / 150-300 ms adaptive — **exceeds AC-4.1 400 ms p95 budget without K reduction**. Mitigation pathway documented in D-C3-3 Choose block: reduce K from 10 to 3-5 / reduce keypoints from 1024 to 512 / accept TIGHT margin and validate at Jetson MVE / parallelize across multiple Jetson GPU streams / elevate ONNX Runtime + TensorRT EP + adaptive depth.
|
||||
|
||||
**Address**: this counterexample is already known and gated as D-C3-3; recommendation is K=3 + adaptive depth which satisfies the AC-4.1 budget at the cost of ~5-10% Recall@K loss vs K=10.
|
||||
|
||||
### Counterexample CE-2 — D-C5-5 = (a) per-correspondence factor density violates AC-4.1
|
||||
|
||||
If C5 GTSAM iSAM2 is configured with D-C5-5 = (a) per-correspondence `GenericProjectionFactorCal3DS2` highest fidelity (1000+ factors per keyframe at K=10 image pairs × 100 inliers per pair), per-update latency is ~50-150 ms on Jetson Orin Nano Super CPU — combined with C3 ~150-300 ms + C4 ~30-90 ms + C2 ~15-30 ms + C8 ~1-5 ms exceeds AC-4.1 400 ms p95 budget.
|
||||
|
||||
**Address**: this counterexample is already known and gated as D-C5-5; recommendation is D-C5-5 = (c) `PriorFactorPose3` only with C4 GTSAM Marginals satellite-anchor 6×6 covariance — couples C4 Fact #54 D-C4-2 = (b) with C5 Fact #89 architectural integration via shared GTSAM substrate. ~2-5 ms per update on Jetson Orin Nano Super CPU. CLEANEST cross-component coupling.
|
||||
|
||||
### Counterexample CE-3 — Pure ESKF (Manual ESKF without GTSAM iSAM2) loses AC-4.5 look-back
|
||||
|
||||
If C5 = Manual ESKF only (no GTSAM iSAM2 secondary), AC-4.5 ("System may refine prior estimates and emit corrections") cannot be satisfied — the recursive forward-time-only Kalman update has no look-back facility per Solà §6 reference recipe. AC-4.5 is a "may" not a "must" but in the project's spoofing-aware AC-NEW-8 dead-reckoning failsafe context, the look-back capability is operationally valuable for retroactively correcting blackout-period estimates once a trusted anchor is recovered.
|
||||
|
||||
**Address**: this counterexample is partially mitigated by recommending the **hybrid** Manual ESKF + GTSAM iSAM2 path per the C5 batch 1 closure (Fact #88 + Fact #89 dual-candidate verdict). Manual ESKF is the mandatory simple-baseline (always-running fallback if GTSAM iSAM2 fails to converge); GTSAM iSAM2 is the primary path with NATIVE AC-4.5 look-back. Final lock at Plan-phase per D-C5-3 + D-C5-5.
|
||||
|
||||
### Counterexample CE-4 — Cand 3 UBX impersonation for iNav (AC-NEW-7 forgery posture)
|
||||
|
||||
If C8 iNav path = Cand 3 UBX impersonation via pyubx2 NAV-PVT (instead of the recommended primary Cand 2 MSP2_SENSOR_GPS), the project takes on an unambiguous forgery posture — companion impersonates a u-blox receiver. AC-NEW-7 ("no covert GPS spoofing without consent") requires an explicit FDR audit trail per D-C8-7 = (a). User chose Cand 2 (MSP2_SENSOR_GPS) as primary for iNav to avoid this posture entirely; Cand 3 remains a documented secondary path with the audit-trail mitigation in case of hard incompatibility.
|
||||
|
||||
**Address**: not a counterexample to the recommended primary stack; documents why the user-locked Cand 2 = primary verdict was the right architectural choice.
|
||||
|
||||
### Counterexample CE-5 — Sector classification heuristic NOT YET pinned
|
||||
|
||||
AC-8.2 freshness threshold (<6 mo active-conflict, <12 mo stable rear) requires a sector classification source. The `00_question_decomposition.md` C10 scope restructure 2026-05-08 deferred the sector classification heuristic to Plan-phase. **At research close, the project does not have a pinned source for "is this sector active-conflict or stable rear?"**. Operator-marked geofence vs Suite Service metadata vs other source is open.
|
||||
|
||||
**Address**: deferred to Plan-phase per user choice C `c10_scope=C` cross-coupling minimal. Surfaces as Plan-phase BLOCKING gate. Not a research-layer gap.
|
||||
|
||||
---
|
||||
|
||||
## Review Checklist
|
||||
|
||||
- [x] Draft conclusions consistent with Step 3 fact cards (cross-references across `02_fact_cards/Cx_*.md` files; every Fact # cited in `04_reasoning_chain.md` exists in the corresponding fact-card file).
|
||||
- [x] No important dimensions missed — twelve dimensions (eight Decision Support + four project-mandatory) cover the AC + restrictions surface comprehensively per the Decomposition Completeness Probe checklist in `references/comparison-frameworks.md`.
|
||||
- [x] No over-extrapolation — every L3 inferential cell is labeled ⚠️ Medium or ⚠️ Medium-High and tied to a Plan-phase Jetson MVE confirmation gate.
|
||||
- [x] Conclusions are actionable/verifiable — every recommendation maps to a specific D-Cx-y decision in `99_cross_component_gates.md` with named owner + resolution path.
|
||||
- [x] Every selected component/tool/pattern matches the Project Constraint Matrix — verified per row in `06_component_fit_matrix/Cx_*.md` Restrictions × Candidate-Modes sub-matrix sections.
|
||||
- [x] Mismatches marked as disqualifiers instead of hidden as generic "limitations" — canonical SP+LightGlue (Magic Leap noncommercial) is the canonical example, called out explicitly as HARD DISQUALIFIER in D-C3-1.
|
||||
|
||||
### Issue found
|
||||
|
||||
- **One issue, partially resolved**: AC-8.2 sector-classification source is not pinned at research close (CE-5). Deferred to Plan-phase per `00_question_decomposition.md` C10 scope restructure user choice. Acknowledged as a Plan-phase BLOCKING gate, not a research-layer gap.
|
||||
|
||||
---
|
||||
|
||||
## Conclusions Requiring Revision
|
||||
|
||||
None at this stage. All five validation scenarios PASS under the recommended primary stack with documented mitigation paths for the three counterexamples (CE-1 K=10 → D-C3-3; CE-2 D-C5-5 = (a) → D-C5-5 = (c); CE-3 pure ESKF → ESKF+iSAM2 hybrid). CE-4 (UBX impersonation) is not a counterexample to the recommended stack but a documentation of why the user-locked Cand 2 verdict was correct. CE-5 (sector classification) is a Plan-phase deferred gate, not a research-layer revision.
|
||||
|
||||
---
|
||||
|
||||
## Sanity check on Step 7.5 Component Applicability Gate
|
||||
|
||||
Per `04_engine-analysis.md` Step 7.5.3: a candidate may not be `Selected` while any sub-matrix cell is ❌ or ❓.
|
||||
|
||||
**Component Fit Matrix scan** ([`06_component_fit_matrix/`](06_component_fit_matrix/)):
|
||||
- C1: lead candidates Selected with documented MVE evidence; no open ❌ or ❓ on sub-matrix.
|
||||
- C2: 5/5 mandatory pre-screen Selected with MVE evidence; conditional pre-screen extensions (AnyLoc/BoQ/DINOv2-VLAD) gated as `Experimental only` per D-C2-5 ViT export prerequisite — correctly NOT marked Selected.
|
||||
- C3: lead candidates Selected with MVE evidence; canonical SP+LightGlue marked `Rejected` per D-C3-1 hard disqualifier.
|
||||
- C4: 3 candidates with verdicts; OpenGV `Selected with runtime gate` is valid per the Step 7.5.3 carve-out for runtime-quality gates (D-C4-3 + D-C4-4 are research-layer gates that are closed at the documentary level; license-clearance-counsel-review remains as a Plan-phase routine task, not a runtime-quality gate).
|
||||
- C5: 2 candidates Selected per closure verdict.
|
||||
- C6: Cand 1 Selected; Cand 2 Deferred secondary per comparative-improvement verdict.
|
||||
- C7: 3 candidates Selected per per-family roles.
|
||||
- C8: 3 candidates Selected per per-FC + per-fallback roles.
|
||||
- C10: 2 sub-areas Selected per cross-coupling-minimal scope.
|
||||
|
||||
**Result**: zero ❌, zero ❓ across all Selected candidates. **Step 7.5 Component Applicability Gate PASSES**. Solution draft (Step 8) may proceed without further blocking gates.
|
||||
@@ -0,0 +1,63 @@
|
||||
# Component Fit Matrix — Index & Summary
|
||||
|
||||
> Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate, structured per-component candidate-selection table). One row per component area (C1–C10 from `../00_question_decomposition.md`); each row enumerates candidates with status, license, key fit dimensions, and a cite of the per-numbered-Restriction × per-numbered-AC sub-matrix in [`../02_fact_cards/`](../02_fact_cards/) that supports the status. Rows are filled progressively as SQ3+SQ4 closes per component.
|
||||
|
||||
This folder replaces the previous monolithic `06_component_fit_matrix.md` (284 lines, dominated by very wide tables that no longer fit in a single editor view). Each component lives in its own file. Open the file matching the component you need — every status verdict and Plan-phase decision is preserved verbatim.
|
||||
|
||||
---
|
||||
|
||||
## Status vocabulary (per engine rule)
|
||||
|
||||
| Status | Meaning |
|
||||
|---|---|
|
||||
| **Selected** | Documentary verification ✅ + Jetson Orin Nano Super hardware MVE ✅; promoted as the implementation choice for the project |
|
||||
| **Documentary lead** | Documentary verification ✅ (mode pinned, MVE block, sub-matrix); Jetson MVE pending; eligible for Selected promotion in the dedicated bring-up phase |
|
||||
| **Experimental only** | Documentary verification surfaced a partial mismatch or contradiction; cannot be Selected without the deferred Jetson MVE explicitly resolving the contradiction (per Per-Mode API Capability Verification rule) |
|
||||
| **Conditional** | Candidate fits only as a sub-component of a hybrid design; cannot be a drop-in lead (e.g., VO-only candidate that requires an external IMU wrapper) |
|
||||
| **Mandatory simple-baseline** | Candidate is required by the engine's Component Option Breadth rule as a runnable fallback / regression baseline; not a lead |
|
||||
| **Rejected — disqualified** | Documentary evidence explicitly contradicts a hard project disqualifier (e.g., AC-4.2 memory budget, license blocks dual-use); excluded from further consideration |
|
||||
| **N/A** | Candidate is not applicable to this component area (cataloged for completeness only) |
|
||||
|
||||
---
|
||||
|
||||
## Component index
|
||||
|
||||
| File | Component | Closure status | Top documentary leads | Hard disqualifiers |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| [`C1_vio.md`](C1_vio.md) | **C1** — Visual / Visual-Inertial Odometry | Closed at documentary level (2026-05-08) | OKVIS2/OKVIS2-X (BSD/permissive track lead), OpenVINS (GPL-3.0 track lead), VINS-Mono (GPL-3.0 alternate, sub-20-Hz caveat), Pure VO + ESKF (mandatory simple-baseline) | DROID-SLAM (>11 GB VRAM exceeds AC-4.2), RTAB-Map + ORB-SLAM3 (rejected by SPRIN-D evidence at >1 km / >2 m/s) |
|
||||
| [`C2_vpr.md`](C2_vpr.md) | **C2** — Visual Place Recognition | Mandatory pre-screen CLOSED at 5/5 (2026-05-08); conditional AnyLoc/BoQ/DINOv2-VLAD GATED on INT8 survey | EigenPlaces (MIT, viewpoint-robust, simplest CNN), MixVPR (MIT, ResNet50 + MLP-Mixer), SelaVPR (MIT, DINOv2-L two-stage, best cross-season Tokyo24/7), SALAD (GPL-3.0, DINOv2-B + optimal-transport), NetVLAD (mandatory simple-baseline) | SuperGlue-as-reranker (matcher-class, not VPR-class) |
|
||||
| [`C3_matchers.md`](C3_matchers.md) | **C3** — Cross-domain registration (Matchers) | Closed at 5/N (2026-05-08); mandatory simple-baseline COMPLETE; modern-competitive-lead axis MATERIALLY EXPANDED | DISK+LightGlue (D-C3-1 RECOMMENDED-PRIMARY: clean Apache-2.0 throughout, +7.99 AUC@5° over SP), XFeat (D-C3-1 ALTERNATE: clean Apache-2.0, strongest embedded signal, cheapest retrain), ALIKED+LightGlue (D-C3-1 SECONDARY), SP+LightGlue (documentary baseline), SuperGlue+SuperPoint (mandatory simple-baseline) | SuperPoint Magic Leap noncommercial-research SLA blocks dual-use deployment (canonical SP+LightGlue + SuperGlue+SuperPoint); SuperGlue training code never released; MASt3R/RoMa/DKM/LoFTR dense matchers fail AC-4.1 latency |
|
||||
| [`C4_pose_estimation.md`](C4_pose_estimation.md) | **C4** — Pose estimation (PnP + RANSAC + LM) | IN PROGRESS at 3/N (mandatory simple-baseline + 2 modern-competitive-leads COMPLETE 2026-05-08); D-C4-1 (3-DoF vs 4-DoF vs 6-DoF lift) carried forward from Fact #20 + REINFORCED by Fact #52; D-C4-2 (covariance-recovery-strategy) NEW from Fact #52 + UPDATED by Fact #54 (GTSAM Marginals NATIVE); D-C4-3 (license-clearance verification) + D-C4-4 (maintenance-staleness mitigation) NEW from Fact #53 (OpenGV-only) | OpenCV `cv::solvePnPRansac` (mandatory simple-baseline, clean Apache-2.0 throughout, JetPack 6 canonical distribution = zero-effort Jetson deployment); **GTSAM `Marginals.marginalCovariance`** (modern-competitive-lead-covariance-honest, clean BSD-3-Clause throughout, **NATIVE 6×6 pose covariance — only C4 candidate to satisfy AC-NEW-4 NATIVELY**, daily-active maintenance, 1121 context7 code snippets); OpenGV `absolute_pose::AbsolutePoseSacProblem(KNEIP)` (modern-competitive-lead-richer-minimal-solver, BSD-3-Clause-equivalent CONTINGENT on D-C4-3, ~3-year stale CONTINGENT on D-C4-4, NO planar-scene solver) | (none yet) |
|
||||
| [`C5_state_estimator.md`](C5_state_estimator.md) | **C5** — State estimator / sensor fusion | **CLOSED at 2/N (batch 1 closed 2026-05-08)** — mandatory simple-baseline + 1 modern-competitive-lead-factor-graph COMPLETE | Manual ESKF (Solà 2017 canonical aerial/quaternion reference, public-domain academic preprint + project's Apache-2.0 implementation, mandatory simple-baseline, native 6×6 covariance via analytic Jacobian propagation); **GTSAM iSAM2 + CombinedImuFactor (Forster et al. RSS 2015) + smart factors + Marginals.marginalCovariance + IncrementalFixedLagSmoother** (modern-competitive-lead-factor-graph, clean BSD-3-Clause throughout, **architecturally couples with C4 Fact #54 GTSAM Marginals via shared substrate**, **NATIVE AC-4.5 look-back refinement**, daily-active maintenance) | (none yet) |
|
||||
| [`C6_tile_cache_spatial_index.md`](C6_tile_cache_spatial_index.md) | **C6** — Tile cache + spatial index | **CLOSED at 2/N (batch 1 closed 2026-05-08)** — mandatory simple-baseline + 1 modern-competitive-lead-spatial-extension COMPLETE; **Cand 1 RECOMMENDED PRIMARY** | **Cand 1 (RECOMMENDED PRIMARY)**: Manual mirror of existing parent-suite `satellite-provider` pattern — PostgreSQL btree composite on slippy-map `(tile_zoom, tile_x, tile_y, version)` + bytea descriptor blobs + app-side FAISS HNSW loaded at takeoff + filesystem tile storage at `./tiles/{zoom}/{x}/{y}.{image_type}` (clean PostgreSQL License + MIT + LGPL/MIT-Apache; trivial dependency footprint; project-pattern alignment; empirically-confirmed Postgres-on-Jetson viability per Source #97 March 2026); **Cand 2 (DEFERRED secondary)**: PostgreSQL + PostGIS GiST on geography(POINT,4326) + pgvector HNSW for descriptor ANN + filesystem tile storage (modern-competitive-lead-spatial-extension; native KNN + radius + combined-SQL capabilities BUT 5-10× slower geographic lookup vs Cand 1 + heavier dependency + GPL-2.0-or-later license complexity + DIVERGENT from suite pattern + improvements marginal-to-negative in project's specific 3 Hz spatial-grid query operating context) | PostGIS GPL-2.0-or-later may CONTINGENT REJECT Cand 2 under D-C1-1 = (b) BSD/permissive-only-track |
|
||||
| [`C7_inference_runtime.md`](C7_inference_runtime.md) | **C7** — On-Jetson inference runtime | **CLOSED at 3/N (batch 1 closed 2026-05-08)** — top-2 documentary leads + mandatory simple-baseline COMPLETE; **Cand 1 RECOMMENDED PRIMARY** | **Cand 1 (RECOMMENDED PRIMARY)**: TensorRT native — JetPack 6.2 bundled TensorRT 10.3 + `IInt8EntropyCalibrator2` + `BuilderFlag.FP16+INT8` mixed-precision + engines built directly on Jetson Orin Nano Super SM 87 (clean Apache-2.0 in TensorRT 10.x; ships with JetPack so zero-effort install; lowest-latency primary path; 2-3× speedup at INT8 vs FP16 per Source #102 YOLO26 evidence); **Cand 2 (interop alternate)**: ONNX Runtime + TensorRT EP — `onnxruntime-gpu` via Jetson AI Lab JP6/CU126 wheel index + `TensorrtExecutionProvider` config + automatic CUDA EP / CPU EP subgraph fallback (clean MIT throughout; cross-architecture portability for replay/SITL on x86 dev hosts; modern-competitive-lead-cross-architecture-portability); **Cand 3 (mandatory simple-baseline)**: pure PyTorch FP16 — `torch.amp.autocast` + `model.half()` + Jetson AI Lab PyTorch 2.5 ARM64 wheel (clean BSD-3-Clause throughout; zero-conversion regression baseline; reference-correctness oracle for accuracy validation of TRT-built engines) | INT8-only candidates marked Experimental until D-C7-1 calibration dataset materializes; matchers (LightGlue, XFeat, XFeat+LighterGlue) are FP16-only — NO INT8 — per D-C7-6 cross-component model-family precision policy due to Source #103 quantization-sensitivity finding |
|
||||
| [`C8_fc_adapter.md`](C8_fc_adapter.md) | **C8** — MAVLink / MSP2 FC adapter | **CLOSED at 3/N (batch 1 closed 2026-05-08)** — top-1 per FC for ArduPilot + parallel-evaluation per FC for iNav after mid-batch contradiction recovery COMPLETE; **Cand 1 RECOMMENDED PRIMARY for AP, Cand 2 RECOMMENDED PRIMARY for iNav** | **Cand 1 (RECOMMENDED PRIMARY for ArduPilot)**: pymavlink → MAVLink `GPS_INPUT` (msg 232) cooperative-path; `master.mav.gps_input_send(...)` periodic injection at 5 Hz over MAVLink (UART/USB/UDP); FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set drives EKF3 ingestion via `AP_GPS_MAV` (LGPL-3.0 pymavlink linkable from Apache-2.0 app per LGPL §6; canonical ArduPilot stack); **Cand 2 (RECOMMENDED PRIMARY for iNav)**: `MSP2_SENSOR_GPS` (id 7939 / 0x1F03) via Python MSP V2 implementation YAMSPy or INAV-Toolkit `msp_v2_encode`; `mspGPSReceiveNewData()` direct passthrough; covariance fields `hPosAccuracy/vPosAccuracy/hVelAccuracy` align directly with AP `GPS_INPUT.horiz_accuracy/vert_accuracy/speed_accuracy` (MIT throughout; clean dual-use compatible; locked SQ6 + AC-4.3 transport); **Cand 3 (DEFERRED secondary for iNav)**: UBX impersonation via pyubx2 NAV-PVT — forging u-blox NAV-PVT frames through standard GPS pipeline; iNav-side `gpsMapFixType()` validation gate requires `flags & 0x01 = 1` (gnssFixOK) AND `fixType ∈ {2,3}`; pyubx2 BSD-3-Clause; **does NOT clear user's "significant-improvement-only" bar over Cand 2** (richer protocol surface + AC-NEW-7 forgery posture + stricter validation gate + AP-path field-name divergence outweigh pyubx2 library-maturity advantage). **Mid-batch correction**: I caught a contradiction between my own initial AskQuestion phrasing ("UBX impersonation as ONLY iNav path") and locked SQ6 + AC-4.3 + restrictions.md verdicts (MSP2_SENSOR_GPS as iNav primary); user re-locked scope via `c8_inav_recovery=B` to evaluate both as parallel candidates | (none yet — pymavlink LGPL-3.0 license posture handled via D-C8-3 = (a) bundle-unmodified-with-version-pin per LGPL §6 standard compliance) |
|
||||
| [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) | **C10** — Pre-flight cache provisioning (CROSS-COUPLING MINIMAL scope per 2026-05-08 user choice C; operator CLI/desktop tooling, sector classification, freshness schema deferred to Plan-phase) | **CLOSED at 2/N (batch 1 closed 2026-05-08)** — D-C6-3 + D-C7-7 cross-component gates closed; no further C10 batches required at research layer | **D-C6-3 confirmation**: direct `faiss.write_index` / `faiss.read_index` Python API + `python-atomicwrites` + content-hash verification gate at takeoff + manifest-hash-driven rebuild trigger + `IO_FLAG_MMAP_IFC` mmap load (FAISS MIT, atomicwrites MIT throughout); **D-C7-7 confirmation**: hybrid Polygraphy CLI primary for INT8-calibrating builds + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API for unusual models (LightGlue dynamic shapes) — Polygraphy + TensorRT 10.x Apache-2.0 throughout, calibration corpus per D-C7-1 closure | (none — both candidates Apache-2.0/MIT clean; FAISS "no internal integrity check" warning mitigated by content-hash gate; `trtexec --int8` random-data caveat mitigated by project-side wrapper enforcing `--calib=<existing_cache>` non-empty precondition) |
|
||||
| [`99_cross_component_gates.md`](99_cross_component_gates.md) | **Cross-component process gates** | Open — Plan-phase Choose blocks raised by C1+C2+C3+C4+C5+C6+C7+C8+C10 closures | D-C1-1 license posture, D-C1-2 Jetson MVE, D-C2-1..11 (VPR retrain/cache/dim), D-C3-1..6 (matcher mitigation/runtime/K-pairs/ALIKED-mode/DISK-weights/XFeat-mode), D-C4-1..4, **D-C5-1..5 (Manual ESKF + GTSAM iSAM2)**, **D-C6-1..7**, **D-C7-1..9**, **D-C8-1..8**, **D-C10-1 (descriptor-cache rebuild trigger — manifest-hash-driven recommended, NEW from Fact #100)**, **D-C10-2 (descriptor-cache atomic-write strategy — `python-atomicwrites` recommended, NEW from Fact #100)**, **D-C10-3 (content-hash verification gate at takeoff load — reject + STATUSTEXT + refuse takeoff recommended, NEW from Fact #100, CROSS-COMPONENT with AC-NEW-7)**, **D-C10-4 (descriptor-cache load path — mmap with `madvise(MADV_WILLNEED)` pre-fault recommended, NEW from Fact #100)**, **D-C10-5 (TensorRT engine-build orchestration tool — hybrid Polygraphy + trtexec + direct API recommended, NEW from Fact #101, CROSS-COMPONENT with C7)**, **D-C10-6 (TensorRT calibration-cache reuse strategy — rebuild-on-calib-corpus-SHA-256-change recommended, NEW from Fact #101, CROSS-COMPONENT with D-C7-1)**, **D-C10-7 (TensorRT engine on-disk filename schema — self-describing `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` recommended, NEW from Fact #101)**, **D-C10-8 (TensorRT prebuilt-fallback engine generation venue — reference Jetson at HQ + deployed-Jetson-copy-to-archive recommended, NEW from Fact #101)**, Fact #40 dual-rate camera pipeline | n/a |
|
||||
|
||||
---
|
||||
|
||||
## Reading order
|
||||
|
||||
For first-time readers:
|
||||
|
||||
1. **Start here** — read this index plus the status vocabulary above.
|
||||
2. Read the closed component rows in order: [`C1_vio.md`](C1_vio.md) → [`C2_vpr.md`](C2_vpr.md) → [`C3_matchers.md`](C3_matchers.md). These three are the dense rows; each carries its own per-license-track preliminary ranking and per-row Plan-phase deliverables.
|
||||
3. Skim [`C4_pose_estimation.md`](C4_pose_estimation.md) for the pinned input/output contract and D-C4-1 carry-forward.
|
||||
4. Skim [`C5_state_estimator.md`](C5_state_estimator.md) for the pinned input/output contract + GTSAM-as-shared-C4+C5-substrate hybrid path D-C5-5 = (c) recommendation.
|
||||
5. Skim [`C6_tile_cache_spatial_index.md`](C6_tile_cache_spatial_index.md) for the pinned input/output contract + Cand 1 (mirror-suite-pattern) RECOMMENDED PRIMARY rationale + Cand 2 (PostGIS+pgvector) DEFERRED-secondary criteria.
|
||||
6. Skim [`C7_inference_runtime.md`](C7_inference_runtime.md) for the pinned input/output contract + TensorRT-native RECOMMENDED PRIMARY rationale + per-model-family precision policy (D-C7-6).
|
||||
7. Skim [`C8_fc_adapter.md`](C8_fc_adapter.md) for the pinned per-FC input/output contract + pymavlink-GPS_INPUT (AP) + MSP2_SENSOR_GPS (iNav) RECOMMENDED PRIMARY rationale + UBX-impersonation DEFERRED-secondary criteria (Cand 3 vs Cand 2 comparative-improvement verdict).
|
||||
8. Skim [`C10_preflight_provisioning.md`](C10_preflight_provisioning.md) for the C10 cross-coupling-minimal scope (D-C6-3 descriptor-cache rebuild + D-C7-7 TensorRT engine build confirmation pipelines; operator tooling design deferred to Plan).
|
||||
9. Cross-reference [`99_cross_component_gates.md`](99_cross_component_gates.md) when reviewing Plan-phase decisions; it consolidates every D-Cx-y gate raised across rows with the owner and resolution path.
|
||||
|
||||
For session-by-session updates: append to the matching row file. The summary table here only needs an update when a row's closure state, top documentary leads, or hard disqualifiers change.
|
||||
|
||||
---
|
||||
|
||||
## Editing rules
|
||||
|
||||
1. Each row file owns its candidate table, per-license-track ranking, and Plan-phase deliverables. Do not duplicate that content here; just refresh the one-line "Top documentary leads" / "Hard disqualifiers" cells when a row's verdict moves.
|
||||
2. Keep the "Sub-matrix cite" column in row files pointing at `../02_fact_cards/Cx_*.md` (not the deprecated `02_fact_cards.md`).
|
||||
3. New cross-cutting Plan-phase decisions (D-Cx-y) go into [`99_cross_component_gates.md`](99_cross_component_gates.md) under the matching component owner.
|
||||
4. When a C-row's candidate list changes, also touch the matching `../02_fact_cards/Cx_*.md` so the fact bindings stay aligned.
|
||||
@@ -0,0 +1,74 @@
|
||||
# Component Fit Matrix — Cross-component process gates
|
||||
|
||||
> Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate). Plan-phase Choose blocks raised by C1, C2, C3, C4, C5, C6, C7, C8, and C10 closures. Each gate names its owner and the resolution path. Backing fact cards live in [`../02_fact_cards/`](../02_fact_cards/) by component.
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Per-component rows: [C1](C1_vio.md), [C2](C2_vpr.md), [C3](C3_matchers.md), [C4](C4_pose_estimation.md), [C5](C5_state_estimator.md), [C6](C6_tile_cache_spatial_index.md), [C7](C7_inference_runtime.md), [C8](C8_fc_adapter.md), [C10](C10_preflight_provisioning.md). C9 dropped per 2026-05-08 restructure — see `../00_question_decomposition.md`.
|
||||
|
||||
---
|
||||
|
||||
## Cross-component process gates open (raised this session and prior)
|
||||
|
||||
| Gate | Owner | Resolution path |
|
||||
|---|---|---|
|
||||
| **D-C1-1 GPL-3.0 license posture** | User | Plan-phase Choose block (A/B/C) before any C1 candidate is locked Selected |
|
||||
| **D-C1-2 Jetson Orin Nano Super hardware MVE phase** | Project bring-up team | Dedicated bring-up phase between research and Plan; produces single Jetson-MVE artifact that promotes Documentary leads to Selected (covers C1 AND C2 candidates per D-C2-4) |
|
||||
| **Fact #40: single-rate vs dual-rate nav-camera pipeline** | Plan-phase architect | Plan-time decision; affects C1 candidate ranking; affects C2/C3 candidate scoring |
|
||||
| **D-C2-1 VPR canonical-weights vs aerial-retrain vs aerial-community-checkpoint** (raised by MixVPR closure 2026-05-08; reaffirmed by SALAD + SelaVPR closures) | User + Plan-phase architect | Plan-phase Choose block (A/B/C) before any C2 candidate is locked Selected; applies to **every** ground-level-pretrained C2 candidate (MixVPR + SALAD + SelaVPR all street-view-trained); SelaVPR README recommends MSLS-finetuned variant for "diverse scenes" cross-domain transfer as default |
|
||||
| **D-C2-2 descriptor-cache carve-out vs raw-tile-cache budget** (raised by MixVPR closure 2026-05-08; harshened by SALAD; **materially-changed-shape by SelaVPR**) | Plan-phase architect | Plan-time decision; AC-8.3 explicitly requires this. **Per-candidate global-descriptor cache**: SelaVPR 1024-D ~3.2% (smallest); MixVPR 2048-D ~6.5%; SALAD-slim 544-D ~1.7%; SALAD-full 8448-D ~27%. **NEW SelaVPR local-feature-cache pressure**: ~150 GB if naive cache → forces D-C2-7 mitigation choice. Conditional candidates (AnyLoc/BoQ/DINOv2-VLAD) at higher dimensionality push descriptor cache to ~10 GB alone, forcing carve-out |
|
||||
| **D-C2-3 input-resolution shape (224×224 vs 320×320 vs 322×322 vs higher)** (raised by MixVPR closure 2026-05-08; harshened by SelaVPR's 224×224) | Plan-phase architect | Plan-phase decision after all C2 candidates have per-Mode entries; trade-off span: SelaVPR's 224×224 (most aggressive downscale) → MixVPR's 320×320 / SALAD's 322×322 (medium) → AnyLoc/BoQ at 322+ ViT (highest, next sessions) |
|
||||
| **D-C2-4 deferred Jetson Orin Nano Super hardware MVE phase coverage for C2** (raised by MixVPR closure 2026-05-08; scope-broadened by SALAD; **broadened further by SelaVPR**) | Project bring-up team | Same artifact as D-C1-2 must produce per-C2-candidate latency + memory + AerialExtreMatch Recall@K numbers + DINOv2 ViT-B AND ViT-L → TensorRT fp16/INT8 export quality + SelaVPR two-stage re-ranking latency profile + on-demand local-feature extraction performance |
|
||||
| **D-C2-5 DINOv2 ViT-export to TensorRT fp16/INT8 path on Jetson Orin Nano Super** (raised by SALAD closure 2026-05-08; **harshened by SelaVPR closure**) | Project bring-up team + C7 inference-runtime owner | Jetson MVE phase must validate DINOv2-B AND DINOv2-L export paths before any ViT-based C2 candidate (SALAD, SelaVPR, AnyLoc, BoQ, DINOv2-VLAD) advances from Documentary lead to Selected. SelaVPR's ViT-L is 3.5× larger than SALAD's ViT-B; counter-mitigation by SelaVPR's frozen-backbone canonical export pathway (FB AI Public Files distribution) |
|
||||
| **D-C2-6 SALAD descriptor-size choice (8448-D / 2112-D / 544-D)** (raised by SALAD closure 2026-05-08; SALAD-only, does not apply to SelaVPR) | Plan-phase architect | Plan-time decision; full variant best R@1 but consumes ~27% of AC-8.3 cache budget; slim 544-D fits within 1.7% but loses ~5 R@1 points on MSLS Challenge. Interacts with D-C2-2 carve-out decision |
|
||||
| **D-C2-7 SelaVPR re-ranking strategy choice (full re-rank with on-demand local-feature extraction / cache top-K local features per likely query path / disable re-ranking entirely and use SelaVPR-global-only mode)** (NEW from SelaVPR closure 2026-05-08; SelaVPR-only, first two-stage C2 candidate) | Plan-phase architect | Plan-time decision conditional on SelaVPR being elevated to Selected. Full re-rank at rerank_num=100 fails AC-4.1 latency budget on Jetson extrapolation; rerank_num=20 fits but tight; on-demand local-feature extraction + global-only-cache (~320 MB) is most cache-efficient; precompute-top-K-local-features (~3 GB at K=20 with selective coverage) is moderate; disable-rerank gives single-stage parity (MSLS-challenge R@1=69.6 vs full's 73.5, still ahead of MixVPR's 64.0). **Three-way interaction with D-C2-2 + AC-8.3 + AC-4.1** |
|
||||
| **D-C2-8 NetVLAD PyTorch-port-strategy choice (Nanne/pytorch-NetVlad with license-uncertainty / re-port from canonical Relja/netvlad with MIT preservation / OpenVPRLab-NetVLAD-on-ResNet50 as separately-cataloged sibling mode)** (NEW from NetVLAD closure 2026-05-08; NetVLAD-only, first canonical-MATLAB-stack C2 candidate) | Plan-phase architect + license-posture decision-maker | Plan-time decision; canonical implementation is MATLAB + MatConvNet (not deployable on JetPack 6) — PyTorch port required. Nanne port is fastest path but README does NOT cite a LICENSE file → Plan-phase verification gate is a hard prerequisite before adoption; re-port from canonical Relja/netvlad MATLAB to PyTorch directly preserves MIT licensing alignment with MixVPR + SelaVPR on the BSD/permissive track but requires ~1 week of engineering + cluster-init prerequisite + retraining or weight-transfer; OpenVPRLab-NetVLAD-on-ResNet50 is apples-to-apples vs MixVPR but is a *different mode* per Per-Mode API rule (different backbone, different pretrained checkpoint provenance) and would be cataloged as a separate sibling candidate. **Recommendation: re-port from canonical** to preserve MIT licensing alignment |
|
||||
| **D-C2-9 NetVLAD descriptor-dimension choice (canonical 4096-D PCA-whitened / 512-D `cropToDim` for tighter cache / 256-D `cropToDim` for tightest cache)** (NEW from NetVLAD closure 2026-05-08; NetVLAD-only; analogous to D-C2-6 SALAD descriptor-size choice but for NetVLAD's PCA-whitened output) | Plan-phase architect | Plan-time decision; canonical 4096-D consumes ~1.3 GB / 13% of 10 GB AC-8.3 cache budget — **largest single-stage descriptor cache** of any C2 candidate evaluated so far; 512-D `cropToDim` reduces to ~160 MB / 1.6% at additional Recall@K loss; 256-D `cropToDim` reduces to ~80 MB / 0.8% at further loss. Only valid for `+whitening` networks. Interacts with D-C2-2 carve-out decision. Given NetVLAD's mandatory-baseline role (NOT a competitive lead), the 256-D / 512-D `cropToDim` variants may be more appropriate to free cache budget for the modern lead's larger descriptor — but Plan must decide explicitly |
|
||||
| **D-C2-10 EigenPlaces descriptor-dimension choice (canonical 2048-D / 512-D / 256-D / 128-D — eleven backbone+dim sibling modes PyTorch-Hub-distributed)** (NEW from EigenPlaces closure 2026-05-08; EigenPlaces-only; analogous to D-C2-6 SALAD and D-C2-9 NetVLAD descriptor-dimension choices) | Plan-phase architect | Plan-time decision; canonical ResNet-50 + 2048-D consumes ~650 MB / 6.5% of AC-8.3 cache budget (identical to MixVPR-2048 for direct apples-to-apples comparison); 512-D variant reduces to ~160 MB / 1.6% at modest Recall@1 loss (paper Tab 3: Pitts30k 91.9 at 512 vs 92.5 at 2048 = -0.6, Tokyo24/7 89.8 at 512 vs 93.0 at 2048 = -3.2 — extreme cross-domain hurts most); 256-D reduces to ~80 MB / 0.8% at moderate Recall@K loss; 128-D reduces to ~40 MB / 0.4% at substantial Recall@K loss on cross-domain (paper §4.3 explicit observation). Eleven canonical pretrained checkpoints PyTorch-Hub-distributed give the project the widest range of cache-footprint sibling modes of any C2 candidate evaluated. Interacts with D-C2-2 carve-out decision |
|
||||
| **D-C2-11 (CONDITIONAL) MegaLoc successor evaluation as separately-cataloged sibling candidate** (NEW from EigenPlaces closure 2026-05-08; raised by canonical EigenPlaces README explicit pointer "EigenPlaces is quite old. Looking for SOTA Visual Place Recognition (VPR)? Check out MegaLoc") | User + Plan-phase architect | Plan-phase decision: (a) treat MegaLoc as a separately-cataloged sibling candidate at Plan time (would require its own per-mode API capability verification + sub-matrix), (b) defer MegaLoc evaluation to a post-research session if EigenPlaces fails Jetson MVE, (c) skip MegaLoc and rely on the closed mandatory pre-screen (5/5: MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces). **Recommendation**: defer to post-research session — EigenPlaces closes the mandatory pre-screen at the documentary-required floor, and MegaLoc's Plan-phase relevance depends on which D-C1-1 license-track is chosen and how Jetson MVE results land |
|
||||
| **D-C1-1 license-posture interaction with C2** (already raised by C1; sharpened by SALAD-GPL-3.0; materially-positive update from SelaVPR-MIT 2026-05-08; further-positive update from NetVLAD-MIT canonical 2026-05-08 with Nanne-port license-uncertainty Plan-phase verification gate; **fully-positive update from EigenPlaces-MIT 2026-05-08 closing the BSD/permissive C2 axis**) | User + Plan-phase architect | **BSD/permissive C2 axis (mandatory pre-screen COMPLETE 2026-05-08)** (under D-C1-1 = (b) or default (c)): MixVPR (CNN-ResNet50 + MLP-Mixer, MIT) + **SelaVPR (DINOv2 ViT-L/14 two-stage, MIT)** + **NetVLAD (CNN-VGG16 + soft-assignment-VLAD, MIT canonical / license-uncertain Nanne PyTorch port — D-C2-8 verification gate, mandatory simple-baseline)** + **EigenPlaces (CNN-ResNet50 + GeM + FC viewpoint-robust training paradigm, MIT)**. **GPL-3.0 C2 axis** (under D-C1-1 = (a) or default (c)): SALAD (DINOv2 ViT-B + optimal-transport, GPL-3.0) + (conditional next-sessions: AnyLoc/BoQ/DINOv2-VLAD pending license verification + INT8 quantization survey prerequisite). EigenPlaces's MIT-canonical placement materially completes the BSD/permissive C2 axis with **four materially-different design points** spanning 2016 (NetVLAD baseline VLAD) → 2023 (MixVPR ResNet50+MLP-Mixer) → 2023 (EigenPlaces ResNet50+GeM+viewpoint-robust-training) → 2024 (SelaVPR DINOv2-L+two-stage). The BSD/permissive C2 axis now has the **most diverse design-point coverage of any license track in any component row in the project** |
|
||||
| **D-C3-1 (NEW from SP+LightGlue closure 2026-05-08) — SuperPoint-replacement-strategy choice (DISK+LightGlue with Apache-2.0 + paper Table 6 superiority [RECOMMENDED] / ALIKED+LightGlue with BSD-3-Clause+Apache-2.0 / SuperPoint-reproduction-with-permissive-license / accept-Magic-Leap-noncommercial-with-swap-commitment / SIFT+LightGlue classical-baseline-fallback)** | User + Plan-phase architect + license-posture decision-maker | Mandatory Plan-phase decision; canonical SuperPoint pretrained weights LICENSE (Source #72 Magic Leap noncommercial-research-only Software License Agreement) is a **HARD DISQUALIFIER** on the canonical SP+LightGlue mode in the project's dual-use deployment context (eastern/southern Ukraine fixed-wing UAV with AC-NEW-2 spoofing-promotion path is dual-use military by every reasonable interpretation, and the project's question_decomposition.md hard disqualifier list includes "anything whose license blocks military / dual-use deployment"). **Recommendation: D-C3-1 = (a) DISK+LightGlue** — Apache-2.0 throughout AND paper Appendix A Table 6 documentary technical superiority over canonical SP+LightGlue (+7.99 absolute AUC@5° on IMC 2020 stereo). Interacts with D-C1-1 (license-posture overall) + D-C2-1 (aerial-domain training, since DISK+LightGlue retrain is the cleanest license-compliant + retrain-friendly pathway) |
|
||||
| **D-C3-2 (NEW from SP+LightGlue closure 2026-05-08) — LightGlue-inference-runtime choice (PyTorch-fp16 / Torch-TensorRT / ONNX Runtime + TensorRT EP via Source #73 / pure TensorRT via trtexec + Polygraphy via Source #73 / FP8 ModelOpt-on-Jetson if Ampere FP8 emulation works)** | Project bring-up team + C7 inference-runtime owner | Plan-phase decision conditional on D-C3-1 lock + Jetson MVE results; Source #73 (`fabio-sim/LightGlue-ONNX`) is the canonical reference for ONNX / TensorRT / OpenVINO / FP16 / FP8 export pathway with January 2026 active maintenance. **CRITICAL Jetson Orin Nano Super FP8 emulation gate**: Source #73 documents FP8 ModelOpt workflow on Hopper/Ada/Blackwell — Jetson Orin Nano Super is Ampere architecture (NOT FP8-native); FP8 ModelOpt path applies only with INT8 emulation fallback (verification at Jetson MVE phase). Likely rolls into the C7 cross-cutting integration row |
|
||||
| **D-C3-3 (NEW from SP+LightGlue closure 2026-05-08) — K-pairs-per-frame budget choice (reduce K from 10 to 3-5 / reduce keypoints from 1024 to 512 / accept TIGHT 300-600 ms standard ÷ 150-300 ms adaptive margin and validate at Jetson MVE / parallelize matcher across multiple Jetson GPU streams / elevate ONNX Runtime + TensorRT EP + adaptive depth)** | Plan-phase architect | Plan-phase decision; canonical RTX-3080 throughput 150 FPS @ 1024 keypoints with adaptivity → Jetson Orin Nano Super extrapolation ~30-60 ms per pair → at K=10 top-K retrieval pairs per UAV frame = 300-600 ms standard / 150-300 ms adaptive against AC-4.1 400 ms budget — TIGHT before C1+C2+C5+C8 costs added. **Three-way interaction with AC-4.1 latency budget + AC-3.3 re-localization recall + AC-1.1/1.2 frame-center pose accuracy**. Adaptive-depth path (paper §5.4 1.86× speedup on easy pairs) is the most-favorable structural trade-off if many of the K pairs are high-overlap UAV-vs-cached-tile pairs. **MORE-TIGHT D-C3-3 gate for ALIKED+LightGlue** vs DISK+LightGlue or SP+LightGlue due to PyTorch-fp16-only restriction (ALIKED-export-absence in LightGlue-ONNX) — likely requires K reduction from 10 to 3-5 OR ALIKED-T(16) 64-D sibling mode for AC-4.1 satisfaction |
|
||||
| **D-C3-4 (NEW from ALIKED+LightGlue closure 2026-05-08) — ALIKED-sibling-mode choice (ALIKED-T(16) 64-D Jetson-friendliest @ 1.37 GFLOPs / ALIKED-N(16) 128-D canonical baseline @ 4.05 GFLOPs / ALIKED-N(16rot) 128-D rotation-augmented @ same arch as N(16) / ALIKED-N(32) 128-D higher-SDDH-sample-count Aachen-best @ 4.62 GFLOPs)** | Plan-phase architect | Plan-phase decision conditional on D-C3-1 = (b) ALIKED+LightGlue secondary mitigation being selected; for project's UAV multi-heading 1 km AGL flights + Jetson PyTorch-fp16-only deployment (forced by ALIKED-export-absence in LightGlue-ONNX), **recommendation is ALIKED-N(16rot)** (rotation-augmentation aligns with multi-heading aerial flights; 4.05 GFLOPs leaves K=10 pairs/frame headroom; same 128-D descriptor as canonical N(16)). ALIKED-T(16) is the latency-fallback if AC-4.1 budget pressure forces 64-D descriptor reduction; ALIKED-N(32) is the accuracy-prioritization choice if Aachen-Day-Night documentary lift is the primary axis (paper Table VII at 2048 keypoints / 0.25m,2° tier = 77.6 best-in-paper). Interacts with D-C3-3 K-pairs-per-frame budget choice (T(16)'s 1.37 GFLOPs allows higher K than N(16)/N(32)'s 4.05/4.62 GFLOPs) |
|
||||
| **D-C3-5 (NEW from DISK+LightGlue closure 2026-05-08) — DISK-pretrained-weights-choice (`save-depth.pth` canonical default RECOMMENDED / `save-epipolar.pth` supplementary-material alternate -0.5 to -1 absolute AUC trade-off / project-domain retrain on aerial nadir corpus via canonical `colmap/colmap2dataset.py` workflow at ~2 weeks on 32 GB V100 cost)** | Plan-phase architect | Plan-phase decision conditional on D-C3-1 = (a) DISK+LightGlue RECOMMENDED-PRIMARY-MITIGATION being selected; for project's pinned UAV-vs-satellite-tile registration use case **`save-depth.pth` is the recommended canonical default** (strongest documentary IMW2020 stereo + multiview AUC numbers per canonical paper Table 1 + cross-paper Aachen Day-Night transitive lift via ALIKED paper Table VII). `save-epipolar.pth` is the fallback if depth-map ground-truth is unavailable for aerial-domain retrain (paper §4 epipolar reward variant trades 0.5-1 absolute AUC for not requiring depth maps). Interacts with D-C2-1 retrain decision (DISK retrain via `colmap/colmap2dataset.py` is well-documented but materially expensive ~2 weeks on 32 GB V100 / ~2 weeks at smaller batch on 12 GB low-memory variant — vs ALIKED's ~24 hours on RTX 3090) |
|
||||
| **D-C3-6 (NEW from XFeat closure 2026-05-08) — XFeat-mode-choice (XFeat sparse with MNN matching for SIMPLEST deployment / XFeat\* semi-dense with MNN+MLP-offset-refinement for HIGHEST inlier count per pair / XFeat+LighterGlue paired-matcher for MODERN learned-matcher accuracy)** | Plan-phase architect | Plan-phase decision conditional on XFeat being selected (D-C3-1 ALTERNATE-MODERN-COMPETITIVE-LEAD role alongside DISK+LightGlue's RECOMMENDED-PRIMARY-MITIGATION); for project's pinned UAV-vs-satellite-tile registration use case + AC-NEW-7 cache-poisoning safety budget + AC-3.3 re-localization stability, **(b) XFeat\* semi-dense is the strongest documentary structural choice** (4× more inliers per pair via lightweight MLP refinement provides best RANSAC stability at lowest engineering complexity — no LightGlue dependency, no productized-export dependency). (a) XFeat sparse is SIMPLEST deployment (D-C3-2 fully sidesteps cvg/LightGlue dependency; documentary AUC@5° materially below LightGlue-siblings); (c) XFeat+LighterGlue narrows MegaDepth-1500 gap to -2.5 to -2.8 absolute below SP+LightGlue at the cost of D-C3-2 reuse (community-contribution-needed for productized LighterGlue export pathway). Interacts with D-C2-1 retrain decision (XFeat is the cheapest retrain candidate among all C3 candidates evaluated at 36 hours single RTX 4090 + 6.5 GB VRAM total per Source #81 §3.3) and D-C3-2 (only XFeat+LighterGlue mode reuses D-C3-2 cvg/LightGlue runtime path; sparse + semi-dense modes sidestep entirely) |
|
||||
| **D-C4-1 (CARRIED FORWARD from Fact #20 C2 closure; REINFORCED by OpenCV `cv::solvePnPRansac` closure 2026-05-08 Fact #52) — 2D-3D-lift architectural decision (3-DoF acceptance with attitude-from-IMU/VIO prior + 2D ortho-only cache / 4-DoF acceptance with flat-earth + altitude-from-IMU+barometer prior + planar-scene homography → 4-DoF pose extraction / 6-DoF via aerial-photogrammetry-DSM-acquisition + paired DSM at 0.94 m/px / 6-DoF via ALOS 30m DSM with 4× accuracy collapse per Source #41)** | User + Plan-phase architect | Plan-phase decision; **for the project's pinned 2D-ortho-only cache + IMU-attitude-prior context, recommendation is (b) 4-DoF flat-earth + IMU+barometer altitude + VIO/IMU attitude → planar-scene homography → 4-DoF pose extraction** — pairs naturally with `flags=SOLVEPNP_IPPE` (Source #83 explicit "Object points must be coplanar" minimal-solver designed for D-C4-1 = 4-DoF flat-earth case); ALOS-30m-DSM secondary mitigation if 4-DoF accuracy proves insufficient at AC-1.1/1.2 50m/20m bars at the tighter tail. **CRITICAL REINFORCEMENT from Fact #52**: solvePnPRansac requires 3D-2D correspondences (Source #83 explicit `objectPoints` Nx3 + `imagePoints` Nx2 signature) — D-C4-1 lift is a HARD prerequisite for ANY C4 candidate (OpenCV / OpenGV / GTSAM-PnP / Theia / Ceres-only), not unique to OpenCV |
|
||||
| **D-C4-2 (NEW from OpenCV `cv::solvePnPRansac` closure 2026-05-08 Fact #52; UPDATED by GTSAM closure 2026-05-08 Fact #54) — covariance-recovery-strategy choice (post-hoc Jacobian-based via `cv::projectPoints` Jacobian + Schur complement / wrap solvePnPRansac result in GTSAM `Marginals` posterior / project-defined heuristic covariance scaling — likely AC-NEW-4 REJECT / migrate to OpenGV `absolute_pose::optimize_nonlinear` with custom Jacobian propagation through bearing-vector residuals)** | Plan-phase architect | Plan-phase decision; `cv::solvePnPRansac` returns `retval, rvec, tvec, inliers` only (Source #83 function signature); OpenGV's `optimize_nonlinear` has no covariance output API (Source #85) — **NO direct 6×6 covariance output from either OpenCV or OpenGV**. **GTSAM IS THE EXCEPTION**: `Marginals(graph, result).marginalCovariance(pose_key)` emits 6×6 posterior covariance NATIVELY (Source #87 multiple snippets). AC-NEW-4 covariance-honesty contract requires explicit recovery strategy. **Recommendation by primary path**: D-C4-2 = (b) **wrap solvePnPRansac result in GTSAM `Marginals` posterior** via `BetweenFactor<Pose3>` prior + per-inlier `GenericProjectionFactorCal3_S2` factors → `LevenbergMarquardtOptimizer.optimize()` → `Marginals.marginalCovariance` (canonical Plan-phase pathway documented in Fact #54; **STRONGLY RECOMMENDED for the OpenCV-as-RANSAC + GTSAM-as-covariance-recovery hybrid path** — couples Fact #52 mandatory-simple-baseline + Fact #54 modern-competitive-lead-covariance-honest); D-C4-2 = (a) post-hoc Jacobian-based via `cv::projectPoints` Jacobian + Schur complement on inlier residuals (~1 day engineering; pure OpenCV API) for the OpenCV-only-no-GTSAM path if Plan-phase Jetson MVE shows GTSAM's ~30-90 ms latency + ~50-200 MB memory footprint exceeds AC-4.1 / AC-4.2 budgets; D-C4-2 = (c) is likely AC-NEW-4 REJECT; D-C4-2 = (d) couples with D-C4-1 + D-C4-3 + D-C4-4 selection of OpenGV-as-primary at ~3-5 days engineering for OpenGV-internal Jacobian propagation through bearing-vector residuals (harder than OpenCV's pixel Jacobian per Fact #53 closure). **Three-way interaction with AC-NEW-4 covariance-honesty + D-C4-1 lift architectural decision + C5 fusion contract** (Fact #20 + #21 closures) |
|
||||
| **D-C4-3 (NEW from OpenGV closure 2026-05-08 Fact #53) — license-clearance verification choice (counsel-review of License.txt to confirm BSD-3-Clause-equivalent / request author + ShanghaiTech Mobile Perception Lab to relicense to OSI canonical / treat NOASSERTION as effective disqualifier and pivot to OpenCV-as-primary / elevate D-C4-3 to D-C1-1 and treat OpenGV as eligible only on GPL-3.0 or keep-both-tracks-open)** | License-posture decision-maker + Plan-phase architect | Plan-phase decision conditional on OpenGV being elevated to Selected; Source #84 GitHub API license metadata reports `license.spdx_id: "NOASSERTION"` for canonical `laurentkneip/opengv` repo; Source #84 direct WebFetch of License.txt confirms BSD-3-Clause-equivalent boilerplate (3 numbered redistribution conditions + non-endorsement clause + "Copyright 2013 Laurent Kneip, ANU. All rights reserved." attribution) but the file does NOT use OSI canonical BSD-3-Clause template text. **Recommendation**: D-C4-3 = (a) counsel-review (~1-2 hours legal review) for the OpenGV-as-secondary path; D-C4-3 = (c) pivot to OpenCV-as-primary if Plan-phase Jetson MVE shows OpenCV's mandatory-simple-baseline coverage is sufficient without OpenGV's richer-minimal-solver-coverage. Interacts with D-C4-4 maintenance-staleness mitigation (if D-C4-3 fails, D-C4-4 also pivots to OpenCV-as-primary or Ceres-only fallback) |
|
||||
| **D-C4-4 (NEW from OpenGV closure 2026-05-08 Fact #53) — maintenance-staleness-mitigation strategy choice (accept-as-is + freeze upstream / fork into project-controlled branch + apply Eigen-3.4+ + JetPack-6 + ARM Cortex-A78AE patches in-house / migrate to Ceres-only manual implementation as fallback / downgrade OpenGV to experimental status and pivot to OpenCV-as-primary)** | Plan-phase architect + project bring-up team | Plan-phase decision conditional on OpenGV being elevated to Selected; Source #84 last pushed 2023-06-07T18:14:14Z = ~2 years 11 months stale at access time 2026-05-08; Doxygen portal generated 2018-01-08 = 8.3 years old documentation; ShanghaiTech Mobile Perception Lab's claimed maintenance contradicted by commit history. **Recommendation**: D-C4-4 = (b) fork-and-patch (~1-2 weeks engineering) for the OpenGV-as-secondary path; D-C4-4 = (d) pivot to OpenCV-as-primary if Plan-phase Jetson MVE shows OpenCV's coverage is sufficient; D-C4-4 = (c) Ceres-only fallback (~2-4 weeks) only if (b) patches not feasible. Interacts with D-C4-3 license-clearance verification |
|
||||
| **D-C5-1 (NEW from Manual ESKF Solà 2017 closure 2026-05-08 Fact #88) — reference-implementation-license-verification choice (counsel-review of repo for LICENSE file in subdirectory ~1 hour engineering RECOMMENDED first step / treat as GPL-equivalent and write project implementation from Solà 2017 paper directly without code reuse ~1-2 weeks engineering vs ~3-5 days with reference template / contact author for LICENSE clarification ~1-3 weeks turnaround if author responsive)** | License-posture decision-maker + Plan-phase architect | Plan-phase decision conditional on project electing to reuse `ludvigls/ESKF` (Python ESKF for fixed-wing UAVs DIRECTLY MATCHING project hardware family) OR `cggos/imu_x_fusion` (C++/ROS multi-source loosely-coupled fusion) OR `koledickarlo/ESKF-ESP32` (microcontroller-class with explicit Solà 2017 citation) at the source-code level; Source #89 README front-pages do NOT declare LICENSE for these three repos. `EliaTarasov/ESKF` is PX4-derived (PX4 is dual BSD/Apache-2.0, ecl is BSD-3-Clause) so license-clearance is easier. `joansola/slamtb` is MATLAB-only and not deployable on JetPack 6 (algorithmic reference only). **Recommendation**: D-C5-1 = (b) write directly from canonical Solà 2017 paper for cleanest license-compliance story; reference implementations serve as documentary templates (read for understanding, not copy-paste). Final lock at Plan phase after counsel-review per D-C5-1 = (a). Interacts with D-C1-1 license-posture overall |
|
||||
| **D-C5-2 (NEW from Manual ESKF Solà 2017 closure 2026-05-08 Fact #88) — long-cruise-observability-strategy choice (accept observability degradation in long-cruise segments + monitor via covariance growth + alert operator if covariance > threshold RECOMMENDED / require operator to perform synthetic S-turns periodically every ~30 min to maintain bias observability / tighten bias-stationarity prior — lower IMU bias random-walk noise — at the cost of accepting more bias drift between updates)** | Plan-phase architect | Plan-phase decision; standard EKF/ESKF fusion of IMU + visual measurements requires sufficient excitation (non-pure-rotation, non-zero acceleration) for IMU bias observability per Solà §5.1 reference + classical observability literature. For a fixed-wing UAV in cruise (level flight at ~60 km/h with minimal acceleration), bias drift is the dominant error source; periodic accelerations (turns, climbs, level-to-bank transitions) re-excite observability. **Recommendation**: D-C5-2 = (a) accept + monitor. Mitigation = project's pinned mission profile per restrictions.md provides natural re-excitation via sharp turns up to ±20° bank per AC-3.1 + sharp-turn frames may share <5% overlap per AC-3.2. Covariance growth alert is consistent with AC-NEW-8 blackout failsafe escalation thresholds. **GTSAM iSAM2 Fact #89 partially mitigates** via incremental smoothing's look-back refinement of bias estimates over the entire sliding window (vs Manual ESKF's recursive forward-time-only bias estimation). Applies primarily to Manual ESKF Fact #88; partially-mitigated for GTSAM iSAM2 Fact #89 |
|
||||
| **D-C5-3 (NEW from GTSAM iSAM2 closure 2026-05-08 Fact #89) — sliding-window-primitive-choice (`gtsam_unstable.IncrementalFixedLagSmoother` with K=10-20 keyframes covering ~3-7 s of recent history RECOMMENDED ~30 minutes engineering / custom marginalization via `ISAM2.marginalizeLeaves(keys_to_marginalize)` ~2-3 days engineering / accept unbounded ISAM2 graph growth simplest ~0 minutes engineering but tested at Jetson MVE phase — likely fails AC-4.2 budget at K_total = 86400 keyframes × ~1 KB per keyframe state = ~86 MB raw + factor-graph overhead)** | Plan-phase architect | Plan-phase decision conditional on D-C5-row final lock including GTSAM iSAM2; `IncrementalFixedLagSmoother` is in `gtsam_unstable` namespace per Source #91 (canonical fixed-lag smoother class but requires opt-in to gtsam_unstable APIs; not in stable `gtsam` namespace). **Recommendation**: D-C5-3 = (a) IncrementalFixedLagSmoother with K=10-20 keyframes covering ~3-7 s of recent history. Interacts with D-C5-5 factor-density-choice (lower K reduces per-update factor count proportionally) |
|
||||
| **D-C5-4 (NEW from GTSAM iSAM2 closure 2026-05-08 Fact #89) — IMU-gap-handling-strategy choice (accept canonical pattern + monitor + adaptive integration covariance inflation RECOMMENDED ~1 day engineering / restart PIM on detected gaps with conservative initial covariance more aggressive ~3-5 days engineering / buffer IMU samples in a queue with explicit gap-fill via interpolation most aggressive ~1 week engineering)** | Plan-phase architect | Plan-phase decision conditional on D-C5-row final lock including GTSAM iSAM2; `CombinedImuFactor` requires CONTIGUOUS IMU samples between keyframes per Source #90 canonical pattern; if IMU samples are dropped mid-flight (network jitter, MAVLink frame loss), `pim.preintMeasCov()` 9×9 covariance becomes optimistic vs reality. **Recommendation**: D-C5-4 = (a) accept + monitor + adaptive inflation (track `last_imu_timestamp` and inflate `params.setIntegrationCovariance` adaptively if gap > expected). Project's pinned MAVLink IMU pipeline at ~100-200 Hz Pixhawk-class is delivered over UART or USB serial — dropped samples are rare. Interacts with C8 MAVLink/MSP2 FC adapter row (when opened) for IMU-pipeline-jitter characterization |
|
||||
| **D-C5-5 (NEW from GTSAM iSAM2 closure 2026-05-08 Fact #89) — factor-density-choice (per-correspondence `GenericProjectionFactorCal3DS2` highest fidelity 1000+ factors per keyframe at K=10 image pairs × 100 inliers per pair ~50-150 ms per update on Jetson Orin Nano Super CPU tight AC-4.1 satisfaction / smart-projection-pose-factor canonical landmark-marginalization-at-construction-time 1 factor per landmark per keyframe ~10× speedup at minimal accuracy loss ~5-15 ms per update on Jetson Orin Nano Super CPU / `PriorFactorPose3` only with C4 GTSAM Marginals satellite-anchor 6×6 covariance — couples C4 Fact #54 D-C4-2 = (b) with C5 Fact #89 architectural integration via shared GTSAM substrate ~1 factor per keyframe ~2-5 ms per update on Jetson Orin Nano Super CPU CLEANEST cross-component coupling RECOMMENDED for the GTSAM-as-shared-C4+C5-substrate hybrid path)** | Plan-phase architect | Plan-phase decision conditional on D-C5-row final lock including GTSAM iSAM2; iSAM2 per-update latency depends critically on factor density per keyframe. **Recommendation**: D-C5-5 = (c) for the GTSAM-as-shared-C4+C5-substrate hybrid path (project's recommended C5 architecture per Fact #89 closure); D-C5-5 = (b) for the C5-as-secondary-with-smoothing path if Plan-phase Jetson MVE shows (c) accuracy is insufficient at AC-1.1/1.2 tail. **Three-way interaction with D-C4-2 covariance-recovery-strategy + AC-4.1 latency budget + AC-1.1/1.2 frame-center pose accuracy**. Strongest cross-component lever in the C4+C5 design space — D-C5-5 = (c) operationalizes the GTSAM-shared-substrate architectural advantage identified in C4 Fact #54 + C5 Fact #89 |
|
||||
| **D-C6-1 (NEW from Cand 1 closure 2026-05-08 Fact #92; mirrored by Cand 2 D-C6-6) — descriptor-storage-format choice (full-precision float32 in `bytea` column ~8 KB/tile-at-2048-D / **halfvec via app-side conversion + storage as 2-byte half-floats ~4 KB/tile-at-2048-D ~50% cache savings ~0-2% Recall@K loss RECOMMENDED** / INT8 quantized + per-vector scale parameter ~1 KB/tile-at-2048-D ~87.5% cache savings + ~1 day engineering for quantization-aware loader)** | Plan-phase architect | Plan-phase decision; trade-off between AC-8.3 cache footprint vs Recall@K accuracy loss vs engineering complexity. **Recommendation**: D-C6-1 = (b) halfvec for descriptor storage at ~2× cache-footprint-saving with ~0-2% Recall@K loss. Interacts with D-C2-9 NetVLAD descriptor-dimension choice + D-C2-10 EigenPlaces descriptor-dimension choice + D-C2-6 SALAD descriptor-size choice + AC-8.3 10 GB cache budget |
|
||||
| **D-C6-2 (NEW from Cand 1 closure 2026-05-08 Fact #92, Cand-1-only) — FAISS index variant choice for app-side descriptor ANN (`IndexFlatL2` brute-force exact-distance for small caches <10K tiles ~1-3 ms per query / **`IndexHNSWFlat(d, M=32)` graph-based approximate for primary path 100K-1M tiles ~1-3 ms per query w/ efSearch=64 RECOMMENDED** / `IndexIVFFlat` inverted-file approximate w/ training requirement / `IndexIVFPQ` for additional product-quantizer compression at ~10% Recall@K loss)** | Plan-phase architect | Plan-phase decision conditional on Cand 1 (mirror-suite-pattern) being selected as primary; trade-off between memory footprint vs query accuracy vs query latency. **Recommendation**: D-C6-2 = (b) IndexHNSWFlat M=32 for primary path; IndexFlatL2 fallback for small caches per Source #96 contextual guidance |
|
||||
| **D-C6-3 (NEW from Cand 1 closure 2026-05-08 Fact #92, Cand-1-only, CROSS-COMPONENT with C10) — descriptor-cache-rebuild-trigger strategy (rebuild on every cache modification ~simplest but slow ~5-30 sec per rebuild blocks readiness / incremental add via `index.add()` ~faster but HNSW does not support delete cleanly per Source #96 / **periodic rebuild during pre-flight provisioning ~most robust requires C10 coordination + serialize via `faiss.write_index` + reload at takeoff in <5 sec RECOMMENDED**)** | Plan-phase architect + C10 owner | Plan-phase decision conditional on Cand 1 being selected; jointly owned with C10 pre-flight cache provisioning row (when opened). **Recommendation**: D-C6-3 = (c) periodic rebuild during C10 pre-flight provisioning. Strongest C6+C10 cross-component coupling |
|
||||
| **D-C6-4 (NEW from Cand 1 closure 2026-05-08 Fact #92, Cand-1-only) — geographic-spatial-grid radius `k` choice (fixed-1 = 3x3 grid simplest / fixed-2 = 5x5 grid covers AC-3.x sharp turns more robustly / fixed-4 = 9x9 grid for very high-bank or low-zoom / **dynamic derived from zoom + ground-speed projected over next 5 sec RECOMMENDED**)** | Plan-phase architect | Plan-phase decision conditional on Cand 1 being selected; trade-off between per-query candidate count vs spatial coverage vs latency. **Recommendation**: D-C6-4 = dynamic |
|
||||
| **D-C6-5 (NEW from Cand 2 closure 2026-05-08 Fact #93, Cand-2-only contingent) — Jetson PostGIS + pgvector co-installation Plan-phase verification choice (**verify on Jetson MVE phase as part of D-C1-2 dedicated bring-up phase RECOMMENDED — already-required Jetson hardware bring-up cycle absorbs this work cheaply** / fork PostGIS+pgvector ARM64 builds in-house if upstream packages incomplete ~1-3 days engineering / pivot to Cand 1 if PostGIS+pgvector co-installation reveals blocking incompatibility)** | Project bring-up team + C7 inference-runtime owner | Plan-phase decision conditional on Cand 2 being elevated to primary; Source #94 search results explicit limitation: "do not provide specific information about PostGIS 3.4's compatibility with ARM64 architecture on Jetson devices, nor do they document the installation footprint"; Source #97 March 2026 article confirms Postgres+pgvector but not explicitly+PostGIS. **Recommendation**: D-C6-5 = (a) verify on Jetson MVE. Interacts with D-C1-2 Jetson MVE phase + D-C7 (when opened) |
|
||||
| **D-C6-6 (NEW from Cand 2 closure 2026-05-08 Fact #93, Cand-2-only contingent; mirrors D-C6-1 for the pgvector-side) — pgvector descriptor-storage-type choice (`vector` full-precision float32 with 2,000-dim max for HNSW per Source #95 — JUST EXCEEDED by MixVPR 2048-D / **`halfvec` half-precision 2-byte with 16,000-dim max + 50% cache savings + ~0-2% Recall@K loss RECOMMENDED — covers all C2 VPR descriptor candidates consistently** / `sparsevec` for sparse descriptors / `bit` for binary descriptors via Hamming distance)** | Plan-phase architect | Plan-phase decision conditional on Cand 2 being elevated to primary; trade-off between cache footprint vs accuracy vs descriptor compatibility with C2 VPR candidate output format. **Recommendation**: D-C6-6 = (b) halfvec. Interacts with D-C2-9 + D-C2-10 + D-C2-6 descriptor-dimension choices |
|
||||
| **D-C6-7 (NEW from C6 batch 1 closure 2026-05-08 Fact #92 + Fact #93, CROSS-COMPONENT — affects both Cand 1 and Cand 2; forced by Cand 2 selection) — IF Cand 2 selected → cascade-changes-back-to-suite-satellite-provider strategy choice (cascade PostGIS+pgvector adoption back to satellite-provider for cross-suite consistency ~1-3 days engineering at suite + onboard / keep satellite-provider on btree-only and gps-denied-onboard on PostGIS+pgvector ~accept divergence + maintenance burden / migrate satellite-provider to PostGIS+pgvector in a separate ticket post-MVP / **leave satellite-provider unchanged + maintain Cand 1 throughout — no cascade needed RECOMMENDED if Cand 1 selected as primary which is the closure verdict**)** | User + Plan-phase architect + suite satellite-provider owner | Plan-phase decision conditional on Cand 2 being elevated to primary at C6; per user's session-start clarification "if improvement is small, then there is no sense to change anything at all" — IF Cand 2's MATERIAL improvement justifies adoption (currently NO per closure verdict in Fact #92 + Fact #93 comparative analysis), cascade via separate ticket; OTHERWISE stay with Cand 1 throughout the suite. **Cross-component cascade decision affecting parent-suite `satellite-provider` component** |
|
||||
| **D-C7-1 (CLOSED IN C7 batch 1 2026-05-08, per C9 / SQ7 restructure user choice A) — calibration-dataset-strategy** | Plan-phase architect (CLOSED at research time — no Plan-phase decision remains) | **Closed at C7 batch 1**: strategy = **real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles** as the calibration corpus distribution (matches the Project Constraint Matrix's "Inputs available" pinning + provides realistic noise/illumination/season distribution that the deployed system will see). Specific fixture-file pin (AerialVL S03 vs project's Mavic + Derkachi flight clips vs other corpora) is fixture-class and **DELEGATED to Test Spec (greenfield Step 5)**. Synthetic-tile augmentation via random homography is the documented low-data fallback, only invoked if real flight footage is insufficient for Recall@K-target calibration. ~500–1,500 representative samples per the C7 batch 1 closure constraint. **No Plan-phase Choose block remains** — the architectural decision is locked at C7 batch 1 closure. **Cross-component coupling with C9 dropped** per restructure; coupling moves to C7 ↔ Test Spec for fixture-file pinning. |
|
||||
| **D-C7-2 (NEW from Cand 1 TensorRT-native closure 2026-05-08 Fact #94, Cand-1-only) — TensorRT mixed-precision flag matrix per model family (single FP16-only flag for entire pipeline / **INT8+FP16 for VPR backbones + FP16-only for matchers + FP16-only for VIO frontends [hybrid per-family per D-C7-6] RECOMMENDED** / per-layer precision overrides via `setPrecision`)** | Plan-phase architect | Plan-phase decision conditional on Cand 1 (TensorRT-native) being selected as primary. **Recommendation**: D-C7-2 = (b) ladder per D-C7-6 per-model-family precision policy. Interacts with D-C7-6 cross-component model-family precision policy (AC-NEW-3 covariance honesty + AC-1.1/1.2 frame-center accuracy preserved at FP16 for matchers per Source #103 evidence) |
|
||||
| **D-C7-3 (NEW from Cand 2 ONNX Runtime+TRT EP closure 2026-05-08 Fact #95, Cand-2-only) — ORT-Jetson-wheel-index-pin choice (`pypi.jetson-ai-lab.io/jp6/cu126` for JetPack 6.2 / `pypi.jetson-ai-lab.io/jp6/cu129` for JetPack 6.x with newer CUDA / **mirror the wheel index to a project-controlled artifact registry for offline-deployment robustness RECOMMENDED ~50 MB per wheel set; pre-flight provisioning step + cu126 variant for JetPack 6.2 alignment**)** | Plan-phase architect + C10 owner | Plan-phase decision conditional on Cand 2 (ONNX Runtime + TRT EP) being elevated to primary; standard `pip install onnxruntime-gpu` does NOT work on Jetson Tegra per Source #100 Issue #20503 — Microsoft does not publish prebuilt aarch64 wheels with CUDA/TensorRT EPs. **Recommendation**: D-C7-3 = (c) mirror to project artifact registry + cu126 variant. Interacts with R-NEW-2 no-cloud-at-flight (offline-deployment requires wheel mirror) + C10 pre-flight cache provisioning |
|
||||
| **D-C7-4 (NEW from Cand 2 ONNX Runtime+TRT EP closure 2026-05-08 Fact #95, Cand-2-only) — numpy-version-pin choice (**`numpy<2.0.0` per Source #100 Issue #27562 RECOMMENDED until upstream rebuild** / wait for upstream onnxruntime-gpu rebuild against numpy>=2 / pin to a specific onnxruntime-gpu version known to work with numpy<2)** | Plan-phase architect | Plan-phase decision conditional on Cand 2 being elevated to primary; onnxruntime-gpu v1.23.0 wheels for JetPack 6 were built against `numpy<2.0.0`; importing under `numpy>=2.0.0` raises a compatibility error per Source #100 Issue #27562. **Recommendation**: D-C7-4 = (a) `numpy<2.0.0` until upstream rebuild; track Issue #27562 status at Plan phase |
|
||||
| **D-C7-5 (NEW from Cand 3 pure-PyTorch-FP16 closure 2026-05-08 Fact #96, Cand-3-only) — PyTorch-Jetson-wheel-pin choice (**PyTorch 2.5 + torchvision 0.20 stable RECOMMENDED ~most-stable combination per NVIDIA Developer Forum** / PyTorch 2.9 + torchvision latest / track Jetson AI Lab cadence)** | Plan-phase architect | Plan-phase decision conditional on Cand 3 (pure PyTorch FP16) being selected as mandatory simple-baseline. Standard `pip install torch` does NOT include CUDA support on Jetson per Source #101 NVIDIA Developer Forum threads; must use Jetson AI Lab community wheels. Known dependency issues with `libcudss.so.0` and `libnvdla_runtime.so` on PyTorch 2.9 cu129 wheel under JetPack 6.2 (CUDA 12.6) — version-mismatch sensitive. **Recommendation**: D-C7-5 = (a) PyTorch 2.5 + torchvision 0.20 for the project's first deployment; revisit at Plan phase based on Jetson MVE results |
|
||||
| **D-C7-6 (NEW from C7 batch 1 closure 2026-05-08 Fact #94 + Fact #95 + Fact #96, CROSS-COMPONENT — affects C2 + C3 + C1 + C7) — INT8-vs-FP16-per-model-family-precision-policy (single INT8 across all model families with sensitivity-fallback / **per-family precision policy: VPR INT8+FP16 fallback, matchers FP16-only, VIO frontends FP16-only RECOMMENDED — operationalizes Source #103 matcher-INT8-quantization-sensitivity finding + Source #102 VPR-CNN-INT8-tolerability finding** / FP16 across all model families until calibration data validates per-family INT8)** | User + Plan-phase architect | **Strongest cross-component lever in the C2+C3+C7 design space.** Plan-phase decision; Source #103 evidence shows LightGlue FP8 caused "match counts dropped sometimes hard" (FP8 is structurally similar to INT8 in dynamic-range reduction) — feature-matching networks are quantization-sensitive in a way that detection / VPR networks are not. Source #102 confirms YOLO26n CNN at INT8 has -6.5% mAP50-95 vs FP16 — acceptable for VPR Recall@K granularity. **Recommendation**: D-C7-6 = (b) per-family policy: VPR backbones (CNN-class MixVPR/EigenPlaces/NetVLAD) → INT8+FP16 mixed; ViT-class VPR backbones (SelaVPR DINOv2-L, conditional AnyLoc/BoQ/DINOv2-VLAD) → FP16-only initially with INT8 deferred to Jetson MVE per D-C2-5; matchers (LightGlue with SP/DISK/ALIKED, XFeat, XFeat+LighterGlue) → **FP16-only — NO INT8**; learned VIO frontends (if any selected at C1) → FP16-only initially, INT8 deferred to Jetson MVE per D-C7-2. **Three-way interaction with AC-1.1/1.2 frame-center accuracy + AC-4.1 latency budget + AC-NEW-3 (FDR for INT8 calibration cache provenance)** |
|
||||
| **D-C7-7 (NEW from Cand 1 TensorRT-native closure 2026-05-08 Fact #94, Cand-1-only CROSS-COMPONENT with C10) — engine-build-on-Jetson-vs-prebuilt-engine-shipping strategy (build engines at pre-flight on the deployed Jetson / build engines on a known-good "reference Jetson" then ship the same `.engine` files to all production Jetsons / **both — primary path build-on-target with reference-Jetson-built engines as a fallback if pre-flight build fails RECOMMENDED ~handles SM-version drift + future TensorRT minor version updates**)** | Plan-phase architect + C10 owner | Plan-phase decision conditional on Cand 1 (TensorRT-native) being selected as primary; per Source #105 constraints #2 + #3, TensorRT engines are hardware-specific (SM 87 for Orin Nano Super) and CANNOT be transferred between devices. **Recommendation**: D-C7-7 = (c) primary build-on-deployed-Jetson during pre-flight; fallback prebuilt engines for emergency provisioning. **Strongest C7+C10 cross-component coupling — C10 owns the engine-build pipeline + calibration-dataset assembly per D-C7-1** |
|
||||
| **D-C7-8 (NEW from Cand 1 TensorRT-native closure 2026-05-08 Fact #94, Cand-1-only) — `config.max_workspace_size` cap to avoid tactic-profile segfault during build (**1 GB safe default RECOMMENDED** / 2 GB for richer kernel-fusion search / 3 GB for fastest-possible engine but high segfault risk on 8 GB shared budget)** | Plan-phase architect | Plan-phase decision conditional on Cand 1 being selected as primary; per Source #105 constraint #4, TensorRT engine builds on Jetson under memory pressure can segfault during tactic profiling (8 GB shared CPU+GPU is tight; rich layer-fusion search consumes peak RAM during `tactic.profile` phase). **Recommendation**: D-C7-8 = (a) 1 GB safe default; raise to 2 GB only if Plan-phase Jetson MVE shows engine quality is materially worse at 1 GB |
|
||||
| **D-C7-9 (NEW from Cand 1 TensorRT-native closure 2026-05-08 Fact #94, Cand-1-only) — TensorRT version pin within JetPack lifecycle (**lock to JetPack 6.2 + TensorRT 10.3 for the project's first deployment RECOMMENDED** / track JetPack 6.x minor releases / lock the exact JetPack point release for cross-deployment reproducibility)** | Plan-phase architect | Plan-phase decision conditional on Cand 1 being selected as primary; JetPack 6.2 ships TensorRT 10.3 + CUDA 12.6 + cuDNN 9.3 (Source #104). Upgrading TensorRT independently of JetPack is not officially supported per Source #105. **Recommendation**: D-C7-9 = (a) lock to JetPack 6.2 + TensorRT 10.3 for the project's first deployment; revisit at Plan phase per JetPack release cadence |
|
||||
| **D-C8-1 (NEW from Cand 1 pymavlink-GPS_INPUT closure 2026-05-08 Fact #97, Cand-1-only) — pymavlink connection-string transport choice (`udpout:127.0.0.1:14550` for in-process companion+autopilot UDP / `serial:/dev/ttyTHS1:921600` for direct UART to AP TELEM port / `tcp:127.0.0.1:5760` for SITL replay / **all three configurable via env var, default UART for production deployment, UDP for SITL replay, TCP for unit tests RECOMMENDED**)** | Plan-phase architect | Plan-phase decision conditional on Cand 1 (pymavlink → GPS_INPUT) being selected as primary AP path; pymavlink supports all three transports identically. **Recommendation**: D-C8-1 = (d) all three configurable + default UART production. Reduces moving parts in production while preserving testability paths |
|
||||
| **D-C8-2 (NEW from Cand 1 pymavlink-GPS_INPUT closure 2026-05-08 Fact #97, Cand-1-only CROSS-COMPONENT with AC-NEW-2) — `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch ownership pattern (companion always claims source-set 1 + FC keeps real-GPS at source-set 2 + companion is reactive only / **companion publishes to source-set 2 + auto-switches FC to set 2 on first valid fix + switches back to set 1 when companion is unavailable RECOMMENDED ~mirrors NGPS/Auterion pattern** / operator manually flips source-set via RC aux switch option 90)** | Plan-phase architect + AC-NEW-2 owner | Plan-phase decision conditional on Cand 1 being selected as primary AP path; per SQ6 Fact #3 "no GCSs are currently known to implement" companion-driven `MAV_CMD_SET_EKF_SOURCE_SET` — but it works at firmware level. **Recommendation**: D-C8-2 = (b) companion publishes to source-set 2 + auto-switches FC; project gets to define the canonical pattern; mirrors NGPS/Auterion deployment pattern from SQ1 lookup |
|
||||
| **D-C8-3 (NEW from Cand 1 pymavlink-GPS_INPUT closure 2026-05-08 Fact #97, Cand-1-only) — pymavlink LGPL-3.0 license-posture verification (**bundle pymavlink unmodified + publish requirements.txt with version pin RECOMMENDED ~standard LGPL §6 compliance** / statically link via Cython compilation [LGPL §6 obligation: provide relinkable form] / wrap pymavlink behind a thin C++/Rust process boundary to keep companion-app fully Apache-2.0 [over-engineered])** | Plan-phase architect + license owner | Plan-phase decision conditional on Cand 1 being selected as primary; LGPL §6 allows linking from Apache-2.0 app without "infecting" application license. **Recommendation**: D-C8-3 = (a) bundle unmodified + requirements.txt; aligns with D-C1-1 license-posture-track decision; pymavlink LGPL-3.0 vs project Apache-2.0 dual-use track is straightforward |
|
||||
| **D-C8-4 (NEW from Cand 2 MSP2_SENSOR_GPS closure 2026-05-08 Fact #99, Cand-2-only) — Python MSP V2 implementation choice (**YAMSPy [community-blessed for iNav external-device comms per Issue #4465; MIT; widest community usage] RECOMMENDED PRIMARY** / INAV-Toolkit `msp_v2_encode` primitive lifted into the project [951-line MIT module, direct primary-source reference] SECONDARY / thin custom encoder using `struct.pack` + CRC-8 DVB-S2 helper [50-line bespoke fallback] FALLBACK / project-side fork of one of the above)** | Plan-phase architect | Plan-phase decision conditional on Cand 2 (MSP2_SENSOR_GPS) being selected as primary iNav path; all options are MIT and produce identical wire bytes. **Recommendation**: D-C8-4 = (a) YAMSPy primary + (c) thin custom encoder fallback if YAMSPy lacks MSP2_SENSOR_GPS support. Choice depends on maintainability vs minimum-dependency-surface preference |
|
||||
| **D-C8-5 (NEW from Cand 2 MSP2_SENSOR_GPS closure 2026-05-08 Fact #99, Cand-2-only) — MSP2_SENSOR_GPS injection rate (**5 Hz periodic RECOMMENDED ~matches GPS_INPUT 5 Hz cadence on AP side, single-rate cross-FC consistency** / 10 Hz to match iNav nav-cycle frequency / variable rate matching estimator publication rate [3 Hz nominal, up to 10 Hz when matcher confidence is high])** | Plan-phase architect | Plan-phase decision conditional on Cand 2 being selected as primary; estimator publishes at 3 Hz nominal (per pinned dual-rate camera pipeline Fact #40). **Recommendation**: D-C8-5 = (a) 5 Hz periodic; spare headroom for IMU-propagation between estimator updates; cross-FC consistency with AP path |
|
||||
| **D-C8-6 (NEW from Cand 3 UBX-impersonation closure 2026-05-08 Fact #98, Cand-3-only contingent) — IF Cand 3 selected → UBX-version-advertisement strategy (**advertise hwVersion ≥ M9 + swVersion ≥ 15.00 via NAV-VER (CLASS=0x0A, ID=0x04) at startup + every reset; force iNav into NAV-PVT-only protocol surface RECOMMENDED ~simplest** / advertise hwVersion = M8 + swVersion = 14.x to drive iNav into legacy NAV-POSLLH+NAV-SOL+NAV-VELNED+NAV-TIMEUTC quad mode [more messages but historical iNav-friendly path] / implement adaptive advertisement based on iNav firmware-version probe)** | Plan-phase architect | Plan-phase decision conditional on Cand 3 (UBX impersonation) being elevated to primary at iNav side; per Source #110 lines 1024-1060, iNav configures the simpler NAV-PVT-only path for u-blox version ≥ 15.0. **Recommendation**: D-C8-6 = (a) advertise version ≥ 15.0 to minimize protocol surface |
|
||||
| **D-C8-7 (NEW from Cand 3 UBX-impersonation closure 2026-05-08 Fact #98, Cand-3-only contingent CROSS-COMPONENT with AC-NEW-7) — IF Cand 3 selected → AC-NEW-7 audit-trail posture (**explicit FDR audit entry on every UBX impersonation session start, naming companion as the UBX source + providing operator-consent provenance check at boot RECOMMENDED** / silent operation with user-manual disclosure only / require runtime parameter `gps-denied-onboard.enable_ubx_impersonation = true` to be set explicitly by the user via QGC [active opt-in])** | Plan-phase architect + AC-NEW-7 owner | Plan-phase decision conditional on Cand 3 being elevated to primary at iNav side; UBX impersonation is unambiguously a forgery posture (companion impersonates u-blox receiver). **Recommendation**: D-C8-7 = (a) explicit FDR audit entry on every impersonation session start; AC-NEW-7 (no covert GPS spoofing without consent) requires an audit trail |
|
||||
| **D-C8-8 (NEW from Cand 1 + Cand 2 closure 2026-05-08 Fact #97 + Fact #99, CROSS-COMPONENT — affects both Cand 1 and Cand 2; CROSS-COMPONENT with C5 covariance contract) — covariance-honesty cross-FC enforcement strategy (project always publishes the SAME covariance value to both FCs [single shared contract, simpler test surface] / **per-FC covariance unit conversion: AP `GPS_INPUT.horiz_accuracy` (m) vs iNav `MSP2_SENSOR_GPS.hPosAccuracy` (mm) — companion publishes the same source covariance, formatted per-FC RECOMMENDED** / per-FC covariance smoothing [different filter parameters per FC; over-engineered + monotonicity-violation risk under C5 D-C5-2])** | Plan-phase architect + AC-NEW-4 owner | Plan-phase decision; AC-NEW-4 covariance-honesty obligation is the same for both FCs; only the unit + field-name change. **Recommendation**: D-C8-8 = (b) per-FC covariance unit conversion; same source covariance, formatted per-FC. **Strongest C5+C8 cross-component coupling** — extracts 2×2 horizontal sub-matrix from C5 GTSAM `Marginals.marginalCovariance` 6×6 matrix, computes 95% confidence ellipse semi-major axis `sqrt(2.0 * 5.991 * λ_max)`, emits as `horiz_accuracy` (m) for AP / `hPosAccuracy` (mm) for iNav |
|
||||
| **D-C10-1 (NEW from Sub-area 1 closure 2026-05-08 Fact #100, C10-only) — descriptor-cache rebuild trigger choice (rebuild on every pre-flight invocation simplest but slow / **manifest-hash-driven (rebuild iff `SHA-256(descriptor_blobs[*] + IndexHNSWFlat params)` differs from last-recorded manifest hash) RECOMMENDED + `--force-rebuild` operator override** / time-based (rebuild every N days irrespective of content drift, AC-8.2-aligned))** | Plan-phase architect + C10 owner | Plan-phase decision; trade-off between rebuild latency (5-30 sec at 100K tiles) blocking pre-flight readiness vs unnecessary work when descriptor blobs haven't changed. **Recommendation**: D-C10-1 = (b) manifest-hash-driven + `--force-rebuild` override. Operationalizes the "incremental add unsafe with HNSW deletes" Source #96 finding by treating any descriptor-blob churn as a full rebuild trigger. Operator override allows AC-NEW-3 FDR-required rebuild for cache-poisoning recovery without operator hash-debugging |
|
||||
| **D-C10-2 (NEW from Sub-area 1 closure 2026-05-08 Fact #100, C10-only) — descriptor-cache atomic-write strategy (write directly to target path simplest but unsafe — partial-write leaves a corrupt FAISS file that `read_index` will load successfully per Source #114 "no internal integrity check" warning / **`python-atomicwrites` package — write-to-temp + `fsync` + atomic rename + parent-dir fsync per Source #116 RECOMMENDED ~3-line addition** / hand-rolled `os.rename` via `tempfile.NamedTemporaryFile(dir=parent_dir)` + manual `fsync` ~10-line equivalent)** | Plan-phase architect + AC-NEW-7 owner | Plan-phase decision; without atomic-write, a power loss or process kill mid-`faiss.write_index` leaves a truncated/partial file that loads successfully and produces silently-wrong descriptor matches at takeoff — **direct violation of AC-NEW-7 cache-poisoning safety + AC-3.3 re-localization stability**. **Recommendation**: D-C10-2 = (b) `python-atomicwrites`. Cross-platform; pure-Python; auditable; established pattern per Source #116. Interacts with D-C10-3 content-hash verification (atomic-write prevents the truncated-file class of corruption; content-hash gate catches malicious tampering separately) |
|
||||
| **D-C10-3 (NEW from Sub-area 1 closure 2026-05-08 Fact #100, C10-only CROSS-COMPONENT with AC-NEW-7) — content-hash verification gate at takeoff load (skip verification — accept FAISS file as-is per "trusted local filesystem" assumption / **compute `SHA-256(faiss_index_file)` at takeoff load + compare against manifest-recorded hash + reject load + emit STATUSTEXT to FC + refuse takeoff if mismatch RECOMMENDED — directly satisfies AC-NEW-7 cache-poisoning safety obligation** / verify only on first takeoff after rebuild + cache the verification result)** | Plan-phase architect + AC-NEW-7 owner | Plan-phase decision; FAISS Source #114 explicit security warning: "No attempt is made to check the correctness of loaded data. A faulty or malicious file could lead to out-of-memory errors or code execution. Users are responsible for verifying that files loaded with `read_index` have not been altered since being written by `write_index`." **Recommendation**: D-C10-3 = (b) reject-and-refuse-takeoff. The hash check is ~50 ms one-time cost vs the unbounded cost of silent descriptor-cache poisoning leading to incorrect VPR retrieval feeding the rest of the pipeline. **Strongest C10 ↔ AC-NEW-7 coupling**. Couples with D-C10-2 (atomic-write prevents truncation; content-hash catches tampering). Final lock at Plan phase after AC-NEW-7 owner reviews STATUSTEXT format + FC FDR audit-entry shape |
|
||||
| **D-C10-4 (NEW from Sub-area 1 closure 2026-05-08 Fact #100, C10-only) — descriptor-cache load path (full read into RAM via `faiss.read_index(path)` simplest + warmest cache after first query / **mmap via `faiss.read_index(path, faiss.IO_FLAG_MMAP_IFC)` + `madvise(MADV_WILLNEED)` pre-fault to smooth p99 latency RECOMMENDED — eliminates ~430 MB read at takeoff, supports large indices that exceed shared 8 GB RAM budget per AC-4.2** / both — Plan-phase Jetson MVE benchmark to pick the lower-p99-latency path)** | Plan-phase architect + Jetson MVE bring-up team | Plan-phase decision conditional on Jetson MVE bench results; mmap eliminates the takeoff load read entirely (FAISS supports mmap on `IndexHNSWFlat` per Source #114 `IO_FLAG_MMAP_IFC` flag); but post-load search performance is "slightly slower initially due to memory layout and cache effects" per Source #115 Issue #622, requiring a warmup-search-pass at takeoff. **Recommendation**: D-C10-4 = (b) mmap with `madvise(MADV_WILLNEED)` pre-fault — fastest path for the project's <5 s takeoff load budget; or (c) bench both at Jetson MVE and pick the lower-p99-latency path empirically. Interacts with AC-4.2 8 GB shared CPU+GPU memory budget (mmap reduces peak RAM during load) |
|
||||
| **D-C10-5 (NEW from Sub-area 2 closure 2026-05-08 Fact #101, C10-only CROSS-COMPONENT with C7) — TensorRT engine-build orchestration tool choice (`trtexec` only — single binary, simplest deployment, but `--int8` without `--calib` falls back to random calibration data per Source #119 — collapses INT8 accuracy / Polygraphy CLI only — handles INT8 calibration via `--data-loader-script` per Source #117 + canonical NVIDIA-blessed wrapper / direct `IBuilderConfig` Python API only — most flexible but most engineering cost per Source #121 + duplicates Polygraphy's calibration-cache management / **hybrid: Polygraphy CLI primary for INT8-calibrating builds + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API as escape hatch for unusual models like LightGlue dynamic-shape inputs RECOMMENDED ~best of all three for the project's mixed model family**)** | Plan-phase architect + C7 inference-runtime owner + C10 owner | Plan-phase decision conditional on Cand 1 (TensorRT-native) per D-C7-7 = (c) being selected as primary; trade-off between operational simplicity vs feature coverage vs maintenance footprint. **Recommendation**: D-C10-5 = (d) hybrid; pin canonical recipes per model family (VPR backbone INT8+FP16 via Polygraphy; matchers FP16-only via Polygraphy; LightGlue dynamic-shapes via direct API; cache-reuse rebuilds via trtexec). **Strongest C7+C10 cross-component coupling** — operationalizes D-C7-7 closure |
|
||||
| **D-C10-6 (NEW from Sub-area 2 closure 2026-05-08 Fact #101, C10-only CROSS-COMPONENT with D-C7-1) — TensorRT calibration-cache reuse strategy (rebuild calibration on every engine build slowest but always uses freshest corpus / **rebuild calibration only when `SHA-256(calibration_corpus)` changes from last-recorded manifest hash + reuse cached scales otherwise per Source #117 cache-reuse pattern RECOMMENDED + `--force-trt-rebuild` operator override** / never recalibrate after first successful build — risks per-model accuracy drift if the underlying model graph changes via fine-tune)** | Plan-phase architect + C7 owner | Plan-phase decision conditional on Cand 1 being selected as primary; calibration cache binary-blob is keyed by `SHA-256(calib_corpus)` + onnx-graph hash + TRT version per Source #117 + Source #118 design. Without reuse, every engine build re-runs the ~10-30 minute calibration on the 500-1500-image corpus per D-C7-1 closure. **Recommendation**: D-C10-6 = (b) rebuild on `SHA-256(calib_corpus)` change + `--force-trt-rebuild` override. Subsequent rebuilds <30 sec via cache reuse per Source #117. **Strongest D-C7-1 ↔ C10 coupling** — operationalizes the calibration-corpus closure into the build pipeline |
|
||||
| **D-C10-7 (NEW from Sub-area 2 closure 2026-05-08 Fact #101, C10-only) — TensorRT engine on-disk filename schema (single `<model>.engine` per model — simplest but breaks under TRT/JetPack version drift / **self-describing `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` filename + sidecar `manifest.json` per Source #105 hardware-tied-engine constraint RECOMMENDED ~enables side-by-side multi-version coexistence + reference-Jetson-built fallback engines per D-C7-7 = (c)** / single-bucket directory with manifest-only routing)** | Plan-phase architect + C10 owner | Plan-phase decision conditional on Cand 1 being selected as primary; per Source #105, TRT engines are tied to (SM version, JetPack version, TRT version, precision mode) — moving an engine across any of these dimensions silently fails or quietly degrades. **Recommendation**: D-C10-7 = (b) self-describing filename. Filename schema example: `mixvpr_sm87_jp62_trt103_int8fp16.engine`, `lightglue_disk_sm87_jp62_trt103_fp16.engine`. Sidecar manifest.json captures full provenance for AC-NEW-3 FDR. Couples with D-C7-9 JetPack version pin |
|
||||
| **D-C10-8 (NEW from Sub-area 2 closure 2026-05-08 Fact #101, C10-only) — TensorRT prebuilt-fallback engine generation venue (build only on the deployed Jetson — minimal infra but blocks deployment until first build succeeds / build only on a reference Jetson at HQ — fastest deployment but loses per-target reproducibility per D-C7-7 = (c) primary path / **reference Jetson at HQ as canonical fallback corpus + deployed-Jetson-copy-to-archive on first successful local build RECOMMENDED — opportunistic redundancy + per-target validation + canonical fallback in case of pre-flight build failure**)** | Plan-phase architect + project bring-up team + C10 owner | Plan-phase decision conditional on Cand 1 being selected as primary; per D-C7-7 = (c), primary path is build-on-deployed-Jetson; fallback is reference-Jetson-built engines. **Recommendation**: D-C10-8 = (c) reference Jetson at HQ + deployed-Jetson-copy-to-archive on first successful local build. Reference Jetson must match deployed Jetson on (SM 87, JetPack 6.2, TensorRT 10.3, CUDA 12.6, cuDNN 9.3) per Source #105 + D-C7-9 lock. Provides AC-NEW-1 (8 h endurance, no infield infra) tolerance for the case where a freshly-deployed Jetson cannot complete a per-mission rebuild before takeoff |
|
||||
@@ -0,0 +1,72 @@
|
||||
# Component Fit Matrix — C10: Pre-flight cache provisioning (cross-coupling minimal scope)
|
||||
|
||||
> Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate). C10 was promoted to its own row file on 2026-05-08 after user-locked scope narrowing (`c10_scope=C` cross-coupling minimal — see [`../00_question_decomposition.md` → "C10 Scope Restructure"](../00_question_decomposition.md)). Operator CLI/desktop tooling, sector classification heuristics, and tile age-stamping/freshness schema are **deferred to Plan-phase as `operator tooling design` out-of-research-scope**. C10 batch 1 covers only the two cross-coupling confirmation sub-areas: D-C6-3 (descriptor-cache rebuild trigger pipeline) and D-C7-7 (TensorRT engine-build pipeline).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5 State estimator](C5_state_estimator.md), [C6 Tile cache + spatial index](C6_tile_cache_spatial_index.md), [C7 On-Jetson inference runtime](C7_inference_runtime.md), [C8 MAVLink / MSP2 FC adapter](C8_fc_adapter.md). Cross-component gates: [`99_cross_component_gates.md`](99_cross_component_gates.md). C9 dropped per 2026-05-08 restructure.
|
||||
|
||||
---
|
||||
|
||||
## C10 — Pre-flight cache provisioning + sector classification (CROSS-COUPLING MINIMAL scope)
|
||||
|
||||
**Status**: IN PROGRESS at 0/2 (batch 1 = 2 sub-areas; opened 2026-05-08).
|
||||
|
||||
**Pinned input/output contract (per the locked C10 scope)**:
|
||||
- inputs:
|
||||
- `descriptor_blobs[*]` per tile = the per-tile global VPR descriptor (per D-C2-9 / D-C2-10 / D-C2-6 final lock: dimension d ∈ {256, 512, 1024, 2048, 4096} float32 or halfvec) — produced offline at C10 pre-flight by running C2 VPR backbone over each cached tile image.
|
||||
- `onnx_models[*]` per inference target = the ONNX-exported model graphs for C2 VPR backbone + C3 matcher + (optional) C1 learned VIO frontend, exported on the dev machine via `torch.onnx.export`.
|
||||
- `calibration_corpus` = real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles (per D-C7-1 closure, fixture-file pin delegated to Test Spec) — ~500-1,500 representative samples; binary tensor format `[N, C, H, W]`.
|
||||
- `target_jetson_uri` = SSH/serial address of the deployed Jetson Orin Nano Super target (or `localhost` when build runs on the deployed Jetson directly).
|
||||
- outputs:
|
||||
- **`/var/lib/onboard/cache/faiss/v_<descriptor_dim>_M<HNSW_M>.index`** = FAISS HNSW index file written via `faiss.write_index(index, path)`; loaded at takeoff via `faiss.read_index(path)`; sized at ~`(n_tiles × d × 2 B halfvec) + (n_tiles × M × 4 B graph links)`. Per-takeoff load latency target <5 s.
|
||||
- **`/var/lib/onboard/cache/trt/<model>_sm87_jp62_trt103_<precision>.engine`** = serialized TensorRT engine file produced by `trtexec` or `IBuilderSerializationConfig.serialize()`; loaded at takeoff via `IRuntime.deserializeCudaEngine`; tied to SM 87 (Jetson Orin Nano Super Ampere) per Source #105.
|
||||
- **Build/rebuild manifest** = single JSON file recording `(model_name, precision_mode, calib_data_sha256, build_start_iso8601, build_duration_sec, engine_sha256, target_sm, jetpack_version, trt_version)` per engine; `(descriptor_dim, n_tiles, faiss_M, ef_construction, build_duration_sec, faiss_sha256)` per FAISS index. Fed into AC-NEW-3 FDR.
|
||||
- runtime context:
|
||||
- **Pre-flight only**, NOT runtime. Build/rebuild cost amortized across all takeoffs that use the same artifacts. Per-mission rebuild only if `calibration_corpus` or `descriptor_blobs[*]` changed (manifest-hash-driven).
|
||||
- Build runs ON the deployed Jetson Orin Nano Super (per D-C7-7 = primary build-on-target). Reference-Jetson-prebuilt engine fallback supported (per D-C7-7 = fallback path) when pre-flight build fails or is skipped.
|
||||
|
||||
---
|
||||
|
||||
## Candidate matrix (batch 1 CLOSED at 2/N on 2026-05-08)
|
||||
|
||||
| Sub-area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|
||||
|---|---|---|---|---|---|---|---|---|
|
||||
| **Sub-area 1: D-C6-3 confirmation** | Direct `faiss.write_index` / `faiss.read_index` Python API + `python-atomicwrites` + content-hash verification gate at takeoff + manifest-hash-driven rebuild trigger + `IO_FLAG_MMAP_IFC` mmap load | `faiss.IndexHNSWFlat(d=descriptor_dim, M=32)` build per pre-flight when `manifest_hash` changed; `faiss.write_index(index, temp_path)` + atomic-rename + content-hash; takeoff load via `faiss.read_index(target_path, faiss.IO_FLAG_MMAP_IFC)` after content-hash verification | Established production (FAISS MIT + python-atomicwrites MIT) + project-side orchestration wrapper | C6 ↔ C10 cross-component gate closure (D-C6-3 confirmation) | MVE: see [`../02_fact_cards/C10_preflight_provisioning.md` Fact #100](../02_fact_cards/C10_preflight_provisioning.md); docs: Source #114 (FAISS API), Source #115 (size formula), Source #116 (atomic write pattern) | None — content-hash gate mitigates the documented FAISS "no internal integrity check" warning per Source #114 | **Selected** | Closes D-C6-3 with idempotent + crash-safe + AC-NEW-7-compliant pipeline; license-clean; minimal abstraction surface; ~430 MB cache file at 2048-D halfvec × 100K tiles fits AC-8.3 + AC-4.2 + AC-NEW-1 budgets comfortably |
|
||||
| **Sub-area 2: D-C7-7 confirmation** | Hybrid orchestration: Polygraphy CLI primary for INT8-calibrating builds + `trtexec` for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API for unusual models (LightGlue dynamic shapes) | `polygraphy convert <model>.onnx --int8 --fp16 --data-loader-script ./calib_data_loader.py --calibration-cache <calib_cache> --workspace=1000000000 -o <engine>_sm87_jp62_trt103_<prec>.engine` (primary); `trtexec --onnx=... --saveEngine=... --fp16 --int8 --calib=... --shapes=...` (cache-reuse fallback); direct `IBuilderConfig` + `IInt8EntropyCalibrator2` Python API (escape hatch) | Established production NVIDIA-blessed orchestration (Polygraphy Apache-2.0; trtexec bundled with TensorRT 10.x Apache-2.0; direct API bundled with TensorRT 10.x) | C7 ↔ C10 cross-component gate closure (D-C7-7 confirmation) | MVE: see [`../02_fact_cards/C10_preflight_provisioning.md` Fact #101](../02_fact_cards/C10_preflight_provisioning.md); docs: Source #117 (Polygraphy CLI), Source #118 (Polygraphy Calibrator class), Source #119 (trtexec CLI), Source #120 (calib corpus size guidance), Source #121 (direct API cross-cite from C7 Source #105) | None — `trtexec --int8` without `--calib` random-data-fallback caveat is mitigated by project-side wrapper that enforces `--calib=<existing_cache>` non-empty as precondition | **Selected** | Closes D-C7-7 with hybrid tool matrix matching D-C10-5 = (d); operationalizes D-C7-1 closure (real UAV nadir flight footage corpus) via Polygraphy `--data-loader-script`; calibration-cache reuse keeps subsequent rebuilds <30 sec; license-clean Apache-2.0 throughout; engine cache files ~100-500 MB on disk separate from AC-8.3 tile cache budget |
|
||||
|
||||
---
|
||||
|
||||
## Working conclusions and decisions
|
||||
|
||||
**Selected primary**:
|
||||
- **D-C6-3 confirmation** (Sub-area 1): direct `faiss.write_index` / `faiss.read_index` Python API + `python-atomicwrites` + content-hash verification gate + manifest-hash-driven rebuild trigger + optional `IO_FLAG_MMAP_IFC` mmap load. **Closes the C6 ↔ C10 cross-component gate.**
|
||||
- **D-C7-7 confirmation** (Sub-area 2): hybrid Polygraphy + `trtexec` + direct `IBuilderConfig` Python API matrix per D-C10-5 = (d). Calibration corpus per D-C7-1 closure (real UAV nadir flight footage at ~1 km AGL over season-matched satellite tiles; specific fixture-file pin delegated to Test Spec). **Closes the C7 ↔ C10 cross-component gate.**
|
||||
|
||||
**Decisions raised (D-C10-N gates)** — see [`99_cross_component_gates.md`](99_cross_component_gates.md):
|
||||
|
||||
- **D-C10-1** (Fact #100) — descriptor-cache rebuild trigger choice — RECOMMENDED manifest-hash-driven + `--force-rebuild` override
|
||||
- **D-C10-2** (Fact #100) — descriptor-cache atomic-write strategy — RECOMMENDED `python-atomicwrites`; fallback hand-rolled
|
||||
- **D-C10-3** (Fact #100, CROSS-COMPONENT with AC-NEW-7) — content-hash verification gate at takeoff load — RECOMMENDED reject + STATUSTEXT + refuse takeoff
|
||||
- **D-C10-4** (Fact #100) — descriptor-cache load path — RECOMMENDED mmap with `madvise(MADV_WILLNEED)` pre-fault (or both for Plan-phase Jetson MVE)
|
||||
- **D-C10-5** (Fact #101, CROSS-COMPONENT with C7) — TensorRT engine-build orchestration tool choice — RECOMMENDED hybrid (Polygraphy + trtexec + direct API by use case)
|
||||
- **D-C10-6** (Fact #101, CROSS-COMPONENT with D-C7-1) — TensorRT calibration-cache reuse strategy — RECOMMENDED rebuild-on-calib-corpus-SHA-256-change + `--force-trt-rebuild` override
|
||||
- **D-C10-7** (Fact #101) — TensorRT engine on-disk filename schema — RECOMMENDED self-describing `<model>_sm<SM>_jp<JP>_trt<TRT>_<precision>.engine` filename + manifest.json side-cache
|
||||
- **D-C10-8** (Fact #101) — TensorRT prebuilt-fallback engine generation venue — RECOMMENDED reference Jetson at HQ + deployed-Jetson-copy-to-archive on first successful local build (opportunistic redundancy)
|
||||
|
||||
**C10 batch 1 closed at 2/N on 2026-05-08.** **No further C10 batches required at the research layer** — D-C6-3 and D-C7-7 cross-component gates are now closed; remaining C10 questions (operator CLI/desktop tooling, sector classification heuristics, freshness pipeline workflow) are deferred to Plan-phase per the 2026-05-08 `c10_scope=C` user choice.
|
||||
|
||||
---
|
||||
|
||||
## Out-of-research-scope items (deferred to Plan-phase)
|
||||
|
||||
The following items were originally part of C10's "Required outputs" per `../00_question_decomposition.md` line 78 but were narrowed out of research scope by user choice C on 2026-05-08:
|
||||
|
||||
| Deferred item | Plan-phase owner | Why it doesn't need research |
|
||||
|---|---|---|
|
||||
| Operator-side CLI/desktop tool design | Plan-phase architect + UX | Tool shape is a UX/integration decision; doesn't bind any architectural contract |
|
||||
| Sector classification (active-conflict vs stable rear) heuristics + interface | Plan-phase architect + operations team | AC-8.2 freshness threshold (6 mo vs 12 mo) is operational; heuristic source TBD (operator-marked geofence vs Suite Service metadata) |
|
||||
| Tile age-stamping schema beyond restrictions.md mandate | Plan-phase architect | Restrictions.md already mandates per-tile capture date in manifest; additional sector-class tag is a Plan-phase decision |
|
||||
| Freshness pipeline workflow | Plan-phase architect + operations team | When to re-pull from Suite Sat Service (every flight, weekly, on operator demand, on sector-class change) is operational |
|
||||
|
||||
These items will be revisited at Plan-phase. Their cross-coupling with the runtime architecture is mediated entirely by the descriptor-cache file (D-C6-3) and the TensorRT engine cache file (D-C7-7) — both pinned by C10 batch 1 confirmations.
|
||||
|
||||
---
|
||||
@@ -0,0 +1,47 @@
|
||||
# Component Fit Matrix — C1: Visual / Visual-Inertial Odometry
|
||||
|
||||
> Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate, structured per-component candidate-selection table). Status vocabulary in [`00_summary.md`](00_summary.md). Detailed fact cards backing every status verdict live in [`../02_fact_cards/C1_vio.md`](../02_fact_cards/C1_vio.md).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5–C10 pending](C5-C10_pending.md). Cross-component gates: [`99_cross_component_gates.md`](99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## C1 — Visual / Visual-Inertial Odometry [closed at documentary level, 2026-05-08]
|
||||
|
||||
**Pinned mode**: monocular + IMU on Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble); inputs `{1× ADTi 20MP nav frame stream + FC IMU via MAVLink/SCALED_IMU2}`; outputs `{6-DoF pose at IMU rate with metric scale + 6×6 covariance + source label visual_propagated when no satellite anchor}`.
|
||||
|
||||
**Locked-in research-time defaults** (per Fact #41, after user-skipped clarification on D-C1-1 and D-C1-2):
|
||||
- D-C1-1 = (c) **keep both license tracks open** through Plan; final license decision deferred to post-Jetson-MVE.
|
||||
- D-C1-2 = (b) **defer Jetson Orin Nano Super hardware MVE to a dedicated bring-up phase** between research and Plan; research closes with documentary ranking + per-candidate `Verify` gates.
|
||||
|
||||
| # | Candidate | License | Per-mode verification | Status | Lead reason / disqualifier | Sub-matrix cite |
|
||||
|---|---|---|---|---|---|---|
|
||||
| 1 | **OKVIS2 / OKVIS2-X** | BSD-3 (no copyleft) | ✅ Fact #39 + Source #56 | **Documentary lead — BSD/permissive track** | Strongest documentary mode-fit; structural sub-20-Hz tolerance via keyframe-based architecture (Fact #40); OKVIS2-X (T-RO 2025) GNSS fusion architecturally aligned with AC-NEW-2 spoof-promotion path | `../02_fact_cards/C1_vio.md` → "OKVIS2 / OKVIS2-X — per-numbered binding" |
|
||||
| 2 | **OpenVINS** | GPL-3.0 (copyleft) | ✅ Fact #37 + Source #54 | **Documentary lead — GPL-3.0 track** | Best Jetson Orin Nano Dev Kit + JetPack 6 + ROS 2 Humble build evidence (rpng/open_vins issue #421 + fdcl-gwu setup guide); MSCKF formulation more memory-efficient than full sliding-window optimization; documented Xavier NX 270 ms latency baseline at 640×480 | `../02_fact_cards/C1_vio.md` → "OpenVINS — per-numbered binding" |
|
||||
| 3 | **VINS-Mono** | GPL-3.0 (copyleft) | ✅ Fact #38 + Source #55 (with caveat) | **Experimental only — GPL-3.0 track alternate** | Single-mode by construction (mono+IMU); proven on original Jetson Nano (2021 KAIST + 2024 RPi CM4); ⚠️ documentary minimum image rate 20 Hz vs project 3 fps (Fact #40) → must be Jetson-MVE-validated at sub-20-Hz OR Plan must commit to dual-rate camera pipeline (Fact #40) before promotion | `../02_fact_cards/C1_vio.md` → "VINS-Mono — per-numbered binding" |
|
||||
| 4 | **Pure VO + external ESKF (C5)** | OpenCV-Apache-2.0 + project-internal | ✅ Source #53 + Fact #35 | **Mandatory simple-baseline** | Per Component Option Breadth rule — runnable fallback if all VIO leads fail Jetson MVE; trivial latency + memory footprint; FAILS C1's IMU-fusion + covariance bindings inherently (those are owned by the external C5 wrapper) | `../02_fact_cards/C1_vio.md` → "Pure VO baseline — per-numbered binding" |
|
||||
| 5 | **VINS-Fusion** | GPL-3.0 | (see Fact #29) | **Documentary lead — GPL-3.0 track redundant** | Same authors as VINS-Mono with multi-sensor superset; mono+IMU mode shares VINS-Mono's algorithmic core; fails to run on Jetson TX2 (KAIST 2021); within HKUST family, VINS-Mono is the cleaner C1 candidate for the project's pinned mode | (covered transitively by VINS-Mono row above; VINS-Fusion-specific Jetson TX2 failure is Fact #29) |
|
||||
| 6 | **Kimera-VIO** | BSD-2 | (see Fact #32) | **Conditional secondary fallback** | Permissive license is attractive but resource overhead (3D mesh + semantic mesher) is poor fit under co-resident process pressure; failed Xavier NX 8 GB shared in KAIST 2021 multi-process benchmark | (no per-numbered sub-matrix this session; deferred — only lifts to lead if both BSD lead OKVIS2 and the GPL-3.0 leads fail Jetson MVE) |
|
||||
| 7 | **DPVO / DPV-SLAM** | MIT | (see Fact #34) | **Conditional — VO not VIO** | Mono VO only (no native IMU fusion); requires external IMU wrapper to satisfy the C1 mandate; DPVO-QAT++ (Nov 2025) shows 1.02 GB peak memory on RTX 4060; Jetson Orin Nano untested; operational complexity of teacher-student QAT pipeline is high vs classical candidates | (no per-numbered sub-matrix this session; lifted from C1 as VO-only candidate per Fact #34) |
|
||||
| 8 | **DROID-SLAM** | (project repo) | (see Fact #33) | **Rejected — disqualified by AC-4.2** | ≥11 GB GPU VRAM inference budget exceeds the project's 8 GB shared LPDDR5; mono VO/SLAM (no IMU fusion); arbitrary scale (no metric recovery without external alignment) | (no sub-matrix; rejected on AC-4.2 alone) |
|
||||
| 9 | **RTAB-Map** | BSD | (see Fact #16) | **Rejected — disqualified by SPRIN-D evidence** | Failed beyond 1 km / above 2 m/s flight in SPRIN-D environment; project cruise (≤17 m/s, kilometers between satellite anchors) explicitly excludes | (no sub-matrix; rejected on Fact #16) |
|
||||
| 10 | **ORB-SLAM3** | GPL-3.0 | (see Fact #16) | **Rejected — disqualified by SPRIN-D evidence** | Same as RTAB-Map | (no sub-matrix; rejected on Fact #16) |
|
||||
|
||||
### C1 — Per-license-track preliminary ranking (final ranking pending Jetson MVE)
|
||||
|
||||
**BSD/permissive track** (track lead under D-C1-1 = (b) or default (c)):
|
||||
1. **OKVIS2 / OKVIS2-X** — Documentary lead; structural sub-20-Hz advantage; OKVIS2-X GNSS-fusion architectural alignment with AC-NEW-2.
|
||||
2. (alternates) Kimera-VIO (Conditional); Pure VO + external ESKF (Mandatory simple-baseline).
|
||||
|
||||
**GPL-3.0 track** (track lead under D-C1-1 = (a) or default (c)):
|
||||
1. **OpenVINS** — Documentary lead; best Jetson Orin Nano build evidence; MSCKF memory advantage.
|
||||
2. **VINS-Mono** — Experimental only until Jetson MVE validates sub-20-Hz operation OR Plan commits to dual-rate pipeline (Fact #40).
|
||||
3. (alternate) VINS-Fusion — within HKUST family, VINS-Mono is the cleaner pick.
|
||||
|
||||
### C1 — Plan-phase deliverables raised by closure
|
||||
|
||||
1. **D-C1-1 license posture A/B/C** — must be presented to user as a structured Choose block at Plan time, with the documentary evidence above as input.
|
||||
2. **D-C1-2 Jetson Orin Nano Super hardware MVE** — must be executed as a dedicated bring-up phase between research and Plan; produces a single MVE artifact that promotes the surviving Documentary leads to Selected.
|
||||
3. **Single-rate vs dual-rate nav-camera pipeline (Fact #40)** — must be decided at Plan time; affects which C1 candidates remain on documentary lead vs Experimental status; affects C2/C3 candidate scoring in their respective rows.
|
||||
|
||||
---
|
||||
@@ -0,0 +1,87 @@
|
||||
# Component Fit Matrix — C2: Visual Place Recognition (VPR)
|
||||
|
||||
> Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate, structured per-component candidate-selection table). Status vocabulary in [`00_summary.md`](00_summary.md). Detailed fact cards backing every status verdict live in [`../02_fact_cards/C2_vpr.md`](../02_fact_cards/C2_vpr.md).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C1 VIO](C1_vio.md), [C3 Matchers](C3_matchers.md), [C4 Pose](C4_pose_estimation.md), [C5–C10 pending](C5-C10_pending.md). Cross-component gates: [`99_cross_component_gates.md`](99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## C2 — Visual Place Recognition (VPR) [mandatory pre-screen CLOSED at documentary level 2026-05-08; conditional candidates AnyLoc/BoQ/DINOv2-VLAD remain GATED on prerequisite INT8 quantization survey]
|
||||
|
||||
**Pinned mode** (per-frame retrieval contract, identical for every C2 candidate; per-candidate mode variations are: input image size, backbone, descriptor dimensionality, training-domain provenance, inference runtime):
|
||||
|
||||
- inputs: `{1× ADTi 20MP nav frame stream → center-cropped + bilinearly downscaled to candidate's native input size + ImageNet-normalised}` on `Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble; PyTorch fp16 baseline; final inference runtime selection deferred to C7)`
|
||||
- outputs: `{global descriptor per frame; cosine top-K (K=10 per Fact #25) retrieval against pre-cached descriptor table over the project's ~400 km² operational area's tiles at AC-8.1 resolution floor (≥0.5 m/px)}` feeding C3 (cross-domain matcher)
|
||||
|
||||
**Locked-in research-time defaults** (carried forward from C1 — Fact #41):
|
||||
- D-C1-1 = (c) **keep both license tracks open** through Plan; final license decision deferred to post-Jetson-MVE.
|
||||
- D-C1-2 = (b) **defer Jetson Orin Nano Super hardware MVE to a dedicated bring-up phase** between research and Plan; research closes with documentary ranking + per-candidate `Verify` gates.
|
||||
- **C2-specific**: most C2 candidates are MIT/Apache permissive — license-track concern is less material than C1's GPL-3.0 vs BSD split; this row tracks license but does not split by track.
|
||||
|
||||
| # | Candidate | License | Per-mode verification | Status | Lead reason / disqualifier | Sub-matrix cite |
|
||||
|---|---|---|---|---|---|---|
|
||||
| 1 | **MixVPR** (ResNet50+MixVPR @ 320×320 → 2048-D) | MIT (BSD/permissive track) | ✅ Fact #42 + Source #57 + #58 | **Documentary lead with aerial-domain-training caveat** | OpenVPRLab canonical reference implementation; runnable per-mode example with project's pinned config; FAISS retrieval harness; descriptor cache ~650 MB fp16 within 10 GB AC-8.1 budget; 1.21 ms A100 latency baseline extrapolates well within AC-4.1 budget. **Caveat**: canonical weights are GSV-Cities (street-view) trained — Plan-phase decision required between (a) project-domain retrain on AerialVL, (b) aerial-trained community checkpoint, (c) elevate alternate C2 candidate | `../02_fact_cards/C2_vpr.md` → "MixVPR — per-numbered binding" |
|
||||
| 2 | **SALAD** (DINOv2 ViT-B/14 + SALAD aggregator @ 322×322 → 8448-D full / 2112-D / 544-D slim) | **GPL-3.0** (canonical, GPL-3.0 track) | ✅ Fact #43 + Source #59 + #60 + #61 | **Documentary lead with aerial-domain-training caveat + GPL-3.0-license-track caveat + DINOv2-ViT-export risk caveat** | Canonical CVPR 2024 implementation (`serizba/salad`); Torch-Hub one-liner `torch.hub.load("serizba/salad", "dinov2_salad")` for full variant; eval CLI ships three pretrained checkpoints (full 8448-D, slim 2112-D, slim 544-D); 2.41 ms RTX 3090 latency baseline extrapolates ~20–30 ms on Jetson Orin Nano Super at fp16 with TensorRT; **+11 R@1 absolute over MixVPR on MSLS Challenge** (75.0 vs 64.0 per paper Table 1) and **+17.6 R@1 on NordLand** (76.0 vs 58.4) — strongest cross-season generalization signal among the documented C2 candidates. Single-stage design (no re-ranking), built-in dustbin discards uninformative regions, optimal-transport assignment is bidirectional (feature-to-cluster + cluster-to-feature). **Three caveats vs MixVPR**: (i) GPL-3.0 license places SALAD on copyleft track — interacts with D-C1-1 license posture; under BSD/permissive lock at Plan, SALAD is excluded; (ii) DINOv2 ViT-B export to TensorRT fp16/INT8 on Jetson is paper-acknowledged "slower than ResNet" + industry-known harder than CNN export — D-C2-5 deferred Jetson MVE risk; (iii) full 8448-D descriptor cache consumes ~2.7 GB / ~27% of 10 GB AC-8.3 cache budget vs MixVPR's ~650 MB / 6.5% — D-C2-6 descriptor-size-choice trade-off; slim 544-D variant restores feasibility (~0.17 GB / 1.7%) at cost of ~5 R@1 points on MSLS Challenge | `../02_fact_cards/C2_vpr.md` → "SALAD — per-numbered binding" |
|
||||
| 3 | **SelaVPR** (DINOv2 ViT-L/14 frozen + Global+Local Adaptation adapters @ 224×224 → 1024-D global + 61×61×128 dense local; two-stage retrieval+rerank) | **MIT** (BSD/permissive track) | ✅ Fact #44 + Source #62 + #63 + #61 | **Documentary lead with aerial-domain-training caveat + DINOv2-ViT-L-export risk caveat (HARSHER than SALAD-ViT-B) + two-stage-latency-and-local-feature-cache-strategy risk caveat** | Canonical ICLR 2024 implementation (`Lu-Feng/SelaVPR`); training+eval CLIs (`python3 train.py --foundation_model_path=/path/to/dinov2_vitl14_pretrain.pth`, `python3 eval.py --rerank_num={20,100}`); two pretrained checkpoints (MSLS-finetuned for diverse scenes / Pitts30k-further-finetuned for urban) + optional `--registers` variant. **First DINOv2-based C2 candidate on BSD/permissive track — materially expands BSD/permissive C2 axis options vs MixVPR-only state**. RTX-3090 baseline 0.027 s extraction + 0.085 s matching (rerank_num=100) = 0.112 s total per query (paper Table 3). Extrapolation to Jetson Orin Nano Super: ~200–270 ms extraction + ~150 ms matching at rerank_num=20 = ~350 ms (tight against AC-4.1 400 ms budget; **rerank_num=100 FAILS budget**). Global descriptor 1024-D = ~320 MB cache (smallest of all C2 candidates, 3.2% of 10 GB AC-8.3 budget); **dense 61×61×128 local-feature cache ~150 GB across operational area = INFEASIBLE without D-C2-7 mitigation strategy** (cache global only + on-demand local-feature re-extraction, OR precompute top-K, OR disable rerank). **Three caveats vs MixVPR**: (i) DINOv2-ViT-L (300M params) backbone is 3.5× larger than SALAD-ViT-B's 86M and 12× larger than MixVPR-ResNet50's 25M — D-C2-5 export risk **harshest in C2 row so far**; counter-mitigation by frozen-backbone canonical TensorRT export pathway (FB AI Public Files); (ii) two-stage retrieval+rerank is structurally novel — D-C2-7 strategy choice; (iii) input size 224×224 is more aggressive downscale from 5472×3648 than MixVPR's 320×320 / SALAD's 322×322. **Documentary advantage**: Tokyo24/7 R@1=94.0 (best in paper Table 2 across all compared methods, +9 absolute over MixVPR's 85.1, +5.4 over prior SOTA R²Former 88.6); Nordland-test R@1=85.2 (vs SALAD's 76.0 and MixVPR's 58.4) — strongest cross-illumination + cross-season generalization signal among C2 candidates so far on ground-level (aerial unverified — D-C2-1) | `../02_fact_cards/C2_vpr.md` → "SelaVPR — per-numbered binding" |
|
||||
| 4 | **NetVLAD** (VGG-16 cropped at conv5_3 + NetVLAD pooling `vlad_preL2_intra` K=64 + PCA-whitening @ 224×224 → 4096-D global descriptor; canonical Pittsburgh-30k-pretrained variant) | **MIT** canonical (`Relja/netvlad`); **license-uncertain** Nanne PyTorch port (BSD/permissive track on canonical) | ✅ Fact #45 + Source #64 + #65 + #66 | **Mandatory simple-baseline** with MIT license + license-uncertain-Nanne-port caveat + established-baseline-accuracy-deficit-as-feature + runtime-stack-port-risk caveat + 4096-D-descriptor-cache caveat + aerial-domain-training caveat | Canonical learned-VLAD reference baseline for the entire VPR field (CVPR 2016, > 4000 citations); cited as the baseline in every subsequent VPR paper (MixVPR Table 1+4, SALAD Table 1, SelaVPR Table 2+3, AnyLoc, BoQ). Role per engine Component Option Breadth rule: **mandatory simple-VLAD baseline that establishes the long-established reference point against which modern C2 leads must show measurable advantage to justify added complexity**. Documented Recall@K deficit vs modern leads is expected and IS the role's purpose: Pitts30k-test R@1=84.1 (paper) / 85.2 (PyTorch reproduction) — **5-11 absolute below** MixVPR/SALAD/SelaVPR; Tokyo24/7 R@1=73.3 — **11.8-20.7 absolute below**; Nordland-test R@1≈33 — **25-52 absolute below**. **POSITIVE structural advantage**: VGG-16 backbone + single-stage retrieval = **LOWEST D-C2-4 + D-C2-5 risk among C2 candidates** (VGG-16 has the most-export-friendly TensorRT pathway; no DINOv2 ViT export-risk applies; no two-stage re-ranking latency variance; no local-feature cache pressure). **NEW caveats vs prior C2 candidates**: (i) runtime-stack port-risk (canonical MATLAB + MatConvNet not deployable on JetPack 6 → PyTorch port required); (ii) Nanne port license-uncertainty (README does NOT cite LICENSE file → Plan-phase verification gate); (iii) 4096-D PCA-whitened descriptor consumes ~13% of AC-8.3 cache budget — largest single-stage descriptor cache so far (256-D / 512-D `cropToDim` variants documented for tighter budgets at cost of further Recall@K loss). | `../02_fact_cards/C2_vpr.md` → "NetVLAD — per-numbered binding" |
|
||||
| 5 | **EigenPlaces** (ResNet50 + GeM + FC @ 224×224 → 2048-D global descriptor; canonical PyTorch-Hub best-Recall@K variant) | **MIT** (BSD/permissive track) | ✅ Fact #46 + Source #67 + #68 | **Documentary lead with aerial-domain-training caveat + structurally-simplest-modern-competitive-CNN advantage + 60%-less-VRAM-retrain advantage + viewpoint-robust-training-paradigm advantage + extreme-cross-season-third-place caveat** | Canonical ICCV 2023 implementation (`gmberton/EigenPlaces`); PyTorch Hub one-liner `torch.hub.load("gmberton/eigenplaces", "get_trained_model", backbone="ResNet50", fc_output_dim=2048)`; eleven canonical pretrained checkpoints PyTorch-Hub-distributed (more than any other C2 candidate evaluated); companion `gmberton/VPR-methods-evaluation` fair-comparison harness. **Three POSITIVE structural advantages vs all prior C2 candidates**: (i) **STRUCTURALLY-SIMPLEST MODERN COMPETITIVE CNN ARCHITECTURE** in C2 row (ResNet-50 + GeM + FC — fewer moving parts than MixVPR's MLP-Mixer / SALAD's optimal-transport+DINOv2-B / SelaVPR's two-stage DINOv2-L+adapters / NetVLAD's soft-assignment+PCA-whitening) → **lowest D-C2-4 + D-C2-5 risk among modern competitive C2 leads**, ~15-30 ms total per frame on Jetson Orin Nano Super extrapolation, ~58 MB total weights at fp16 (smallest model footprint of any C2 candidate evaluated); (ii) **60%-LESS-VRAM-RETRAIN advantage** vs MixVPR (paper §4.4: <7 GB VRAM training vs MixVPR's 18 GB at canonical batch) → **most retrain-friendly C2 candidate for D-C2-1 aerial-domain retrain decision**; (iii) **VIEWPOINT-ROBUST TRAINING PARADIGM** (paper §3 lateral+frontal CosFace dual loss with SVD-based class construction — explicitly designed for viewpoint shifts) → **most semantically-aligned training prior for aerial nadir VPR** where UAV multi-heading flights generate exactly the multi-viewpoint training signal EigenPlaces is designed to exploit. **Documented Recall@1 vs other C2 candidates (best-config-of-each)**: Pitts30k 92.5 (vs MixVPR 91.5, SALAD 95.1 different paper, SelaVPR 92.8, NetVLAD 84.1); Tokyo24/7 **93.0** (best in EigenPlaces paper Tab 3 across all compared methods, second only to SelaVPR's 94.0 in C2 row); AmsterTime **48.9** (BEST in C2 row for extreme decade-scale cross-time domain shift — relevant to Ukraine-active-conflict scene-change scenarios); SF-XL test v1 **84.1** (BEST in row, +44 over NetVLAD); Nordland 71.2 (third in C2 row — SelaVPR wins by +14 absolute; viewpoint-robustness comes at the cost of being weaker than DINOv2-based on extreme cross-season); SVOX-Night 58.9 (fourth — MixVPR-4096 wins by +5.5). Cache footprint at 2048-D = ~650 MB / 6.5% (identical to MixVPR-2048; smaller sibling modes 128/256/512-D documented as PyTorch-Hub-distributed). **Closes the BSD/permissive C2 axis with a 4th materially-different design point** alongside MixVPR + SelaVPR + NetVLAD. README explicitly recommends MegaLoc as a SOTA successor — for the project's mandatory-pre-screen role this is acceptable; Plan-phase may want to also evaluate MegaLoc as a separately-cataloged sibling/successor candidate | `../02_fact_cards/C2_vpr.md` → "EigenPlaces — per-numbered binding" |
|
||||
| 6 | **AnyLoc** (DINOv2 ViT-G+VLAD) | (TBD) | NOT STARTED — conditional on INT8 quantization | **Conditional** | DINOv2 ViT-G is too large for Jetson Orin Nano Super at fp16; INT8 quantization path is the only route to inclusion (per Fact #26 pre-screen rule) | (conditional next session) |
|
||||
| 7 | **BoQ** (DINOv2 ViT-B+BoQ) | MIT | NOT STARTED — conditional on INT8 quantization | **Conditional** | Same author as MixVPR (amaralibey); also bundled in OpenVPRLab; transformer-based aggregation with learnable queries; Jetson cost of DINOv2 ViT-B + BoQ requires INT8 path | (conditional next session) |
|
||||
| 8 | **DINOv2-VLAD** (DINOv2 direct + VLAD pooling) | (TBD) | NOT STARTED — conditional on INT8 quantization | **Conditional** | Heaviest of the conditional candidates; only worthwhile if INT8 path proven for any other DINOv2-based candidate first | (conditional next session) |
|
||||
| 9 | **SuperGlue-as-reranker** | (N/A) | (pruned outright per pre-screen) | **Pruned outright** | Matcher-class, not VPR-class; no global-descriptor stage | (no entry; pruned at SQ3+SQ4 pre-screen) |
|
||||
|
||||
### C2 — Plan-phase deliverables raised by MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces closures (mandatory pre-screen complete; conditional candidates may compound)
|
||||
|
||||
1. **D-C2-1 VPR canonical-weights vs aerial-retrain vs aerial-community-checkpoint** (raised by MixVPR; reaffirmed by SALAD + SelaVPR with identical caveat) — must be presented to user as a structured Choose block at Plan time; applies to **every** ground-level-pretrained C2 candidate, so the decision is project-wide. Options: (a) project-domain retrain on AerialVL / AerialExtreMatch, (b) source aerial-trained community checkpoint at Plan time, (c) elevate a candidate with already-aerial-trained weights as the C2 lead.
|
||||
2. **D-C2-2 descriptor-cache carve-out vs raw-tile-cache budget** (raised by MixVPR; harshened by SALAD; **materially-changed-shape by SelaVPR**) — AC-8.3 explicitly requires Plan-phase decision on whether the C2 descriptor table is part of the 10 GB cache budget or carved out separately. Per-variant lower bounds (global-descriptor stage only): **SelaVPR 1024-D ~320 MB fp16 / 3.2%** (smallest of all C2 candidates so far); MixVPR 2048-D ~650 MB fp16 / 6.5%; SALAD-slim 544-D ~0.17 GB / 1.7%; SALAD-slim 2112-D ~0.68 GB / 6.8%; **SALAD-full 8448-D ~2.7 GB / 27%**; conditional candidates (AnyLoc 49152-D, BoQ 16384-D, DINOv2-VLAD) push descriptor cache to ~10 GB alone, forcing the carve-out decision. **NEW SelaVPR-specific local-feature-cache pressure**: 61×61×128 dense local features × 160k tiles × 2 bytes (fp16) = ~150 GB — fundamentally infeasible without D-C2-7 mitigation strategy.
|
||||
3. **D-C2-3 input-resolution shape (224×224 vs 320×320 vs 322×322 vs higher)** (raised by MixVPR/SALAD at 320–322; **harshened by SelaVPR's 224×224**) — SelaVPR's 224×224 is more aggressive downscale from 5472×3648 than MixVPR's 320×320 / SALAD's 322×322; SelaVPR is at the small-input extreme of the C2 candidate space, MixVPR + SALAD are at the medium-input baseline, AnyLoc + BoQ may be at the higher-resolution end (next sessions). Plan-phase decision after all C2 candidates have per-Mode entries.
|
||||
4. **D-C2-4 deferred Jetson Orin Nano Super hardware MVE phase coverage** (raised by MixVPR; broadened by SALAD; **broadened further by SelaVPR**) — same artifact as D-C1-2; must now also produce DINOv2 ViT-B AND ViT-L → TensorRT fp16/INT8 export-quality numbers + per-C2-candidate latency + memory + AerialExtreMatch Recall@K numbers + SelaVPR two-stage re-ranking latency profile + on-demand local-feature extraction performance; promotes Documentary leads to Selected.
|
||||
5. **D-C2-5 DINOv2 ViT-export to TensorRT fp16/INT8 path on Jetson Orin Nano Super** (raised by SALAD; **harshened by SelaVPR**) — applies to every ViT-based C2 candidate (SALAD-ViT-B, SelaVPR-ViT-L, AnyLoc-ViT-G, BoQ-ViT-B, DINOv2-VLAD). SelaVPR's ViT-L is 3.5× larger than SALAD's ViT-B; export risk profile materially elevated. Counter-mitigation by frozen-backbone canonical TensorRT export pathway (SelaVPR's frozen DINOv2-L weights have a well-documented optimization path via FB AI Public Files distribution, vs SALAD's fine-tuned-backbone). Jetson MVE prerequisite for any ViT-based C2 candidate to advance from Documentary lead to Selected. Likely rolls into D-C1-2 + the C7 inference-runtime row.
|
||||
6. **D-C2-6 SALAD descriptor-size choice (8448-D / 2112-D / 544-D)** (raised by SALAD only — does not apply to SelaVPR which has fixed 1024-D global descriptor) — Plan-phase trade-off; full variant gives best R@1 (MSLS Challenge 75.0) but consumes ~27% of cache budget; slim 2112-D variant (R@1 73.7) consumes ~6.8%; slim 544-D variant (R@1 70.8) fits within 1.7% of cache budget. Interacts with D-C2-2.
|
||||
7. **D-C2-7 (NEW from SelaVPR closure 2026-05-08) — SelaVPR re-ranking strategy choice (full re-rank with on-demand local-feature extraction / cache top-K local features per likely query path / disable re-ranking entirely and use SelaVPR-global-only mode)** — only applies to SelaVPR (first two-stage C2 candidate evaluated). Plan-phase decision; full re-rank at rerank_num=100 fails AC-4.1 latency budget on Jetson extrapolation; rerank_num=20 fits but tight; on-demand local-feature extraction + global-only-cache (~320 MB) is the most cache-efficient mitigation; precompute-top-K-local-features (~3 GB at K=20 with selective coverage) is the moderate option; disable-rerank gives back the two-stage advantage but drops MSLS-challenge R@1 from 73.5 to 69.6 (still ahead of MixVPR's 64.0). **Three-way interaction with D-C2-2 (cache carve-out) and AC-8.3 (10 GB budget) and AC-4.1 (400 ms latency budget)**.
|
||||
8. **D-C2-8 (NEW from NetVLAD closure 2026-05-08) — NetVLAD PyTorch-port-strategy choice (Nanne/pytorch-NetVlad with license-uncertainty / re-port from canonical Relja/netvlad with MIT preservation / OpenVPRLab-NetVLAD-on-ResNet50 as separately-cataloged sibling mode)** — only applies to NetVLAD (canonical implementation is MATLAB + MatConvNet, not deployable on JetPack 6). Plan-phase decision; Nanne port is fastest path but README does NOT cite a LICENSE file — Plan-phase verification gate is required before Nanne adoption; re-port from canonical Relja/netvlad MATLAB to PyTorch directly preserves MIT licensing but requires ~1 week of engineering + cluster-init prerequisite; OpenVPRLab-NetVLAD-on-ResNet50 (per Source #57) is apples-to-apples vs MixVPR but is a *different mode* per Per-Mode API rule (different backbone, different pretrained checkpoint provenance) and would be cataloged as a separate sibling candidate. Recommendation: re-port from canonical to preserve MIT licensing alignment with MixVPR + SelaVPR on the BSD/permissive track.
|
||||
9. **D-C2-9 (NEW from NetVLAD closure 2026-05-08) — NetVLAD descriptor-dimension choice (canonical 4096-D PCA-whitened / 512-D `cropToDim` for tighter cache / 256-D `cropToDim` for tightest cache)** — only applies to NetVLAD; analogous to D-C2-6 SALAD descriptor-size choice but for NetVLAD's PCA-whitened output. Plan-phase decision; canonical 4096-D consumes ~1.3 GB / 13% of 10 GB AC-8.3 cache budget — largest single-stage descriptor cache of any C2 candidate so far; 512-D `cropToDim` reduces to ~160 MB / 1.6% at additional Recall@K loss; 256-D `cropToDim` reduces to ~80 MB / 0.8% at further loss. Only valid for `+whitening` networks. Interacts with D-C2-2 carve-out decision.
|
||||
10. **D-C2-10 (NEW from EigenPlaces closure 2026-05-08) — EigenPlaces descriptor-dimension choice (canonical 2048-D / 512-D / 256-D / 128-D — eleven backbone+dim sibling modes documented PyTorch-Hub-distributed)** — only applies to EigenPlaces; analogous to D-C2-6 SALAD and D-C2-9 NetVLAD descriptor-dimension choices. Plan-phase decision; canonical ResNet-50 + 2048-D consumes ~650 MB / 6.5% of AC-8.3 cache budget (identical to MixVPR-2048 for direct apples-to-apples comparison); 512-D variant reduces to ~160 MB / 1.6% at modest Recall@1 loss (paper Tab 3: Pitts30k 91.9 at 512 vs 92.5 at 2048 = -0.6, Tokyo24/7 89.8 at 512 vs 93.0 at 2048 = -3.2 — extreme cross-domain hurts most); 256-D reduces to ~80 MB / 0.8% at moderate Recall@K loss; 128-D reduces to ~40 MB / 0.4% at substantial Recall@K loss on cross-domain (paper §4.3 explicit observation that lower-D variants struggle on AmsterTime/Tokyo24/7/SVOX-Night). Interacts with D-C2-2 carve-out decision.
|
||||
11. **D-C2-11 (NEW from EigenPlaces closure 2026-05-08, conditional) — MegaLoc successor evaluation as separately-cataloged sibling candidate** — EigenPlaces canonical README explicitly recommends MegaLoc as a SOTA successor ("EigenPlaces is quite old. Looking for SOTA Visual Place Recognition (VPR)? Check out MegaLoc"). Plan-phase decision: (a) treat MegaLoc as a separately-cataloged sibling candidate at Plan time (would require its own per-mode API capability verification + sub-matrix), (b) defer MegaLoc evaluation to a post-research session if EigenPlaces fails Jetson MVE, (c) skip MegaLoc and rely on the closed mandatory pre-screen (5/5: MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces). Recommendation: defer to post-research session — EigenPlaces closes the mandatory pre-screen at the documentary-required floor, and MegaLoc's Plan-phase relevance depends on which D-C1-1 license-track is chosen and how Jetson MVE results land.
|
||||
|
||||
### C2 — Per-candidate ranking (mandatory pre-screen complete at 5 of 5; final ranking deferred to Jetson MVE phase)
|
||||
|
||||
Status: **5 of 5 mandatory pre-screen candidates** have per-Mode entries (MixVPR + SALAD + SelaVPR + NetVLAD + EigenPlaces). Final ranking deferred to D-C1-2 + D-C2-4 dedicated Jetson Orin Nano Super hardware MVE phase between research and Plan. **Conditional pre-screen candidates (AnyLoc / BoQ / DINOv2-VLAD) remain GATED on a prerequisite INT8 quantization survey** before they can be added to per-mode rows (per Fact #26 pre-screen rule).
|
||||
|
||||
Per-license-track preliminary picture (mandatory pre-screen final picture; will be re-ranked at Jetson MVE phase if conditional candidates are added):
|
||||
|
||||
**BSD/permissive track** (track lead under D-C1-1 = (b) or default (c)):
|
||||
1. **SelaVPR** — Documentary lead with three caveats (MIT, 1024-D global + 61×61×128 dense local, 224×224 input, DINOv2 ViT-L/14 frozen backbone, two-stage retrieval+rerank, `Lu-Feng/SelaVPR` canonical implementation). **Strongest documentary cross-illumination + cross-season recall numbers** among C2 candidates so far on ground-level: Tokyo24/7 R@1=94.0 (best in paper Table 2 across all compared methods including SOTA R²Former 88.6) and Nordland-test R@1=85.2 (vs SALAD's 76.0 and MixVPR's 58.4). Carries D-C2-5 (DINOv2-ViT-L export risk, harshest in C2 row) + D-C2-7 (re-ranking strategy choice, NEW) + D-C2-3 (smallest-input downscale-from-5472×3648).
|
||||
2. **MixVPR** — Documentary lead with aerial-domain-training caveat (MIT, 2048-D descriptor, 320×320 input, ResNet50 backbone, OpenVPRLab canonical implementation). Cleanest BSD/permissive-track candidate: simplest backbone + simplest export path + smallest model footprint + medium descriptor cache.
|
||||
3. **EigenPlaces** — Documentary lead with five distinguishing characteristics (MIT, 2048-D / 512-D / 256-D / 128-D PyTorch-Hub-distributed sibling modes [eleven canonical pretrained checkpoints total — most of any C2 candidate], 224×224 input, ResNet-50 + GeM + FC structurally-simplest-modern-competitive-CNN backbone, single-stage retrieval, `gmberton/EigenPlaces` canonical implementation). **Distinguishing positive structural advantages vs MixVPR + SelaVPR + NetVLAD on this track**: (i) STRUCTURALLY-SIMPLEST modern competitive CNN (lowest D-C2-4 + D-C2-5 runtime risk among modern competitive C2 leads; smallest model footprint ~58 MB at fp16); (ii) 60%-LESS-VRAM-RETRAIN advantage vs MixVPR (most retrain-friendly C2 candidate for D-C2-1 aerial-domain decision); (iii) VIEWPOINT-ROBUST training paradigm (most semantically-aligned training prior for aerial nadir VPR — UAV multi-heading flights generate the multi-viewpoint training signal). **Documented Recall@K**: BEST in C2 row on multi-view (Pitts30k 92.5, AmsterTime 48.9 [BEST in C2 row for extreme decade-scale cross-time], SF-XL-v1 84.1) and second-only-to-SelaVPR on Tokyo24/7 (93.0 vs 94.0 — with much lower deployment risk than SelaVPR); third-place on extreme cross-season Nordland (71.2 vs SelaVPR 85.2 = -14) and SVOX-Night (58.9 vs MixVPR-4096 64.4 = -5.5). Carries NEW D-C2-10 (descriptor-dimension choice) + conditional D-C2-11 (MegaLoc successor evaluation deferral).
|
||||
4. **NetVLAD** — Mandatory simple-baseline (MIT canonical / license-uncertain Nanne PyTorch port, 4096-D PCA-whitened / 512-D / 256-D `cropToDim` variants, 224×224 input, VGG-16 cropped-at-conv5_3 backbone, single-stage retrieval, `Relja/netvlad` canonical MATLAB + `Nanne/pytorch-NetVlad` PyTorch reproduction + canonical paper Arandjelović 2016). **Long-established VPR reference baseline** — cited as the baseline in every modern VPR paper. **Documented Recall@K deficit vs modern leads (5-25 absolute R@1 across Pitts30k/Tokyo24/7/Nordland) IS the role's purpose** per engine Component Option Breadth rule. **POSITIVE structural advantages**: LOWEST D-C2-4 + D-C2-5 risk overall (VGG-16 → TensorRT is the most-export-friendly path of any C2 candidate; no DINOv2 ViT export-risk; no two-stage re-ranking variance; no local-feature cache pressure). Carries NEW D-C2-8 (PyTorch port-strategy choice) + NEW D-C2-9 (descriptor-dimension choice).
|
||||
|
||||
**Comparison: MixVPR vs SelaVPR vs EigenPlaces vs NetVLAD on BSD/permissive track** — four materially-different design points on the same license track (modern competitive lead [MLP-Mixer] vs modern competitive lead [DINOv2-L two-stage] vs modern competitive lead [viewpoint-robust ResNet-50] vs long-established baseline [VLAD]):
|
||||
|
||||
| Dimension | MixVPR | SelaVPR | EigenPlaces | NetVLAD (mandatory baseline) |
|
||||
|---|---|---|---|---|
|
||||
| Backbone | ResNet50 (~25M params) | DINOv2 ViT-L/14 (~300M params, FROZEN) | ResNet50 (~25M params) | VGG-16 cropped at conv5_3 (~50-60M params) |
|
||||
| Aggregator/Adapter | MLP-Mixer aggregation | Lightweight serial+parallel adapters per ViT block + LocalAdapt up-conv | **GeM (Generalized Mean Pooling, parameter-free) + single FC layer** | NetVLAD soft-assignment-VLAD pooling K=64 + PCA-whitening |
|
||||
| Input size | 320×320 | 224×224 (more aggressive downscale) | 224×224 (same as SelaVPR + NetVLAD) | 224×224 (same as SelaVPR + EigenPlaces) |
|
||||
| Global descriptor | 2048-D | 1024-D | **2048-D** canonical / 512-D / 256-D / 128-D / VGG16+512-D PyTorch-Hub-distributed sibling modes (eleven canonical pretrained checkpoints) | **4096-D PCA-whitened** (canonical) / 512-D / 256-D `cropToDim` variants |
|
||||
| Retrieval architecture | Single-stage | Two-stage (top-K via global + rerank via local MNN cross-match) | **Single-stage** | Single-stage |
|
||||
| Global descriptor cache (~400 km² @ 0.5 m/px) | ~650 MB fp16 / 6.5% | ~320 MB fp16 / 3.2% | **~650 MB fp16 / 6.5% (canonical 2048-D — identical to MixVPR-2048)** / ~160 MB / 1.6% (512-D) / ~80 MB / 0.8% (256-D) / ~40 MB / 0.4% (128-D) | **~1.3 GB fp16 / 13%** (canonical 4096-D) / ~160 MB / 1.6% (512-D) / ~80 MB / 0.8% (256-D) |
|
||||
| Local-feature cache | (none — single-stage) | 61×61×128 dense local features = ~150 GB if naive cache; needs D-C2-7 mitigation | (none — single-stage) | (none — single-stage) |
|
||||
| Model footprint at fp16 | ~50 MB (ResNet50 + MLP-Mixer) | ~600 MB (DINOv2-L + adapters + LocalAdapt) | **~58 MB (smallest of any C2 candidate evaluated — ResNet50 + GeM + FC)** | ~400 MB (VGG-16 + NetVLAD + PCA-whitening) |
|
||||
| RTX-3090 latency baseline | 1.21 ms (paper Table 4, A100) | 27 ms extraction + 85 ms matching @ rerank_num=100 = 112 ms total (paper Table 3) | ~5 ms (ResNet-50 fp16 + GeM + FC contemporary benchmark extrapolation; paper §4.4 says "extraction time negligible at scale") | ~10-20 ms VGG-16 forward pass + ~1-2 ms NetVLAD aggregation + ~1-2 ms PCA-whitening MatMul |
|
||||
| Jetson Orin Nano Super extrapolated latency | ~10–30 ms | ~200–270 ms extraction + ~150 ms matching @ rerank_num=20 = ~350 ms total (FAILS AC-4.1 at rerank_num=100) | **~15-30 ms total (LOWEST among modern competitive C2 leads; D-C2-5 ViT-export-risk does not apply)** | ~40-60 ms total (LOWEST runtime risk overall in C2 row; D-C2-5 ViT-export-risk does not apply but VGG-16 is older + larger than ResNet-50) |
|
||||
| Training GPU memory cost (canonical batch) | ~18 GB (paper §4.4) | ~24 GB (DINOv2-L finetune; estimated) | **<7 GB (paper §4.4 — 60% LESS than MixVPR)** | (not directly reported; canonical training is on cluster) |
|
||||
| MSLS-Val R@1 | 87.2 (@4096) / 83.6 (@512) | 90.8 | **89.1 (@2048)** | (not in MSLS-Val paper Table; documented as baseline floor) |
|
||||
| Pitts30k-test R@1 (canonical) | ~90 | 92.8 | **92.5 (@2048)** | **84.1 (paper) / 85.2 (PyTorch reproduction)** — baseline floor |
|
||||
| Tokyo24/7 R@1 (cross-illumination day/night) | 85.1 | **94.0** (best in SelaVPR paper Table 2) | **93.0 (@2048; second-place in C2 row, with much lower deployment risk than SelaVPR)** | **73.3** — baseline floor (-11.8 to -20.7 absolute vs modern leads) |
|
||||
| AmsterTime R@1 (extreme decade-scale cross-time) | 40.2 (@4096) | (not reported in SelaVPR paper) | **48.9 (BEST in C2 row — relevant to Ukraine-active-conflict scene-change scenarios)** | 16.3 (VGG16+4096) — baseline floor |
|
||||
| Nordland-test R@1 (extreme cross-season) | 58.4 (canonical paper) / 76.2 (EigenPlaces paper Tab 4 @4096) | **85.2** | 71.2 (@2048; third-place, viewpoint-robustness comes at the cost of being weaker than DINOv2 on extreme cross-season) | **~33** (per MixVPR paper Table 1 baseline) — baseline floor |
|
||||
| SVOX-Night R@1 (extreme illumination) | 64.4 (@4096) | (not reported in SelaVPR paper) | 58.9 (@2048; fourth-place; MixVPR-4096 wins by +5.5) | 8.0 (VGG16+4096) — baseline floor |
|
||||
| Aerial-domain training | None (D-C2-1 applies) | None (D-C2-1 applies) | None (D-C2-1 applies) **but EigenPlaces is the MOST retrain-friendly candidate** at <7 GB GPU VRAM — D-C2-1 = (a) project-domain retrain on AerialVL is materially cheaper to execute on EigenPlaces than on any other candidate | None (D-C2-1 applies) — but NetVLAD's mandatory-baseline role does NOT require aerial-domain training to be useful |
|
||||
| Training paradigm semantic alignment with aerial nadir VPR | Standard metric learning (multi-similarity loss on GSV-Cities) — generic | DINOv2 frozen + adapter fine-tuning (parameter-efficient transfer learning) — generic | **Lateral+Frontal CosFace dual loss with SVD-based viewpoint-shift class construction** — most semantically-aligned training prior for UAV multi-heading flights generating multi-viewpoint training signal | Weakly supervised triplet ranking with hard negative mining on Google Street View Time Machine — generic |
|
||||
| Role per engine Component Option Breadth | Modern competitive lead (compact-descriptor leader) | Modern competitive lead (best documented cross-illumination/cross-season ground-level recall via two-stage) | Modern competitive lead (**viewpoint-robust + structurally-simplest + most retrain-friendly + best AmsterTime cross-time**) | **Mandatory simple-VLAD reference baseline** that establishes the long-established floor against which modern leads must measurably exceed |
|
||||
|
||||
**GPL-3.0 track** (track lead under D-C1-1 = (a) or default (c)):
|
||||
1. **SALAD** — Documentary lead with three caveats (GPL-3.0, 8448-D / 2112-D / 544-D descriptor variants, 322×322 input, DINOv2 ViT-B/14 backbone with last 4 blocks fine-tuned, `serizba/salad` canonical implementation). Strongest single-stage MSLS Challenge R@1 (75.0 full vs SelaVPR's 73.5 and MixVPR's 64.0). Carries DINOv2-ViT-B export risk (D-C2-5, harsher than MixVPR's CNN, lighter than SelaVPR's ViT-L) and descriptor-cache budget pressure (D-C2-6).
|
||||
2. (no other GPL-3.0 C2 leads pending in the mandatory pre-screen — SelaVPR landed on the BSD/permissive track, contradicting the prior session's assumption that it would be the "most likely additional GPL-3.0 candidate").
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -0,0 +1,47 @@
|
||||
# Component Fit Matrix — C4: Pose estimation (PnP + RANSAC + LM)
|
||||
|
||||
> Mode A Phase 2 — engine Step 7.5 (Component Applicability Gate, structured per-component candidate-selection table). Status vocabulary in [`00_summary.md`](00_summary.md). Backing fact cards: [`../02_fact_cards/SQ2_canonical_pipeline.md`](../02_fact_cards/SQ2_canonical_pipeline.md) (canonical PnP+RANSAC+LM pipeline) and [`../02_fact_cards/C3_matchers.md`](../02_fact_cards/C3_matchers.md) (C3 → C4 input contract).
|
||||
>
|
||||
> Index: [`00_summary.md`](00_summary.md). Sibling components: [C1 VIO](C1_vio.md), [C2 VPR](C2_vpr.md), [C3 Matchers](C3_matchers.md), [C5–C10 pending](C5-C10_pending.md). Cross-component gates: [`99_cross_component_gates.md`](99_cross_component_gates.md).
|
||||
|
||||
---
|
||||
|
||||
## C4 — Pose estimation (PnP + RANSAC + LM)
|
||||
|
||||
**Status**: IN PROGRESS — **3 of N candidates** complete at documentary level (mandatory-simple-baseline role STRUCTURALLY COMPLETE — OpenCV `cv::solvePnPRansac` per Fact #52; modern-competitive-lead-richer-minimal-solver role — OpenGV per Fact #53; modern-competitive-lead-covariance-honest role — GTSAM per Fact #54). Subsequent candidates (Theia / Ceres-only) will be cataloged in subsequent sessions if needed.
|
||||
|
||||
**Definition correction (2026-05-08, locked by user)**: the original `00_question_decomposition.md` line 72 taxonomy table named C4 "Single-frame orthorectification" but the dominant convention through C1+C2+C3 closures (line 160 + 194 of `00_question_decomposition.md`, ALL C3-row pinned outputs `feeding C4 PnP+RANSAC pose estimator`, Fact #20 + #21 + #45 through #51 audience lines) treats C4 as "Pose estimation (PnP + RANSAC + LM)" — taking C3's 2D-2D correspondences and producing 6-DoF camera pose w.r.t. tile, feeding C5 estimator. **User-locked C4 definition is Pose estimation**. The orphaned "Single-frame orthorectification" responsibility (AC-8.4 mid-flight tile generation, write-side path) is reassigned-pending-final-row-decision to **C6 (tile cache + spatial index)** as a write-side cache concern (since AC-8.4 is fundamentally `pose-from-C5 + nav-frame-from-C1 → ortho-rectified-tile → C6 cache write`, and C6 already owns tile cache write per Fact #21 binding `AC-8.4 ... bound by ... C6 (tile cache write)`). Final placement of orthorectification will be confirmed when the C6 row is opened. **No new component slot is created**; the original C4–C10 numbering is preserved; only C4's responsibility is changed.
|
||||
|
||||
**Pinned mode** (per-frame pose-from-correspondences contract for the project's C4 row):
|
||||
|
||||
- inputs: `{up to 1024 2D-2D correspondences with confidence scores per (UAV-frame, satellite-tile) image pair from C3 (DISK+LightGlue D-C3-1 RECOMMENDED-PRIMARY / ALIKED+LightGlue D-C3-1 SECONDARY / XFeat D-C3-1 ALTERNATE / SP+LightGlue documentary-baseline / SuperGlue+SuperPoint mandatory-simple-baseline) + per-tile geo metadata (WGS84 corner coordinates + ortho resolution per AC-8.1) from C6 tile cache read + 3D lift via DSM (if available; AC + restrictions specify 2D ortho only — Fact #23 closure deferred 2D-3D-lift architectural decision means 2D-only operation is the project default)}` on `Jetson Orin Nano Super (8 GB shared, JetPack 6, ROS 2 Humble; final inference runtime selection deferred to C7 + D-C3-2 reuse)`; per-frame compute = K=10 image pairs × 1 PnP+RANSAC+LM call per Fact #25 + AC-3.3 re-localization
|
||||
- outputs: `{6-DoF camera pose w.r.t. tile (R, t) + per-correspondence inlier mask + reprojection error + 6×6 covariance + inlier ratio + RANSAC iteration count + source label satellite_anchor for the fused C5 estimate}` per Fact #20 + #21 + AC-NEW-4 (covariance honesty)
|
||||
|
||||
**Locked-in research-time defaults** (carried forward from C1 + C2 + C3 — Fact #41):
|
||||
- D-C1-1 = (c) **keep both license tracks open** through Plan; final license decision deferred to post-Jetson-MVE.
|
||||
- D-C1-2 = (b) **defer Jetson Orin Nano Super hardware MVE to a dedicated bring-up phase** between research and Plan; research closes with documentary ranking + per-candidate `Verify` gates.
|
||||
|
||||
**Interactions with prior C-row closures**:
|
||||
- C3 D-C3-1 extractor choice determines the C4 RANSAC inlier-count distribution (DISK 2424.8 multiview NL vs ALIKED-N(16) 1975.4 vs XFeat\* 1885 vs SP 1500 — varies the C4 PnP+RANSAC failure rate; high-inlier extractors give more-stable RANSAC).
|
||||
- C3 outputs include per-correspondence confidence — C4 PnP+RANSAC must consume the τ=0.1 cosine-confidence-threshold-filtered subset, not the full match list, to avoid bias from low-confidence outliers (consistent with cvg/LightGlue paper Table 3 Aachen Day-Night pipeline shape via Source #71 cross-cite).
|
||||
- C5 covariance-honesty contract (AC-NEW-4) requires C4 to produce a HONEST 6×6 covariance, not a placeholder identity-matrix stub. Fact #20 closure (DSM lift architectural decision) interacts: 2D-only pose-from-homography produces a 3-DoF homography that lifts to 6-DoF only with attitude-from-IMU/VIO prior — D-C4-1 NEW (3-DoF-acceptance / DSM-coarseness-acceptance / aerial-photogrammetry-DSM-acquisition-cost) was raised at Fact #20 closure and remains the canonical Plan-phase decision for C4.
|
||||
|
||||
| # | Candidate | License | Per-mode verification | Status | Lead reason / disqualifier | Sub-matrix cite |
|
||||
|---|---|---|---|---|---|---|
|
||||
| 1 | **OpenCV `cv::solvePnPRansac` + paired `cv::solvePnPRefineLM`** (canonical `opencv/opencv` calib3d module; mandatory simple-baseline) | Apache-2.0 (clean throughout) — Source #82 GitHub API metadata `license.spdx_id: "Apache-2.0"`; 87385 stars + last pushed 2026-05-08 = TODAY at access time | ✅ Mode enumeration via WebFetch fallback (context7 MCP-validation-error) — 9 `SolvePnPMethod` enum values + 2 function signatures (classical + USAC) + paired `solvePnPRefineLM`/`solvePnPRefineVVS`/`solvePnPGeneric` + 7 USAC RANSAC variants documented in Source #83; ✅ runnable example in canonical PnP tutorial; ✅ canonical defaults `iterationsCount=100, reprojectionError=8.0, confidence=0.99, flags=SOLVEPNP_ITERATIVE`; ⚠️ **3D-2D INPUT CONTRACT, NOT 2D-2D** — requires D-C4-1 lift (inherent to all PnP candidates); ⚠️ **NO DIRECT 6×6 COVARIANCE** — D-C4-2 NEW gate raised; ⚠️ **2 of 9 enum values BROKEN** (SOLVEPNP_DLS + SOLVEPNP_UPNP fall back to EPNP per explicit docstring); ⚠️ `solvePnPRefineLM` rotation update NOT on SO(3) (alternate `solvePnPRefineVVS` is SO(3)-correct) | **Mandatory-simple-baseline reference** (engine Component Option Breadth rule role — structurally analogous to NetVLAD's role in C2 row + SuperGlue+SuperPoint's role in C3 row); deployment-ready under every D-C1-1 license-posture choice; final ranking deferred to Jetson MVE phase per D-C1-2 | **TWO CONVERGING POSITIVE structural advantages**: (i) clean Apache-2.0 throughout (tied with cvg/LightGlue + DISK + XFeat for cleanest license-compliance story); (ii) dominant industry-standard reference (87385 stars + daily-active maintenance + JetPack 6 canonical distribution = zero-effort Jetson deployment). **TWO NEGATIVE-BUT-MITIGABLE structural findings** (inherent to PnP class, apply identically to all C4 candidates): (iii) 3D-2D input contract → D-C4-1 2D→3D lift required; (iv) no direct 6×6 covariance → D-C4-2 NEW covariance-recovery-strategy required. **TWO MINOR CAVEATS**: (v) 2 BROKEN enum values eliminated; valid set is `EPNP / AP3P / IPPE / SQPNP` plus 2 special-case (`P3P` for exactly-3, `IPPE_SQUARE` for 4-fixed-pattern markers); (vi) `solvePnPRefineLM` not on SO(3) — alternate `solvePnPRefineVVS` is preferred for high-accuracy. Recommended pairing for D-C4-1 = 4-DoF flat-earth lift: **`flags=SOLVEPNP_IPPE`** (planar-scene minimal-solver designed for coplanar object points) with **`SOLVEPNP_SQPNP`** as the modern globally-optimal fallback for the 6-DoF DSM-lift case | Fact #52 in [`../02_fact_cards/C4_pose_estimation.md`](../02_fact_cards/C4_pose_estimation.md) (per-mode entry + per-numbered-Restriction × per-numbered-AC sub-matrix block) |
|
||||
| 2 | **OpenGV `absolute_pose::AbsolutePoseSacProblem(KNEIP)` + paired `absolute_pose::optimize_nonlinear`** (canonical `laurentkneip/opengv` library; modern-competitive-lead-richer-minimal-solver) | BSD-3-Clause-equivalent CONTINGENT on D-C4-3 NEW Plan-phase license-clearance verification — Source #84 License.txt direct WebFetch verified BSD-3-Clause boilerplate (3 numbered redistribution conditions + non-endorsement clause), but GitHub SPDX detector reports `license.spdx_id: "NOASSERTION"` due to non-canonical-OSI-template formatting; 1109 stars + 358 forks + last pushed **2023-06-07 = ~2y 11mo stale** at access time (D-C4-4 NEW maintenance-staleness mitigation gate) | ✅ Mode enumeration via WebFetch fallback (context7 NOT INDEXED — only OpenCV variants returned for OpenGV query) — 4 absolute-pose minimal solvers (`p2p / p3p_kneip / p3p_gao / upnp`) + 2 non-minimal solvers (`epnp / upnp`) + 2 generalized-camera solvers (`gp3p / gpnp`) + 1 LM optimizer (`optimize_nonlinear`) + 4 RANSAC algorithm enums (`KNEIP / GAO / EPNP / GP3P`) documented in Source #85; ✅ runnable example with `sac::Ransac` + `AbsolutePoseSacProblem` integration; ⚠️ **BEARING-VECTOR INPUT CONTRACT, NOT 2D PIXEL** — requires project-side adapter or pre-computed inverse-intrinsic projection from C3's pixel correspondences; ⚠️ **3D-ANGLE RANSAC THRESHOLD** — conversion required from project's pixel-reprojection-error budget; ⚠️ **NO DIRECT 6×6 COVARIANCE OUTPUT** from `optimize_nonlinear` (D-C4-2 applies identically; harder to mitigate than OpenCV since OpenGV's residuals are bearing-vector not pixel); ⚠️ **NO PLANAR-SCENE DEDICATED SOLVER** equivalent to OpenCV's `flags=SOLVEPNP_IPPE` — DOCUMENTARY NEGATIVE for D-C4-1 = 4-DoF flat-earth case | **Modern-competitive-lead-richer-minimal-solver-coverage** (engine Component Option Breadth rule role); deployment-ready CONTINGENT on D-C4-3 license-clearance + D-C4-4 maintenance-staleness mitigation; final ranking deferred to Jetson MVE phase per D-C1-2 | **TWO CONVERGING POSITIVE structural advantages**: (i) **richer minimal-solver coverage than OpenCV** (4 algorithm-selectable RANSAC enums + 2 P3P variants [Kneip 2011 + Gao 2003] + 1 UPnP global-optimal alternate + 1 generalized-camera GP3P; vs OpenCV's effectively-4-valid SolvePnPMethod after 2 BROKEN entries removed; OpenGV provides Kneip's original 2011 P3P that OpenCV does NOT distribute — only Ke & Roumeliotis 2017 AP3P); (ii) **generalized-camera + non-central absolute pose support** — `absolute_pose::gp3p` + `absolute_pose::gpnp` for multi-camera rigs; not directly applicable to project's pinned 1× ADTi 20MP nav frame but architecturally cleaner if project later adds side-looking camera. **FIVE NEGATIVE-BUT-MITIGABLE structural findings**: (iii) bearing-vector input contract → adapter engineering required; (iv) 3D-angle RANSAC threshold → conversion required; (v) no direct 6×6 covariance → D-C4-2 applies identically (recommendation = option (b) wrap in GTSAM Marginals since OpenGV-internal Jacobian recovery is ~3-5 days vs ~1 day for OpenCV); (vi) **~3 years maintenance staleness** → D-C4-4 NEW gate; (vii) **NOASSERTION SPDX-detector status** → D-C4-3 NEW Plan-phase license-clearance verification gate. **ONE MAJOR DOCUMENTARY NEGATIVE finding vs OpenCV**: (viii) NO planar-scene dedicated minimal solver vs OpenCV's `flags=SOLVEPNP_IPPE` — for project's locked-in D-C4-1 = 4-DoF flat-earth lift recommendation, OpenGV requires using Kneip's P3P or EPNP without the planar-scene specialization advantage. Net trade-off favors OpenCV-as-primary for the project's D-C4-1 = 4-DoF flat-earth case; OpenGV-as-secondary-evaluation if Plan-phase Jetson MVE shows the need for non-central or generalized-camera support | Fact #53 in [`../02_fact_cards/C4_pose_estimation.md`](../02_fact_cards/C4_pose_estimation.md) (per-mode entry; per-numbered-Restriction × per-numbered-AC sub-matrix deferred to next session per scope-discipline) |
|
||||
| 3 | **GTSAM `LevenbergMarquardtOptimizer` + `GenericProjectionFactorCal3_S2` + `Marginals.marginalCovariance`** (canonical `borglab/gtsam` library by Frank Dellaert et al. + Georgia Tech Borg Lab; modern-competitive-lead-covariance-honest) | BSD-3-Clause (clean throughout) — Source #86 LICENSE.BSD direct WebFetch verified `Copyright (c) 2010, Georgia Tech Research Corporation` with 3 numbered redistribution conditions + non-endorsement clause; bundled deps clean (BSD-3 + Apache-2.0 + MPL2 file-level — all dual-use compatible); 3424 stars + 927 forks + last pushed **2026-05-08T13:00:22Z = TODAY** at access time (fresher than OpenCV by 6 hours = daily-active maintenance) | ✅ Mode enumeration via context7 INDEXED PASS at `/borglab/gtsam` version 4.3a1 with **1121 code snippets** (best context7 indexing of any C4 candidate evaluated) — `GenericProjectionFactorCal3_S2` + `LevenbergMarquardtOptimizer` + `Marginals.marginalCovariance` + `NonlinearFactorGraph` + `Cal3_S2` / `Cal3DS2` + `Pose3` + `noiseModel.Diagonal.Sigmas` / `noiseModel.Isotropic.Sigma` + `noiseModel.Robust.Create` + `mEstimator.Huber.Create` + `GncOptimizer` documented in Source #87; ✅ runnable example via `python/gtsam/examples/CameraResectioning.ipynb` canonical PnP pattern; ✅ **NATIVE 6×6 POSE COVARIANCE via `Marginals(graph, result).marginalCovariance(pose_key)`** — only C4 candidate to date that satisfies AC-NEW-4 covariance-honesty NATIVELY; ⚠️ **NO NATIVE RANSAC** (canonical pattern is external-RANSAC-via-OpenCV-for-inliers → GTSAM-factor-graph-from-inliers OR in-graph robust noise model OR `GncOptimizer`); ⚠️ **~50-200 MB library footprint** (heaviest C4 candidate to date but well within AC-4.2 budget); ⚠️ **TIGHT AC-4.1 latency margin** (~30-90 ms per call extrapolated to Jetson Orin Nano Super = 300-900 ms total at K=10 pairs/frame vs 400 ms budget); ⚠️ **NO JetPack 6 canonical distribution** (~1-2 days cross-compilation engineering) | **Modern-competitive-lead-covariance-honest** (engine Component Option Breadth rule role; **directly addresses AC-NEW-4-binding-constraint axis** that drives C4 row's primary architectural concern); deployment-ready under every D-C1-1 license-posture choice; final ranking deferred to Jetson MVE phase per D-C1-2 | **THREE CONVERGING POSITIVE structural advantages**: (i) **NATIVE 6×6 POSE COVARIANCE via `Marginals.marginalCovariance`** — the **only C4 candidate to date that satisfies AC-NEW-4 covariance-honesty NATIVELY without D-C4-2 mitigation work**; **directly addresses the AC-NEW-4-binding-constraint axis**; (ii) clean BSD-3-Clause throughout (tied with cvg/LightGlue + DISK + XFeat + OpenCV for cleanest license-compliance story); bundled deps clean (BSD-3 + Apache-2.0 + MPL2 file-level); (iii) daily-active maintenance + best context7 indexing of any C4 candidate (1121 code snippets at version 4.3a1). **ONE ADDITIONAL POSITIVE structural advantage**: (iv) **ARCHITECTURAL EXTENSION TO C5 VIA iSAM2** — factor-graph paradigm scales naturally from C4 single-frame PnP to C5 multi-frame state estimation via `iSAM2` + `BetweenFactor<Pose3>` + `PriorFactorPose3`; would simplify C5 implementation if both C4 and C5 are GTSAM-based, providing a forward-looking architectural integration advantage that no other C4 candidate provides. **ONE NEGATIVE-BUT-MITIGABLE structural finding**: (v) NO native RANSAC → canonical pattern is external-RANSAC-via-OpenCV (couples C4 = GTSAM-as-primary with OpenCV-RANSAC-as-inlier-detector); alternative is in-graph M-estimator robust noise model OR `GncOptimizer` (Yang et al. RAL 2020). **THREE CAVEATS**: (vi) ~50-200 MB library footprint; (vii) no JetPack 6 canonical distribution (~1-2 days cross-compilation engineering); (viii) tight AC-4.1 latency margin requiring Plan-phase Jetson MVE phase verification — mitigation strategies include reduce K from 10 to 3-5 (couples with D-C3-3) OR GTSAM-as-secondary-only for satellite-anchor frames OR batch GTSAM optimization across multiple frames via iSAM2 incremental update. **Recommended C4 architecture for the project**: **OpenCV solvePnPRansac as mandatory simple-baseline reference floor + per-frame inlier detection + initial pose estimate + GTSAM factor-graph posterior recovery for AC-NEW-4 covariance-honest output** (couples Fact #52 + Fact #54 closures via D-C4-2 = (b)) | Fact #54 in [`../02_fact_cards/C4_pose_estimation.md`](../02_fact_cards/C4_pose_estimation.md) (per-mode entry; per-numbered-Restriction × per-numbered-AC sub-matrix deferred to next session per scope-discipline) |
|
||||
|
||||
### C4 — Plan-phase deliverables raised by prior closures (will compound as candidates close)
|
||||
|
||||
1. **D-C4-1 (CARRIED FORWARD from Fact #20 closure 2026-05-XX) — 2D-3D-lift architectural decision** (3-DoF acceptance with attitude-from-IMU/VIO prior + 2D ortho-only cache / 4-DoF acceptance with flat-earth + altitude-from-IMU+barometer prior / 6-DoF via aerial-photogrammetry-DSM-acquisition + paired DSM at 0.94 m/px / 6-DoF via ALOS 30m DSM with 4× accuracy collapse per Source #41) — **carried forward from C2 row deferred resolution** (Fact #20 surfaced this decision but the C2 row closure left it for the C4 row to consolidate). Plan-phase decision; **for the project's pinned 2D-ortho-only cache + IMU-attitude-prior context, recommendation is 4-DoF with flat-earth assumption (altitude from IMU+barometer + attitude from VIO/IMU + planar-scene homography → 4-DoF pose extraction)** — this is the "flat-steppe Donetsk/Kharkiv operational area" assumption made plausible by Source #38 Skoltech survey + restrictions on 2D-ortho-only cache. ALOS-30m-DSM fallback is the secondary mitigation if 4-DoF accuracy proves insufficient at AC-1.1/1.2 50m/20m bars at the tighter tail.
|
||||
|
||||
2. **D-C4-2 NEW (raised by OpenCV `cv::solvePnPRansac` closure 2026-05-08, Fact #52; UPDATED by GTSAM closure 2026-05-08 Fact #54) — covariance-recovery-strategy** — `cv::solvePnPRansac` returns `retval, rvec, tvec, inliers` only; OpenGV's `optimize_nonlinear` has no covariance output API; **NO direct 6×6 covariance output from either OpenCV or OpenGV** per Source #83 + Source #85 function signatures. **GTSAM IS THE EXCEPTION** — `Marginals(graph, result).marginalCovariance(pose_key)` emits 6×6 posterior covariance NATIVELY (Source #87 multiple snippets). AC-NEW-4 covariance-honesty contract requires Plan-phase choice between: **(a)** post-hoc Jacobian-based covariance recovery via `cv::projectPoints` Jacobian + Schur complement on inlier residuals (~1 day engineering; pure OpenCV API; covariance approximation of equivalent quality to ROS `tf2`'s standard recipe; **recommended for OpenCV-as-primary mandatory-simple-baseline path**); **(b)** **wrap solvePnPRansac result in GTSAM `Marginals` posterior** via `BetweenFactor<Pose3>` prior + per-inlier `GenericProjectionFactorCal3_S2` factors → `LevenbergMarquardtOptimizer.optimize()` → `Marginals.marginalCovariance` (canonical Plan-phase pathway documented in Fact #54; **STRONGLY RECOMMENDED for the GTSAM-as-covariance-recovery hybrid path** — couples Fact #52 OpenCV solvePnPRansac mandatory-simple-baseline + Fact #54 GTSAM modern-competitive-lead-covariance-honest); **(c)** project-defined heuristic covariance scaling from inlier residual statistics (lowest engineering, lowest correctness — **likely AC-NEW-4 REJECT** since it's effectively an identity-matrix-placeholder family); **(d)** migrate to OpenGV's `absolute_pose::optimize_nonlinear` with custom Jacobian propagation through bearing-vector residuals (~3-5 days engineering vs ~1 day for OpenCV; couples D-C4-2 with D-C4-1 selection of OpenGV-as-primary; STRONGER NEGATIVE than expected per Fact #53 closure — OpenGV's bearing-vector Jacobian is harder to recover than OpenCV's pixel Jacobian). **Recommendation**: D-C4-2 = (b) for the OpenCV-as-RANSAC + GTSAM-as-covariance-recovery hybrid path (project's recommended C4 architecture per Fact #54 closure) — provides AC-NEW-4 covariance honesty NATIVELY via GTSAM's `Marginals` posterior while keeping OpenCV's mandatory-simple-baseline RANSAC inlier detection at zero-effort Jetson deployment. D-C4-2 = (a) Jacobian-based recovery for the OpenCV-only-no-GTSAM path if Plan-phase Jetson MVE shows GTSAM's ~30-90 ms latency + ~50-200 MB memory footprint exceeds AC-4.1 / AC-4.2 budgets. Final lock at Plan phase after Jetson MVE.
|
||||
|
||||
3. **D-C4-3 NEW (raised by OpenGV closure 2026-05-08, Fact #53) — license-clearance verification** — Source #84 GitHub API license metadata reports `license.spdx_id: "NOASSERTION"` for canonical `laurentkneip/opengv` repo; Source #84 direct WebFetch of License.txt confirms BSD-3-Clause-equivalent boilerplate (3 numbered redistribution conditions + non-endorsement clause + "Copyright 2013 Laurent Kneip, ANU. All rights reserved." attribution) but the file does NOT use OSI canonical BSD-3-Clause template text, causing GitHub SPDX detector to fail to identify the license. Plan-phase decision-maker MUST choose between: **(a)** counsel-review of License.txt to confirm BSD-3-Clause-equivalent dual-use compatibility (~1-2 hours legal review; recommended for OpenGV adoption), **(b)** request author Laurent Kneip + ShanghaiTech Mobile Perception Lab to relicense canonical License.txt to OSI canonical BSD-3-Clause boilerplate (~1-3 weeks turnaround if responsive, may not be responsive given ~3-year staleness), **(c)** treat NOASSERTION as effective disqualifier and pivot to OpenCV-as-primary instead of OpenGV-as-primary (lowest risk, but loses OpenGV's richer-minimal-solver-coverage advantage), **(d)** elevate D-C4-3 to D-C1-1 license-posture decision and treat OpenGV as eligible only on D-C1-1 = (a) GPL-3.0 track or (c) keep-both-tracks-open (since BSD-3-Clause-equivalent without canonical template formatting is more ambiguous than GPL-3.0). **Recommendation**: D-C4-3 = (a) counsel-review for the OpenGV-as-secondary path; D-C4-3 = (c) pivot to OpenCV-as-primary if Plan-phase Jetson MVE shows OpenCV's mandatory-simple-baseline coverage is sufficient without OpenGV's richer-minimal-solver-coverage. Applies only if D-C4-row final lock includes OpenGV.
|
||||
|
||||
4. **D-C4-4 NEW (raised by OpenGV closure 2026-05-08, Fact #53) — maintenance-staleness-mitigation strategy** — Source #84 GitHub API `pushed_at` field shows `laurentkneip/opengv` last commit at 2023-06-07T18:14:14Z = ~2 years 11 months stale at access time 2026-05-08; Doxygen documentation portal generation timestamp 2018-01-08 21:43:04 = 8.3 years old documentation. ShanghaiTech Mobile Perception Lab's claimed maintenance is contradicted by commit history. Plan-phase decision-maker MUST choose between: **(a)** accept-as-is + freeze upstream at git commit ea7c66f5e (lowest engineering; assumes Eigen 3.3.x continues to compile on JetPack 6 ARM Cortex-A78AE without patches; risk: future Eigen 3.4+ migration breaks build), **(b)** fork into project-controlled branch + apply Eigen-3.4+ + JetPack-6 + ARM Cortex-A78AE patches in-house (~1-2 weeks engineering; medium risk; allows future upstream-patch contribution), **(c)** migrate to Ceres-only manual implementation as fallback if OpenGV-specific patches not feasible at Jetson MVE phase (highest engineering at ~2-4 weeks; lowest dependency-lock risk), **(d)** downgrade OpenGV to "experimental" status and pivot to OpenCV-as-primary if D-C4-3 license-clearance fails OR Jetson MVE shows OpenCV's coverage is sufficient. **Recommendation**: D-C4-4 = (b) fork-and-patch for the OpenGV-as-secondary path; D-C4-4 = (d) pivot to OpenCV-as-primary if Plan-phase Jetson MVE shows OpenCV's coverage is sufficient. Applies only if D-C4-row final lock includes OpenGV.
|
||||
|
||||
5. (additional D-C4-N gates will be added as candidates close)
|
||||
|
||||
---
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -0,0 +1,329 @@
|
||||
# Solution Draft
|
||||
|
||||
> Mode A Phase 2 — engine Step 8 (Deliverable Formatting). Integrates all intermediate research artifacts into a single actionable architecture proposal.
|
||||
>
|
||||
> **Research Output Class**: Technical-component selection (per [`../00_research/00_question_decomposition.md`](../00_research/00_question_decomposition.md)).
|
||||
>
|
||||
> Backing artifacts (read these alongside this draft for full evidence):
|
||||
> - Question decomposition + scope: [`../00_research/00_question_decomposition.md`](../00_research/00_question_decomposition.md)
|
||||
> - Source registry: [`../00_research/01_source_registry/00_summary.md`](../00_research/01_source_registry/00_summary.md) (#1–#121)
|
||||
> - Fact cards: [`../00_research/02_fact_cards/00_summary.md`](../00_research/02_fact_cards/00_summary.md) (#1–#101)
|
||||
> - Comparison framework: [`../00_research/03_comparison_framework.md`](../00_research/03_comparison_framework.md)
|
||||
> - Reasoning chain: [`../00_research/04_reasoning_chain.md`](../00_research/04_reasoning_chain.md)
|
||||
> - Validation log: [`../00_research/05_validation_log.md`](../00_research/05_validation_log.md)
|
||||
> - Component fit matrix: [`../00_research/06_component_fit_matrix/00_summary.md`](../00_research/06_component_fit_matrix/00_summary.md)
|
||||
> - Cross-component gates: [`../00_research/06_component_fit_matrix/99_cross_component_gates.md`](../00_research/06_component_fit_matrix/99_cross_component_gates.md)
|
||||
> - Project Constraint Matrix: [`../00_problem/problem.md`](../00_problem/problem.md), [`../00_problem/restrictions.md`](../00_problem/restrictions.md), [`../00_problem/acceptance_criteria.md`](../00_problem/acceptance_criteria.md)
|
||||
>
|
||||
> **Note on AC assessment** — Mode A Phase 1 (`00_ac_assessment.md` BLOCKING gate per the research SKILL.md) was not executed as a standalone artifact in this run. Per-AC binding evidence is instead distributed across the per-component fact cards and the Restrictions × Candidate-Modes sub-matrix sections in `06_component_fit_matrix/Cx_*.md`. This is acknowledged as a process deviation and is recoverable by extracting an `00_ac_assessment.md` summary file from the existing per-AC binding evidence on demand. No AC has been silently dropped or unverified.
|
||||
|
||||
---
|
||||
|
||||
## Product Solution Description
|
||||
|
||||
A Jetson-Orin-Nano-Super-hosted companion-PC system that produces a GPS-equivalent WGS84 position estimate (with honest 6×6 covariance) for a fixed-wing UAV operating in a GPS-denied or GPS-spoofed environment, by fusing pre-flight-cached satellite tile imagery (from the parent-suite Azaion Satellite Service) with live nav-camera frames and FC-supplied IMU + attitude.
|
||||
|
||||
The system implements the canonical hierarchical GPS-denied pipeline `retrieval → matching → pose → fusion` (per SQ2 surveys converging on this pattern, Sources #38–#42), runs on the pinned Jetson Orin Nano Super hardware (Source #105 hardware-tied constraints honored), and delivers the final pose to the FC via per-FC external-positioning interfaces — MAVLink `GPS_INPUT` for ArduPilot Plane (verified Source #4 + #106 + #107), MSP2 `MSP2_SENSOR_GPS` for iNav (verified Source #111 + #112 + #113). PX4 is explicitly out of scope per `restrictions.md`.
|
||||
|
||||
### Component-interaction diagram (pre-flight + runtime)
|
||||
|
||||
```
|
||||
PRE-FLIGHT (operator-managed, on-Jetson) ─────────────────────────────────────────
|
||||
parent-suite Satellite Service ─→ tile cache (PostgreSQL btree + filesystem)
|
||||
─→ C2 VPR backbone (TensorRT engine, INT8+FP16)
|
||||
└─→ per-tile descriptors → FAISS HNSW index
|
||||
(.index file written
|
||||
via faiss.write_index +
|
||||
atomicwrites + SHA-256
|
||||
content-hash gate)
|
||||
ONNX models (C2/C3/C1) ─→ Polygraphy / trtexec / IBuilderConfig hybrid
|
||||
orchestration → TensorRT engines
|
||||
(.engine files, SM 87 / JetPack 6.2 / TRT 10.3)
|
||||
|
||||
TAKEOFF LOAD (≤5 s) ──────────────────────────────────────────────────────────────
|
||||
FAISS read_index(IO_FLAG_MMAP_IFC) + content-hash verify → ready
|
||||
IRuntime.deserializeCudaEngine per-engine → ready
|
||||
|
||||
RUNTIME (3 Hz nav-camera, 100-200 Hz IMU; AC-4.1 <400 ms p95) ─────────────────────
|
||||
nav-camera frame ─→ C1 OKVIS2 VIO (relative pose, IMU bias)
|
||||
─→ C2 MixVPR query → top-K=3 satellite tile retrieval (~25 ms)
|
||||
─→ C3 DISK+LightGlue × K pairs (~90-180 ms FP16)
|
||||
─→ C4 OpenCV solvePnPRansac (~5-15 ms)
|
||||
└─→ wrap in GTSAM Marginals
|
||||
(~30-90 ms; 6×6 covariance)
|
||||
FC IMU + attitude ─→ C5 GTSAM iSAM2 + CombinedImuFactor + PriorFactorPose3
|
||||
(~2-5 ms per update at D-C5-5=(c) factor density)
|
||||
└─→ posterior 6×6 covariance via Marginals
|
||||
─→ C8 per-FC unit conversion
|
||||
├─→ pymavlink GPS_INPUT (AP)
|
||||
└─→ MSP2_SENSOR_GPS (iNav)
|
||||
(5 Hz periodic)
|
||||
total runtime: ~140-420 ms p95 at K=3 + adaptive LightGlue depth
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Existing/Competitor Solutions Analysis
|
||||
|
||||
| System | Class | Stack signature | Relation to this project |
|
||||
|---|---|---|---|
|
||||
| **Twist Robotics OSCAR** (Source #25) | Deployed peer (Ukraine theater) | Visual navigation companion; closed-source | Closest peer system; deployed in theater the project will operate in. Confirms operational viability of the canonical pipeline shape. |
|
||||
| **Auterion Artemis / Skynode N** (Sources #31+#32) | Commercial deployed (Ukraine-tested) | Skynode N + Visual Navigation; 1000-mile deep-strike demonstrated; closed-source proprietary stack | Demonstrates Jetson-class hardware can host GPS-denied companion at deployed-mission scale. Validates the pinned hardware target. |
|
||||
| **NGPS (snktshrma/ngps_flight)** (Source #33) | Open-source (ArduPilot GSoC 2024) | LightGlue + SuperPoint + UKF + VISION_POSITION_ESTIMATE | Closest open-source pipeline-match. Confirms ArduPilot Plane + visual-localization companion is operationally validated. **License gap**: relies on Magic Leap-noncommercial canonical SP weights — same hard disqualifier this project hits in D-C3-1, mitigated by D-C3-1 = (a) DISK+LightGlue swap. |
|
||||
| **Vantor Raptor** (Source #30) | Commercial deployed | GPS-denied UAV navigation + coordinate extraction | Validates dual-purpose pose + object-localization output. Aligns with project AC-7.x object-localization requirements. |
|
||||
| **DARPA FLA (T&E review)** (Source #35) | Defense program lineage | GPS-denied autonomy with onboard compute | Provides T&E reference for AC-NEW-4 false-position safety budget validation methodology. |
|
||||
| **DSMAC / TERCOM lineage** (Source #36) | Defense legacy | Digital Scene Matching Area Correlator + Terrain Contour Matching | Historical proof point that the project's "match against pre-cached imagery" core idea predates modern CV by decades; modern equivalents (this project) trade hand-engineered correlators for learned VPR + matchers. |
|
||||
|
||||
**Key delta vs existing systems**: this project (a) supports both ArduPilot Plane AND iNav (no other open-source GPS-denied companion targets iNav per SQ6 saturation), (b) enforces an explicit AC-NEW-7 cache-poisoning safety budget across the descriptor cache + tile cache + Suite Sat Service pipeline, (c) ships an honest 6×6 posterior covariance per AC-NEW-4 via a GTSAM-shared-substrate hybrid (D-C4-2 + D-C5-5 + D-C8-8 cross-component coupling).
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
The solution is decomposed into nine components (C1–C8 + C10; C9 was dropped in the SQ7/C9 restructure 2026-05-08 and deferred to Test Spec greenfield Step 5). Per-component candidate tables follow. **All "Selected" candidates have an MVE link in the Restrictions × Candidate-Modes sub-matrix sections** of [`../00_research/06_component_fit_matrix/Cx_*.md`](../00_research/06_component_fit_matrix/) per Step 7.5.3 decision rules.
|
||||
|
||||
### Component: C1 — Visual / Visual-Inertial Odometry
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **OKVIS2** (modern-competitive-lead) | C++ + ROS; smartroboticslab/okvis2 | Loosely-coupled VIO with stereo+IMU optionable; for this project mono+IMU mode; outputs per-frame relative pose + IMU bias estimates | Best modern accuracy on cross-domain tracking; permissive (BSD) | C++ + ROS dependency; ~30-50 ms per frame on Jetson Orin Nano Super extrapolation | C++17, ROS Noetic optional, IMU at 100-200 Hz | BSD-3-Clause clean | ~1-2 wk integration | MVE: see [`../00_research/02_fact_cards/C1_vio.md`](../00_research/02_fact_cards/C1_vio.md); docs: Sources #47+#48+#56 | **Selected (modern-competitive-lead)** — preferred runtime path |
|
||||
| **VINS-Mono** (mandatory simple-baseline) | C++ + ROS; HKUST-Aerial-Robotics/VINS-Mono | Mono+IMU loosely-coupled VIO | Stable since 2018; simplest baseline | Older accuracy; some Jetson port effort | C++17, ROS Noetic optional, IMU at 100-200 Hz | BSD permissive clean | ~3-5 days fallback | MVE: see fact card; docs: Sources #43+#55 | **Selected (mandatory simple-baseline)** — fallback if OKVIS2 fails Jetson MVE |
|
||||
| **KLT+RANSAC** (homemade fallback) | OpenCV pure-Python | KLT optical flow + 5-point/homography RANSAC essential-matrix → pose decomposition | Pure OpenCV; no C++ dependency; pure-VO baseline | No IMU fusion (delegated to C5); ~5-10 ms per frame on Jetson | OpenCV 4.x; IMU bypassed | Apache-2.0 | ~3-5 days fallback | MVE: see fact card; docs: Source #53 | **Selected (project-internal homemade fallback)** — used when OKVIS2/VINS-Mono unavailable |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.3 cumulative drift (visual-only / IMU-fused branches); AC-2.1a frame-to-frame registration; AC-3.1 outlier tolerance; AC-3.2 sharp-turn behavior; AC-4.1 + AC-4.2 latency + memory.
|
||||
- Evidence: `02_fact_cards/C1_vio.md`; Sources #43+#47+#48+#53+#55+#56.
|
||||
- Disqualifiers: VINS-Fusion + OpenVINS GPL-3.0 contingent on D-C1-1 = (a) track.
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C1_vio.md`](../00_research/06_component_fit_matrix/C1_vio.md) per-candidate sections.
|
||||
- API capability gates: ✅ MVE saved.
|
||||
|
||||
### Component: C2 — Visual Place Recognition
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **MixVPR** (mandatory simple-baseline, BSD/permissive track) | PyTorch; amaralibey/MixVPR | ResNet50 backbone + MLP-Mixer aggregator; output dimension 2048-D float32 (or 512-D / 256-D `cropToDim` per D-C2-9 / D-C2-10 / D-C6-1 = halfvec); input 320×320 | MIT throughout; modest descriptor budget (~6.5% of AC-8.3 cache); active maintenance | Street-view-pretrained — D-C2-1 retrain on aerial corpus required | PyTorch 2.x; ONNX export verified | MIT clean | ~3-5 days base + ~1-2 wk D-C2-1 retrain | MVE: see [`../00_research/02_fact_cards/C2_vpr.md`](../00_research/02_fact_cards/C2_vpr.md); docs: Sources #57+#58+#61 | **Selected (mandatory simple-baseline + recommended primary on BSD/permissive track)** |
|
||||
| **SALAD** (modern-competitive-lead, GPL-3.0 track) | PyTorch; serizba/salad | DINOv2 ViT-B + optimal-transport aggregator; output 8448-D / 2112-D / 544-D per D-C2-6; input 322×322 | +5-7 R@1 over MixVPR-2048 on MSLS Challenge | GPL-3.0; D-C2-5 ViT export risk; descriptor budget at full size 27% of AC-8.3 | PyTorch 2.x; DINOv2 ViT-B export | GPL-3.0 contingent | ~3-5 days base + ~1-2 wk D-C2-1 retrain | MVE: see fact card; docs: Sources #59+#60 | **Selected (modern-competitive-lead) on GPL-3.0 track** — eligible only if D-C1-1 = (a) or (c) |
|
||||
| **EigenPlaces** (BSD/permissive sibling) | PyTorch; gmberton/EigenPlaces | ResNet-50 + GeM + FC viewpoint-robust training; 2048-D / 512-D / 256-D / 128-D per D-C2-10 | MIT throughout; viewpoint-robust training paradigm; eleven sibling modes | Older approach (2023); modest accuracy lift over MixVPR | PyTorch 2.x | MIT clean | ~3-5 days | MVE: see fact card; docs: Sources #67+#68 | **Selected (BSD/permissive sibling)** — alternate primary on BSD/permissive track |
|
||||
| **SelaVPR** (BSD/permissive two-stage sibling) | PyTorch; Lu-Feng/SelaVPR | DINOv2 ViT-L two-stage (global + local); 1024-D global + on-demand local features | MIT; lift from two-stage; 1024-D smallest single-stage cache | DINOv2 ViT-L is 3.5× larger than ViT-B; D-C2-5 + D-C2-7 re-rank gates | PyTorch 2.x; DINOv2 ViT-L export | MIT clean | ~3-5 days base + ~1-2 wk D-C2-1 retrain | MVE: see fact card; docs: Sources #62+#63 | **Selected (modern-competitive-lead BSD/permissive two-stage)** — eligible if D-C2-7 re-rank strategy chosen |
|
||||
| **NetVLAD** (mandatory baseline, BSD/permissive track) | PyTorch port; Relja/netvlad canonical | VGG16 + soft-assignment-VLAD; 4096-D / 512-D / 256-D PCA-whitened per D-C2-9 | MIT canonical; classical-baseline; widely-cited | Largest single-stage descriptor cache at canonical 4096-D; D-C2-8 PyTorch-port-strategy gate | PyTorch port required from canonical MATLAB | MIT canonical (Nanne port has license-uncertainty per D-C2-8) | ~1 wk re-port from canonical OR ~3 days Nanne port + license-clearance | MVE: see fact card; docs: Sources #64+#65+#66 | **Selected (mandatory simple-baseline)** — classical reference; D-C2-8 = (b) re-port from canonical recommended |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-2.1b satellite-anchor registration; AC-2.2 cross-domain MRE; AC-8.3 cache budget; AC-8.6 retrieval robustness; AC-4.1 latency.
|
||||
- Evidence: `02_fact_cards/C2_vpr.md`; Sources #57–#68.
|
||||
- Disqualifiers: SALAD GPL-3.0 contingent on D-C1-1 = (a); conditional candidates (AnyLoc/BoQ/DINOv2-VLAD) pending D-C2-5 INT8 quantization survey prerequisite.
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C2_vpr.md`](../00_research/06_component_fit_matrix/C2_vpr.md).
|
||||
- API capability gates: ✅ MVE saved for all 5 mandatory pre-screen candidates.
|
||||
|
||||
### Component: C3 — Cross-domain matchers
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **DISK+LightGlue** (recommended-primary-mitigation) | PyTorch + LightGlue ONNX; cvlab-epfl/disk + cvg/LightGlue | DISK `save-depth.pth` canonical default + LightGlue+DISK paired matcher; FP16 only (D-C7-6 matchers→FP16-only per-family policy); K=3-5 retrieval pairs per UAV frame per D-C3-3 | +7.99 absolute AUC@5° over canonical SP+LightGlue (LightGlue paper Tab 6); Apache-2.0 throughout; clean license | ~30-60 ms per pair on Jetson FP16 — TIGHT at K=10; D-C3-3 mitigation via K=3-5 + adaptive depth | PyTorch 2.x; LightGlue ONNX export verified | Apache-2.0 throughout | ~1 wk ONNX export + ~1-2 wk D-C2-1 retrain | MVE: see [`../00_research/02_fact_cards/C3_matchers.md`](../00_research/02_fact_cards/C3_matchers.md); docs: Sources #69+#70+#71+#76+#77 | **Selected (recommended-primary-mitigation)** — replaces canonical SP+LightGlue (Magic Leap noncommercial HARD DISQUALIFIER per D-C3-1) |
|
||||
| **ALIKED+LightGlue** (modern-competitive-lead-secondary) | PyTorch (ALIKED export-absence in LightGlue-ONNX → PyTorch fp16-only); Shiaoming/ALIKED + cvg/LightGlue | ALIKED-N(16rot) 128-D rotation-augmented (D-C3-4) + LightGlue+ALIKED paired matcher; PyTorch fp16 | Rotation-augmented for multi-heading aerial flights; Apache-2.0 | PyTorch fp16-only forces D-C3-3 K reduction more than DISK | PyTorch 2.x | Apache-2.0 + BSD-3-Clause | ~1 wk swap from DISK | MVE: see fact card; docs: Sources #74+#75 | **Selected (modern-competitive-lead-secondary)** |
|
||||
| **XFeat / XFeat\* / XFeat+LighterGlue** (alternate-modern-competitive-lead) | PyTorch; verlab/accelerated_features | Recommended XFeat\* semi-dense with MNN+MLP-offset-refinement per D-C3-6 = (b); cheapest C3 retrain (~36h on RTX 4090) | 4× more inliers per pair via lightweight MLP refinement; cheapest retrain | Documentary AUC@5° below LightGlue siblings | PyTorch 2.x | Apache-2.0 throughout | ~3-5 days base + ~36h retrain | MVE: see fact card; docs: Sources #80+#81 | **Selected (alternate-modern-competitive-lead)** — lowest engineering complexity path |
|
||||
| **SuperGlue + SuperPoint canonical** (deprecated by LightGlue authors) | magicleap/SuperGluePretrainedNetwork | n/a — canonical SP weights | Reference baseline | **Magic Leap noncommercial license = HARD DISQUALIFIER** in dual-use deployment context | n/a | n/a | n/a | docs: Sources #78+#79 | **Rejected** (D-C3-1 hard disqualifier) |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-2.1b satellite-anchor registration; AC-2.2 cross-domain MRE <2.5 px; AC-3.3 disconnected-segment recovery; AC-4.1 latency budget.
|
||||
- Evidence: `02_fact_cards/C3_matchers.md`; Sources #69–#81.
|
||||
- Disqualifiers: Magic Leap noncommercial on canonical SP weights (HARD DISQUALIFIER); MASt3R CC-BY-NC; RoMa / DKM / LoFTR not selected at this batch.
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C3_matchers.md`](../00_research/06_component_fit_matrix/C3_matchers.md).
|
||||
- API capability gates: ✅ MVE saved for selected candidates; canonical SP rejected before API verification.
|
||||
|
||||
### Component: C4 — Pose estimation (PnP+RANSAC+LM)
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **OpenCV `cv::solvePnPRansac`** (mandatory simple-baseline) wrapped in **GTSAM `Marginals`** (D-C4-2 = (b) covariance recovery) | OpenCV 4.x calib3d + GTSAM Python | `solvePnPRansac(objectPoints, imagePoints, K, dist, ..., flags=SOLVEPNP_IPPE)` (planar-scene IPPE per D-C4-1 = (b) 4-DoF flat-earth); wrap result in GTSAM `BetweenFactor<Pose3>` prior + per-inlier `GenericProjectionFactorCal3_S2` factors → `LevenbergMarquardtOptimizer` → `Marginals.marginalCovariance(pose_key)` 6×6 | OpenCV simplest-baseline + 7 USAC RANSAC variants; GTSAM provides NATIVE 6×6 covariance recovery; couples C4 + C5 via shared GTSAM substrate per D-C5-5 = (c) | GTSAM `Marginals` ~30-90 ms per pose recovery (Plan-phase Jetson MVE confirms tail); OpenCV alone has no covariance API per Source #83 signature | OpenCV 4.x; GTSAM Python | Apache-2.0 + BSD-3-Clause | ~3-5 days OpenCV + ~3-5 days GTSAM wrapper | MVE: see [`../00_research/02_fact_cards/C4_pose_estimation.md`](../00_research/02_fact_cards/C4_pose_estimation.md); docs: Sources #82+#83+#86+#87 | **Selected (mandatory simple-baseline + recommended-primary covariance recovery via GTSAM)** |
|
||||
| **OpenGV** (modern-competitive-lead-richer-minimal-solver) | C++ + Python bindings; laurentkneip/opengv | `absolute_pose::optimize_nonlinear` per D-C4-2 = (d); algorithm-selectable RANSAC enums (KNEIP/GAO/EPNP/GP3P) | Richer minimal-solver coverage than OpenCV; 2 P3P variants; UPnP global-optimal; GP3P generalized-camera | NOASSERTION SPDX (D-C4-3 license-clearance gate); ~3 yr stale (D-C4-4 maintenance gate); no native planar-scene solver vs OpenCV's IPPE | C++17; Eigen-3.4+ | BSD-3-Clause-equivalent NOASSERTION pending counsel review | ~1-2 wk fork-and-patch (D-C4-4 = (b)) | MVE: see fact card; docs: Sources #84+#85 | **Selected with runtime gate** — secondary path conditional on D-C4-3 + D-C4-4 closures |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.1/1.2 frame-center accuracy; AC-2.2 reprojection error <2.5 px cross-domain; AC-NEW-4 covariance honesty (P(error >500 m) <0.1 %); AC-4.1 latency.
|
||||
- Evidence: `02_fact_cards/C4_pose_estimation.md`; Sources #82–#87.
|
||||
- Disqualifiers: none in selected candidates; OpenGV NOASSERTION gated as `Selected with runtime gate` per Step 7.5.3 carve-out for license-clearance.
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C4_pose_estimation.md`](../00_research/06_component_fit_matrix/C4_pose_estimation.md).
|
||||
- API capability gates: ✅ MVE saved.
|
||||
|
||||
### Component: C5 — State estimator / sensor fusion
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **Manual ESKF (Solà 2017)** (mandatory simple-baseline) | NumPy/SciPy project-side implementation | Quaternion-correct ESKF on SO(3); analytic Jacobian per Solà §6; ~5-15 ms per update on Jetson CPU | Trivial dependency footprint (~kilobytes of code); fastest C5 candidate; native 6×6 covariance via analytic Jacobian propagation | No look-back refinement (forward-time-only Kalman update); D-C5-2 long-cruise observability strategy required | NumPy 1.x + SciPy + Solà 2017 paper as canonical reference | Public-domain canonical equations + project Apache-2.0 implementation | ~1-2 wk D-C5-1 = (b) re-implement from paper directly | MVE: see [`../00_research/02_fact_cards/C5_state_estimator.md`](../00_research/02_fact_cards/C5_state_estimator.md); docs: Sources #88+#89 | **Selected (mandatory simple-baseline)** — always-running fallback |
|
||||
| **GTSAM iSAM2 + CombinedImuFactor + smart factors + Marginals + IncrementalFixedLagSmoother** (modern-competitive-lead-factor-graph) | GTSAM Python; borglab/gtsam | iSAM2 incremental smoothing + `CombinedImuFactor` 6-key per-keyframe-pair factor with bias evolution + `BetweenFactorPose3` + `GenericProjectionFactorCal3DS2` per D-C5-5 = (c) `PriorFactorPose3` only + `gtsam_unstable.IncrementalFixedLagSmoother` K=10-20 keyframes per D-C5-3 | NATIVE 6×6 posterior covariance via `Marginals`; NATIVE AC-4.5 look-back refinement; couples C4 + C5 via shared GTSAM substrate per D-C5-5 = (c) | GTSAM ~50-200 MB footprint; per-update latency ~5-100 ms depending on factor density (D-C5-5 = (c) gives ~2-5 ms) | GTSAM Python; daily-active maintenance | BSD-3-Clause clean | ~2-3 wk full factor-graph design | MVE: see fact card; docs: Sources #90+#91 | **Selected (modern-competitive-lead-factor-graph + recommended primary path)** — couples NATIVELY with C4 GTSAM Marginals via D-C5-5 = (c) |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.3 cumulative drift; AC-1.4 95% covariance ellipse + source label; AC-3.5 visual blackout + spoofed GPS dead-reckon; AC-4.1 + AC-4.5 latency + look-back refinement; AC-NEW-4 covariance honesty; AC-NEW-8 visual blackout failsafe.
|
||||
- Evidence: `02_fact_cards/C5_state_estimator.md`; Sources #88–#91.
|
||||
- Disqualifiers: D-C5-1 reference-implementation-license-verification gates `ludvigls/ESKF` and `cggos/imu_x_fusion` (mitigation = D-C5-1 = (b) re-implement from canonical Solà 2017 paper).
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C5_state_estimator.md`](../00_research/06_component_fit_matrix/C5_state_estimator.md).
|
||||
- API capability gates: ✅ MVE saved.
|
||||
|
||||
### Component: C6 — Tile cache + spatial index
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **Mirror-of-suite-`satellite-provider` pattern** (recommended primary) | PostgreSQL btree composite + bytea + FAISS HNSW + filesystem | btree composite index on `(tile_zoom, tile_x, tile_y, version)` for slippy-map spatial-grid range queries; `bytea` descriptor blobs (halfvec per D-C6-1); `IndexHNSWFlat(d, M=32)` per D-C6-2 loaded at takeoff via `faiss.read_index(path, IO_FLAG_MMAP_IFC)`; tile storage at `./tiles/{zoom}/{x}/{y}.{image_type}` slippy-map convention | Mirrors verified-existing parent-suite pattern (Source #92 filesystem read); ~6-54 ms per cache hit within AC-4.1; ~700 MB-1.5 GB total memory within AC-4.2; trivial dependency footprint (no Postgres extensions) | Halfvec descriptor storage requires app-side conversion; sector classification heuristic deferred to Plan-phase | PostgreSQL 16 + Dapper or psycopg2 + FAISS Python | PostgreSQL License + MIT clean | ~3-5 days mirror integration | MVE: see [`../00_research/02_fact_cards/C6_tile_cache_spatial_index.md`](../00_research/02_fact_cards/C6_tile_cache_spatial_index.md); docs: Sources #92+#96+#97+#98 | **Selected (recommended primary)** — leverages existing verified-suite pattern |
|
||||
| **PostGIS + pgvector** (deferred secondary) | PostgreSQL + PostGIS 3.4 + pgvector 0.7+ | GiST on `geography(POINT,4326)` with KNN distance ordering (`<->`); pgvector HNSW for descriptor ANN; same filesystem tile storage | Native KNN + radius queries; combined-SQL capabilities | 5-10× slower geographic lookup at 3 Hz query rate per Sources #93 + #97; PostGIS GPL-2.0-or-later (CONTINGENT REJECT under D-C1-1 = (b)); +50-100 MB Jetson memory + 50-200 MB disk install; D-C6-5 Jetson PostGIS+pgvector co-installation Plan-phase verification gate | PostgreSQL 16 + PostGIS 3.4 + pgvector 0.7+ | GPL-2.0-or-later via PostGIS contingent | ~1-2 wk + Plan-phase Jetson MVE | MVE: see fact card; docs: Sources #94+#95 | **Deferred secondary** — comparative-improvement verdict does NOT clear user's "significant-improvement-only" bar |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-3.3 disconnected-segment retrieval; AC-4.1 latency; AC-4.2 memory; AC-8.1 cache-interface resolution; AC-8.2 freshness; AC-8.3 cache budget; AC-8.6 satellite-anchor relocalization robustness.
|
||||
- Evidence: `02_fact_cards/C6_tile_cache_spatial_index.md`; Sources #92–#98.
|
||||
- Disqualifiers: PostGIS GPL-2.0-or-later contingent on D-C1-1 = (a) license track.
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C6_tile_cache_spatial_index.md`](../00_research/06_component_fit_matrix/C6_tile_cache_spatial_index.md).
|
||||
- API capability gates: ✅ MVE saved for selected primary; deferred secondary has API capability evidence saved but is not active.
|
||||
|
||||
### Component: C7 — On-Jetson inference runtime
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **TensorRT native** (recommended primary) | TensorRT 10.3 bundled with JetPack 6.2 | `IInt8EntropyCalibrator2` + `BuilderFlag.FP16+INT8` mixed-precision per D-C7-2 = (b) per-family ladder per D-C7-6; engines built directly on Jetson Orin Nano Super SM 87 per D-C7-7 = (c); `config.max_workspace_size = 1 << 30` (1 GB) per D-C7-8 | Apache-2.0 in TRT 10.x; ships with JetPack so zero-effort install; lowest-latency primary path; 2-3× speedup at INT8 vs FP16 per Source #102 | Engines hardware-tied to SM 87 (Source #105) — must be built per-target via D-C10-5..D-C10-8 orchestration | JetPack 6.2 + CUDA 12.6 + cuDNN 9.3 + TRT 10.3 | Apache-2.0 throughout | ~1 wk first-model + ~1 day each subsequent | MVE: see [`../00_research/02_fact_cards/C7_inference_runtime.md`](../00_research/02_fact_cards/C7_inference_runtime.md); docs: Sources #99+#104+#105 | **Selected (recommended primary)** — lowest-latency runtime path |
|
||||
| **ONNX Runtime + TensorRT EP** (modern-competitive-lead-cross-architecture-portability) | onnxruntime-gpu via Jetson AI Lab JP6/CU126 wheel index | `TensorrtExecutionProvider` config + automatic CUDA EP / CPU EP subgraph fallback | MIT throughout; cross-architecture portability for replay/SITL on x86 dev hosts | `pip install onnxruntime-gpu` does not work on Jetson (D-C7-3 mitigation = mirror Jetson AI Lab wheel index); `numpy<2.0.0` pin per D-C7-4 | Jetson AI Lab community wheels | MIT throughout | ~1 wk Jetson AI Lab wheel mgmt | MVE: see fact card; docs: Sources #100+#103 | **Selected (modern-competitive-lead-cross-architecture-portability)** — secondary for replay/SITL only |
|
||||
| **Pure PyTorch FP16** (mandatory simple-baseline + reference-correctness oracle) | torch.amp.autocast + model.half() + Jetson AI Lab PyTorch 2.5 ARM64 wheel | FP16 across all models; no quantization | BSD-3-Clause; zero-conversion regression baseline; reference-correctness oracle for accuracy validation of TRT-built engines | Standard `pip install torch` lacks CUDA on Jetson — needs Jetson AI Lab wheel via D-C7-5 = (a) PyTorch 2.5 + torchvision 0.20 | Jetson AI Lab PyTorch 2.5 wheel | BSD-3-Clause throughout | ~3-5 days base | MVE: see fact card; docs: Source #101 | **Selected (mandatory simple-baseline)** — accuracy-validation oracle only, not runtime path |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-4.1 latency; AC-4.2 memory; AC-NEW-3 INT8 calibration cache provenance for FDR; AC-NEW-5 thermal envelope (Jetson runs at 25 W TDP).
|
||||
- Evidence: `02_fact_cards/C7_inference_runtime.md`; Sources #99–#105.
|
||||
- Disqualifiers: Triton/DeepStream/CUDA-Python custom kernels considered-and-rejected (server/video-pipeline class, out-of-budget for embedded 8 h mission).
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C7_inference_runtime.md`](../00_research/06_component_fit_matrix/C7_inference_runtime.md).
|
||||
- API capability gates: ✅ MVE saved for all 3 candidates per per-family roles.
|
||||
|
||||
### Component: C8 — MAVLink / MSP2 FC adapter
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **pymavlink → MAVLink `GPS_INPUT`** (recommended-primary for ArduPilot Plane) | ardupilot/pymavlink | `master.mav.gps_input_send(time_usec, gps_id, ignore_flags, time_week_ms, time_week, fix_type, lat, lon, alt, hdop, vdop, vn, ve, vd, speed_accuracy, horiz_accuracy, vert_accuracy, satellites_visible, yaw)` 5 Hz periodic per D-C8-5 over UART/USB/UDP per D-C8-1; FC-side `GPS1_TYPE=14` MAVLink + `EK3_SRC1_POSXY=3` GPS source-set; per-FC unit conversion `horiz_accuracy` (m) per D-C8-8 = (b) | Cooperative-path; FC-side ingestion via `AP_GPS_MAV` (verified Source #4); LGPL-3.0 linkable from Apache-2.0 app per LGPL §6 (D-C8-3 mitigation) | LGPL-3.0 license-posture verification (D-C8-3 mitigation = bundle unmodified) | pymavlink + ArduPilot Plane firmware (any) | LGPL-3.0 linkable | ~3-5 days | MVE: see [`../00_research/02_fact_cards/C8_fc_adapter.md`](../00_research/02_fact_cards/C8_fc_adapter.md); docs: Sources #106+#107 | **Selected (recommended-primary)** for ArduPilot Plane |
|
||||
| **MSP2_SENSOR_GPS via Python MSP V2** (recommended-primary for iNav) | YAMSPy + INAV-Toolkit `msp_v2_encode` | `MSP2_SENSOR_GPS` (id 7939 / 0x1F03) 36-byte payload at 5 Hz periodic per D-C8-5; `mspGPSReceiveNewData()` direct passthrough on iNav side; per-FC unit conversion `hPosAccuracy` (mm) per D-C8-8 = (b) | YAMSPy + INAV-Toolkit MIT throughout; covariance fields aligned (`hPosAccuracy`/`vPosAccuracy`/`hVelAccuracy`); `USE_GPS_PROTO_MSP` enabled by default in iNav target/common.h | D-C8-4 implementation choice gate (YAMSPy primary + thin custom encoder fallback) | YAMSPy or INAV-Toolkit; iNav firmware 8.0+ | MIT throughout | ~3-5 days | MVE: see fact card; docs: Sources #111+#112+#113 | **Selected (recommended-primary)** for iNav |
|
||||
| **UBX impersonation via pyubx2 NAV-PVT** (deferred secondary for iNav) | semuconsulting/pyubx2 | NAV-PVT periodic + NAV-VER startup + CFG-MSG/CFG-RATE ACK; iNav-side `gpsMapFixType()` validation gate requires `flags & 0x01 = 1` AND `fixType ∈ {2,3}` per Source #110 | BSD-3-Clause clean; richer protocol surface | Forgery posture; D-C8-7 AC-NEW-7 audit-trail verification gate | pyubx2; iNav firmware (any) | BSD-3-Clause | ~1-2 wk + audit-trail design | MVE: see fact card; docs: Sources #108+#109+#110 | **Deferred secondary** — comparative-improvement verdict does NOT clear user's "significant-improvement-only" bar over MSP2 |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-4.3 per-FC external-positioning interface; AC-NEW-2 spoofing-promotion latency; AC-NEW-4 covariance honesty (per-FC unit conversion); AC-NEW-7 forgery posture for UBX path.
|
||||
- Evidence: `02_fact_cards/C8_fc_adapter.md`; Sources #106–#113.
|
||||
- Disqualifiers: PX4 explicitly out of scope per `restrictions.md`; pymavlink LGPL-3.0 mitigated via bundle-unmodified pattern (D-C8-3).
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C8_fc_adapter.md`](../00_research/06_component_fit_matrix/C8_fc_adapter.md).
|
||||
- API capability gates: ✅ MVE saved for all 3 candidates.
|
||||
|
||||
### Component: C10 — Pre-flight cache provisioning + sector classification + freshness pipeline
|
||||
|
||||
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|
||||
|----------|-------|--------------------|------------|-------------|--------------|----------|------|-------------------------|-----|
|
||||
| **D-C6-3 confirmation: descriptor-cache rebuild trigger pipeline** | FAISS Python API + python-atomicwrites + SHA-256 content-hash | Manifest-hash-driven rebuild trigger per D-C10-1 = (b); `python-atomicwrites` write-temp + `fsync` + atomic rename + parent-dir fsync per D-C10-2 = (b); SHA-256 content-hash gate at takeoff + reject + STATUSTEXT + refuse takeoff if mismatch per D-C10-3 = (b); mmap with `madvise(MADV_WILLNEED)` pre-fault per D-C10-4 = (b) | FAISS MIT + atomicwrites MIT throughout; idempotent + crash-safe + AC-NEW-7-compliant; minimal abstraction surface | FAISS warns "no internal integrity check, expects validated input" — MITIGATED by content-hash gate at takeoff | FAISS Python + python-atomicwrites + SHA-256 stdlib | MIT throughout | ~1 wk orchestration wrapper | MVE: see [`../00_research/02_fact_cards/C10_preflight_provisioning.md`](../00_research/02_fact_cards/C10_preflight_provisioning.md); docs: Sources #114+#115+#116 | **Selected** — closes D-C6-3 cross-component gate |
|
||||
| **D-C7-7 confirmation: TensorRT engine-build pipeline** | Polygraphy CLI + trtexec + direct `IBuilderConfig` Python API | Hybrid orchestration per D-C10-5 = (d): Polygraphy CLI primary for INT8-calibrating builds (`polygraphy convert --int8 --fp16 --data-loader-script ./calib_data_loader.py --calibration-cache <path> -o <engine>`) + trtexec for cache-reuse fast rebuilds + direct `IBuilderConfig` Python API escape hatch for unusual models (LightGlue dynamic shapes); calibration cache reuse keyed by `SHA-256(calib_corpus)` per D-C10-6; self-describing filename `<model>_sm87_jp62_trt103_<precision>.engine` per D-C10-7; reference Jetson at HQ + deployed-Jetson-copy-to-archive on first successful local build per D-C10-8 | Polygraphy + TRT 10.x Apache-2.0 throughout; calibration-cache reuse keeps subsequent rebuilds <30 sec; production-mature NVIDIA-blessed orchestration | `trtexec --int8` without `--calib` random-data-fallback caveat — MITIGATED by project-side wrapper enforcing `--calib=<existing_cache>` non-empty as precondition | TensorRT 10.3 + Polygraphy + JetPack 6.2 | Apache-2.0 throughout | ~1 wk first-model + ~1 day each subsequent | MVE: see fact card; docs: Sources #117+#118+#119+#120+#121 | **Selected** — closes D-C7-7 cross-component gate |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-NEW-7 cache-poisoning safety budget (descriptor cache + TensorRT engine path); AC-8.3 cache budget; AC-NEW-1 cold-start TTFF (takeoff load <5 s); restrictions.md rebuild-while-not-flying constraint.
|
||||
- Evidence: `02_fact_cards/C10_preflight_provisioning.md`; Sources #114–#121.
|
||||
- Disqualifiers: none — both candidates Apache-2.0/MIT clean.
|
||||
- Restrictions × Candidate-Modes sub-matrix: see [`../00_research/06_component_fit_matrix/C10_preflight_provisioning.md`](../00_research/06_component_fit_matrix/C10_preflight_provisioning.md).
|
||||
- API capability gates: ✅ MVE saved for both sub-areas.
|
||||
|
||||
### Out-of-research-scope items (deferred to Plan-phase)
|
||||
|
||||
Per the C10 scope restructure 2026-05-08 (`c10_scope=C` cross-coupling minimal), the following are deferred to Plan-phase as `operator tooling design`:
|
||||
- Operator-side CLI/desktop tool design (Plan-phase architect + UX)
|
||||
- Sector classification (active-conflict vs stable rear) heuristics + interface (Plan-phase architect + operations team)
|
||||
- Tile age-stamping schema beyond restrictions.md mandate (Plan-phase architect)
|
||||
- Freshness pipeline workflow (Plan-phase architect + operations team)
|
||||
|
||||
Their cross-coupling with the runtime architecture is mediated entirely by the descriptor-cache file (D-C6-3 closure) and the TensorRT engine cache file (D-C7-7 closure) — both pinned by C10 batch 1 confirmations.
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
> **Note**: full test specifications are produced by the Test Spec skill (greenfield Step 5). What follows is the research-level test envelope, named so the Test Spec skill can elaborate against it.
|
||||
|
||||
### Integration / Functional Tests
|
||||
|
||||
- **IT-1 — Pipeline smoke**: feed `_docs/00_problem/input_data/flight_derkachi/` (cropped nadir flight footage + synchronized `SCALED_IMU2` + `GLOBAL_POSITION_INT`) into the full C1+C2+C3+C4+C5+C8 pipeline; assert that the emitted `GPS_INPUT` (ArduPilot SITL) and `MSP2_SENSOR_GPS` (iNav SITL) frames stay within AC-1.1/1.2 frame-center-accuracy bounds vs the tlog GPS path.
|
||||
- **IT-2 — Cold-boot TTFF**: cold-boot the companion 50× with a simulated FC pose; measure boot → first valid emitted external-position MAVLink frame; pass = 95th percentile <30 s per AC-NEW-1.
|
||||
- **IT-3 — Spoofing-promotion latency**: SITL on each supported FC (ArduPilot Plane + iNav, production param sets); inject false GPS; measure spoof onset → companion estimate becoming primary FC source via D-C8-2 = (b) `MAV_CMD_SET_EKF_SOURCE_SET` companion-driven switch; pass = 95th percentile <3 s on both per AC-NEW-2.
|
||||
- **IT-4 — Sharp-turn recovery**: synthetic UAV trajectory with ±20° bank turns + <5% inter-frame overlap; assert C2/C3 satellite-anchor recovery within 1-2 frames per AC-3.2 + AC-3.3.
|
||||
- **IT-5 — Visual blackout + GPS spoofing degraded mode**: SITL/replay on each FC; inject 5 s / 15 s / 35 s blackouts while spoofing GPS; assert mode transition ≤400 ms, spoofed GPS ignored, covariance grows monotonically, MAVLink fields degrade at AC-NEW-8 thresholds (>100 m → "2D fix or worse"; >500 m or >30 s → "no fix" + `VISUAL_BLACKOUT_FAILSAFE` STATUSTEXT), recovery only via trusted anchor or 10-s GPS-health + visual-consistency gate.
|
||||
- **IT-6 — Stale tile rejection (AC-NEW-6)**: inject synthetic-age tiles into C6 cache; verify rejection or downgrade-to-non-`satellite_anchored` per AC-8.2 freshness threshold.
|
||||
- **IT-7 — Cache-poisoning verification (AC-NEW-7)**: tamper with `/var/lib/onboard/cache/faiss/v_2048_M32.index` post-write but pre-takeoff; verify D-C10-3 SHA-256 content-hash gate triggers reject + STATUSTEXT + refuse takeoff.
|
||||
- **IT-8 — Pre-flight cache rebuild idempotence**: invoke C10 pre-flight provisioning twice consecutively without input changes; verify D-C10-1 manifest-hash-driven trigger correctly skips rebuild on second invocation; verify atomic-write integrity holds across simulated power-loss mid-rebuild.
|
||||
- **IT-9 — TensorRT engine cache reuse**: invoke C10 pre-flight provisioning with same model + same calibration corpus twice; verify D-C10-6 calibration-cache reuse triggers <30 sec rebuild on second invocation; verify D-C10-7 self-describing filename schema correctly identifies SM/JP/TRT/precision tuple.
|
||||
- **IT-10 — AC-NEW-4 covariance-honesty cross-FC**: verify D-C8-8 = (b) per-FC unit conversion correctly extracts 2×2 horizontal sub-matrix from C5 GTSAM `Marginals.marginalCovariance`, computes 95% confidence ellipse semi-major axis `sqrt(2.0 * 5.991 * λ_max)`, emits as `horiz_accuracy` (m) for ArduPilot AND `hPosAccuracy` (mm) for iNav with mathematically equivalent values.
|
||||
|
||||
### Non-Functional Tests
|
||||
|
||||
- **NFT-1 — End-to-end latency p95 (AC-4.1)**: 8 h synthetic load (3 Hz nav frames replayed); measure end-to-end latency distribution; pass = 95th percentile <400 ms; up to ~10% frames may drop under sustained load per AC-4.1.
|
||||
- **NFT-2 — Memory cap (AC-4.2)**: same 8 h load; assert peak shared CPU+GPU memory <8 GB per AC-4.2.
|
||||
- **NFT-3 — Thermal envelope (AC-NEW-5)**: hot-soak 25 W @ +50 °C for 8 h; assert no Jetson thermal throttling. Cold-soak −20 °C cold-start within AC-NEW-1 30 s p95 budget.
|
||||
- **NFT-4 — False-position safety budget (AC-NEW-4)**: Monte Carlo over public aerial-localization dataset (e.g., AerialVL S03) + own recorded flights; report error CDF; pass = `P(>500 m) <0.1 %` AND `P(>1 km) <0.01 %` across ≥100 flights.
|
||||
- **NFT-5 — Cache-poisoning safety budget (AC-NEW-7)**: multi-flight Monte Carlo replay over public datasets + own flights with synthetic over-confidence injection (deflate covariance ×1.5–3); assert `P(geo-misalign >30 m) <1 %` AND `P(>100 m) <0.1 %` across ≥100 flights. Independently exercise the Suite Sat Service-side voting contract (out of onboard scope but acknowledged as cross-component).
|
||||
- **NFT-6 — FDR storage cap (AC-NEW-3)**: 8 h synthetic load; assert FDR ≤64 GB; verify no payload class silently dropped without a logged rollover.
|
||||
- **NFT-7 — License posture verification**: SBOM dump of the deployed companion; verify D-C1-1 license-track is honored (no GPL-3.0 candidate loaded if D-C1-1 = (b); pymavlink LGPL-3.0 bundled-unmodified per D-C8-3); verify Magic Leap noncommercial canonical SP weights are NOT loaded; verify all selected candidates' LICENSE files are bundled in `LICENSE/`.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
> Full per-source descriptions in `_docs/00_research/01_source_registry/` (organized by category file).
|
||||
|
||||
### SQ6 — ArduPilot Plane vs iNav external positioning
|
||||
|
||||
Sources #1–#24. See [`SQ6_external_positioning.md`](../00_research/01_source_registry/SQ6_external_positioning.md).
|
||||
|
||||
### SQ1 — Existing GPS-denied UAV systems
|
||||
|
||||
Sources #25–#37. See [`SQ1_existing_systems.md`](../00_research/01_source_registry/SQ1_existing_systems.md).
|
||||
|
||||
### SQ2 — Canonical pipeline decomposition
|
||||
|
||||
Sources #38–#42. See [`SQ2_canonical_pipeline.md`](../00_research/01_source_registry/SQ2_canonical_pipeline.md).
|
||||
|
||||
### C1 — VIO candidates
|
||||
|
||||
Sources #43–#56. See [`C1_vio.md`](../00_research/01_source_registry/C1_vio.md).
|
||||
|
||||
### C2 — VPR candidates
|
||||
|
||||
Sources #57–#68. See [`C2_vpr.md`](../00_research/01_source_registry/C2_vpr.md).
|
||||
|
||||
### C3 — Matcher candidates
|
||||
|
||||
Sources #69–#81. See [`C3_matchers.md`](../00_research/01_source_registry/C3_matchers.md).
|
||||
|
||||
### C4 — Pose estimation candidates
|
||||
|
||||
Sources #82–#87. See [`C4_pose_estimation.md`](../00_research/01_source_registry/C4_pose_estimation.md).
|
||||
|
||||
### C5 — State estimator / sensor fusion candidates
|
||||
|
||||
Sources #88–#91. See [`C5_state_estimator.md`](../00_research/01_source_registry/C5_state_estimator.md).
|
||||
|
||||
### C6 — Tile cache + spatial index candidates
|
||||
|
||||
Sources #92–#98. See [`C6_tile_cache_spatial_index.md`](../00_research/01_source_registry/C6_tile_cache_spatial_index.md).
|
||||
|
||||
### C7 — On-Jetson inference runtime candidates
|
||||
|
||||
Sources #99–#105. See [`C7_inference_runtime.md`](../00_research/01_source_registry/C7_inference_runtime.md).
|
||||
|
||||
### C8 — MAVLink / MSP2 FC adapter candidates
|
||||
|
||||
Sources #106–#113. See [`C8_fc_adapter.md`](../00_research/01_source_registry/C8_fc_adapter.md).
|
||||
|
||||
### C10 — Pre-flight cache provisioning candidates
|
||||
|
||||
Sources #114–#121. See [`C10_preflight_provisioning.md`](../00_research/01_source_registry/C10_preflight_provisioning.md).
|
||||
|
||||
---
|
||||
|
||||
## Open decisions for Plan-phase (D-Cx-y registry)
|
||||
|
||||
The 27 Plan-phase-architect-owned decisions and 8 cross-component-owner decisions raised across all components are catalogued in [`../00_research/06_component_fit_matrix/99_cross_component_gates.md`](../00_research/06_component_fit_matrix/99_cross_component_gates.md). The most architecturally significant **user-decision** gates are:
|
||||
|
||||
- **D-C1-1 license-track posture** (User + Plan-phase architect). Recommendation: D-C1-1 = (c) both tracks open; preserves modular swap pathway documented in Comparison Framework Dimension 8.
|
||||
- **D-C2-1 VPR canonical-weights vs aerial-retrain vs aerial-community-checkpoint** (User + Plan-phase architect). Recommendation: aerial-retrain on real UAV nadir flight footage corpus per D-C7-1 closure (cost ~1-2 weeks per retrained candidate).
|
||||
- **D-C3-1 SuperPoint-replacement-strategy** (User + Plan-phase architect + license-posture decision-maker). Recommendation: D-C3-1 = (a) DISK+LightGlue (Apache-2.0 throughout + +7.99 absolute AUC@5° lift over canonical SP+LightGlue per LightGlue paper Tab 6).
|
||||
- **D-C2-11 (CONDITIONAL) MegaLoc successor evaluation** (User + Plan-phase architect). Recommendation: defer to post-research session — EigenPlaces closes the mandatory pre-screen at the documentary-required floor; MegaLoc's Plan-phase relevance depends on D-C1-1 + Jetson MVE results.
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- Tech stack evaluation (`tech_stack.md`): NOT PRODUCED in this Mode A run. Recommendation set is embedded in the per-component candidate tables above. Full extraction into `tech_stack.md` is a low-cost task if the user requests it before Plan-phase.
|
||||
- Security analysis (`security_analysis.md`): NOT PRODUCED in this Mode A run. AC-NEW-7 cache-poisoning safety + AC-NEW-2 spoofing-promotion + AC-NEW-8 visual blackout failsafe + AC-NEW-4 covariance honesty are addressed component-by-component above and cross-referenced in [`../00_research/05_validation_log.md`](../00_research/05_validation_log.md). Full extraction into `security_analysis.md` is a low-cost task if the user requests it before Plan-phase.
|
||||
- AC assessment (`_docs/00_research/00_ac_assessment.md`): NOT PRODUCED as standalone artifact in this Mode A run; per-AC binding evidence distributed across per-component fact cards + Restrictions × Candidate-Modes sub-matrix sections.
|
||||
@@ -6,8 +6,8 @@ step: 2
|
||||
name: Research
|
||||
status: in_progress
|
||||
sub_step:
|
||||
phase: 12
|
||||
name: c1-context7-and-restrictions-ac-submatrix
|
||||
detail: "C1 candidate enumeration done (Sources #43–#53 in 01_source_registry.md, Facts #28–#36 in 02_fact_cards.md). Surviving lead candidates (priority order): (1) OpenVINS — GPL-3.0, best Jetson Orin Nano evidence; (2) OKVIS2 / OKVIS2-X — BSD-3, most actively maintained, GNSS-fusion alignment for AC-NEW-2; (3) VINS-Mono — GPL-3.0, proven on Jetson Nano; (4) Pure VO baseline — mandatory simple-baseline reference. Disqualified: DROID-SLAM (AC-4.2 memory budget), RTAB-Map / ORB-SLAM3 (Fact #16). Conditional: DPVO (VO not VIO; needs external IMU wrapper), Kimera-VIO (resource overhead). Two open decisions surfaced: D-C1-1 GPL-3.0 license posture for onboard binary (BLOCKING for GPL-3 trio) and D-C1-2 Jetson Orin Nano MVE schedule. NEXT SESSION'S WORK (BLOCKING per Per-Mode API Capability Verification rule): (a) context7 lookup × 3 mandatory queries per lead candidate (OpenVINS, OKVIS2/OKVIS2-X, VINS-Mono) covering mode enumeration + project's exact mode runnable example + disqualifier probe; (b) MVE block per candidate in 02_fact_cards.md; (c) per-numbered-Restriction × per-numbered-AC sub-matrix per candidate; (d) write 06_component_fit_matrix.md draft for C1 row; (e) ASK USER on Decision D-C1-1 before promoting any GPL-3 candidate to Selected. AFTER C1 IS CLOSED: proceed to C2 (VPR) candidate enumeration."
|
||||
phase: 52
|
||||
name: research-mode-a-engine-steps-4-6-7-8-complete-awaiting-research-decision-gate
|
||||
detail: "Mode A engine artifacts all written today 2026-05-08: 03_comparison_framework.md (Step 4 — 12-dimension Decision Support framework with cross-component coupling table + decisions-by-owner aggregate), 04_reasoning_chain.md (Step 6 — 12-dimension fact→comparison→conclusion chain with cross-cutting reasoning summary), 05_validation_log.md (Step 7 — 5-scenario validation with 5 counterexamples + Step 7.5 Component Applicability Gate sanity-check PASS), 01_solution/solution_draft01.md (Step 8 — full solution_draft_mode_a.md template populated with C1..C8 + C10 candidate tables + IT-1..IT-10 Integration tests + NFT-1..NFT-7 Non-Functional tests + 27 Plan-phase architect-owned decisions + 8 cross-component-owner decisions inventoried). Awaiting user response on Research Decision gate (A: another round Mode B assessment / B: proceed to Plan greenfield Step 3). NO additional research necessary at the documentary level — every component has Selected primary candidate(s) with MVE evidence + zero ❌ + zero ❓ across Restrictions × Candidate-Modes sub-matrices. Recommendation: B (proceed to Plan) — research-layer work is complete, Plan-phase will close the 35 D-Cx-y decisions and produce architecture.md."
|
||||
retry_count: 0
|
||||
cycle: 1
|
||||
|
||||
Reference in New Issue
Block a user