mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 21:11:12 +00:00
48dd81ee0f
Updated the meta-rule document to emphasize strict adherence to skill instructions, prohibiting unnecessary investigations or external checks. Revised acceptance criteria and restrictions to correct communication protocol details for ArduPilot and iNav, ensuring clarity on external-positioning interfaces. Adjusted autodev state to reflect ongoing research phase and updated sub-step details for improved tracking.
197 lines
25 KiB
Markdown
197 lines
25 KiB
Markdown
# Question Decomposition
|
||
|
||
> Mode A Phase 2 (Initial Research — Problem & Solution Draft).
|
||
> Phase 1 (AC & Restrictions Assessment) was skipped per user decision after a cleanup pass that stripped implementation details from `acceptance_criteria.md` and `restrictions.md` (commit `12cc5a4`); AC/restrictions are treated as fixed inputs.
|
||
|
||
## Original Question
|
||
|
||
Design the GPS-denied onboard navigation system for a fixed-wing UAV operating in eastern/southern Ukraine, satisfying every AC in `_docs/00_problem/acceptance_criteria.md` under the constraints in `_docs/00_problem/restrictions.md`. Recommend a concrete component-by-component architecture and tech stack.
|
||
|
||
## Research Output Class
|
||
|
||
**Technical-component selection.** All technical-component gates apply (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookups for every lead library/SDK candidate).
|
||
|
||
## Question Type
|
||
|
||
**Decision Support** (per Mode A Phase 2 default). Sub-flavour: multi-component decision support — weighing trade-offs across ~10 interlocking component areas under hard real-time + memory + safety budgets.
|
||
|
||
## Project Context Summary (from `_docs/00_problem/`)
|
||
|
||
- **What is being built**: an onboard companion-PC system that replaces real GPS for a fixed-wing UAV when GPS is denied/spoofed, by combining nav-camera frames + FC IMU + a pre-cached satellite tile basemap, and emits standard MAVLink external-positioning messages to ArduPilot or iNav at frame rate.
|
||
- **Operating area**: eastern/southern Ukraine, active-conflict zones (war-zone scene change is a first-class concern, not an edge case).
|
||
- **Mission profile**: 8-hour fixed-wing flights, ~60 km/h cruise, ≤1 km AGL, ~400 km² operational area.
|
||
- **Pinned external deps**: ADTi 20MP 20L V1 nav camera (APS-C); Jetson Orin Nano Super 8 GB / 25 W; MAVLink protocol; ArduPilot + iNav as supported FCs; QGroundControl as GCS; Azaion Suite Satellite Service (offline cache interface ≥0.5 m/px).
|
||
- **Hard runtime envelope**: <400 ms p95 end-to-end latency (camera → MAVLink), <8 GB shared CPU+GPU RAM, 25 W TDP at +50 °C ambient for 8 h continuous, no in-flight network, 10 GB persistent tile cache + 64 GB per-flight FDR.
|
||
- **Hard safety envelope**: P(error >500 m) <0.1 % per flight, P(error >1 km) <0.01 % per flight; honest covariance reporting; explicit `dead_reckoned` failsafe under simultaneous GPS spoof + visual blackout; cache-poisoning probability bounds for tiles written back to the Service.
|
||
|
||
## Project Constraint Matrix
|
||
|
||
| Dimension | Binding constraint |
|
||
|---|---|
|
||
| **Inputs available** | Nav camera frames @ 3 fps (5472×3648, ~12 cm/px GSD @ 1 km AGL); FC IMU (high rate via MAVLink); FC attitude/airspeed/altitude; pre-cached satellite tiles ≥0.5 m/px (offline); operator re-loc hint via GCS (rare). |
|
||
| **Outputs required** | WGS84 position to FC via MAVLink external-positioning message(s) accepted by ArduPilot AND iNav; per-frame estimate carrying honest 95 % covariance, source label `{satellite_anchored, visual_propagated, dead_reckoned}`, and `last_satellite_anchor_age_ms`; mid-flight ortho-tiles written to local cache with quality metadata; 1–2 Hz GCS summary; FDR records per AC-NEW-3. |
|
||
| **Hardware fixed** | Jetson Orin Nano Super (67 TOPS sparse INT8, 8 GB shared LPDDR5, 25 W TDP, JetPack/CUDA/TensorRT). |
|
||
| **Lifecycle** | Real-time embedded; offline (no in-flight network); 8 h continuous; persistent tile cache across flights; FDR rollover. |
|
||
| **Non-functional** | <400 ms p95 latency; <8 GB shared RAM; ≤25 W power at +50 °C ambient; AC-1.1/1.2 accuracy; AC-2.1/2.2 registration & MRE; AC-3.x resilience; AC-NEW-1 cold-start <30 s; AC-NEW-2 spoof promotion <3 s; AC-NEW-4 false-position safety; AC-NEW-7 cache-poisoning safety; AC-NEW-8 blackout failsafe. |
|
||
| **Hard disqualifiers** | Anything requiring >8 GB RAM peak (CPU+GPU shared); anything not runnable under JetPack on Orin Nano Super; anything requiring in-flight cloud calls; anything that cannot honestly report covariance; anything that does not have a runnable example for monocular nadir UAV input over season-matched satellite tiles; anything whose license blocks military / dual-use deployment. |
|
||
|
||
## Research Subject Boundary
|
||
|
||
| Dimension | Boundary |
|
||
|---|---|
|
||
| **Population** | Fixed-wing UAVs, downward-fixed monocular nav camera, Jetson-class edge HW, ArduPilot or iNav autopilot. Excludes: multirotors, gimbal-stabilised nav cams, server/cloud GPS-denied stacks, PX4 (out of scope), commercial sat-imagery direct integration (Service handles upstream). |
|
||
| **Geography** | Eastern/southern Ukraine — agricultural steppe, active-conflict scene change. Validation must include this geography or representative analogues (low-texture cropland, snow, war-zone destruction). |
|
||
| **Timeframe** | Production deployment 2026; tools / libraries / models considered must be currently maintained (commits/releases in last 18 months OR explicit long-term-stable status). Critical-novelty domain — see Step 0.5 timeliness assessment. |
|
||
| **Operating context** | Real-time embedded; offline in-flight; 8 h continuous duty; 25 W power envelope; 8 GB shared CPU+GPU memory; thermal envelope to +50 °C ambient. |
|
||
| **Required interfaces** | Inputs: ADTi 20MP nav cam, FC IMU (MAVLink), satellite tile cache. Outputs: MAVLink external-positioning to ArduPilot AND iNav; QGroundControl summary; FDR; tile write-back to Suite Service on landing. |
|
||
| **Non-functional envelope** | Per AC-1 to AC-8 plus AC-NEW-1 to AC-NEW-8. Hardest binding constraints: 400 ms p95 end-to-end; 8 GB shared RAM; AC-NEW-4 false-position probability bounds; AC-NEW-7 cache-poisoning probability bounds. |
|
||
|
||
## Sub-Questions
|
||
|
||
| ID | Sub-question |
|
||
|---|---|
|
||
| SQ1 | What existing/competitor GPS-denied UAV navigation systems exist (academic + open-source + commercial + military), and which of them have been validated on fixed-wing UAVs in active-conflict environments? What works, what fails? |
|
||
| SQ2 | What is the canonical decomposition of "monocular nadir UAV ↔ pre-cached satellite basemap localization" into pipeline components? Is the decomposition below complete, or are there industry-standard components missing? |
|
||
| SQ3 | For each component (VO/VIO, VPR, cross-domain registration, single-frame orthorectification, sensor-fusion estimator, tile cache + spatial index, on-Jetson inference runtime, MAVLink FC adapter, dataset/SITL validation infrastructure): what option families exist (simple baseline / production / open-source / commercial / SOTA / adjacent-domain / no-build), and what are the leading candidates as of 2026? |
|
||
| SQ4 | For each lead candidate per component: what are the documented runtime/memory/latency/license/maintenance constraints, and how do they bind against the Project Constraint Matrix? Per-mode API capability verification with `context7` for every library/SDK lead. |
|
||
| SQ5 | What are the documented failure modes and real-world deployment lessons for each component family? In particular: VPR collapse under cropland repetition, DINOv2/foundation-model cost on Jetson at int8, RANSAC degeneracy at sharp turns / low texture, EKF over-confidence on cross-domain matches, ortho geometric error from unknown bank/pitch. |
|
||
| SQ6 | How do **ArduPilot Plane** (current stable) and **iNav** (current stable) each accept external positioning input via MAVLink? What message types does each support? Where do their interfaces diverge, and what is the documented status of each interface (stable / experimental / known bugs)? |
|
||
| SQ7 | What public datasets, benchmarks, and SITL/replay environments exist for cross-validating monocular nadir UAV navigation against satellite basemaps in season-matched + change-affected conditions? AerialVL, AerialExtreMatch, others? |
|
||
| SQ8 | What are the security and safety considerations specific to the AC-NEW-4 (false-position) and AC-NEW-7 (cache-poisoning) safety budgets, including spoofing-detection signals from FC, ortho-tile geo-alignment quality estimation, and write-back cache-poisoning controls? |
|
||
| SQ9 | What does the system look like end-to-end — wiring, scheduling, threading model, inference scheduling on shared CPU+GPU memory, cold-start sequencing, FDR rotation, and pre-flight cache provisioning workflow? (synthesis question, answered in Step 8) |
|
||
|
||
## Component Areas (search plan)
|
||
|
||
For each component below, the search plan covers all option families per `Component Option Search Plan` rules (`research/steps/03_engine-investigation.md` → "Component Option Breadth").
|
||
|
||
| # | Component area | Required outputs | Key option families to enumerate |
|
||
|---|----------------|------------------|----------------------------------|
|
||
| C1 | **Visual / Visual-Inertial Odometry** (frame-to-frame motion when satellite anchor is unavailable) | Relative 6-DoF pose between consecutive frames or short windows; output frequency ≥3 Hz; metric scale (with IMU) | Classical (VINS-Mono / VINS-Fusion / OpenVINS), Kimera, ORB-SLAM3, OKVIS2, MSCKF-class, learning-based (DROID-SLAM, DPVO), pure VO baseline (KLT + RANSAC homography), no-build (skip and rely on pure satellite re-anchor every frame) |
|
||
| C2 | **Visual Place Recognition (VPR)** — UAV nadir frame → top-K satellite chunks | Compact global descriptor per UAV frame and per cache chunk; cosine-rank top-K candidates | NetVLAD class, MixVPR, EigenPlaces, BoQ, AnyLoc (DINOv2 + VLAD), CricaVPR, foundation-model direct retrieval (DINOv2/DINOv3/SAM 2 / SuperGlobal) |
|
||
| C3 | **Cross-domain registration** (UAV nadir ↔ ortho satellite tile, after VPR top-K) | Sub-pixel alignment + 6-DoF camera pose w.r.t. tile, with inlier ratio + covariance | Local-feature matching (SuperPoint+SuperGlue / LightGlue / DISK+LightGlue / ALIKED+LightGlue / XFeat), dense matchers (LoFTR / RoMa / DKM / MASt3R), classical (SIFT+RANSAC), specialized cross-domain (CMRNet+, CroCoMatch class), templating (mutual-information / ECC), no-build (skip cross-domain; rely on direct frame-to-tile homography from VPR retrieval) |
|
||
| C4 | **Single-frame orthorectification** (nav frame → basemap-aligned tile, given current pose) | Ortho-rectified tile chunk with geo metadata + quality score | Single-frame perspective warp with flat-earth assumption; OpenCV homography; bundled-DEM-aware (rare for flat steppe — likely overkill); GDAL warp utilities; custom GPU shader on Jetson |
|
||
| C5 | **State estimator / sensor fusion** (VO + IMU + sat anchors → fused estimate with covariance) | WGS84 position + 3D velocity + attitude + 6×6 covariance, frame-rate output, honest covariance, source label | EKF (manual), ESKF (manual or via library), MSCKF, factor-graph (GTSAM, iSAM2), particle filter, learned (out-of-scope for safety budget). Supporting: Mahalanobis outlier gates |
|
||
| C6 | **Tile cache + spatial index** (storage + retrieval of basemap tiles + descriptors, with manifests, freshness, dedup, and write-back) | mmap-friendly storage; ANN over global descriptors; spatial query for geographic prior; manifest schema per AC | Storage: GeoTIFF + COG, MBTiles, custom flat layout. ANN: FAISS (IVF/PQ/HNSW), hnswlib, ScaNN, brute-force (small index). Spatial: R-tree / KD-tree / GeoPandas / SQLite+SpatiaLite. Manifest: SQLite, JSON-per-tile, Parquet sidecar |
|
||
| C7 | **On-Jetson inference runtime** | INT8/FP16 inference of the chosen VPR + matcher models within latency + memory budget | TensorRT (native), Torch-TensorRT, ONNX Runtime + TRT EP, NVIDIA Triton (probably overkill), pure PyTorch fp16, NVIDIA DeepStream (for video), CUDA-Python custom kernels |
|
||
| C8 | **MAVLink FC adapter** (per-FC external-positioning emission + spoofing-signal subscription, for ArduPilot AND iNav) | MAVLink frames consumed by ArduPilot Plane and iNav as external position; spoofing signals consumed from each FC | Libraries: `pymavlink` (per-message), MAVSDK (high-level), ArduPilot/iNav SITL for verification. Per-FC choice of message: `GPS_INPUT` vs `ODOMETRY` vs `VISION_POSITION_ESTIMATE` vs `GLOBAL_POSITION_INT` (documented capability per FC must be verified, not assumed) |
|
||
| C9 | **Datasets + SITL / replay** | Reproducible validation against AC-1/2/3/4/NEW-4/NEW-7/NEW-8 budgets; fixtures for AerialVL S03, AerialExtreMatch, own Mavic flights, Derkachi flight footage | AerialVL (VISTA / NTU), AerialExtreMatch, VPR-Bench, MahalNotchVPR / Mid-Air UAV; SITL: ArduPilot Plane SITL, iNav SITL/HITL, Gazebo, Webots; replay: PX4-Avionics-Replay-style or custom |
|
||
| C10 | **Pre-flight cache provisioning + sector classification + freshness pipeline** | Tooling (operator-side) to pull tiles from Suite Sat Service for an operational area, classify active-conflict vs stable rear, age-stamp, populate descriptor index | Likely a custom CLI/desktop tool — research existing UAV mission-prep tools (QGC plan files, MAVProxy, ArduPilot Mission Planner equivalents on the operator side) |
|
||
|
||
## Perspectives Chosen (≥3 mandatory)
|
||
|
||
1. **Implementer / Engineer** — Will the chosen stack actually compile, link, and run on the pinned Jetson within the latency + memory budget? Pitfalls of MAVLink GPS injection on each FC. Sub-pixel registration on UAV-nadir × ortho satellite. Inference-scheduler contention on shared CPU+GPU memory.
|
||
2. **Practitioner / Field** — What do UAV teams actually report from GPS-denied missions in real war-zone deployments? (Ukraine context if findable; otherwise analogous high-stakes deployments.) Real-world VPR collapse on agricultural cropland / snow / season change. Real-world FDR usefulness for post-mission forensics.
|
||
3. **Domain expert / Academic** — Recent (2024–2026) VPR + cross-domain matching benchmarks and their relative ranks under cross-season / cross-domain / cross-altitude conditions. Foundation-model-based VPR (AnyLoc, BoQ, MASt3R) — academic claims vs reproducibility. Recent factor-graph vs ESKF comparisons.
|
||
4. **Contrarian / Devil's advocate** — Why might foundation-model VPR fail on the Jetson budget? Where does cross-domain matching degrade silently? When does ortho-tile write-back amplify bad poses? When does honest covariance turn into "system never trusts itself" (over-cautious failure)?
|
||
|
||
## Search Query Variants per Sub-Question
|
||
|
||
(Detailed query lists are appended below per sub-question; these will be executed in Step 2 and saved to `01_source_registry.md`. The shape is shown here so the search plan is auditable; the full execution log will populate downstream files.)
|
||
|
||
**SQ1** (existing systems / competitors): "GPS-denied UAV navigation 2025", "visual GPS denied fixed wing UAV", "satellite map matching UAV localization 2024 2025", "Ukraine UAV GPS spoofing countermeasures", "ARL ANT Project visual navigation", "vision-based GPS replacement UAV production", "UAV GPS spoofing real-world deployment 2025".
|
||
|
||
**SQ2** (canonical pipeline): "visual aerial localization pipeline survey", "UAV satellite map matching architecture", "monocular UAV global localization pipeline 2024 2025".
|
||
|
||
**SQ3 / SQ4** (per-component candidates + binding): per-component query templates (5+ variants each) — see Step 2 plan in `01_source_registry.md` once initialised. Each lead library/SDK candidate triggers the mandatory `context7` per-mode capability verification per `research/steps/03_engine-investigation.md`.
|
||
|
||
**SQ5** (failure modes): "VPR cropland failure", "DINOv2 Jetson Orin Nano latency", "SuperGlue LightGlue Jetson Orin", "ESKF cross-domain over-confidence", "RANSAC homography low-texture failure UAV", "ortho photo geometric error airframe tilt".
|
||
|
||
**SQ6** (ArduPilot vs iNav external positioning): "ArduPilot Plane GPS_INPUT external", "ArduPilot ODOMETRY EKF3 source switching", "iNav external positioning MAVLink GPS_INPUT", "iNav MAVLink GPS substitute", "iNav GPS denied flight 2025", "ArduPilot vs iNav external nav comparison".
|
||
|
||
**SQ7** (datasets): "AerialVL dataset", "AerialExtreMatch", "VPR-Bench cross-season aerial", "Mid-Air UAV dataset", "Mavic Mavik UAV public flight dataset", "satellite-aerial cross-view localization benchmark".
|
||
|
||
**SQ8** (safety): "MAVLink GPS_RAW_INT spoofing detection", "EKF lane switch ArduPilot", "covariance under-reporting risk EKF", "geo-misalign detection ortho tile".
|
||
|
||
## Completeness Audit
|
||
|
||
Probes (per `references/comparison-frameworks.md` → Decomposition Completeness Probes — applied here without re-reading the full file; will reconcile during Step 2):
|
||
|
||
| Probe | Coverage |
|
||
|---|---|
|
||
| Functional decomposition complete? | C1–C10 cover all data flows from camera in to MAVLink out + back. ✓ |
|
||
| Non-functional dimensions covered? | Latency, memory, accuracy, safety, freshness, security all in Project Constraint Matrix. ✓ |
|
||
| Failure-mode dimension covered? | SQ5 explicitly. ✓ |
|
||
| Cost / TCO dimension? | Hardware is pinned (Jetson Orin Nano Super); Service-side cost is out of scope; SW cost = mostly open-source candidates. Will revisit during Phase 3 (tech stack consolidation) if commercial options emerge. ✓ |
|
||
| Maintenance / community-health dimension? | SQ4 binds it per candidate. ✓ |
|
||
| Adjacent-domain dimension? | Robot SLAM, AGV warehouse navigation, aerial photogrammetry will be searched as analogues. ✓ |
|
||
| Validation / dataset coverage? | SQ7 + C9. ✓ |
|
||
| Integration / boundary coverage? | SQ6 (FC adapters) + C8 + C10 (pre-flight provisioning). ✓ |
|
||
| Operational/human-factors? | Pre-flight cache provisioning (C10) and operator re-loc hint (AC-3.4) covered. Mission-planning UX is out of scope. ✓ |
|
||
| Security / threat model? | SQ8. Will deepen in Phase 4 (Security Deep Dive) if invoked. ✓ |
|
||
|
||
No major gap detected at decomposition time. If domain-discovery searches in Step 2 surface a missed dimension, a "gap-fill" entry will be appended here.
|
||
|
||
## Notes on Output-Class Mode-Verification
|
||
|
||
Because this is **Technical-component selection**, every lead library/SDK candidate triggers:
|
||
- Pinned mode/configuration sentence in `02_fact_cards.md`.
|
||
- `context7` lookup with the three mandatory queries (mode enumeration; project's exact mode runnable example; disqualifier probe).
|
||
- MVE block per candidate.
|
||
- Per-numbered-Restriction and per-numbered-AC binding (`Pass` / `Fail` / `Verify` / `N/A`).
|
||
- Two modes of one library = two distinct candidates.
|
||
|
||
## Step 0.5 — Novelty Sensitivity Assessment
|
||
|
||
**Classification: Critical sensitivity.**
|
||
|
||
Justification:
|
||
- Foundation-model VPR is moving fast: DINOv2 (Apr 2023), AnyLoc (Aug 2023), BoQ (CVPR 2024), MASt3R (May 2024), MASt3R-SfM / new VPR-leader candidates 2025; rankings on cross-season aerial benchmarks have shifted multiple times since 2023.
|
||
- ArduPilot Plane / iNav external-positioning interfaces have moved: ArduPilot EKF3 source-switching parameters and known double-fusion bugs between `GPS_INPUT` and `ODOMETRY` were a moving target through 2024–2025; iNav GPS-denied support has matured separately.
|
||
- TensorRT / JetPack stacks on Jetson Orin Nano Super have version-dependent INT8 quantisation behaviour and runtime tooling differences worth verifying against current releases.
|
||
- Public aerial-localization datasets (AerialVL, AerialExtreMatch, etc.) have had multiple revisions and added splits.
|
||
|
||
Source-time-window rules for this run:
|
||
- **Lead-candidate selection / SOTA claims**: prioritise sources from **last 6 months**; allow up to **18 months** if no newer source covers the same claim and the older source is the official authority.
|
||
- **Established baselines / classical algorithms** (KLT, RANSAC, EKF, ORB, SIFT, GTSAM): no time window — canonical references are fine.
|
||
- **Library/SDK API behaviour**: must be verified against the **currently shipped version** at the time of search (`context7` mandatory per lead candidate; release notes / changelog cross-checked).
|
||
- **Cross-validation**: every Critical-sensitivity claim that drives a candidate selection must have **≥2 independent sources** or one official + one runnable MVE; single-source SOTA claims must be downgraded to `Experimental only` at Step 7.5 unless cross-validated.
|
||
|
||
## SQ2 Closure — Pipeline-component coverage table (Mode A Phase 2, Step 3 result)
|
||
|
||
The C1–C10 decomposition was sanity-checked against five independent surveys/benchmarks (Skoltech aerial-VPR survey, U.Maine cross-view survey, OrthoLoC benchmark, AnyVisLoc benchmark, NUDT 2026 absolute-VL survey — all logged in `01_source_registry.md` as Sources #38–#42). The canonical hierarchical framework `retrieval → matching → pose-estimation` is unanimously confirmed; project's split is **canonical, not novel**. Two augmentations are required.
|
||
|
||
| Survey/benchmark canonical stage | Project component | Coverage status | Required action |
|
||
|---|---|---|---|
|
||
| Image retrieval (global VPR) | **C2 — VPR** | ✅ covered | None |
|
||
| Re-ranking (top-N inlier-based) | (implicit, inside C2/C3) | ⚠️ implicit | Promote to explicit sub-stage in `solution_draft01` |
|
||
| Local image matching (2D-2D, sparse or dense) | **C3 — Cross-domain registration** | ✅ covered | Add Top-N inlier re-rank requirement |
|
||
| AdHoP-style perspective preconditioning | (not represented) | ❌ missing | Add as optional sub-stage between C3 and C4, gated on Jetson latency budget |
|
||
| 2D-3D lift via DSM | (not represented; current cache is 2D ortho only) | ❌ architectural decision required | **Decision required from user** — see "Open architectural decisions" below |
|
||
| Pose estimation (PnP + RANSAC + LM) | **C4 — Pose estimation** | ✅ covered | None |
|
||
| State estimator / fusion | **C5 — Estimator / fusion** | ✅ covered | Augmented with covariance-honesty contract (already from AC-NEW-4) |
|
||
| IMU + VIO contract | **C1 (VIO)** + **C6 (Tile cache)** ⁂ | ✅ covered | Add yaw σ ≤ 5°, pitch σ ≤ 5° hard contract (Fact #24) |
|
||
| Tile cache + scheduler | **C6 (Tile cache + spatial index)** | ✅ covered | Add 20% covisibility runtime invariant (Fact #27) |
|
||
| On-Jetson runtime | **C7 — On-Jetson inference runtime** | ✅ covered | Pre-screen prunes non-viable candidates (Fact #26) |
|
||
| Anti-spoof / FC adapter | **C8 — MAVLink FC adapter** | ✅ covered | Already addressed by SQ6 |
|
||
| Datasets / SITL / replay | **C9 — Datasets + SITL / replay** | ✅ covered | None |
|
||
| Pre-flight cache provisioning | **C10 — Pre-flight cache + sector classification** | ✅ covered | None |
|
||
|
||
⁂ The "IMU integration" concern lives in C1 (VIO) and partially flows from FC IMU; there is no separately numbered IMU component in the original C1–C10 split. SQ2 confirms this was correct — IMU is best owned by C1 (VIO) which already produces the yaw/pitch attitude. The σ ≤ 5° contract belongs on C1's output interface.
|
||
|
||
### SQ2 — Architectural decisions (resolved by user, 2026-05-07)
|
||
|
||
| # | Decision | Choice | Implication for SQ3+SQ4 |
|
||
|---|---|---|---|
|
||
| 1 | DSM dependency on Suite Sat Service tile cache (Fact #23) | **(a) 3-DoF acceptance** — fix attitude from IMU/VIO, ignore DSM; current 2D-ortho cache contract preserved. | C6 (Tile cache) candidate matrix excludes DSM-dependent storage formats. C3 (matcher) candidates evaluated on 2D-2D output (homography) only. Yaw/pitch σ ≤ 5° (Fact #24) is **noted as an empirical requirement on C1's output but NOT bound as a hard interface contract** — emerges as an output of C1 candidate selection in SQ3+SQ4. AC-1.1.1 (≤80 m at 1 km AGL) likely satisfied per DSMAC-class lineage in Fact #17; if AC ever tightens, revisit option (b). |
|
||
| 2 | AdHoP refinement loop (Fact #22) | **(b) Conditional** — only invoked when initial reprojection error exceeds a threshold. | C3 (matcher) latency budget = base (single-pass) + AdHoP-conditional overhead (worst-case 2× when triggered). Per-frame Jetson MVE must measure both modes. The reprojection-error threshold becomes a SQ3+SQ4 hyperparameter. |
|
||
| 3 | Top-N re-rank promotion (Fact #25) | **(a) Promote** to an explicit named sub-stage between C2 and C3. | SQ3+SQ4 will hyperparameter-sweep N ∈ {5, 10, 15, 20}; C2 candidates evaluated jointly with re-rank cost. Top-N re-rank by inlier-count is now a hard pipeline component, not implicit. |
|
||
|
||
### SQ2 — Component-pruning carried into SQ3+SQ4 (Jetson-pre-screen result)
|
||
|
||
Per Fact #26 (RTX-3090-measured runtime → conservative Jetson-Orin-Nano translation):
|
||
|
||
- **C2 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: MixVPR, SALAD, SelaVPR, EigenPlaces, NetVLAD.
|
||
- **C2 candidates entering SQ3+SQ4 conditional on INT8 quantization path**: AnyLoc, BoQ, DINOv2-VLAD.
|
||
- **C2 candidates pruned outright**: SuperGlue-as-reranker (latency).
|
||
- **C3 candidates entering SQ3+SQ4 with mandatory Jetson MVE**: LightGlue, XFeat, XFeat*, SP+LightGlue (NGPS template confirmed).
|
||
- **C3 candidates pruned outright**: RoMa, MASt3R, DKM (dense-matcher latency on Jetson).
|
||
- **C3 candidates as "AerialExtreMatch reference points" only**: GIM+DKM, GIM+LightGlue (per Source #40 — accuracy benchmark, not for production deployment).
|
||
|
||
## Next Step
|
||
|
||
SQ1 ✓ → SQ2 ✓ (with three architectural decisions resolved) → **SQ3+SQ4 per component (C1→C10)** → SQ5 interleaved → SQ7 → SQ8 → SQ9 synthesis at engine Step 8.
|
||
|
||
Pipeline shape entering SQ3+SQ4: `C1 (VIO) → C2 (VPR) → Top-N re-rank by inlier count → C3 (matcher) → AdHoP-conditional refinement → C4 (PnP+RANSAC+LM) → C5 (estimator) → C8 (FC adapter)` with C6 (cache, 2D ortho) + C7 (Jetson runtime) + C9 (datasets) + C10 (provisioning) cross-cutting.
|
||
|
||
First C1 (VIO) candidate batch: VINS-Mono / VINS-Fusion / OpenVINS / OKVIS2 / DROID-SLAM / DPVO / pure-VO baseline (RTAB-Map and ORB-SLAM3 already pruned by Fact #16). Per-mode `context7` capability verification mandatory for every lead library/SDK candidate.
|