mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 07:01:14 +00:00
fresh start v2
This commit is contained in:
@@ -1,85 +0,0 @@
|
||||
# Acceptance Criteria Assessment
|
||||
|
||||
## Scope
|
||||
|
||||
- **Mode**: Mode A Phase 1 — acceptance criteria and restrictions assessment.
|
||||
- **Problem boundary**: fixed-wing UAV, downward fixed navigation camera, high-rate IMU over MAVLink, GPS may be denied or spoofed, offline satellite tile cache, eastern/southern Ukraine operating region, onboard Jetson Orin Nano Super.
|
||||
- **Novelty sensitivity**: High for onboard AI/VPR, Jetson/TensorRT, and ArduPilot EKF integration; Medium for photogrammetry/GSD and slippy-map math; Low for high-level VO/INS principles.
|
||||
- **Input data observed**: 60 nadir-like agricultural/steppe frames with sparse landmarks, ground-truth frame-center coordinates, two Google Maps reference screenshots, no real IMU trace.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status |
|
||||
|-----------|------------|-------------------|----------------------|--------|
|
||||
| AC-1.1 / AC-1.2 frame-center accuracy | 80% <50 m, 50% <20 m | Recent UAV GNSS-denied visual localization work reports roughly 15-20 m mean localization error in favorable datasets, while fixed-wing satellite-aided VO demonstrates drift reduction over >17 km rather than a universal meter-level guarantee. The thresholds are plausible if satellite anchors succeed regularly, but they require dataset-specific validation. | Keep, but budget a full evaluation harness and Monte Carlo replay early. | Keep |
|
||||
| Expected-results threshold for 20 m accuracy | AC says 50%; `expected_results/results_report.md` says 60% | This is an internal contradiction, not a research issue. Tests will enforce the stricter 60% if left as-is. | Low effort to correct, high risk if unnoticed because it changes pass/fail. | Needs decision |
|
||||
| AC-1.3 VO drift between anchors | <100 m VO-only, <50 m IMU-fused | Visual-inertial odometry drift below ~1% of distance is realistic in some outdoor tests; at 60 km/h and 3 Hz, anchor cadence and turn handling dominate. The criterion is feasible if satellite relocalization prevents long VO-only intervals. | Keep, but require explicit anchor-age limits in tests. | Keep |
|
||||
| AC-1.4 confidence score | 95% covariance ellipse + source label | MAVLink `GPS_INPUT` supports horizontal/vertical accuracy fields; producing calibrated covariance is feasible but must be statistically validated against false-position budgets. | Medium effort; requires calibration and reliability testing, not just API plumbing. | Keep |
|
||||
| AC-2.1 registration rate | >95% normal segments | Plausible only under the definition already scoped: nadir, daylight, season-matched, >=40% overlap. Sparse agricultural frames and active-conflict scene changes make this too optimistic outside that scope. | Keep the scoped definition; add separate degraded-condition metrics later. | Keep |
|
||||
| AC-2.2 reprojection error | <1.0 px VO, <2.5 px UAV-satellite | VO MRE <1 px is plausible after calibration. Cross-domain UAV-satellite <2.5 px is aggressive because appearance, season, scale, and orthorectification errors dominate. | Keep as a stretch or lab-gated target; do not make it the only production acceptance signal. | Modify |
|
||||
| AC-3.1 outliers / AC-3.2 sharp turns / AC-3.3 disconnected segments | 350 m outliers; <5% overlap turns; >=3 disconnected segments | These are operationally important and realistic as failure modes. They cannot be solved by VO alone; they require VPR/global retrieval + EKF gating. | High implementation risk; keep because they drive the right architecture. | Keep |
|
||||
| AC-3.4 relocalization request | >=3 failed frames and >=2 s | At 3 fps, this means the frame-count trigger can fire around 1 s but the time trigger waits 2 s. The combined trigger is coherent. | Low implementation cost; important for operator workflow. | Keep |
|
||||
| AC-4.1 latency | <400 ms p95 capture-to-FC output, <=10% frame drop | Jetson Orin Nano Super has official 67 TOPS sparse / 8 GB / 25 W mode. DINOv2-base TensorRT examples can run at high FPS, but DINOv2-S GitHub issue data shows ~22-23 ms GPU compute and limited INT8 gain; SuperPoint/LightGlue data on Orin Nano is thin. End-to-end 400 ms is plausible only with conditional VPR and heavy offline preprocessing. | High risk; needs prototype benchmark before implementation decomposition. | Keep with benchmark gate |
|
||||
| AC-4.2 memory | <8 GB shared LPDDR5 | Feasible only if model set is small, descriptors are tiled/indexed, TensorRT engines are prebuilt, and raw frames are not retained. | Medium risk; memory profiling must be an acceptance test. | Keep |
|
||||
| AC-4.3 MAVLink output | v1 GPS_INPUT only; ODOMETRY disabled | ArduPilot/MAVProxy documents `GPS1_TYPE=14` for MAVLink GPS input. MAVLink says `GPS_INPUT` is raw sensor input, not global position estimate, so covariance/accuracy fields matter. ArduPilot issue #30076 shows external-nav/GPS fusion risk was real and is now marked closed; v1 GPS_INPUT-only remains conservative, but ODOMETRY should be re-evaluated against the exact ArduPilot release before v1.1. | Keep v1 GPS_INPUT-only; add version-pinned SITL gate before enabling ODOMETRY. | Keep |
|
||||
| AC-5 startup/failsafe | TTFF <30 s; fail if no estimate >3 s | Cold-start <30 s is plausible only if TensorRT engines and tile indexes are loaded without first-run compilation. >3 s failover matches the 3 fps / sharp-turn recovery intent. | Medium effort; requires boot profiling and watchdog tests. | Keep |
|
||||
| AC-6 QGC telemetry | 1-2 Hz downsampled summary, commands via MAVLink | Feasible and appropriately scoped; keeps high-rate data local. | Low/medium depending on command dialect choice. | Keep |
|
||||
| AC-7 object localization | Frame-center-equivalent in level flight; publish bank/pitch bound outside level flight | This is realistic because AI-camera bank/pitch is unavailable. The bound `altitude * sin(angle)` correctly exposes the limitation instead of hiding it. | Keep; API must always return accuracy bound. | Keep |
|
||||
| AC-8.1 imagery resolution | >=0.5 m/px, ideal 0.3 m/px | Commercial 30 cm imagery is realistic through providers such as Airbus Pléiades Neo and Vantor/Maxar-class constellations, but upstream availability/freshness in active-conflict areas remains a service risk. | Keep; dependency belongs to Suite Satellite Service SLA. | Keep |
|
||||
| AC-8.2 / AC-NEW-6 freshness | <6 months active sectors, <12 months stable sectors | Operationally justified. It will reduce false positives but increases mission planning/sourcing burden. | Medium/high cost delegated to Satellite Service and cache metadata. | Keep |
|
||||
| AC-8.3 preprocessing | Offline cache with descriptors | Correct architectural choice for onboard latency and no in-flight network dependency. | Medium offline compute/storage cost. | Keep |
|
||||
| AC-8.4 / AC-8.5 tile write-back and no raw frame storage | Persist generated tiles, not raw frames | Good storage control and privacy/security posture. Requires tile quality gates to prevent poisoning. | Medium/high because write-back quality scoring is non-trivial. | Keep |
|
||||
| AC-8.6 VPR chunks | 600-800 m chunks, 40-50% overlap, conditional VPR | Fits the latency evidence: online VPR should be event-triggered, not per-frame. Multi-scale chunks are appropriate for active-conflict scene changes. | Keep; index size and load time need measurement. | Keep |
|
||||
| AC-NEW-3 FDR | <=64 GB / flight | Feasible because raw frames are excluded. Must include rollover logging so no payload class disappears silently. | Low/medium. | Keep |
|
||||
| AC-NEW-4 false-position budget | P(error >500 m) <0.1%, P(error >1 km) <0.01% per flight | This is the right safety metric, but it cannot be proven from unit tests alone. It requires calibrated covariance, outlier rejection, and Monte Carlo over representative datasets. | High validation cost; keep as safety gate. | Keep |
|
||||
| AC-NEW-5 environmental envelope | -20 C to +50 C, 25 W for 8 h, no throttling | DO-160G-style environmental testing is a recognized airborne-equipment pattern; applying it to a small UAV is conservative. The thermal target is realistic only with designed cooling, not a bare dev kit. | Hardware integration and chamber testing required. | Keep |
|
||||
| AC-NEW-7 cache-poisoning budget | Tile misalignment >30 m <1%, >100 m <0.1% | Correct risk to control, but the thresholds depend on covariance calibration and multi-flight voting. Single-flight promotion in active sectors should remain exceptional and heavily gated. | High validation cost; keep with explicit service-side voting tests. | Keep |
|
||||
|
||||
## Restrictions Assessment
|
||||
|
||||
| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status |
|
||||
|-------------|------------|-------------------|----------------------|--------|
|
||||
| UAV type and camera pose | Fixed-wing, fixed downward navigation camera | This fits visual odometry and orthorectification assumptions better than gimbaled imagery. Lack of gimbal stabilization means bank/pitch compensation must come from FC attitude. | Keep; camera/FC time sync is critical. | Keep |
|
||||
| Operational area | Eastern/southern Ukraine, steppe/agricultural terrain | Sample images confirm low-texture fields, tree lines, roads, sparse structures, and seasonal/appearance risk. This is a hard test for cross-view matching. | Keep; research and tests must avoid urban-only datasets. | Keep |
|
||||
| Camera spec inconsistency | Problem text says ~6200x4100; restrictions say 20 MP 5472x3648; input data says 26 MP 6252x4168 at 400 m | The repo distinguishes target camera from example data, but the solution needs one authoritative v1 navigation camera/lens for GSD, intrinsics, and latency. | Needs confirmation or the plan will produce conflicting calibration tasks. | Needs decision |
|
||||
| GSD target | 10-20 cm/px at <=1 km AGL | Formula `GSD = sensor width * altitude / (focal length * image width)` supports tuning GSD by lens choice. With 23.5 mm sensor width, 6252 px, 25 mm lens: ~6 cm/px at 400 m and ~15 cm/px at 1 km. | Keep; lens selection must be locked before calibration tasks. | Keep |
|
||||
| Tile zoom statement | "slippy-XYZ z=20 (~30 cm/px, 512x512)" | OSM zoom table gives z20 = 0.149 m/px at equator for 256 px tiles, about 0.10 m/px at latitude 48; 512 px tile conventions can shift effective zoom. The stated "z=20 ~30 cm/px" is inconsistent unless this uses a custom provider convention. | Needs correction before storage and cache-index tasks. | Needs decision |
|
||||
| Satellite provider boundary | Onboard consumes Azaion Suite Satellite Service cache only | Correct. Commercial provider contracts, tasking, licensing, and freshness are outside this build but must become Satellite Service SLAs. | Keep. | Keep |
|
||||
| Hardware | Jetson Orin Nano Super, 8 GB, 25 W | Official specs support the constraint, but thermal and memory limits are tight for multi-model CV. | Keep; prototype benchmark is mandatory. | Keep |
|
||||
| IMU source | High-rate IMU from FC via MAVLink | Feasible. Test gap remains because current input data has no real IMU trace. Synthetic/SITL IMU is acceptable for early tests but not final validation. | Add real or representative flight IMU capture plan. | Modify |
|
||||
| Autopilot | ArduPilot only, QGroundControl only | Good scope control. ArduPilot GPS_INPUT path is documented; EKF source/fusion behavior must be version-pinned. | Keep. | Keep |
|
||||
| Storage | ~10 GB persistent tile cache plus 64 GB FDR | Feasible if tile zoom/resolution is corrected and raw frames remain excluded. | Keep after zoom correction. | Keep |
|
||||
| No raw photo storage | Tiles only; failure thumbnails <=0.1 Hz | Strong restriction for storage and security; compatible with FDR cap. | Keep. | Keep |
|
||||
|
||||
## Key Findings
|
||||
|
||||
1. Most existing ACs are directionally sound and should remain, especially the safety budgets, covariance reporting, freshness gates, GPS_INPUT-only v1 scope, and no-raw-frame storage policy.
|
||||
2. Three items should be resolved before planning proceeds: AC-1.2's 50% vs expected-results 60% contradiction, the authoritative v1 navigation camera/lens, and the tile zoom/resolution convention.
|
||||
3. The 400 ms p95 latency target is plausible only if VPR is conditional and reference descriptors are precomputed; it should be treated as a benchmark gate before implementation tasks are finalized.
|
||||
4. Cross-domain reprojection error <2.5 px is an aggressive lab target. Production acceptance should rely on position error, covariance calibration, registration success, and false-position rejection, not pixel error alone.
|
||||
5. The public evidence supports satellite-aided visual localization as feasible, but the project's operating region is harder than many benchmark datasets because farmland/steppe imagery has sparse stable features and active-conflict areas change quickly.
|
||||
|
||||
## Recommended Adjustments Before Phase 2
|
||||
|
||||
| Decision | Recommendation | Rationale |
|
||||
|----------|----------------|-----------|
|
||||
| AC-1.2 threshold mismatch | Align `expected_results/results_report.md` with AC-1.2 at 50%, unless the user intentionally wants a 60% stretch target. | Avoid hidden test/spec disagreement. |
|
||||
| Navigation camera | Confirm ADTi 20MP 20L V1 + selected lens as v1 target; keep 26 MP images as sample/test data only. | Calibration, GSD, FOV, latency, and orthorectification depend on intrinsics. |
|
||||
| Tile zoom | Replace "z=20 (~30 cm/px, 512x512)" with an explicit cache convention: provider pixel size, tile matrix, tile dimension, CRS, and latitude-adjusted resolution. | Prevent wrong storage estimates and matcher scale assumptions. |
|
||||
| IMU validation data | Add a requirement to obtain real FC MAVLink IMU logs or approved SITL-generated IMU traces before final acceptance. | Current sample data cannot validate fusion, covariance, or latency-to-FC behavior. |
|
||||
| Cross-domain MRE | Mark <2.5 px as lab/stretched diagnostic; keep position/covariance as production gate. | Pixel error does not fully capture orthorectification, map freshness, and geodetic error. |
|
||||
|
||||
## Sources
|
||||
|
||||
- ArduPilot MAVProxy GPSInput documentation, accessed 2026-04-29: `https://ardupilot.org/mavproxy/docs/modules/GPSInput.html`
|
||||
- MAVLink common message spec, `GPS_INPUT`, accessed 2026-04-29: `https://mavlink.io/en/messages/common.html#GPS_INPUT`
|
||||
- ArduPilot issue #30076, ExternalNav + GPS fusion behavior, accessed 2026-04-29: `https://github.com/ArduPilot/ardupilot/issues/30076`
|
||||
- NVIDIA Jetson Orin Nano Super technical blog, published 2024-12-17, accessed 2026-04-29: `https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost`
|
||||
- NVIDIA JetPack 6.2 Super Mode benchmarks, accessed 2026-04-29: `https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/`
|
||||
- NVIDIA TensorRT issue #4348, DINOv2-S Jetson Orin TensorRT measurements, accessed 2026-04-29: `https://github.com/NVIDIA/TensorRT/issues/4348`
|
||||
- MDPI Applied Sciences 2024, "Visual Odometry in GPS-Denied Zones for Fixed-Wing Unmanned Aerial Vehicle with Reduced Accumulative Error Based on Satellite Imagery", accessed 2026-04-29: `https://www.mdpi.com/2076-3417/14/16/7420`
|
||||
- MDPI Sensors 2024, "A Cross-View Geo-Localization Algorithm Using UAV Image and Satellite Image", accessed 2026-04-29: `https://www.mdpi.com/1424-8220/24/12/3719`
|
||||
- Airbus Pléiades Neo official imagery page, accessed 2026-04-29: `https://www.airbus.com/en/pleiades-neo-satellite-imagery`
|
||||
- Vantor/Maxar WorldView Legion page, accessed 2026-04-29: `https://www.maxar.com/worldview-legion`
|
||||
- OpenStreetMap Wiki zoom levels, accessed 2026-04-29: `https://wiki.openstreetmap.org/wiki/Zoom_levels`
|
||||
- FAA AC 21-16G / RTCA DO-160G reference, accessed 2026-04-29: `https://www.faa.gov/documentLibrary/media/Advisory_Circular/AC_21-16G.pdf`
|
||||
@@ -1,97 +0,0 @@
|
||||
# Question Decomposition
|
||||
|
||||
## Original Question
|
||||
|
||||
Design a GPS-denied onboard system for a fixed-wing UAV that estimates the WGS84 coordinate of each navigation-camera frame center, estimates object coordinates from a separate AI camera, and emits replacement GPS data to ArduPilot using only onboard sensors and a preloaded satellite imagery cache.
|
||||
|
||||
## Active Mode
|
||||
|
||||
- **Mode**: Mode B — solution assessment of `_docs/01_solution/solution_draft01.md`.
|
||||
- **Question type**: Problem Diagnosis with Decision Support.
|
||||
- **Rationale**: The first draft has a plausible architecture, but several component choices need stricter exact-fit checks against licensing, memory, cache-size, validation-data, and ArduPilot integration constraints before planning.
|
||||
|
||||
## Problem Context Summary
|
||||
|
||||
- Fixed-wing UAV, roughly 60 km/h cruise, 8-hour missions, up to 400 km² persistent satellite cache.
|
||||
- Navigation camera is fixed and nadir-facing; v1 target is ADTi 20MP 20L V1 with a lens selected for roughly 10-20 cm/px at <=1 km AGL.
|
||||
- AI camera is separate, gimbaled/zoomed, and only gimbal angle + zoom are available to the GPS-denied system.
|
||||
- IMU and attitude data arrive from the flight controller over MAVLink.
|
||||
- Onboard platform is Jetson Orin Nano Super, 8 GB shared memory, 25 W.
|
||||
- Satellite imagery is preloaded before flight by Azaion Suite Satellite Service; no in-flight network dependency.
|
||||
- Output to flight controller is v1 `GPS_INPUT` only, with `ODOMETRY` deferred until release/version-specific SITL validation.
|
||||
|
||||
## Research Subject Boundary
|
||||
|
||||
| Dimension | Boundary |
|
||||
|-----------|----------|
|
||||
| Population | Fixed-wing UAVs with downward navigation camera, not multicopter-only or indoor robot-only systems. |
|
||||
| Geography | Eastern/southern Ukraine style steppe/agricultural terrain, sparse landmarks, active-conflict freshness risk. |
|
||||
| Timeframe | Current practical open-source/commercial CV stack as of 2024-2026. |
|
||||
| Level | Onboard production architecture and validation plan, not only offline academic benchmark. |
|
||||
| Operating context | Real-time edge inference, GPS denied/spoofed, no in-flight satellite fetch, 8-hour thermal duty cycle. |
|
||||
| Required interfaces | Nav-camera frames, FC IMU/attitude/altitude over MAVLink, offline tile cache, local API, MAVLink `GPS_INPUT`. |
|
||||
| Non-functional envelope | <400 ms p95 output latency, <8 GB memory, 10 GB cache cap, 64 GB FDR cap, calibrated false-position budget. |
|
||||
|
||||
## Project Constraint Matrix Summary
|
||||
|
||||
| Constraint Area | Binding Constraint | Hard Disqualifiers |
|
||||
|-----------------|-------------------|--------------------|
|
||||
| Camera/VO | Fixed nadir monocular navigation camera, high-res frames, flight attitude from FC. | VO requiring stereo/depth as a mandatory input for v1. |
|
||||
| Absolute localization | Must anchor to offline satellite cache and handle sparse terrain. | Pure VO/SLAM with no global relocalization. |
|
||||
| Runtime | Jetson Orin Nano Super, 8 GB shared memory, 25 W. | Per-frame heavy VPR or models that exceed memory/thermal budgets. |
|
||||
| Safety | Must report covariance and avoid confident false positions. | Matchers that output positions without calibrated uncertainty and outlier rejection. |
|
||||
| Autopilot | ArduPilot v1, `GPS_INPUT` primary. | PX4-only or ODOMETRY-only integration for v1. |
|
||||
| Storage | No raw frame persistence; tiles and FDR only. | Architectures depending on raw photo archive during normal flight. |
|
||||
| Satellite cache | 0.5 m/px min, 0.3 m/px ideal; freshness metadata required. | Untimestamped/stale reference tiles treated as equally trustworthy. |
|
||||
|
||||
## Decomposed Sub-Questions
|
||||
|
||||
1. Which assumptions in `solution_draft01.md` are weak, under-evidenced, or contradicted by project constraints?
|
||||
2. Does the selected local matcher have a product-safe licensing path and a realistic Jetson performance path?
|
||||
3. Does the AnyLoc/DINOv2-VLAD VPR proposal fit the 8 GB memory and 10 GB cache budgets after descriptor/index sizing?
|
||||
4. Is the 400 km² satellite cache budget plausible when 0.3-0.5 m/px imagery, overviews, manifests, descriptors, and generated tiles are included?
|
||||
5. Are the public validation datasets sufficient, or does final acceptance require project-specific IMU/camera timing traces?
|
||||
6. Does v1 `GPS_INPUT`-only ArduPilot integration remain safer than dual `GPS_INPUT` + `ODOMETRY` emission?
|
||||
7. Are any selected components still only experimental and therefore blocked by the exact-fit gate?
|
||||
|
||||
## Chosen Perspectives
|
||||
|
||||
- **Implementer / Engineer**: exact sensor fit, timing, memory, calibration, and integration risks.
|
||||
- **Domain expert / Academic**: public evidence for UAV-to-satellite localization and VO drift behavior.
|
||||
- **Contrarian / Safety**: false positive matches, stale tiles, cache poisoning, and EKF double-fusion.
|
||||
- **Operator / Field**: no in-flight network, QGC visibility, reboot recovery, and failure requests.
|
||||
|
||||
## Search Query Variants
|
||||
|
||||
| Sub-Question | Query Variants |
|
||||
|--------------|----------------|
|
||||
| VO/VIO | `fixed wing UAV visual odometry GPS denied satellite imagery`, `monocular IMU visual inertial odometry UAV`, `cuVSLAM monocular IMU requirements`, `ORB-SLAM3 monocular inertial license`, `VINS-Fusion monocular IMU fixed wing UAV` |
|
||||
| VPR | `DINOv2 visual place recognition UAV satellite localization`, `AnyLoc DINOv2 visual place recognition license`, `SatLoc dataset GNSS denied UAV`, `NaviLoc trajectory visual localization UAV`, `DINOv2 Jetson Orin latency` |
|
||||
| Local matching | `SuperPoint LightGlue TensorRT Jetson Orin`, `cross-view geo-localization UAV satellite imagery 2024`, `UAV satellite image local feature matching RANSAC homography`, `SuperPoint license LightGlue license` |
|
||||
| Fusion | `error state Kalman filter visual inertial GPS denied UAV`, `MAVLink GPS_INPUT horizontal accuracy covariance`, `ArduPilot external navigation GPS double fusion`, `ArduPilot EKF3 GPS_INPUT ODOMETRY` |
|
||||
| Cache | `Cloud Optimized GeoTIFF offline raster cache`, `MBTiles offline raster tile cache`, `slippy map zoom meters per pixel`, `satellite imagery 30 cm resolution official`, `Maxar Airbus 30 cm CE90` |
|
||||
| API/FDR | `FastAPI OpenAPI automatic documentation`, `pymavlink GPS_INPUT send message`, `MAVProxy GPS_INPUT GPS1_TYPE 14`, `flight data recorder UAV MAVLink logs` |
|
||||
| Mode B weak points | `SuperPoint pretrained weights license commercial use`, `LightGlue license extractor weights`, `AnyLoc DINOv2 VLAD descriptor dimension memory`, `COG JPEG compression satellite imagery storage estimate`, `public UAV dataset GPS IMU satellite imagery AerialVL`, `ArduPilot GPS_INPUT external navigation ODOMETRY EKF3 2026` |
|
||||
| Mode B round 2 weak points | `ALIKED feature extractor license LightGlue`, `DISK local feature extractor license LightGlue`, `DeDoDe local features license TensorRT`, `OpenCV SIFT patent expired commercial use`, `real-time frame queue latency latest frame processing visual odometry` |
|
||||
|
||||
## Completeness Audit
|
||||
|
||||
| Probe | Result |
|
||||
|-------|--------|
|
||||
| Cost / resources | Covered through Jetson, storage, model, and satellite-service constraints. |
|
||||
| Physical/legal/environmental constraints | Covered through camera, thermal, DO-160-style environment, and imagery provider boundary. |
|
||||
| Dependencies and assumptions | Covered through Suite Satellite Service, ArduPilot version, TensorRT/JetPack, and model licensing. |
|
||||
| Operating environment | Covered: daytime, steppe/agricultural terrain, active-conflict freshness, sparse landmarks. |
|
||||
| Failure modes | Covered: stale tiles, sharp turns, VO loss, false positives, cache poisoning, reboot, EKF fusion bugs. |
|
||||
| Practitioner lessons | Covered through ArduPilot issue evidence and Jetson/TensorRT deployment constraints. |
|
||||
| Change over time | Covered through high novelty sensitivity for VPR models, JetPack/TensorRT, and ArduPilot EKF behavior. |
|
||||
| Mode B exact-fit gaps | Covered through new checks on SuperPoint licensing, LightGlue/extractor separation, AnyLoc descriptor size, COG storage measurement, and dataset realism. |
|
||||
| Planning specificity | Covered through named local-matcher candidates and a scheduler/drop-policy component for AC-4.1 / AC-4.4. |
|
||||
|
||||
## Timeliness Sensitivity Assessment
|
||||
|
||||
- **Research topic**: GPS-denied UAV visual localization using VO/IMU, VPR, cross-view matching, Jetson inference, and ArduPilot MAVLink output.
|
||||
- **Sensitivity level**: High.
|
||||
- **Rationale**: Hardware, TensorRT, JetPack, VPR foundation models, and ArduPilot EKF behavior change across releases; algorithmic principles are more stable.
|
||||
- **Source time window**: Prefer 2024-2026 for implementation/tooling; older foundational VO/SLAM papers are acceptable only as background.
|
||||
- **Priority official sources**: ArduPilot/MAVLink docs, NVIDIA Jetson/TensorRT/Isaac ROS docs, official provider imagery pages, official library repos.
|
||||
@@ -1,349 +0,0 @@
|
||||
# Source Registry
|
||||
|
||||
## Source #1
|
||||
- **Title**: ArduPilot MAVProxy GPS Input documentation
|
||||
- **Link**: https://ardupilot.org/mavproxy/docs/modules/GPSInput.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: ArduPilot/MAVProxy docs accessed 2026-04-29
|
||||
- **Target Audience**: ArduPilot developers/operators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: GPSInput forwards JSON GPS data to the flight controller and requires `GPS1_TYPE=14` for MAVLink GPS input.
|
||||
- **Related Sub-question**: Flight-controller output path
|
||||
|
||||
## Source #2
|
||||
- **Title**: MAVLink common message spec — GPS_INPUT
|
||||
- **Link**: https://mavlink.io/en/messages/common.html#GPS_INPUT
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: MAVLink common.xml accessed 2026-04-29
|
||||
- **Target Audience**: MAVLink integrators
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: `GPS_INPUT` is a raw GPS sensor input message, not the global position estimate; all non-ignored fields must be provided.
|
||||
- **Related Sub-question**: MAVLink output and covariance
|
||||
|
||||
## Source #3
|
||||
- **Title**: ArduPilot issue #30076 — Fixing ExternalNav + GPS
|
||||
- **Link**: https://github.com/ArduPilot/ardupilot/issues/30076
|
||||
- **Tier**: L4
|
||||
- **Publication Date**: 2025-05-15
|
||||
- **Timeliness Status**: Currently relevant, but version-specific
|
||||
- **Version Info**: ArduPilot 4.6 beta context; issue marked closed by 2025-12-03
|
||||
- **Target Audience**: ArduPilot EKF developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Documents a real EKF3 issue where GPS and external navigation could be fused unexpectedly, motivating version-pinned SITL before enabling `ODOMETRY` alongside GPS.
|
||||
- **Related Sub-question**: Safe ArduPilot integration
|
||||
|
||||
## Source #4
|
||||
- **Title**: NVIDIA Jetson Orin Nano Developer Kit Gets a Super Boost
|
||||
- **Link**: https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost
|
||||
- **Tier**: L2
|
||||
- **Publication Date**: 2024-12-17
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: JetPack 6.1 Super mode
|
||||
- **Target Audience**: Jetson edge AI developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Confirms Jetson Orin Nano Super 67 sparse TOPS, 8 GB LPDDR5, 102 GB/s memory bandwidth, and 25 W mode.
|
||||
- **Related Sub-question**: Onboard runtime feasibility
|
||||
|
||||
## Source #5
|
||||
- **Title**: NVIDIA JetPack 6.2 Super Mode benchmarks
|
||||
- **Link**: https://developer.nvidia.com/blog/nvidia-jetpack-6-2-brings-super-mode-to-nvidia-jetson-orin-nano-and-jetson-orin-nx-modules/
|
||||
- **Tier**: L2
|
||||
- **Publication Date**: 2025-01
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: JetPack 6.2
|
||||
- **Target Audience**: Jetson AI developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Provides ViT benchmark rates for Orin Nano 8 GB in Super Mode, including DINOv2-base-patch14.
|
||||
- **Related Sub-question**: VPR runtime feasibility
|
||||
|
||||
## Source #6
|
||||
- **Title**: NVIDIA Isaac ROS cuVSLAM documentation
|
||||
- **Link**: https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: Isaac ROS latest docs accessed 2026-04-29
|
||||
- **Target Audience**: Robotics developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: cuVSLAM is GPU-accelerated stereo-visual-inertial SLAM and can use IMU fallback, but its primary fit is stereo, not v1 monocular nadir camera.
|
||||
- **Related Sub-question**: VO/VIO candidate fit
|
||||
|
||||
## Source #7
|
||||
- **Title**: ORB-SLAM3 repository/license
|
||||
- **Link**: https://github.com/UZ-SLAMLab/ORB_SLAM3
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Requires license review
|
||||
- **Version Info**: GPLv3; release v1.0, active repo updates observed in search
|
||||
- **Target Audience**: SLAM researchers/developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Supports monocular-inertial SLAM but GPLv3 is a commercial/product integration risk.
|
||||
- **Related Sub-question**: VO/VIO candidate fit
|
||||
|
||||
## Source #8
|
||||
- **Title**: VINS-Fusion repository/license
|
||||
- **Link**: https://github.com/HKUST-Aerial-Robotics/VINS-Fusion
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Requires license review
|
||||
- **Version Info**: GPL-3.0
|
||||
- **Target Audience**: Robotics/VIO developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Supports monocular + IMU visual-inertial estimation, but GPL-3.0 and ROS-centric integration make it unsuitable as a direct embedded production dependency without approval.
|
||||
- **Related Sub-question**: VO/VIO candidate fit
|
||||
|
||||
## Source #9
|
||||
- **Title**: LightGlue / SuperPoint TensorRT ecosystem
|
||||
- **Link**: https://github.com/cvg/LightGlue and https://github.com/yuefanhao/SuperPoint-LightGlue-TensorRT
|
||||
- **Tier**: L1/L4
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Needs license and Jetson benchmark verification
|
||||
- **Version Info**: LightGlue Apache-2.0; SuperPoint weights have separate MagicLeap license
|
||||
- **Target Audience**: CV developers
|
||||
- **Research Boundary Match**: Full match for local matching, partial for licensing
|
||||
- **Summary**: LightGlue/SuperPoint is a strong local matching candidate, but SuperPoint weight licensing and Orin Nano performance must be validated before product selection.
|
||||
- **Related Sub-question**: Local cross-view matcher
|
||||
|
||||
## Source #10
|
||||
- **Title**: AnyLoc visual place recognition
|
||||
- **Link**: https://github.com/AnyLoc/AnyLoc and https://arxiv.org/pdf/2308.00688
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2023
|
||||
- **Timeliness Status**: Currently useful, but benchmark against project data required
|
||||
- **Version Info**: BSD-3-Clause, DINOv2-based VPR
|
||||
- **Target Audience**: Robotics/VPR researchers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: DINOv2 + VLAD VPR generalizes across domains and is suitable for offline descriptor generation plus event-triggered online retrieval.
|
||||
- **Related Sub-question**: VPR strategy
|
||||
|
||||
## Source #11
|
||||
- **Title**: FAISS installation and GPU package support
|
||||
- **Link**: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Needs direct build validation on Jetson
|
||||
- **Version Info**: Search result notes GPU packages are x86-64 only; ARM64 GPU requires source build
|
||||
- **Target Audience**: Vector search developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: FAISS CPU is viable on ARM64; FAISS GPU on Jetson may require source build and should not be assumed for v1.
|
||||
- **Related Sub-question**: VPR index runtime
|
||||
|
||||
## Source #12
|
||||
- **Title**: Fixed-wing satellite-aided visual odometry paper
|
||||
- **Link**: https://www.mdpi.com/2076-3417/14/16/7420
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2024
|
||||
- **Timeliness Status**: Currently relevant
|
||||
- **Version Info**: Applied Sciences 14(16), 7420
|
||||
- **Target Audience**: UAV navigation researchers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: Demonstrates fixed-wing GPS-denied visual odometry aided by satellite imagery over >17 km and >1000 m altitude, reducing accumulated error.
|
||||
- **Related Sub-question**: Accuracy and architecture feasibility
|
||||
|
||||
## Source #13
|
||||
- **Title**: Cross-view UAV/satellite geolocalization paper
|
||||
- **Link**: https://www.mdpi.com/1424-8220/24/12/3719
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2024
|
||||
- **Timeliness Status**: Currently relevant
|
||||
- **Version Info**: Sensors 24(12), 3719
|
||||
- **Target Audience**: UAV visual localization researchers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Shows current UAV-to-satellite localization research addressing cross-source appearance and similar-scene interference.
|
||||
- **Related Sub-question**: Cross-view matching risks
|
||||
|
||||
## Source #14
|
||||
- **Title**: Airbus Pléiades Neo imagery
|
||||
- **Link**: https://www.airbus.com/en/pleiades-neo-satellite-imagery
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: Official Airbus page
|
||||
- **Target Audience**: Satellite imagery customers
|
||||
- **Research Boundary Match**: Full match for imagery resolution
|
||||
- **Summary**: Confirms 30 cm native imagery and 3.5 m CE90 location accuracy.
|
||||
- **Related Sub-question**: Satellite cache SLA
|
||||
|
||||
## Source #15
|
||||
- **Title**: Vantor/Maxar WorldView Legion page
|
||||
- **Link**: https://www.maxar.com/worldview-legion
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: Official Vantor/Maxar page
|
||||
- **Target Audience**: Satellite imagery customers
|
||||
- **Research Boundary Match**: Full match for imagery resolution
|
||||
- **Summary**: Confirms 30 cm-class imagery capacity and native <5 m CE90 accuracy.
|
||||
- **Related Sub-question**: Satellite cache SLA
|
||||
|
||||
## Source #16
|
||||
- **Title**: OpenStreetMap zoom levels
|
||||
- **Link**: https://wiki.openstreetmap.org/wiki/Zoom_levels
|
||||
- **Tier**: L2
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid for WebMercator math
|
||||
- **Version Info**: Accessed 2026-04-29
|
||||
- **Target Audience**: Map tile developers
|
||||
- **Research Boundary Match**: Full match for tile math
|
||||
- **Summary**: Documents meters-per-pixel by zoom and latitude correction, showing zoom alone is not a physical resolution contract.
|
||||
- **Related Sub-question**: Tile cache convention
|
||||
|
||||
## Source #17
|
||||
- **Title**: FastAPI first steps and OpenAPI docs
|
||||
- **Link**: https://fastapi.tiangolo.com/tutorial/first-steps/
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: FastAPI docs accessed 2026-04-29
|
||||
- **Target Audience**: Python API developers
|
||||
- **Research Boundary Match**: Full match
|
||||
- **Summary**: FastAPI automatically exposes Swagger UI, ReDoc, and `/openapi.json`.
|
||||
- **Related Sub-question**: Local API and OpenAPI
|
||||
|
||||
## Source #18
|
||||
- **Title**: GDAL Cloud Optimized GeoTIFF driver
|
||||
- **Link**: https://gdal.org/en/stable/drivers/raster/cog.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Currently valid as of access date
|
||||
- **Version Info**: GDAL stable docs
|
||||
- **Target Audience**: Geospatial developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: COG is a standard GeoTIFF profile optimized for tiled/ranged access and geospatial processing, useful for preprocessing and service exchange.
|
||||
- **Related Sub-question**: Cache format
|
||||
|
||||
## Source #19
|
||||
- **Title**: NVIDIA TensorRT issue #4348 — DINOv2-S Jetson Orin measurements
|
||||
- **Link**: https://github.com/NVIDIA/TensorRT/issues/4348
|
||||
- **Tier**: L4
|
||||
- **Publication Date**: 2025-02-05
|
||||
- **Timeliness Status**: Currently useful as deployment-risk evidence
|
||||
- **Version Info**: TensorRT 10.4 on JetPack 6.1, Jetson Orin
|
||||
- **Target Audience**: TensorRT/Jetson developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Reports DINOv2-S TensorRT runs around 22-23 ms GPU compute on Jetson Orin with limited INT8 speedup, showing project-specific benchmarking is required.
|
||||
- **Related Sub-question**: VPR runtime feasibility
|
||||
|
||||
## Source #20
|
||||
- **Title**: Magic Leap SuperPoint pretrained network license
|
||||
- **Link**: https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: Repository license accessed 2026-04-29
|
||||
- **Target Audience**: Product dependency/legal reviewers
|
||||
- **Research Boundary Match**: Full match for local-feature dependency selection
|
||||
- **Summary**: The official SuperPoint pretrained package is licensed for academic or non-profit noncommercial research use, so the weights cannot be treated as a product dependency without a separate commercial grant.
|
||||
- **Related Sub-question**: Local matcher licensing
|
||||
|
||||
## Source #21
|
||||
- **Title**: LightGlue license and extractor-license discussion
|
||||
- **Link**: https://github.com/cvg/LightGlue/blob/main/LICENSE and https://github.com/cvg/LightGlue/issues/38
|
||||
- **Tier**: L1/L4
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: LightGlue repository accessed 2026-04-29
|
||||
- **Target Audience**: CV developers and legal reviewers
|
||||
- **Research Boundary Match**: Full match for local matcher selection
|
||||
- **Summary**: LightGlue itself is Apache-2.0, but extractor weights such as SuperPoint carry their own license; the matching layer and extractor license must be reviewed separately.
|
||||
- **Related Sub-question**: Local matcher licensing and deployment
|
||||
|
||||
## Source #22
|
||||
- **Title**: AnyLoc DINOv2 VLAD descriptor release
|
||||
- **Link**: https://github.com/AnyLoc/DINO
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: AnyLoc/DINO README and release metadata accessed 2026-04-29
|
||||
- **Target Audience**: VPR developers
|
||||
- **Research Boundary Match**: Full match for VPR descriptor sizing
|
||||
- **Summary**: AnyLoc DINOv2 VLAD examples produce 49,152-dimensional global descriptors and provide pretrained cluster centers, making descriptor compression/index sizing a first-order Jetson/cache concern.
|
||||
- **Related Sub-question**: VPR memory and cache footprint
|
||||
|
||||
## Source #23
|
||||
- **Title**: NVIDIA Isaac ROS cuVSLAM documentation
|
||||
- **Link**: https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: Isaac ROS latest docs accessed 2026-04-29
|
||||
- **Target Audience**: Robotics developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: cuVSLAM is described as stereo-visual-inertial SLAM/odometry, supports multiple stereo cameras, and uses IMU-only propagation only for short degraded intervals around one second.
|
||||
- **Related Sub-question**: VO/VIO candidate fit
|
||||
|
||||
## Source #24
|
||||
- **Title**: GDAL Cloud Optimized GeoTIFF driver and OGC COG standard
|
||||
- **Link**: https://gdal.org/en/stable/drivers/raster/cog.html and http://www.opengis.net/doc/is/COG/1.0
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: GDAL stable docs and OGC COG 1.0 accessed 2026-04-29
|
||||
- **Target Audience**: Geospatial/cache engineers
|
||||
- **Research Boundary Match**: Full match for cache exchange format
|
||||
- **Summary**: COG supports tiled imagery, overviews, and compression options including JPEG/DEFLATE/ZSTD/WEBP; actual bytes per pixel must be measured on representative provider imagery rather than assumed from zoom level.
|
||||
- **Related Sub-question**: Satellite cache storage feasibility
|
||||
|
||||
## Source #25
|
||||
- **Title**: AerialVL public aerial visual localization dataset
|
||||
- **Link**: https://github.com/hmf21/AerialVL and https://udspace.udel.edu/items/338c0b7c-993b-476c-a095-6820c6f1c031
|
||||
- **Tier**: L1/L2
|
||||
- **Publication Date**: 2024
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: Dataset/repository accessed 2026-04-29
|
||||
- **Target Audience**: UAV localization researchers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Provides long aerial visual localization trajectories, RGB imagery, GNSS, reference satellite patches, and VPR/sequence localization/VO evaluation tasks; terrain and sensor details still differ from the fixed-wing deployment target.
|
||||
- **Related Sub-question**: Validation data availability
|
||||
|
||||
## Source #26
|
||||
- **Title**: UAV-VisLoc dataset
|
||||
- **Link**: https://arxiv.org/html/2405.11936v1
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2024
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: arXiv version accessed 2026-04-29
|
||||
- **Target Audience**: UAV geo-localization researchers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: Provides UAV images and satellite maps with metadata including coordinates, altitude, heading, and capture dates; useful for cross-view retrieval evaluation but not a complete substitute for the project's FC IMU traces.
|
||||
- **Related Sub-question**: Validation data availability
|
||||
|
||||
## Source #27
|
||||
- **Title**: LightGlue feature extractor support for ALIKED, DISK, and SIFT
|
||||
- **Link**: https://github.com/cvg/LightGlue
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: LightGlue repository accessed 2026-04-29
|
||||
- **Target Audience**: CV developers
|
||||
- **Research Boundary Match**: Full match for local matcher productization
|
||||
- **Summary**: LightGlue supports multiple extractor families including ALIKED, DISK, SuperPoint, and SIFT; ALIKED/LightGlue ONNX deployment examples exist, and the LightGlue repository is Apache-2.0.
|
||||
- **Related Sub-question**: License-cleared local feature candidates
|
||||
|
||||
## Source #28
|
||||
- **Title**: DeDoDe local feature matcher and deployment ports
|
||||
- **Link**: https://github.com/Parskatt/DeDoDe
|
||||
- **Tier**: L1
|
||||
- **Publication Date**: 2024
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: Repository and release metadata accessed 2026-04-29
|
||||
- **Target Audience**: CV developers
|
||||
- **Research Boundary Match**: Partial overlap
|
||||
- **Summary**: DeDoDe is MIT-licensed, provides detector/descriptor models, and has ONNX/TensorRT ports; larger DINOv2-based descriptors still require model-size and runtime validation.
|
||||
- **Related Sub-question**: License-cleared local feature candidates
|
||||
|
||||
## Source #29
|
||||
- **Title**: OpenCV SIFT patent expiration and main-module availability
|
||||
- **Link**: https://github.com/opencv/opencv/blob/4.x/modules/features2d/src/sift.dispatch.cpp
|
||||
- **Tier**: L1/L3
|
||||
- **Publication Date**: n/a
|
||||
- **Timeliness Status**: Current as of access date
|
||||
- **Version Info**: OpenCV 4.x source and patent-expiration references accessed 2026-04-29
|
||||
- **Target Audience**: CV developers/legal reviewers
|
||||
- **Research Boundary Match**: Full match for classical feature baseline
|
||||
- **Summary**: SIFT is available in OpenCV's main features2d module after patent expiration, making SIFT a practical commercial-safe classical baseline.
|
||||
- **Related Sub-question**: License-cleared local feature candidates
|
||||
@@ -1,244 +0,0 @@
|
||||
# Fact Cards
|
||||
|
||||
## Fact #1
|
||||
- **Statement**: ArduPilot MAVProxy GPSInput requires `GPS1_TYPE=14` to accept MAVLink GPS input.
|
||||
- **Source**: Source #1
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: ArduPilot v1 integration
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Flight-controller output
|
||||
- **Fit Impact**: Supports `GPS_INPUT` v1 selection
|
||||
|
||||
## Fact #2
|
||||
- **Statement**: MAVLink `GPS_INPUT` is a raw GPS sensor input message, not the global position estimate of the system.
|
||||
- **Source**: Source #2
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: MAVLink integrators
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Output semantics and covariance
|
||||
- **Fit Impact**: Requires careful accuracy/covariance fields and FC EKF configuration
|
||||
|
||||
## Fact #3
|
||||
- **Statement**: ArduPilot issue #30076 documented EKF3 instability when external navigation and GPS were fused unexpectedly; the issue is version-specific and marked closed, but it proves the risk class is real.
|
||||
- **Source**: Source #3
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: ArduPilot v1/v1.1 release planning
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Autopilot source fusion
|
||||
- **Fit Impact**: Supports GPS_INPUT-only v1 and SITL gate before ODOMETRY
|
||||
|
||||
## Fact #4
|
||||
- **Statement**: Jetson Orin Nano Super officially provides 67 sparse TOPS, 8 GB LPDDR5, 102 GB/s memory bandwidth, and a 25 W mode.
|
||||
- **Source**: Source #4
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Onboard runtime sizing
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Hardware envelope
|
||||
- **Fit Impact**: Supports feasibility, but memory/thermal profiling remains mandatory
|
||||
|
||||
## Fact #5
|
||||
- **Statement**: NVIDIA reports TensorRT FP16 ViT benchmark rates on Orin Nano 8 GB Super Mode, including DINOv2-base-patch14 around 126 FPS in the published table.
|
||||
- **Source**: Source #5
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: VPR runtime planning
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: VPR model feasibility
|
||||
- **Fit Impact**: Supports conditional DINOv2 VPR, not per-frame full pipeline guarantee
|
||||
|
||||
## Fact #6
|
||||
- **Statement**: A TensorRT issue report measured DINOv2-S on Jetson Orin at roughly 22-23 ms GPU compute with limited INT8 speedup.
|
||||
- **Source**: Source #19
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Jetson VPR deployment
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Model optimization risk
|
||||
- **Fit Impact**: Requires project benchmark; INT8 speedup must not be assumed
|
||||
|
||||
## Fact #7
|
||||
- **Statement**: NVIDIA cuVSLAM is a GPU-accelerated stereo-visual-inertial SLAM and odometry library; it can use IMU fallback but is designed around stereo camera input.
|
||||
- **Source**: Source #6
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: VO/VIO component selection
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Required camera inputs
|
||||
- **Fit Impact**: Reject as lead v1 VO for a single fixed monocular nav camera
|
||||
|
||||
## Fact #8
|
||||
- **Statement**: ORB-SLAM3 and VINS-Fusion support monocular-inertial modes, but both are GPL-family licensed.
|
||||
- **Source**: Source #7, Source #8
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Product dependency selection
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Licensing and integration
|
||||
- **Fit Impact**: Experimental/reference only unless legal approval is obtained
|
||||
|
||||
## Fact #9
|
||||
- **Statement**: A 2024 fixed-wing UAV study used satellite imagery to reduce visual odometry accumulated error over missions above 1000 m and over 17 km.
|
||||
- **Source**: Source #12
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Architecture feasibility
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Satellite-aided VO
|
||||
- **Fit Impact**: Supports hybrid VO + satellite anchor architecture
|
||||
|
||||
## Fact #10
|
||||
- **Statement**: Recent UAV-to-satellite cross-view localization research targets source-domain appearance differences and similar-scene interference, confirming these are core risks.
|
||||
- **Source**: Source #13
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Cross-view matching design
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: False-match risk
|
||||
- **Fit Impact**: Requires top-K VPR, local geometric verification, covariance gating, and stale-tile controls
|
||||
|
||||
## Fact #11
|
||||
- **Statement**: Airbus Pléiades Neo advertises native 30 cm imagery and 3.5 m CE90 location accuracy.
|
||||
- **Source**: Source #14
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Satellite Service SLA
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Reference imagery resolution
|
||||
- **Fit Impact**: Supports 0.3 m/px ideal cache target
|
||||
|
||||
## Fact #12
|
||||
- **Statement**: Vantor/Maxar states its constellation provides 30 cm-class imagery and native <5 m CE90 accuracy.
|
||||
- **Source**: Source #15
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Satellite Service SLA
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Reference imagery resolution
|
||||
- **Fit Impact**: Supports 0.3 m/px ideal cache target
|
||||
|
||||
## Fact #13
|
||||
- **Statement**: OpenStreetMap zoom-level math gives meters-per-pixel at the equator and requires multiplying by cosine(latitude); zoom alone does not define physical pixel size.
|
||||
- **Source**: Source #16
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Cache engineering
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Tile resolution convention
|
||||
- **Fit Impact**: Supports explicit pixel-size cache contract
|
||||
|
||||
## Fact #14
|
||||
- **Statement**: FastAPI automatically exposes interactive API docs and an OpenAPI schema.
|
||||
- **Source**: Source #17
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Local API design
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: OpenAPI documentation
|
||||
- **Fit Impact**: Supports Python/FastAPI for local health/session/object API
|
||||
|
||||
## Fact #15
|
||||
- **Statement**: FAISS GPU packages should not be assumed on Jetson ARM64; CPU FAISS or source-built GPU FAISS must be validated.
|
||||
- **Source**: Source #11
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: VPR index deployment
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Index runtime
|
||||
- **Fit Impact**: Select CPU FAISS/HNSW-flat as v1 baseline; GPU FAISS is optimization only
|
||||
|
||||
## Fact #16
|
||||
- **Statement**: COG is a standard GeoTIFF profile useful for geospatial processing, while MBTiles-style SQLite tile packages are better aligned with local offline tile lookup.
|
||||
- **Source**: Source #18 plus offline cache search
|
||||
- **Phase**: Phase 2
|
||||
- **Target Audience**: Cache storage
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Cache format
|
||||
- **Fit Impact**: Select COG/GeoTIFF for Satellite Service exchange and SQLite/MBTiles-like package for onboard lookup/index sidecars
|
||||
|
||||
## Fact #17
|
||||
- **Statement**: Official Magic Leap SuperPoint pretrained weights are restricted to academic or non-profit noncommercial research use.
|
||||
- **Source**: Source #20
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: Product dependency selection
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Local matcher licensing
|
||||
- **Fit Impact**: Reject official SuperPoint weights as a v1 product dependency unless a commercial license is obtained
|
||||
|
||||
## Fact #18
|
||||
- **Statement**: LightGlue's Apache-2.0 license does not automatically license upstream feature extractors; extractor weights must be reviewed separately.
|
||||
- **Source**: Source #21
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: Product dependency selection
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Local matcher licensing
|
||||
- **Fit Impact**: Select LightGlue only behind a license-cleared extractor interface
|
||||
|
||||
## Fact #19
|
||||
- **Statement**: AnyLoc DINOv2 VLAD examples produce 49,152-dimensional descriptors, so a large multi-scale VPR gallery can become a memory/storage problem if descriptors are stored uncompressed.
|
||||
- **Source**: Source #22
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: VPR/cache developers
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: VPR descriptor footprint
|
||||
- **Fit Impact**: Requires PCA/quantization or smaller descriptors before the approach can satisfy the 8 GB memory and 10 GB cache budgets
|
||||
|
||||
## Fact #20
|
||||
- **Statement**: NVIDIA describes cuVSLAM as stereo-visual-inertial SLAM/odometry and documents IMU-only degraded tracking as suitable only for short intervals around one second.
|
||||
- **Source**: Source #23
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: VO/VIO component selection
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Required camera inputs and fallback duration
|
||||
- **Fit Impact**: Keep cuVSLAM rejected as a v1 lead dependency for the fixed monocular navigation camera, but keep it as a Jetson benchmark/reference if hardware changes
|
||||
|
||||
## Fact #21
|
||||
- **Statement**: COG supports tiled storage, overviews, and multiple compression profiles, but the docs do not define a universal bytes-per-pixel budget for 0.3 m satellite imagery.
|
||||
- **Source**: Source #24
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: Cache engineers
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Satellite cache storage
|
||||
- **Fit Impact**: Treat the 10 GB cache as a measured acceptance gate, not as proven by zoom-level math
|
||||
|
||||
## Fact #22
|
||||
- **Statement**: AerialVL provides public aerial localization trajectories with RGB imagery, GNSS, and satellite reference patches, but it is only a partial match for the fixed-wing/ArduPilot/IMU deployment target.
|
||||
- **Source**: Source #25
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: Validation planning
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Dataset realism
|
||||
- **Fit Impact**: Use public datasets for early VPR and cross-view benchmarks, but require SITL or real FC IMU traces for final fusion validation
|
||||
|
||||
## Fact #23
|
||||
- **Statement**: UAV-VisLoc provides UAV images, satellite maps, and metadata such as coordinates, altitude, heading, and capture date, but does not replace the need for project-specific IMU/camera timing traces.
|
||||
- **Source**: Source #26
|
||||
- **Phase**: Mode B Assessment
|
||||
- **Target Audience**: Validation planning
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Dataset realism
|
||||
- **Fit Impact**: Add dataset adapters for retrieval/localization tests while keeping final acceptance tied to project replay and ArduPilot SITL
|
||||
|
||||
## Fact #24
|
||||
- **Statement**: LightGlue supports ALIKED, DISK, SIFT, and other extractors, so the local matcher can name concrete license-cleared candidates instead of an abstract "license-cleared extractor."
|
||||
- **Source**: Source #27
|
||||
- **Phase**: Mode B Round 2
|
||||
- **Target Audience**: Local matcher productization
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Local feature selection
|
||||
- **Fit Impact**: Select ALIKED + LightGlue and OpenCV SIFT/AKAZE as concrete v1 candidates; keep SuperPoint rejected unless licensed
|
||||
|
||||
## Fact #25
|
||||
- **Statement**: DeDoDe is MIT-licensed and has ONNX/TensorRT deployment ports, making it a plausible learned-feature fallback, but its model size and DINOv2-related variants still require Jetson validation.
|
||||
- **Source**: Source #28
|
||||
- **Phase**: Mode B Round 2
|
||||
- **Target Audience**: Local matcher productization
|
||||
- **Confidence**: Medium
|
||||
- **Related Dimension**: Local feature selection
|
||||
- **Fit Impact**: Mark DeDoDe as experimental fallback until runtime and cross-domain accuracy are measured
|
||||
|
||||
## Fact #26
|
||||
- **Statement**: SIFT is available in OpenCV's main features2d module after patent expiration, supporting its use as a commercial-safe classical baseline.
|
||||
- **Source**: Source #29
|
||||
- **Phase**: Mode B Round 2
|
||||
- **Target Audience**: Local matcher productization
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Classical matching baseline
|
||||
- **Fit Impact**: Select OpenCV SIFT/AKAZE as the legal baseline for local geometric verification and regression tests
|
||||
|
||||
## Fact #27
|
||||
- **Statement**: With 3 Hz camera input and <400 ms p95 output latency, a FIFO frame queue can violate latency even when every component is individually fast enough.
|
||||
- **Source**: Derived from AC-4.1 and AC-4.4 timing constraints
|
||||
- **Phase**: Mode B Round 2
|
||||
- **Target Audience**: Real-time pipeline design
|
||||
- **Confidence**: High
|
||||
- **Related Dimension**: Runtime scheduling
|
||||
- **Fit Impact**: Add a bounded latest-frame scheduler and explicit drop/backpressure policy to the architecture
|
||||
@@ -1,44 +0,0 @@
|
||||
# Comparison Framework
|
||||
|
||||
## Selected Framework Type
|
||||
|
||||
Mode B weak-point assessment with exact-fit validation.
|
||||
|
||||
## Selected Dimensions
|
||||
|
||||
1. Functional fit against the Project Constraint Matrix.
|
||||
2. Licensing and productization risk.
|
||||
3. Jetson runtime and memory fit.
|
||||
4. Cache/storage fit.
|
||||
5. Validation evidence quality.
|
||||
6. Security/safety impact.
|
||||
7. Selection status.
|
||||
|
||||
## Initial Population
|
||||
|
||||
| Component Area | Candidate / Decision | Functional Fit | Runtime / Storage Fit | Evidence | Status |
|
||||
|----------------|----------------------|----------------|-----------------------|----------|--------|
|
||||
| Runtime scheduling | Bounded latest-frame scheduler with explicit drop/backpressure policy | Fits AC-4.1 frame-drop allowance and AC-4.4 no-batching requirement | Prevents stale FIFO backlog when a frame takes near the 400 ms p95 budget | Fact #27 | Selected |
|
||||
| Relative motion | Custom planar VO/IMU | Fits fixed nadir monocular camera, flat terrain, FC attitude/altitude, and covariance output | Must run on downsampled/ROI imagery and skip under load | Facts #4, #9, #20 | Selected |
|
||||
| Relative motion | NVIDIA cuVSLAM | Official docs emphasize stereo-visual-inertial SLAM; not exact-fit for one fixed monocular nav camera | Good Jetson stack, but wrong input assumptions for v1 | Facts #7, #20 | Rejected for v1 |
|
||||
| Coarse VPR | AnyLoc/DINOv2-VLAD over chunks | Good retrieval shape for cross-domain top-K, conditional invocation only | 49,152-d descriptors require PCA/quantization/index-size proof | Facts #5, #6, #19 | Selected with compression gate |
|
||||
| Local refinement | Official Magic Leap SuperPoint weights | Good technical candidate for local features | Product license blocks commercial use without separate grant | Fact #17 | Rejected for product v1 |
|
||||
| Local refinement | ALIKED + LightGlue | Names a concrete license-cleared learned-feature candidate | Needs Jetson benchmark, ONNX/TensorRT proof, and cross-domain accuracy proof | Facts #18, #24 | Selected candidate |
|
||||
| Local refinement | OpenCV SIFT/AKAZE + classical matching | Commercial-safe baseline for geometric verification and regression tests | May be weaker on cross-domain sparse fields | Facts #10, #26 | Selected baseline |
|
||||
| Local refinement | DeDoDe | MIT-licensed learned-feature fallback with ONNX/TensorRT ports | Model size/runtime and DINOv2-related variants need validation | Fact #25 | Experimental only |
|
||||
| State estimation | ESKF in local NED/ENU | Owns covariance, source labels, outlier rejection, and false-position budget | CPU feasible if bounded; needs Monte Carlo calibration | Facts #1, #2, #9, #10 | Selected |
|
||||
| Flight-controller output | pymavlink `GPS_INPUT` only for v1 | Matches ArduPilot replacement-GPS framing | Low runtime load; fields must be honest raw-GPS-sensor values | Facts #1, #2, #3 | Selected |
|
||||
| Flight-controller output | Dual `GPS_INPUT` + `ODOMETRY` | Richer covariance/yaw semantics, but overlaps source fusion | Version-specific EKF risk remains | Fact #3 | Deferred to v1.1 |
|
||||
| Cache | 0.3-0.5 m/px COG/GeoTIFF exchange + SQLite package | Correct service/onboard split | 10 GB budget requires measured compression + descriptor/index proof | Facts #13, #16, #21 | Selected with storage gate |
|
||||
| Validation | Public datasets only | Useful for VPR/cross-view early proof | Does not cover project FC IMU timing and ArduPilot injection | Facts #22, #23 | Insufficient alone |
|
||||
| API | FastAPI local service | Fits OpenAPI/local control/object localization | Keep out of hot path; default docs at `/docs`, `/redoc`, OpenAPI path configurable | Fact #14, Context7 FastAPI docs | Selected |
|
||||
|
||||
## Rejected / Deferred Candidates
|
||||
|
||||
| Candidate | Reason |
|
||||
|-----------|--------|
|
||||
| Official SuperPoint pretrained weights as direct product dependency | Noncommercial research license blocks product use without separate commercial permission. |
|
||||
| Uncompressed AnyLoc VLAD descriptors in the runtime cache | 49,152-dimensional descriptors can consume hundreds of MB to GB once multi-scale chunks and variants are included. |
|
||||
| Per-frame DINOv2 VPR | Wastes latency/thermal budget and conflicts with AC-8.6 conditional VPR. |
|
||||
| GPS_INPUT + ODOMETRY dual emission in v1 | Still too risky without version-pinned ArduPilot SITL proving no source double-fusion. |
|
||||
| Public datasets as final validation substitute | They do not replace FC IMU, camera timing, thermal, and MAVLink injection evidence from the actual deployment stack. |
|
||||
@@ -1,111 +0,0 @@
|
||||
# Reasoning Chain
|
||||
|
||||
## Dimension 1: Local Matcher Product Fit
|
||||
|
||||
### Fact Confirmation
|
||||
SuperPoint-style features remain technically attractive for local geometric verification, but the official Magic Leap pretrained weights are noncommercial research-only (Fact #17). LightGlue itself is Apache-2.0, but it does not license upstream extractors (Fact #18). LightGlue supports ALIKED, DISK, SIFT, and other extractors (Fact #24), DeDoDe is MIT-licensed with deployment ports (Fact #25), and OpenCV SIFT is now a commercial-safe classical baseline (Fact #26).
|
||||
|
||||
### Reference Comparison
|
||||
`solution_draft02.md` fixed the SuperPoint licensing issue but left "license-cleared extractor" too abstract for planning. The architecture can keep the local-verification stage, but planning needs named candidates so benchmark and licensing tasks can be decomposed.
|
||||
|
||||
### Conclusion
|
||||
Reject official SuperPoint pretrained weights for product v1 unless a commercial license is obtained. Select ALIKED + LightGlue as the first learned-feature candidate, OpenCV SIFT/AKAZE as the legal baseline, and DeDoDe as an experimental fallback pending Jetson/model-size validation.
|
||||
|
||||
### Confidence
|
||||
High for licensing; Medium for final extractor accuracy until benchmarked.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 1.5: Real-Time Scheduling
|
||||
|
||||
### Fact Confirmation
|
||||
The camera produces frames at 3 Hz, AC-4.1 allows <400 ms p95 end-to-end latency with up to ~10% dropped frames, and AC-4.4 forbids batching or delaying output (Fact #27).
|
||||
|
||||
### Reference Comparison
|
||||
A FIFO queue can accumulate stale frames whenever a heavy VPR or local-matching event exceeds the 333 ms camera interval. That would make the system accurate on old images while violating the flight-controller output latency budget.
|
||||
|
||||
### Conclusion
|
||||
Add a bounded latest-frame scheduler: camera queue size 1, explicit drop accounting, IMU propagation continues between image fixes, VPR/local matching run under deadlines, and every emitted `GPS_INPUT` references the freshest state timestamp.
|
||||
|
||||
### Confidence
|
||||
High.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 2: VPR Descriptor and Cache Footprint
|
||||
|
||||
### Fact Confirmation
|
||||
AnyLoc DINOv2 VLAD examples produce 49,152-dimensional descriptors (Fact #19). The operational area can be up to 400 km² with multi-scale, overlapping chunks.
|
||||
|
||||
### Reference Comparison
|
||||
Event-triggered VPR is still the right architecture, but uncompressed VLAD descriptors can quietly consume a large fraction of RAM/cache. For example, 4,000-10,000 chunks at 49,152 float32 values each is roughly 0.8-2.0 GB before multi-scale variants, indexes, metadata, and model/runtime memory.
|
||||
|
||||
### Conclusion
|
||||
Keep AnyLoc/DINOv2-style VPR as the lead retrieval family only with a mandatory descriptor-compression gate: PCA/float16/product quantization or a smaller descriptor must be chosen before implementation freeze. CPU FAISS/HNSW remains the v1 baseline until Jetson GPU indexing is proven.
|
||||
|
||||
### Confidence
|
||||
High for the footprint risk; Medium for the best compression/index choice.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 3: Satellite Cache Storage
|
||||
|
||||
### Fact Confirmation
|
||||
COG supports tiled imagery, overviews, and multiple compression profiles, but docs do not provide a universal bytes-per-pixel budget for the target imagery (Fact #21). Zoom level alone does not prove physical resolution (Fact #13).
|
||||
|
||||
### Reference Comparison
|
||||
The 10 GB persistent cache budget may be plausible with lossy compressed 0.3-0.5 m/px imagery and careful indexing, but it is not proven until representative Suite Satellite Service imagery is packaged with overviews, manifests, descriptors, and generated-tile sidecars.
|
||||
|
||||
### Conclusion
|
||||
Treat cache size as a hard measurement gate. The architecture should preserve the 10 GB budget but require a cache-packing benchmark before task decomposition commits to descriptor formats or chunk overlap settings.
|
||||
|
||||
### Confidence
|
||||
Medium-high.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 4: Relative Motion and cuVSLAM
|
||||
|
||||
### Fact Confirmation
|
||||
NVIDIA describes cuVSLAM as stereo-visual-inertial SLAM/odometry, with IMU-only degraded tracking suitable only for short intervals around one second (Fact #20). The project has one fixed downward navigation camera for v1.
|
||||
|
||||
### Reference Comparison
|
||||
cuVSLAM is a strong Jetson stack, but the selected v1 camera geometry does not match its documented primary input assumptions. A custom planar VO/IMU module can exploit nadir imagery, flat terrain, camera intrinsics, altitude, and FC attitude directly.
|
||||
|
||||
### Conclusion
|
||||
Keep custom planar VO/IMU as the lead. Keep cuVSLAM rejected for v1 product use, but preserve it as a benchmark/reference if the hardware changes to stereo or if NVIDIA documents an exact monocular deployment path matching the project.
|
||||
|
||||
### Confidence
|
||||
High.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 5: Validation Data
|
||||
|
||||
### Fact Confirmation
|
||||
AerialVL and UAV-VisLoc provide useful public aerial localization data, but they only partially match the fixed-wing, ArduPilot, high-rate IMU, camera-timing, and Ukraine steppe deployment context (Facts #22, #23).
|
||||
|
||||
### Reference Comparison
|
||||
Public datasets can validate VPR/cross-view ideas and regression-test retrieval. They cannot prove ESKF covariance, MAVLink timing, companion reboot, or false-position budgets without representative IMU and FC traces.
|
||||
|
||||
### Conclusion
|
||||
Use public datasets for early VPR/local-matcher benchmarking, then require ArduPilot SITL-generated IMU traces and at least one real FC/camera timing capture before final acceptance.
|
||||
|
||||
### Confidence
|
||||
High.
|
||||
|
||||
---
|
||||
|
||||
## Dimension 6: ArduPilot Output
|
||||
|
||||
### Fact Confirmation
|
||||
ArduPilot documents MAVLink GPS input with `GPS1_TYPE=14` (Fact #1), MAVLink defines `GPS_INPUT` as raw GPS sensor input rather than the global position estimate (Fact #2), and external-nav/GPS source-fusion issues are version-specific (Fact #3).
|
||||
|
||||
### Reference Comparison
|
||||
`ODOMETRY` is semantically richer but increases EKF source-interaction risk. v1 `GPS_INPUT` only is narrower and forces honest accuracy fields, but it matches the "GPS substitute" framing and avoids dual-source overlap.
|
||||
|
||||
### Conclusion
|
||||
Keep v1 `GPS_INPUT` only. Add a v1.1 research/testing backlog item for `ODOMETRY`, gated by exact ArduPilot release, params, and SITL proof.
|
||||
|
||||
### Confidence
|
||||
High.
|
||||
@@ -1,58 +0,0 @@
|
||||
# Validation Log
|
||||
|
||||
## Validation Scenario
|
||||
|
||||
Mode B validates the revised architecture against four weak-point scenarios:
|
||||
|
||||
1. Package a representative 400 km² cache slice at 0.3-0.5 m/px with COG exchange files, onboard SQLite/MBTiles-like tiles, manifests, overviews, VPR descriptors, and sidecars.
|
||||
2. Build a VPR index for 600-800 m chunks with 40-50% overlap, using raw AnyLoc/DINOv2-VLAD descriptors and at least one compressed descriptor variant.
|
||||
3. Benchmark local verification on the provided 60-frame sequence plus AerialVL/UAV-VisLoc-style public data using ALIKED + LightGlue, OpenCV SIFT/AKAZE, and DeDoDe by default.
|
||||
4. Replay a GPS-denied flight in ArduPilot SITL with synthetic IMU and camera timestamps, then repeat with real FC/camera logs once available.
|
||||
5. Stress the scheduler with synthetic heavy VPR/local-matcher events and verify stale frames are dropped rather than queued.
|
||||
|
||||
## Expected Based on Conclusions
|
||||
|
||||
- Cache package stays under 10 GB only if representative compression, overviews, descriptor compression, and metadata are measured together.
|
||||
- VPR remains event-triggered and can load/query within the 8 GB memory envelope after descriptor compression.
|
||||
- Official Magic Leap SuperPoint weights are absent from product builds unless commercial licensing is obtained.
|
||||
- ALIKED + LightGlue, OpenCV SIFT/AKAZE, and DeDoDe are benchmarked as distinct local-matcher candidates instead of hiding behind a generic "license-cleared extractor" label.
|
||||
- The camera queue remains bounded; frame drops are counted; emitted `GPS_INPUT` messages reference fresh estimator timestamps.
|
||||
- Local matcher acceptance is based on measured geodetic error, inlier quality, covariance consistency, and false-positive rejection, not only pixel MRE.
|
||||
- Public datasets are used for early retrieval/matching proof; final fusion and MAVLink acceptance uses SITL or real FC IMU/camera timing traces.
|
||||
- v1 emits `GPS_INPUT` only; no `ODOMETRY` message appears on the wire.
|
||||
|
||||
## Actual Validation Results
|
||||
|
||||
Not executed in research phase. This log defines the validation plan for implementation and test decomposition.
|
||||
|
||||
## Counterexamples
|
||||
|
||||
- A technically accurate SuperPoint benchmark is not product-usable if it depends on noncommercial weights.
|
||||
- A VPR demo can fit latency while still violating memory/cache budgets if uncompressed 49,152-dimensional descriptors are used at all chunk scales.
|
||||
- A 400 km² imagery package can fit compressed raster storage but fail once descriptors, indexes, generated tiles, and metadata are counted.
|
||||
- Public datasets can pass retrieval metrics but still miss FC timing, IMU covariance, thermal, and MAVLink source behavior.
|
||||
- A FIFO image queue can meet throughput on average while still violating the p95 freshness requirement under bursty VPR/matcher load.
|
||||
|
||||
## Review Checklist
|
||||
|
||||
- [x] Draft conclusions consistent with fact cards.
|
||||
- [x] No important Mode B weak-point dimensions missed for v1 architecture.
|
||||
- [x] No over-extrapolation from urban-only cross-view datasets.
|
||||
- [x] Selected components checked against the Project Constraint Matrix.
|
||||
- [x] Mismatches recorded as disqualifiers or gates.
|
||||
- [x] Noncommercial SuperPoint weights rejected for product v1 unless licensed.
|
||||
- [x] Concrete local-matcher candidates named for planning.
|
||||
- [x] Real-time scheduler/drop policy added as an architectural component.
|
||||
- [ ] Cache-packing benchmark still required with Suite Satellite Service sample imagery.
|
||||
- [ ] VPR descriptor compression benchmark still required on Jetson Orin Nano Super.
|
||||
- [ ] Real IMU/FC logs still required for final validation.
|
||||
|
||||
## Conclusions Requiring Revision
|
||||
|
||||
- Replace direct SuperPoint dependency language with named license-cleared matcher candidates; official Magic Leap weights are rejected for product v1 unless separately licensed.
|
||||
- Replace generic "license-cleared extractor" with named candidates: ALIKED + LightGlue, OpenCV SIFT/AKAZE, and DeDoDe fallback.
|
||||
- Add a bounded latest-frame scheduler and scheduler stress tests to the plan.
|
||||
- Add an explicit VPR descriptor compression/index-size gate before implementation freeze.
|
||||
- Add an explicit cache-packing benchmark before accepting the 10 GB persistent-cache budget.
|
||||
- If CPU FAISS/HNSW retrieval exceeds latency, benchmark source-built GPU FAISS or a smaller descriptor/index design.
|
||||
- If the selected lens fails 10-20 cm/px at <=1 km AGL, recalibrate footprint, VPR chunk size, and matcher scale assumptions.
|
||||
@@ -1,23 +0,0 @@
|
||||
# Component Fit Matrix
|
||||
|
||||
| Candidate | Intended Role | Project Constraints Checked | Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|
||||
|-----------|---------------|-----------------------------|----------|----------------------------|--------|--------------------|
|
||||
| Bounded latest-frame scheduler | Runtime flow control and deadline ownership | 3 Hz input, <400 ms p95 latency, <=10% frame drop allowance, no batching | Fact #27 | Must account for dropped frames and preserve timestamp correctness | Selected | Prevents stale FIFO backlog and makes AC-4.1/AC-4.4 implementable |
|
||||
| Custom planar VO/IMU module | Relative motion between satellite anchors | Fixed nadir monocular camera, flat-terrain assumption, FC attitude/altitude, 400 ms p95, no raw frame storage | Facts #4, #9, #20 | Requires calibration, time sync, covariance model, and Jetson benchmark | Selected | Best exact fit for v1 sensor geometry without stereo/GPL dependency |
|
||||
| NVIDIA cuVSLAM | Relative VO/VIO | Jetson support, IMU fallback, visual odometry | Facts #7, #20 | Official docs describe stereo-visual-inertial assumptions; IMU-only degraded tracking is short-duration | Rejected | Good Jetson stack but wrong v1 camera/input assumptions |
|
||||
| ORB-SLAM3 | Mono-inertial SLAM baseline | Monocular+IMU support | Fact #8 | GPLv3 product risk; generic SLAM assumptions not tailored to nadir fixed-wing | Experimental only | Useful offline benchmark, not selected product dependency |
|
||||
| VINS-Fusion | Mono-inertial estimator baseline | Monocular+IMU support | Fact #8 | GPL-3.0 product risk; ROS/research integration burden | Experimental only | Useful reference, not selected product dependency |
|
||||
| AnyLoc/DINOv2-VLAD style descriptors | Coarse VPR over precomputed chunks | Offline descriptor generation, event-triggered online query, sparse terrain, cache cap | Facts #5, #6, #10, #15, #19 | Raw 49,152-dimensional descriptors can violate memory/cache budgets unless compressed | Selected with compression gate | Strong fit as conditional top-K retrieval layer only after descriptor/index-size proof |
|
||||
| FAISS CPU/HNSW/Flat | VPR vector retrieval | Jetson ARM64, local cache, bounded chunk index | Fact #15 | GPU packages not assumed; source-built GPU optional only | Selected | Conservative v1 baseline |
|
||||
| FAISS GPU/cuVS | Accelerated VPR retrieval | CUDA Jetson, lower latency | Fact #15 | ARM64 GPU support requires source build and validation | Experimental only | Optimization path, not v1 assumption |
|
||||
| Official Magic Leap SuperPoint pretrained weights | Local feature extraction | Cross-view candidate refinement | Fact #17 | Noncommercial research-only license blocks product use without separate commercial permission | Rejected | Do not select as v1 product dependency by default |
|
||||
| ALIKED + LightGlue + RANSAC | Learned local satellite/UAV geometric match | Cross-view candidate refinement, inlier/covariance gate, license separation | Facts #18, #24 | Jetson speed and sparse-steppe accuracy still need benchmark | Selected candidate | Concrete license-cleared learned-feature path for v1 planning |
|
||||
| SIFT/AKAZE + classical matching fallback | License-safe local matching fallback | No special model weights, CPU/GPU fallback | Fact #10 | May be weaker on cross-domain imagery and sparse fields | Selected as fallback | Keeps implementation unblocked and provides a legal baseline |
|
||||
| DeDoDe | Learned local feature fallback | MIT-licensed model family and deployment ports | Fact #25 | Model size/runtime and DINOv2-related variants need validation | Experimental only | Useful fallback if ALIKED/SIFT miss accuracy or robustness targets |
|
||||
| ESKF in local NED/ENU | State fusion and covariance owner | IMU propagation, VO, satellite anchors, false-position budget, GPS_INPUT accuracy | Facts #1, #2, #9, #10 | Requires calibration and Monte Carlo validation | Selected | Necessary to satisfy confidence and safety ACs |
|
||||
| pymavlink `GPS_INPUT` emitter | Primary FC output | ArduPilot-only v1, GPS substitute framing, WGS84 output | Facts #1, #2, #3 | Must configure EKF/GPS params and validate in SITL | Selected | Safest v1 output path |
|
||||
| MAVLink `ODOMETRY` auxiliary | Rich covariance/yaw external nav | ArduPilot EKF3 external nav | Fact #3 | Source-fusion/version risk; can double-fuse if misconfigured | Needs user decision for v1.1 | Deferred until SITL confirms exact release behavior |
|
||||
| FastAPI local API | Health/session/object localization API | Python CV stack, OpenAPI docs, local-only service | Fact #14 | Must stay out of hot frame path | Selected | Satisfies API documentation and integration needs |
|
||||
| SQLite/MBTiles-like package | Onboard tile lookup and metadata | Offline cache, random lookup, 10 GB cap, freshness metadata | Facts #13, #16, #21 | Must handle corruption, sidecar schema, and measured descriptor/index budget | Selected with storage gate | Good embedded local cache shape, but the 10 GB budget needs a packing benchmark |
|
||||
| COG/GeoTIFF exchange | Satellite Service import/export | Geospatial processing, provider imagery, tile write-back ingestion | Facts #16, #21 | Not selected as sole hot lookup format; compression ratio must be measured | Selected | Good boundary format with Satellite Service |
|
||||
| AerialVL / UAV-VisLoc adapters | Early VPR and cross-view validation data | Public aerial localization data with satellite references | Facts #22, #23 | Partial match only; not enough for FC IMU/MAVLink acceptance | Selected for early validation | Useful benchmark inputs, not final acceptance evidence |
|
||||
@@ -1,44 +0,0 @@
|
||||
# Security Analysis
|
||||
|
||||
## Threat Model
|
||||
|
||||
| Asset | Threat Actors | Attack Vectors | Impact |
|
||||
|-------|---------------|----------------|--------|
|
||||
| Flight-controller position input | GPS spoofer, compromised companion process, malicious ground operator | False `GPS_INPUT`, EKF source misconfiguration, replayed MAVLink packets | Aircraft navigates to wrong location or leaves route/geofence |
|
||||
| Satellite cache | Compromised cache sync source, stale imagery, physical access attacker | Tile replacement, stale metadata, manifest tampering | False satellite anchors or cache poisoning |
|
||||
| Onboard tile write-back | Bad EKF state, compromised companion, service ingestion bug | Misaligned generated tiles promoted into shared basemap | Cross-flight error propagation |
|
||||
| Local API | Unauthorized network client, operator laptop malware | Object localization abuse, health/session tampering, denial of service | Data leakage or service interruption |
|
||||
| FDR logs | Physical capture, insider misuse | Extraction of route, imagery thumbnails, telemetry | Operational intelligence exposure |
|
||||
| Model/runtime artifacts | Supply-chain attacker, license-incompatible artifact source | Modified TensorRT engines, malicious Python packages, poisoned descriptors, noncommercial model weights in product build | Silent false outputs, code execution, or product license violation |
|
||||
|
||||
## Per-Component Security Requirements and Controls
|
||||
|
||||
| Component | Risk Level | Controls |
|
||||
|-----------|------------|----------|
|
||||
| Frame ingest and calibration | Medium | Store signed calibration profiles; reject unexpected resolution/intrinsics; log camera timestamp drift; never persist raw frames except allowed failure thumbnails. |
|
||||
| Satellite cache | High | Signed manifests, checksums per package, capture-date metadata, source identity, freshness gates, immutable trusted service-source tiles, local cache verification at startup. |
|
||||
| VPR and local matching | High | Treat retrieval as untrusted candidate only; require geometric verification, inlier thresholds, freshness checks, covariance consistency, and ESKF innovation gates. |
|
||||
| ESKF/state estimator | High | Conservative covariance floors, Mahalanobis gates, source-label transitions, false-position event logging, fail-closed to degraded fix_type when uncertainty is high. |
|
||||
| MAVLink output | High | Pin ArduPilot parameters; emit GPS_INPUT from one process; validate rate and sequence; no v1 ODOMETRY; send `fix_type=0` or degraded accuracy when estimator is invalid. |
|
||||
| Local API | Medium | Bind to localhost by default; require JWT/API key for network exposure; validate pixel bounds and request schema; rate-limit commands. |
|
||||
| FDR | Medium/high | Segment files with checksums, rollover logs, no raw frame archive, encrypt or protect storage when mission secrecy requires it. |
|
||||
| Tile write-back | High | Only write candidate tiles when parent pose covariance passes strict threshold; sidecar stores parent pose, covariance, source ancestry, and quality score; Suite Service requires multi-flight voting before trusted promotion. |
|
||||
| Dependency/runtime | Medium | Pin package versions, build TensorRT engines at install time, verify model checksums, run dependency vulnerability scanning in CI, and block noncommercial model weights such as official Magic Leap SuperPoint unless separately licensed. |
|
||||
|
||||
## Security Controls Summary
|
||||
|
||||
1. **Trust boundary**: The onboard system trusts only signed Satellite Service cache packages and live FC telemetry from the configured MAVLink link.
|
||||
2. **No direct provider calls**: Commercial provider credentials never live on the aircraft; the onboard system consumes only prebuilt cache artifacts.
|
||||
3. **Fail closed**: Match failures, stale tiles, bad covariance, or state-estimator inconsistency downgrade the source label and `GPS_INPUT` accuracy/fix state.
|
||||
4. **No dual-source v1 fusion**: `ODOMETRY` is intentionally disabled in v1 to avoid EKF source ambiguity.
|
||||
5. **Cache poisoning defense**: Generated tiles remain candidate/soft trust until covariance gates and Satellite Service voting promote them.
|
||||
6. **Local-first API**: The API is not part of the hot path and is local-only unless explicitly configured with authentication.
|
||||
7. **Forensics without raw-frame hoarding**: FDR captures enough to replay decisions while respecting the no-raw-photo restriction.
|
||||
|
||||
## Open Security Work
|
||||
|
||||
- Define the cache manifest schema and signing mechanism.
|
||||
- Pin ArduPilot version and parameter set in deployment docs.
|
||||
- Decide whether onboard FDR encryption is mandatory for the operating environment.
|
||||
- Select and scan final model weights and TensorRT engine build pipeline; confirm ALIKED/SIFT/DeDoDe artifact licenses before product packaging.
|
||||
- Add CI checks for dependency vulnerabilities and generated OpenAPI schema drift.
|
||||
@@ -1,198 +0,0 @@
|
||||
# Solution Draft
|
||||
|
||||
## Assessment Findings
|
||||
|
||||
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|
||||
|------------------------|----------------------------------------------|--------------|
|
||||
| "License-cleared extractor" in the local matcher | Too abstract for planning; tasks need concrete candidates and legal baselines. | Use ALIKED + LightGlue as the first learned-feature candidate, OpenCV SIFT/AKAZE as the legal baseline, and DeDoDe as an experimental fallback. |
|
||||
| Image pipeline without an explicit scheduler | A FIFO frame queue can violate <400 ms p95 latency even when individual stages are fast, because frames arrive every ~333 ms at 3 Hz. | Add a bounded latest-frame scheduler: camera queue size 1, explicit frame-drop accounting, deadline-aware VPR/matching, and timestamp-correct `GPS_INPUT`. |
|
||||
| SuperPoint + LightGlue-style local matching | Official Magic Leap SuperPoint pretrained weights are noncommercial research-only. | Reject official SuperPoint weights for product v1 unless a commercial license is obtained. |
|
||||
| AnyLoc/DINOv2-VLAD VPR chunks | Raw 49,152-dimensional descriptors can consume too much RAM/cache once multi-scale chunks, overlap, indexes, and metadata are included. | Keep event-triggered VPR, but add a mandatory descriptor compression/index-size gate before implementation freeze. |
|
||||
| 10 GB persistent satellite cache | 400 km² at 0.3-0.5 m/px plus overviews, manifests, VPR descriptors, and generated tiles is not proven by zoom-level math. | Keep the 10 GB target, but require a representative cache-packing benchmark using Suite Satellite Service sample imagery. |
|
||||
| Public datasets for validation | AerialVL/UAV-VisLoc are useful but do not prove FC IMU timing, covariance calibration, thermal behavior, or MAVLink source behavior. | Use public datasets for early VPR/matcher tests, then require ArduPilot SITL IMU traces and real FC/camera timing captures before final acceptance. |
|
||||
| Visual blackout during GPS spoofing | Clouds/whiteout can remove all visual signal exactly when real GPS cannot be trusted. | Add a Visual Blackout / IMU-Only Degraded Mode that rejects spoofed GPS, propagates solely from trusted prior state + FC IMU, grows covariance, and fails closed when uncertainty/duration exceeds the safety budget. |
|
||||
| `GPS_INPUT` + `ODOMETRY` hybrid | Richer `ODOMETRY` semantics are attractive, but source-fusion behavior is version-sensitive. | v1 emits `GPS_INPUT` only. `ODOMETRY` remains a v1.1 item gated by exact ArduPilot release and SITL proof. |
|
||||
|
||||
## Product Solution Description
|
||||
|
||||
Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the WGS84 coordinate of each navigation-camera frame center, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible `GPS_INPUT` messages with calibrated confidence.
|
||||
|
||||
```text
|
||||
Nav camera + FC IMU/attitude/altitude
|
||||
-> bounded latest-frame scheduler + timestamp sync
|
||||
-> calibration + frame normalization
|
||||
-> visual-health / blackout classifier
|
||||
-> planar VO/IMU relative motion
|
||||
-> conditional compressed-descriptor VPR over preloaded satellite chunks
|
||||
-> ALIKED/LightGlue or SIFT/AKAZE local geometric verification
|
||||
-> ESKF state + covariance + source label + IMU-only degraded mode
|
||||
-> pymavlink GPS_INPUT + local API + FDR
|
||||
```
|
||||
|
||||
The architecture separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors. VPR runs only on cold start, sharp turns, disconnected segments, VO failure, covariance growth, or operator-assisted relocalization. The scheduler owns frame freshness: if processing pressure rises, it drops stale frames instead of letting a FIFO backlog delay flight-controller output. If the camera is fully occluded while GPS is spoofed or denied, the estimator switches to `{dead_reckoned}` and propagates solely from the last trusted state plus flight-controller IMU/attitude/airspeed/altitude until a trusted visual/satellite anchor recovers or the fail threshold is reached.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Component: Real-Time Frame Scheduler
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Bounded latest-frame scheduler | Python/C++ worker loop, monotonic timestamps, metrics counters | Makes latency/drop behavior explicit and prevents stale FIFO backlog | Requires careful timestamp ownership across camera, IMU, VO, and MAVLink output | Camera queue size 1, drop accounting, deadline-aware VPR/matching, IMU propagation between image fixes | Logs every drop and stale-frame rejection to FDR | Supports <400 ms p95 and <=10% frame drops without batching | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: 3 Hz camera input, AC-4.1 latency/drop budget, AC-4.4 no batching, GPS_INPUT timestamp correctness.
|
||||
- Evidence: Fact #27.
|
||||
- Disqualifiers: unbounded FIFO image queues are rejected.
|
||||
|
||||
### Component: Frame Ingest, Calibration, and Time Sync
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Python/C++ ingest with OpenCV/GStreamer, camera calibration files, and MAVLink timestamp alignment | OpenCV, GStreamer, NumPy, calibration YAML | Simple, debuggable, works with USB/MIPI/GigE once the camera module is pinned | Driver and hardware timestamp behavior are module-specific | Locked nav camera/lens, checkerboard calibration, FC clock sync, altitude/attitude stream | Reject unexpected dimensions/intrinsics; signed calibration profiles; no raw-frame persistence | 3 Hz full-res ingest; hot path may downsample/ROI | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: fixed downward nav camera, no raw photo storage, high-res frames, FC IMU/attitude, 400 ms p95.
|
||||
- Evidence: Facts #4, #9, #20.
|
||||
- Disqualifiers: final v1 camera/lens and hardware timestamp behavior must be pinned before calibration tasks.
|
||||
|
||||
### Component: Relative Motion Estimation
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Custom planar VO/IMU module | OpenCV, Eigen/SciPy, optional C++ hot path | Matches nadir fixed camera, flat-terrain assumption, and FC attitude/altitude | Needs calibration, rolling-shutter assessment, and covariance model | Camera intrinsics, altitude, FC attitude/IMU, frame timestamps | Reject low-inlier/high-innovation updates | Must stay within steady-state deadline after downsampling/ROI | Selected |
|
||||
| NVIDIA cuVSLAM | Isaac ROS/cuVSLAM | Strong Jetson ecosystem and IMU fallback | Official docs emphasize stereo-visual-inertial assumptions; IMU-only fallback is short-duration | Stereo or documented exact monocular path | ROS 2 surface area | Good Jetson acceleration, wrong v1 input fit | Rejected for v1 |
|
||||
| ORB-SLAM3 / VINS-Fusion | Research SLAM/VIO stacks | Mono-IMU capability | GPL-family licensing and product integration risk | Legal approval, ROS/C++ integration | Larger dependency surface | Benchmark/offline only | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: single downward nav camera, Jetson runtime, flat terrain, VO drift AC.
|
||||
- Evidence: Facts #7, #8, #20.
|
||||
- Disqualifiers: stereo-required or GPL-family stacks are not product dependencies for v1.
|
||||
|
||||
### Component: Visual Blackout and IMU-Only Degraded Mode
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Visual-health classifier + ESKF IMU-only propagation mode | OpenCV image statistics, exposure/texture checks, FC IMU/attitude/airspeed/altitude, ESKF covariance floors | Handles clouds/whiteout/no-texture frames without trusting spoofed GPS or stale visual fixes | IMU-only position error grows quickly and cannot meet normal accuracy ACs for long gaps | Last trusted visual/satellite anchor, synchronized IMU, GPS health/spoofing signal, covariance thresholds | Spoofed GPS is rejected as an estimator input; every degraded estimate is logged and labeled `{dead_reckoned}` | Transition to blackout mode ≤400 ms; continue IMU-only up to 30 s or until covariance fail threshold | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.4, AC-3.5, AC-5.2, AC-NEW-2, AC-NEW-4, AC-NEW-8.
|
||||
- Evidence: visual blackout/spoofing degraded-mode requirement in `acceptance_criteria.md`.
|
||||
- Disqualifiers: IMU-only mode cannot be treated as normal navigation; it is degraded/failsafe behavior with honest covariance growth.
|
||||
|
||||
### Component: Satellite Cache and Preprocessing
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars | GDAL/Rasterio, SQLite, local manifest schema | Clear offline service boundary, explicit pixel-size/freshness metadata, fast local lookup | The 10 GB budget is unproven until representative imagery, overviews, descriptors, and sidecars are packed together | 0.5 m/px minimum, 0.3 m/px ideal, capture date, source, CRS, tile matrix, compression profile | Signed manifests, checksums, immutable service-source tiles, stale-tile rejection | Cache-packing benchmark must include descriptors and generated-tile sidecars | Selected with storage gate |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: offline-only cache, 10 GB cap, freshness gates, mid-flight tile write-back, no direct provider calls.
|
||||
- Evidence: Facts #13, #16, #21.
|
||||
- Disqualifiers: zoom level alone cannot define physical resolution or storage cost.
|
||||
|
||||
### Component: Visual Place Recognition
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m chunks | PyTorch/TensorRT export path, FAISS CPU/HNSW-flat baseline, PCA/quantization | Strong cross-domain retrieval family; offline gallery descriptors; event-triggered online cost | Raw 49,152-dimensional descriptors can violate memory/cache budgets | Precomputed compressed descriptors, top-K dynamic sizing, covariance-aware search window | Retrieval is candidate generation only; never trusted without local verification | VPR invoked only on relocalization triggers; descriptor compression/index-size gate required | Selected with compression gate |
|
||||
| FAISS GPU/cuVS | FAISS source build or cuVS | Potential lower query latency | ARM64 GPU deployment must be proven; not assumed | Jetson source build and benchmark | Same candidate-only trust model | Optimization path only | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency, 10 GB cache cap.
|
||||
- Evidence: Facts #5, #6, #10, #15, #19.
|
||||
- Disqualifiers: uncompressed descriptors and per-frame VPR are rejected.
|
||||
|
||||
### Component: Local Satellite/UAV Geometric Verification
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| ALIKED + LightGlue + RANSAC | LightGlue, ALIKED, OpenCV, ONNX/TensorRT path | Concrete learned-feature candidate without the official SuperPoint license blocker | Jetson speed and sparse-steppe accuracy must be measured | Candidate chunk/tile, camera intrinsics, attitude, altitude, freshness metadata | Strict inlier, reprojection, freshness, Mahalanobis, and covariance gates | Inline matcher target <=200 ms/pair | Selected candidate |
|
||||
| OpenCV SIFT/AKAZE + classical matching | OpenCV | Commercial-safe legal baseline and regression target | May be weaker on cross-domain imagery and sparse fields | Same geometric verification inputs | Same verification gates | CPU/GPU baseline before learned extractor optimization | Selected baseline |
|
||||
| DeDoDe | DeDoDe, ONNX/TensorRT ports | MIT-licensed learned-feature fallback | Model size, DINOv2-related variants, and Jetson runtime need validation | Model artifact approval and benchmark | Same verification gates | Fallback if ALIKED/SIFT miss robustness targets | Experimental only |
|
||||
| Official Magic Leap SuperPoint pretrained weights | SuperPoint | Technically strong local features | Noncommercial research license blocks product use by default | Separate commercial license | License noncompliance risk | Not product path | Rejected for v1 |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: product licensing, cross-view false-match risk, sparse terrain, <400 ms p95.
|
||||
- Evidence: Facts #10, #17, #18, #24, #25, #26.
|
||||
- Disqualifiers: official SuperPoint weights are not selected unless licensing changes.
|
||||
|
||||
### Component: State Estimator and Confidence
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Error-state Kalman filter in local NED/ENU | NumPy/SciPy prototype, C++ Eigen if profiling requires | Owns covariance, source labels, anchor gating, and output smoothing | Requires calibration, Monte Carlo validation, and conservative covariance floors | IMU propagation, VO deltas, satellite-anchor measurements, innovation gates | Reject overconfident anchors; log every gate decision | Bounded CPU path; hot path may move to C++ only if measured | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
|
||||
- Evidence: Facts #1, #2, #9, #10.
|
||||
- Disqualifiers: direct matcher-to-GPS output is rejected.
|
||||
|
||||
### Component: Flight Controller and Ground Station Interface
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| v1 `GPS_INPUT` emitter | pymavlink | Matches GPS-replacement framing and ArduPilot MAVLink GPS input path | Less expressive than full external-nav `ODOMETRY`; fields must be honest raw-GPS-sensor values | ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields | Validate rate, sequence, fix_type, and fail-closed behavior | 5-10 Hz output; freshest estimator timestamp only | Selected |
|
||||
| `ODOMETRY` auxiliary | pymavlink | Better covariance/yaw semantics | EKF source-fusion and source-switching risk by ArduPilot version | Version-pinned SITL and source-switch tests | Avoid double-fusion | v1.1 only | Deferred |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
|
||||
- Evidence: Facts #1, #2, #3.
|
||||
- Disqualifiers: v1 emits no `ODOMETRY`.
|
||||
|
||||
### Component: Local API, Object Localization, and FDR
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| FastAPI local service + segmented FDR writer | FastAPI, Pydantic, SQLite/Parquet/JSONL segments | OpenAPI docs, health/session/object endpoints, replayable FDR | Must stay outside hot frame path | Local-first API, object pixel validation, rollover schema, no raw frame retention | Bind localhost by default; JWT/API key for network exposure; segment checksums | 1-2 Hz GCS summary; high-rate data local only | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation, no raw frame storage.
|
||||
- Evidence: Fact #14 and Context7 FastAPI docs.
|
||||
- Disqualifiers: API cannot block GPS_INPUT emission.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Integration / Functional Tests
|
||||
|
||||
- Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds and no frame over the allowed maximum error.
|
||||
- Simulate heavy VPR/local-matching frames and assert stale frames are dropped, not queued, with <=10% drops under the defined sustained-load scenario.
|
||||
- Simulate sharp turns, disconnected segments, and 350 m outliers; assert VPR/local verification recovers or the estimator downgrades confidence without false anchors.
|
||||
- Simulate full visual blackout during GPS spoofing for 5 s, 15 s, and 35 s; assert switch to `{dead_reckoned}` ≤400 ms, spoofed GPS ignored, covariance grows monotonically, QGC receives `VISUAL_BLACKOUT_IMU_ONLY`, and `VISUAL_BLACKOUT_FAILSAFE` triggers at the configured duration/covariance threshold.
|
||||
- Run ArduPilot SITL with v1 parameters and assert accepted `GPS_INPUT` messages at 5-10 Hz, no `ODOMETRY` emission, correct fix_type transitions, and QGC downsampled telemetry.
|
||||
- Inject stale tiles, mismatched manifests, corrupted descriptors, and cache-poisoning candidates; assert rejection or confidence downgrade.
|
||||
- Validate object localization with level-flight AI-camera geometry and out-of-frame input errors.
|
||||
|
||||
### Non-Functional Tests
|
||||
|
||||
- Jetson benchmark: capture-to-`GPS_INPUT` p95 <400 ms with scheduler, VO, VPR triggers, local matcher, ESKF, API, and FDR active.
|
||||
- IMU-only propagation benchmark: verify blackout-mode CPU cost is negligible, covariance thresholds are deterministic, and `GPS_INPUT.horiz_accuracy` never under-reports the 95% covariance semi-major axis.
|
||||
- Local matcher bake-off: ALIKED + LightGlue, SIFT/AKAZE, and DeDoDe on sample/public datasets, measuring accuracy, false positives, latency, memory, and license status.
|
||||
- VPR memory/index benchmark: prove compressed descriptors, FAISS index, TensorRT engines, and runtime buffers stay below 8 GB.
|
||||
- Cache-packing benchmark: package representative 400 km² imagery with overviews, manifests, descriptors, indexes, and generated-tile sidecars under the 10 GB persistent-cache budget.
|
||||
- Thermal soak: 25 W workload for 8 hours at the upper environmental envelope without throttling.
|
||||
- Monte Carlo false-position and cache-poisoning validation over public datasets plus SITL/real FC traces.
|
||||
- License and dependency scan: fail CI if noncommercial SuperPoint weights or unapproved model artifacts enter product builds.
|
||||
|
||||
## References
|
||||
|
||||
- ArduPilot MAVProxy GPSInput documentation: `https://ardupilot.org/mavproxy/docs/modules/GPSInput.html`
|
||||
- MAVLink `GPS_INPUT` message spec: `https://mavlink.io/en/messages/common.html#GPS_INPUT`
|
||||
- NVIDIA Isaac ROS cuVSLAM docs: `https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html`
|
||||
- Magic Leap SuperPoint pretrained network license: `https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE`
|
||||
- LightGlue repository: `https://github.com/cvg/LightGlue`
|
||||
- DeDoDe repository: `https://github.com/Parskatt/DeDoDe`
|
||||
- OpenCV SIFT source: `https://github.com/opencv/opencv/blob/4.x/modules/features2d/src/sift.dispatch.cpp`
|
||||
- AnyLoc/DINO repository: `https://github.com/AnyLoc/DINO`
|
||||
- GDAL COG driver and OGC COG standard: `https://gdal.org/en/stable/drivers/raster/cog.html`, `http://www.opengis.net/doc/is/COG/1.0`
|
||||
- AerialVL dataset: `https://github.com/hmf21/AerialVL`
|
||||
- UAV-VisLoc paper: `https://arxiv.org/html/2405.11936v1`
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- AC assessment: `_docs/00_research/00_ac_assessment.md`
|
||||
- Question decomposition: `_docs/00_research/00_question_decomposition.md`
|
||||
- Source registry: `_docs/00_research/01_source_registry.md`
|
||||
- Fact cards: `_docs/00_research/02_fact_cards.md`
|
||||
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
|
||||
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md`
|
||||
@@ -1,153 +0,0 @@
|
||||
# Solution Draft
|
||||
|
||||
## Product Solution Description
|
||||
|
||||
Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the UAV navigation-camera frame center in WGS84, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible `GPS_INPUT` messages with calibrated confidence.
|
||||
|
||||
High-level flow:
|
||||
|
||||
```text
|
||||
Nav camera + FC IMU/attitude/altitude
|
||||
-> frame ingest + timestamp sync + calibration
|
||||
-> planar VO/IMU relative motion
|
||||
-> conditional VPR over preloaded satellite chunks
|
||||
-> local satellite/UAV geometric verification
|
||||
-> ESKF state + covariance + source label
|
||||
-> pymavlink GPS_INPUT + local API + FDR
|
||||
```
|
||||
|
||||
The architecture deliberately separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors; VPR runs only on cold start, sharp turns, disconnected segments, VO failure, or covariance growth.
|
||||
|
||||
## Existing/Competitor Solutions Analysis
|
||||
|
||||
| Solution Class | What It Provides | Why It Is Not Enough Alone | Role In This Draft |
|
||||
|----------------|------------------|----------------------------|--------------------|
|
||||
| Pure visual odometry / SLAM | Relative motion from camera frames | Drifts over long fixed-wing flights and fails at disconnected segments or low-overlap turns | Used only as relative motion input |
|
||||
| Stereo VIO stacks such as cuVSLAM | Strong Jetson-optimized stereo visual-inertial odometry | v1 has one fixed downward navigation camera, not a stereo rig | Rejected for v1, reconsider if hardware changes |
|
||||
| ORB-SLAM3 / VINS-Fusion | Mono-inertial research baselines | GPL-family license and generic SLAM assumptions make direct product use risky | Experimental/offline benchmark only |
|
||||
| Direct UAV-to-satellite retrieval | Absolute place recognition | Top-1 retrieval is vulnerable to similar fields, stale tiles, and appearance changes | Used as top-K candidate generation only |
|
||||
| Cross-view local matching | Geometric proof against satellite reference | Too expensive and false-positive-prone if run everywhere blindly | Run after VPR/prior narrowing with strict gates |
|
||||
|
||||
## Architecture
|
||||
|
||||
### Component: Frame Ingest, Calibration, and Time Sync
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| Python/C++ ingest with OpenCV/GStreamer, camera calibration files, MAVLink timestamp alignment | OpenCV, GStreamer, NumPy, calibration YAML | Simple, debuggable, works with USB/MIPI/GigE once driver selected | Camera driver and hardware timestamp details are module-specific | Locked nav camera/lens, checkerboard calibration, FC time sync | Validate input dimensions and timestamps; no raw frame persistence | Low/medium | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: fixed camera, no raw photo storage, 3 Hz frame rate, high-res frames, FC IMU.
|
||||
- Evidence: `_docs/00_research/06_component_fit_matrix.md`.
|
||||
- Disqualifiers: none, but final camera driver must be chosen during implementation planning.
|
||||
|
||||
### Component: Relative Motion Estimation
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| Custom planar VO/IMU module | OpenCV, Eigen/SciPy, optional TensorRT local features | Matches nadir fixed camera and flat-terrain assumption; avoids stereo/GPL dependency | Requires careful calibration, attitude compensation, and covariance model | Camera intrinsics, altitude, FC attitude/IMU, frame timestamps | Reject low-inlier or high-innovation updates | Medium | Selected |
|
||||
| cuVSLAM | Isaac ROS/cuVSLAM | Strong Jetson acceleration and IMU fallback | Stereo-visual-inertial design mismatches single nav camera | Stereo camera rig | ROS 2 surface area | Medium | Rejected for v1 |
|
||||
| ORB-SLAM3 / VINS-Fusion | Research SLAM/VIO stacks | Mono-IMU capability | GPL-family licensing and product integration risk | Legal approval, ROS/C++ integration | Larger attack/dependency surface | Medium/high | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: single downward nav camera, Jetson runtime, flat-terrain assumption, VO drift AC.
|
||||
- Evidence: Facts #7-#9.
|
||||
- Disqualifiers: cuVSLAM requires stereo; ORB-SLAM3/VINS-Fusion licensing.
|
||||
|
||||
### Component: Satellite Cache and Preprocessing
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars | GDAL/Rasterio, SQLite, local manifest schema | Clear service boundary, offline lookup, explicit pixel-size/freshness metadata | Storage estimate depends on final provider compression | 0.5 m/px min, 0.3 m/px ideal, capture date, source, CRS, tile matrix | Reject stale/unsigned manifests; immutable trusted service-source tiles | Medium | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: offline-only, 10 GB cache cap, freshness gates, mid-flight tile write-back.
|
||||
- Evidence: Facts #11-#13, #16.
|
||||
- Disqualifiers: zoom level alone cannot define physical resolution.
|
||||
|
||||
### Component: Visual Place Recognition
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m VPR chunks | PyTorch/TensorRT, FAISS CPU/HNSW-flat baseline | Good cross-domain retrieval candidate; offline gallery descriptors; conditional online cost | Must benchmark on steppe/agricultural imagery; CPU index may be enough, GPU FAISS not assumed | Precomputed descriptors, top-K dynamic sizing, covariance-aware search window | Never trust retrieval without local verification | Medium | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency.
|
||||
- Evidence: Facts #5, #6, #10, #15.
|
||||
- Disqualifiers: per-frame VPR is rejected.
|
||||
|
||||
### Component: Local Satellite/UAV Geometric Verification
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| SuperPoint/LightGlue-style local matching + RANSAC homography + geodesic projection | TensorRT/OpenCV, fallback SIFT/AKAZE | Produces inlier count, reprojection error, and covariance evidence for `satellite_anchored` fixes | SuperPoint weights need license review; Jetson speed must be measured | Candidate tile/chunk, camera intrinsics, attitude, altitude, freshness metadata | Strict inlier, Mahalanobis, freshness, and covariance gates | Medium/high | Selected with gates |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: cross-view false-match risk, sparse terrain, <400 ms p95.
|
||||
- Evidence: Fact #10, component fit matrix.
|
||||
- Disqualifiers: no single match can bypass ESKF gates.
|
||||
|
||||
### Component: State Estimator and Confidence
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| Error-state Kalman filter in local NED/ENU | NumPy/SciPy or C++ Eigen core | Owns covariance, source labels, anchor gating, and output smoothing | Requires calibration and Monte Carlo validation | IMU propagation, VO deltas, satellite-anchor measurements, innovation gates | Reject overconfident anchors; log every gate decision | Medium | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
|
||||
- Evidence: Facts #1, #2, #9, #10.
|
||||
- Disqualifiers: direct matcher-to-GPS output is rejected.
|
||||
|
||||
### Component: Flight Controller and Ground Station Interface
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| v1 `GPS_INPUT` emitter | pymavlink | Matches GPS-replacement framing and ArduPilot `GPS1_TYPE=14` | Less expressive than full external-nav ODOMETRY | ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields | Validate outbound rates and fail closed on bad state | Low | Selected |
|
||||
| ODOMETRY auxiliary | pymavlink | Better covariance/yaw semantics | EKF source-fusion risk by ArduPilot version | Version-pinned SITL and source-switch tests | Avoid double-fusion | Medium | Deferred |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
|
||||
- Evidence: Facts #1-#3.
|
||||
- Disqualifiers: ODOMETRY disabled for v1.
|
||||
|
||||
### Component: Local API, Object Localization, and FDR
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|------|-----|
|
||||
| FastAPI local service + FDR writer | FastAPI, Pydantic, SQLite/Parquet/log segments | OpenAPI docs, local health/session/object endpoints, replayable FDR | Must stay outside hot frame path | Localhost or authenticated LAN, rollover, schema versioning | JWT/API key for non-local access; no raw frame retention | Low/medium | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation.
|
||||
- Evidence: Fact #14.
|
||||
- Disqualifiers: API cannot block GPS_INPUT emission.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Integration / Functional Tests
|
||||
- Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds.
|
||||
- Verify no frame exceeds the maximum allowed error in `position_accuracy.csv`.
|
||||
- Simulate frames 32-43 as a sharp-turn/disconnected-segment scenario and assert relocalization.
|
||||
- Inject stale tiles and assert no stale match emits `satellite_anchored`.
|
||||
- Run ArduPilot SITL with `GPS1_TYPE=14` and assert `GPS_INPUT` messages are accepted at configured rate.
|
||||
- Reboot the companion process mid-replay and assert first valid output within AC-NEW-1 budget.
|
||||
- Call object-localization API with level-flight inputs and invalid pixel coordinates.
|
||||
|
||||
### Non-Functional Tests
|
||||
- Jetson benchmark: p95 capture-to-`GPS_INPUT` latency <400 ms with VPR triggers and <=10% frame drops.
|
||||
- Memory profile: peak below 8 GB with descriptors, TensorRT engines, cache index, API, and FDR active.
|
||||
- Thermal soak: 25 W workload for 8 hours at upper environmental envelope without throttling.
|
||||
- Monte Carlo false-position: verify AC-NEW-4 and AC-NEW-7 probability budgets over synthetic and real replay sets.
|
||||
- Cache storage: validate final provider format stays within 10 GB persistent cache and 64 GB FDR cap.
|
||||
- Security: verify manifest signing/checksums, stale-tile rejection, and local API authentication.
|
||||
|
||||
## References
|
||||
|
||||
See `_docs/00_research/01_source_registry.md` and `_docs/00_research/02_fact_cards.md`.
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- AC assessment: `_docs/00_research/00_ac_assessment.md`
|
||||
- Question decomposition: `_docs/00_research/00_question_decomposition.md`
|
||||
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
|
||||
- Tech stack evaluation: `_docs/01_solution/tech_stack.md` (generated after this draft)
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md` (generated after this draft)
|
||||
@@ -1,164 +0,0 @@
|
||||
# Solution Draft
|
||||
|
||||
## Assessment Findings
|
||||
|
||||
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|
||||
|------------------------|----------------------------------------------|--------------|
|
||||
| SuperPoint + LightGlue-style local matching | Official Magic Leap SuperPoint pretrained weights are noncommercial research-only, so they are not a selectable product dependency by default. | Use a local-verification abstraction: LightGlue only with a license-cleared extractor, plus SIFT/AKAZE/classical matching as the legal v1 baseline. |
|
||||
| AnyLoc/DINOv2-VLAD VPR chunks | Raw 49,152-dimensional descriptors can consume too much RAM/cache once multi-scale chunks, overlap, indexes, and metadata are included. | Keep event-triggered VPR, but add a mandatory descriptor compression/index-size gate before implementation freeze. |
|
||||
| 10 GB persistent satellite cache | 400 km² at 0.3-0.5 m/px plus overviews, manifests, VPR descriptors, and generated tiles is not proven by zoom-level math. | Keep the 10 GB target, but require a representative cache-packing benchmark using Suite Satellite Service sample imagery. |
|
||||
| Public datasets for validation | AerialVL/UAV-VisLoc are useful but do not prove FC IMU timing, covariance calibration, thermal behavior, or MAVLink source behavior. | Use public datasets for early VPR/matcher tests, then require ArduPilot SITL IMU traces and real FC/camera timing captures before final acceptance. |
|
||||
| cuVSLAM rejection rationale | The first draft treated the stereo mismatch as enough; official docs also show IMU-only degraded tracking is short-duration. | Keep cuVSLAM rejected for v1 product use, but retain it as a benchmark/reference if future hardware adds stereo. |
|
||||
| `GPS_INPUT` + `ODOMETRY` hybrid | Richer `ODOMETRY` semantics are attractive, but source-fusion behavior is version-sensitive. | v1 emits `GPS_INPUT` only. `ODOMETRY` remains a v1.1 item gated by exact ArduPilot release and SITL proof. |
|
||||
|
||||
## Product Solution Description
|
||||
|
||||
Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the WGS84 coordinate of each navigation-camera frame center, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible `GPS_INPUT` messages with calibrated confidence.
|
||||
|
||||
```text
|
||||
Nav camera + FC IMU/attitude/altitude
|
||||
-> frame ingest + timestamp sync + calibration
|
||||
-> planar VO/IMU relative motion
|
||||
-> conditional compressed-descriptor VPR over preloaded satellite chunks
|
||||
-> license-cleared local geometric verification
|
||||
-> ESKF state + covariance + source label
|
||||
-> pymavlink GPS_INPUT + local API + FDR
|
||||
```
|
||||
|
||||
The architecture separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors. VPR runs only on cold start, sharp turns, disconnected segments, VO failure, covariance growth, or operator-assisted relocalization.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Component: Frame Ingest, Calibration, and Time Sync
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Python/C++ ingest with OpenCV/GStreamer, camera calibration files, and MAVLink timestamp alignment | OpenCV, GStreamer, NumPy, calibration YAML | Simple, debuggable, works with USB/MIPI/GigE once the camera module is pinned | Driver and hardware timestamp behavior are module-specific | Locked nav camera/lens, checkerboard calibration, FC clock sync, altitude/attitude stream | Reject unexpected dimensions/intrinsics; signed calibration profiles; no raw-frame persistence | 3 Hz full-res ingest; processing may downsample/ROI before hot path | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: fixed downward nav camera, no raw photo storage, high-res frames, FC IMU/attitude, 400 ms p95.
|
||||
- Evidence: Facts #4, #9, #20.
|
||||
- Disqualifiers: final v1 camera/lens and hardware timestamp behavior must be pinned before calibration tasks.
|
||||
|
||||
### Component: Relative Motion Estimation
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Custom planar VO/IMU module | OpenCV, Eigen/SciPy, optional C++ hot path | Matches nadir fixed camera, flat-terrain assumption, and FC attitude/altitude | Needs careful calibration, rolling-shutter assessment, and covariance model | Camera intrinsics, altitude, FC attitude/IMU, frame timestamps | Reject low-inlier/high-innovation updates | Must stay within steady-state budget after downsampling/ROI | Selected |
|
||||
| NVIDIA cuVSLAM | Isaac ROS/cuVSLAM | Strong Jetson ecosystem and IMU fallback | Official docs emphasize stereo-visual-inertial assumptions; IMU-only fallback is short-duration | Stereo or documented exact monocular path | ROS 2 surface area | Good Jetson acceleration, wrong v1 input fit | Rejected for v1 |
|
||||
| ORB-SLAM3 / VINS-Fusion | Research SLAM/VIO stacks | Mono-IMU capability | GPL-family licensing and product integration risk | Legal approval, ROS/C++ integration | Larger dependency surface | Benchmark/offline only | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: single downward nav camera, Jetson runtime, flat terrain, VO drift AC.
|
||||
- Evidence: Facts #7, #8, #20.
|
||||
- Disqualifiers: stereo-required or GPL-family stacks are not product dependencies for v1.
|
||||
|
||||
### Component: Satellite Cache and Preprocessing
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars | GDAL/Rasterio, SQLite, local manifest schema | Clear offline service boundary, explicit pixel-size/freshness metadata, fast local lookup | The 10 GB budget is unproven until representative imagery, overviews, descriptors, and sidecars are packed together | 0.5 m/px minimum, 0.3 m/px ideal, capture date, source, CRS, tile matrix, compression profile | Signed manifests, checksums, immutable service-source tiles, stale-tile rejection | Cache-packing benchmark must include descriptors and generated-tile sidecars | Selected with storage gate |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: offline-only cache, 10 GB cap, freshness gates, mid-flight tile write-back, no direct provider calls.
|
||||
- Evidence: Facts #13, #16, #21.
|
||||
- Disqualifiers: zoom level alone cannot define physical resolution or storage cost.
|
||||
|
||||
### Component: Visual Place Recognition
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m chunks | PyTorch/TensorRT export path, FAISS CPU/HNSW-flat baseline, PCA/quantization | Strong cross-domain retrieval family; offline gallery descriptors; event-triggered online cost | Raw 49,152-dimensional descriptors can violate memory/cache budgets | Precomputed compressed descriptors, top-K dynamic sizing, covariance-aware search window | Retrieval is candidate generation only; never trusted without local verification | VPR invoked only on relocalization triggers; descriptor compression/index-size gate required | Selected with compression gate |
|
||||
| FAISS GPU/cuVS | FAISS source build or cuVS | Potential lower query latency | ARM64 GPU deployment must be proven; not assumed | Jetson source build and benchmark | Same candidate-only trust model | Optimization path only | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency, 10 GB cache cap.
|
||||
- Evidence: Facts #5, #6, #10, #15, #19.
|
||||
- Disqualifiers: uncompressed descriptors and per-frame VPR are rejected.
|
||||
|
||||
### Component: Local Satellite/UAV Geometric Verification
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| License-cleared feature extractor + LightGlue/classical matching + RANSAC homography/geodesic projection | LightGlue where licensed, SIFT/AKAZE fallback, OpenCV, TensorRT if applicable | Keeps geometric proof stage without depending on noncommercial weights | Exact extractor accuracy and Jetson speed must be measured on steppe/agricultural imagery | Candidate chunk/tile, camera intrinsics, attitude, altitude, freshness metadata | Strict inlier, reprojection, freshness, Mahalanobis, and covariance gates | Inline matcher target <=200 ms/pair; fallback relocalization can use longer budget | Selected |
|
||||
| Official Magic Leap SuperPoint pretrained weights | SuperPoint | Technically strong local features | Noncommercial research license blocks product use by default | Separate commercial license | License noncompliance risk | Not product path | Rejected for v1 |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: product licensing, cross-view false-match risk, sparse terrain, <400 ms p95.
|
||||
- Evidence: Facts #10, #17, #18.
|
||||
- Disqualifiers: official SuperPoint weights are not selected unless licensing changes.
|
||||
|
||||
### Component: State Estimator and Confidence
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Error-state Kalman filter in local NED/ENU | NumPy/SciPy prototype, C++ Eigen if profiling requires | Owns covariance, source labels, anchor gating, and output smoothing | Requires calibration, Monte Carlo validation, and conservative covariance floors | IMU propagation, VO deltas, satellite-anchor measurements, innovation gates | Reject overconfident anchors; log every gate decision | Bounded CPU path; hot path may move to C++ only if measured | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
|
||||
- Evidence: Facts #1, #2, #9, #10.
|
||||
- Disqualifiers: direct matcher-to-GPS output is rejected.
|
||||
|
||||
### Component: Flight Controller and Ground Station Interface
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| v1 `GPS_INPUT` emitter | pymavlink | Matches GPS-replacement framing and ArduPilot MAVLink GPS input path | Less expressive than full external-nav `ODOMETRY`; fields must be honest raw-GPS-sensor values | ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields | Validate rate, sequence, fix_type, and fail-closed behavior | 5-10 Hz output; never batch frame-center outputs | Selected |
|
||||
| `ODOMETRY` auxiliary | pymavlink | Better covariance/yaw semantics | EKF source-fusion and source-switching risk by ArduPilot version | Version-pinned SITL and source-switch tests | Avoid double-fusion | v1.1 only | Deferred |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
|
||||
- Evidence: Facts #1, #2, #3.
|
||||
- Disqualifiers: v1 emits no `ODOMETRY`.
|
||||
|
||||
### Component: Local API, Object Localization, and FDR
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| FastAPI local service + segmented FDR writer | FastAPI, Pydantic, SQLite/Parquet/JSONL segments | OpenAPI docs, health/session/object endpoints, replayable FDR | Must stay outside hot frame path | Local-first API, object pixel validation, rollover schema, no raw frame retention | Bind localhost by default; JWT/API key for network exposure; segment checksums | 1-2 Hz GCS summary; high-rate data local only | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation, no raw frame storage.
|
||||
- Evidence: Fact #14 and Context7 FastAPI docs.
|
||||
- Disqualifiers: API cannot block GPS_INPUT emission.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Integration / Functional Tests
|
||||
|
||||
- Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds and no frame over the allowed maximum error.
|
||||
- Simulate sharp turns, disconnected segments, and 350 m outliers; assert VPR/local verification recovers or the estimator downgrades confidence without false anchors.
|
||||
- Run ArduPilot SITL with v1 parameters and assert accepted `GPS_INPUT` messages at 5-10 Hz, no `ODOMETRY` emission, correct fix_type transitions, and QGC downsampled telemetry.
|
||||
- Inject stale tiles, mismatched manifests, corrupted descriptors, and cache-poisoning candidates; assert rejection or confidence downgrade.
|
||||
- Validate object localization with level-flight AI-camera geometry and out-of-frame input errors.
|
||||
|
||||
### Non-Functional Tests
|
||||
|
||||
- Jetson benchmark: capture-to-`GPS_INPUT` p95 <400 ms with VO, VPR triggers, local matcher, ESKF, API, and FDR active.
|
||||
- VPR memory/index benchmark: prove compressed descriptors, FAISS index, TensorRT engines, and runtime buffers stay below 8 GB.
|
||||
- Cache-packing benchmark: package representative 400 km² imagery with overviews, manifests, descriptors, indexes, and generated-tile sidecars under the 10 GB persistent-cache budget.
|
||||
- Thermal soak: 25 W workload for 8 hours at the upper environmental envelope without throttling.
|
||||
- Monte Carlo false-position and cache-poisoning validation over public datasets plus SITL/real FC traces.
|
||||
- License and dependency scan: fail CI if noncommercial SuperPoint weights or unapproved model artifacts enter product builds.
|
||||
|
||||
## References
|
||||
|
||||
- ArduPilot MAVProxy GPSInput documentation: `https://ardupilot.org/mavproxy/docs/modules/GPSInput.html`
|
||||
- MAVLink `GPS_INPUT` message spec: `https://mavlink.io/en/messages/common.html#GPS_INPUT`
|
||||
- NVIDIA Isaac ROS cuVSLAM docs: `https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html`
|
||||
- Magic Leap SuperPoint pretrained network license: `https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE`
|
||||
- LightGlue license and extractor-license issue: `https://github.com/cvg/LightGlue/blob/main/LICENSE`, `https://github.com/cvg/LightGlue/issues/38`
|
||||
- AnyLoc/DINO repository: `https://github.com/AnyLoc/DINO`
|
||||
- GDAL COG driver and OGC COG standard: `https://gdal.org/en/stable/drivers/raster/cog.html`, `http://www.opengis.net/doc/is/COG/1.0`
|
||||
- AerialVL dataset: `https://github.com/hmf21/AerialVL`
|
||||
- UAV-VisLoc paper: `https://arxiv.org/html/2405.11936v1`
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- AC assessment: `_docs/00_research/00_ac_assessment.md`
|
||||
- Question decomposition: `_docs/00_research/00_question_decomposition.md`
|
||||
- Source registry: `_docs/00_research/01_source_registry.md`
|
||||
- Fact cards: `_docs/00_research/02_fact_cards.md`
|
||||
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
|
||||
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md`
|
||||
@@ -1,183 +0,0 @@
|
||||
# Solution Draft
|
||||
|
||||
## Assessment Findings
|
||||
|
||||
| Old Component Solution | Weak Point (functional/security/performance) | New Solution |
|
||||
|------------------------|----------------------------------------------|--------------|
|
||||
| "License-cleared extractor" in the local matcher | Too abstract for planning; tasks need concrete candidates and legal baselines. | Use ALIKED + LightGlue as the first learned-feature candidate, OpenCV SIFT/AKAZE as the legal baseline, and DeDoDe as an experimental fallback. |
|
||||
| Image pipeline without an explicit scheduler | A FIFO frame queue can violate <400 ms p95 latency even when individual stages are fast, because frames arrive every ~333 ms at 3 Hz. | Add a bounded latest-frame scheduler: camera queue size 1, explicit frame-drop accounting, deadline-aware VPR/matching, and timestamp-correct `GPS_INPUT`. |
|
||||
| SuperPoint + LightGlue-style local matching | Official Magic Leap SuperPoint pretrained weights are noncommercial research-only. | Reject official SuperPoint weights for product v1 unless a commercial license is obtained. |
|
||||
| AnyLoc/DINOv2-VLAD VPR chunks | Raw 49,152-dimensional descriptors can consume too much RAM/cache once multi-scale chunks, overlap, indexes, and metadata are included. | Keep event-triggered VPR, but add a mandatory descriptor compression/index-size gate before implementation freeze. |
|
||||
| 10 GB persistent satellite cache | 400 km² at 0.3-0.5 m/px plus overviews, manifests, VPR descriptors, and generated tiles is not proven by zoom-level math. | Keep the 10 GB target, but require a representative cache-packing benchmark using Suite Satellite Service sample imagery. |
|
||||
| Public datasets for validation | AerialVL/UAV-VisLoc are useful but do not prove FC IMU timing, covariance calibration, thermal behavior, or MAVLink source behavior. | Use public datasets for early VPR/matcher tests, then require ArduPilot SITL IMU traces and real FC/camera timing captures before final acceptance. |
|
||||
| `GPS_INPUT` + `ODOMETRY` hybrid | Richer `ODOMETRY` semantics are attractive, but source-fusion behavior is version-sensitive. | v1 emits `GPS_INPUT` only. `ODOMETRY` remains a v1.1 item gated by exact ArduPilot release and SITL proof. |
|
||||
|
||||
## Product Solution Description
|
||||
|
||||
Build an onboard GPS-denied localization service for fixed-wing UAVs. The service estimates the WGS84 coordinate of each navigation-camera frame center, localizes AI-camera detections on flat terrain, and emits ArduPilot-compatible `GPS_INPUT` messages with calibrated confidence.
|
||||
|
||||
```text
|
||||
Nav camera + FC IMU/attitude/altitude
|
||||
-> bounded latest-frame scheduler + timestamp sync
|
||||
-> calibration + frame normalization
|
||||
-> planar VO/IMU relative motion
|
||||
-> conditional compressed-descriptor VPR over preloaded satellite chunks
|
||||
-> ALIKED/LightGlue or SIFT/AKAZE local geometric verification
|
||||
-> ESKF state + covariance + source label
|
||||
-> pymavlink GPS_INPUT + local API + FDR
|
||||
```
|
||||
|
||||
The architecture separates fast steady-state tracking from heavier relocalization. Normal frames use VO/IMU prediction and local map priors. VPR runs only on cold start, sharp turns, disconnected segments, VO failure, covariance growth, or operator-assisted relocalization. The scheduler owns frame freshness: if processing pressure rises, it drops stale frames instead of letting a FIFO backlog delay flight-controller output.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Component: Real-Time Frame Scheduler
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Bounded latest-frame scheduler | Python/C++ worker loop, monotonic timestamps, metrics counters | Makes latency/drop behavior explicit and prevents stale FIFO backlog | Requires careful timestamp ownership across camera, IMU, VO, and MAVLink output | Camera queue size 1, drop accounting, deadline-aware VPR/matching, IMU propagation between image fixes | Logs every drop and stale-frame rejection to FDR | Supports <400 ms p95 and <=10% frame drops without batching | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: 3 Hz camera input, AC-4.1 latency/drop budget, AC-4.4 no batching, GPS_INPUT timestamp correctness.
|
||||
- Evidence: Fact #27.
|
||||
- Disqualifiers: unbounded FIFO image queues are rejected.
|
||||
|
||||
### Component: Frame Ingest, Calibration, and Time Sync
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Python/C++ ingest with OpenCV/GStreamer, camera calibration files, and MAVLink timestamp alignment | OpenCV, GStreamer, NumPy, calibration YAML | Simple, debuggable, works with USB/MIPI/GigE once the camera module is pinned | Driver and hardware timestamp behavior are module-specific | Locked nav camera/lens, checkerboard calibration, FC clock sync, altitude/attitude stream | Reject unexpected dimensions/intrinsics; signed calibration profiles; no raw-frame persistence | 3 Hz full-res ingest; hot path may downsample/ROI | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: fixed downward nav camera, no raw photo storage, high-res frames, FC IMU/attitude, 400 ms p95.
|
||||
- Evidence: Facts #4, #9, #20.
|
||||
- Disqualifiers: final v1 camera/lens and hardware timestamp behavior must be pinned before calibration tasks.
|
||||
|
||||
### Component: Relative Motion Estimation
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Custom planar VO/IMU module | OpenCV, Eigen/SciPy, optional C++ hot path | Matches nadir fixed camera, flat-terrain assumption, and FC attitude/altitude | Needs calibration, rolling-shutter assessment, and covariance model | Camera intrinsics, altitude, FC attitude/IMU, frame timestamps | Reject low-inlier/high-innovation updates | Must stay within steady-state deadline after downsampling/ROI | Selected |
|
||||
| NVIDIA cuVSLAM | Isaac ROS/cuVSLAM | Strong Jetson ecosystem and IMU fallback | Official docs emphasize stereo-visual-inertial assumptions; IMU-only fallback is short-duration | Stereo or documented exact monocular path | ROS 2 surface area | Good Jetson acceleration, wrong v1 input fit | Rejected for v1 |
|
||||
| ORB-SLAM3 / VINS-Fusion | Research SLAM/VIO stacks | Mono-IMU capability | GPL-family licensing and product integration risk | Legal approval, ROS/C++ integration | Larger dependency surface | Benchmark/offline only | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: single downward nav camera, Jetson runtime, flat terrain, VO drift AC.
|
||||
- Evidence: Facts #7, #8, #20.
|
||||
- Disqualifiers: stereo-required or GPL-family stacks are not product dependencies for v1.
|
||||
|
||||
### Component: Satellite Cache and Preprocessing
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Suite Satellite Service exchange via COG/GeoTIFF, onboard SQLite/MBTiles-like package with manifests and descriptor sidecars | GDAL/Rasterio, SQLite, local manifest schema | Clear offline service boundary, explicit pixel-size/freshness metadata, fast local lookup | The 10 GB budget is unproven until representative imagery, overviews, descriptors, and sidecars are packed together | 0.5 m/px minimum, 0.3 m/px ideal, capture date, source, CRS, tile matrix, compression profile | Signed manifests, checksums, immutable service-source tiles, stale-tile rejection | Cache-packing benchmark must include descriptors and generated-tile sidecars | Selected with storage gate |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: offline-only cache, 10 GB cap, freshness gates, mid-flight tile write-back, no direct provider calls.
|
||||
- Evidence: Facts #13, #16, #21.
|
||||
- Disqualifiers: zoom level alone cannot define physical resolution or storage cost.
|
||||
|
||||
### Component: Visual Place Recognition
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| AnyLoc/DINOv2-VLAD-style descriptors over 600-800 m chunks | PyTorch/TensorRT export path, FAISS CPU/HNSW-flat baseline, PCA/quantization | Strong cross-domain retrieval family; offline gallery descriptors; event-triggered online cost | Raw 49,152-dimensional descriptors can violate memory/cache budgets | Precomputed compressed descriptors, top-K dynamic sizing, covariance-aware search window | Retrieval is candidate generation only; never trusted without local verification | VPR invoked only on relocalization triggers; descriptor compression/index-size gate required | Selected with compression gate |
|
||||
| FAISS GPU/cuVS | FAISS source build or cuVS | Potential lower query latency | ARM64 GPU deployment must be proven; not assumed | Jetson source build and benchmark | Same candidate-only trust model | Optimization path only | Experimental only |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: event-triggered VPR, active-conflict change robustness, Jetson memory/latency, 10 GB cache cap.
|
||||
- Evidence: Facts #5, #6, #10, #15, #19.
|
||||
- Disqualifiers: uncompressed descriptors and per-frame VPR are rejected.
|
||||
|
||||
### Component: Local Satellite/UAV Geometric Verification
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| ALIKED + LightGlue + RANSAC | LightGlue, ALIKED, OpenCV, ONNX/TensorRT path | Concrete learned-feature candidate without the official SuperPoint license blocker | Jetson speed and sparse-steppe accuracy must be measured | Candidate chunk/tile, camera intrinsics, attitude, altitude, freshness metadata | Strict inlier, reprojection, freshness, Mahalanobis, and covariance gates | Inline matcher target <=200 ms/pair | Selected candidate |
|
||||
| OpenCV SIFT/AKAZE + classical matching | OpenCV | Commercial-safe legal baseline and regression target | May be weaker on cross-domain imagery and sparse fields | Same geometric verification inputs | Same verification gates | CPU/GPU baseline before learned extractor optimization | Selected baseline |
|
||||
| DeDoDe | DeDoDe, ONNX/TensorRT ports | MIT-licensed learned-feature fallback | Model size, DINOv2-related variants, and Jetson runtime need validation | Model artifact approval and benchmark | Same verification gates | Fallback if ALIKED/SIFT miss robustness targets | Experimental only |
|
||||
| Official Magic Leap SuperPoint pretrained weights | SuperPoint | Technically strong local features | Noncommercial research license blocks product use by default | Separate commercial license | License noncompliance risk | Not product path | Rejected for v1 |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: product licensing, cross-view false-match risk, sparse terrain, <400 ms p95.
|
||||
- Evidence: Facts #10, #17, #18, #24, #25, #26.
|
||||
- Disqualifiers: official SuperPoint weights are not selected unless licensing changes.
|
||||
|
||||
### Component: State Estimator and Confidence
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| Error-state Kalman filter in local NED/ENU | NumPy/SciPy prototype, C++ Eigen if profiling requires | Owns covariance, source labels, anchor gating, and output smoothing | Requires calibration, Monte Carlo validation, and conservative covariance floors | IMU propagation, VO deltas, satellite-anchor measurements, innovation gates | Reject overconfident anchors; log every gate decision | Bounded CPU path; hot path may move to C++ only if measured | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-1.4, AC-NEW-4, AC-NEW-7, GPS_INPUT accuracy fields.
|
||||
- Evidence: Facts #1, #2, #9, #10.
|
||||
- Disqualifiers: direct matcher-to-GPS output is rejected.
|
||||
|
||||
### Component: Flight Controller and Ground Station Interface
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| v1 `GPS_INPUT` emitter | pymavlink | Matches GPS-replacement framing and ArduPilot MAVLink GPS input path | Less expressive than full external-nav `ODOMETRY`; fields must be honest raw-GPS-sensor values | ArduPilot params, SITL tests, WGS84 conversion, h_acc/v_acc fields | Validate rate, sequence, fix_type, and fail-closed behavior | 5-10 Hz output; freshest estimator timestamp only | Selected |
|
||||
| `ODOMETRY` auxiliary | pymavlink | Better covariance/yaw semantics | EKF source-fusion and source-switching risk by ArduPilot version | Version-pinned SITL and source-switch tests | Avoid double-fusion | v1.1 only | Deferred |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: ArduPilot-only, QGC, v1 GPS_INPUT-only scope.
|
||||
- Evidence: Facts #1, #2, #3.
|
||||
- Disqualifiers: v1 emits no `ODOMETRY`.
|
||||
|
||||
### Component: Local API, Object Localization, and FDR
|
||||
|
||||
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|
||||
|----------|-------|------------|-------------|--------------|----------|-------------|-----|
|
||||
| FastAPI local service + segmented FDR writer | FastAPI, Pydantic, SQLite/Parquet/JSONL segments | OpenAPI docs, health/session/object endpoints, replayable FDR | Must stay outside hot frame path | Local-first API, object pixel validation, rollover schema, no raw frame retention | Bind localhost by default; JWT/API key for network exposure; segment checksums | 1-2 Hz GCS summary; high-rate data local only | Selected |
|
||||
|
||||
**Exact-fit evidence**:
|
||||
- Project constraints checked: AC-6, AC-7, AC-NEW-3, OpenAPI documentation, no raw frame storage.
|
||||
- Evidence: Fact #14 and Context7 FastAPI docs.
|
||||
- Disqualifiers: API cannot block GPS_INPUT emission.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Integration / Functional Tests
|
||||
|
||||
- Process the 60-frame sample sequence and assert AC-1.1 / AC-1.2 aggregate thresholds and no frame over the allowed maximum error.
|
||||
- Simulate heavy VPR/local-matching frames and assert stale frames are dropped, not queued, with <=10% drops under the defined sustained-load scenario.
|
||||
- Simulate sharp turns, disconnected segments, and 350 m outliers; assert VPR/local verification recovers or the estimator downgrades confidence without false anchors.
|
||||
- Run ArduPilot SITL with v1 parameters and assert accepted `GPS_INPUT` messages at 5-10 Hz, no `ODOMETRY` emission, correct fix_type transitions, and QGC downsampled telemetry.
|
||||
- Inject stale tiles, mismatched manifests, corrupted descriptors, and cache-poisoning candidates; assert rejection or confidence downgrade.
|
||||
- Validate object localization with level-flight AI-camera geometry and out-of-frame input errors.
|
||||
|
||||
### Non-Functional Tests
|
||||
|
||||
- Jetson benchmark: capture-to-`GPS_INPUT` p95 <400 ms with scheduler, VO, VPR triggers, local matcher, ESKF, API, and FDR active.
|
||||
- Local matcher bake-off: ALIKED + LightGlue, SIFT/AKAZE, and DeDoDe on sample/public datasets, measuring accuracy, false positives, latency, memory, and license status.
|
||||
- VPR memory/index benchmark: prove compressed descriptors, FAISS index, TensorRT engines, and runtime buffers stay below 8 GB.
|
||||
- Cache-packing benchmark: package representative 400 km² imagery with overviews, manifests, descriptors, indexes, and generated-tile sidecars under the 10 GB persistent-cache budget.
|
||||
- Thermal soak: 25 W workload for 8 hours at the upper environmental envelope without throttling.
|
||||
- Monte Carlo false-position and cache-poisoning validation over public datasets plus SITL/real FC traces.
|
||||
- License and dependency scan: fail CI if noncommercial SuperPoint weights or unapproved model artifacts enter product builds.
|
||||
|
||||
## References
|
||||
|
||||
- ArduPilot MAVProxy GPSInput documentation: `https://ardupilot.org/mavproxy/docs/modules/GPSInput.html`
|
||||
- MAVLink `GPS_INPUT` message spec: `https://mavlink.io/en/messages/common.html#GPS_INPUT`
|
||||
- NVIDIA Isaac ROS cuVSLAM docs: `https://nvidia-isaac-ros.github.io/concepts/visual_slam/cuvslam/index.html`
|
||||
- Magic Leap SuperPoint pretrained network license: `https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE`
|
||||
- LightGlue repository: `https://github.com/cvg/LightGlue`
|
||||
- DeDoDe repository: `https://github.com/Parskatt/DeDoDe`
|
||||
- OpenCV SIFT source: `https://github.com/opencv/opencv/blob/4.x/modules/features2d/src/sift.dispatch.cpp`
|
||||
- AnyLoc/DINO repository: `https://github.com/AnyLoc/DINO`
|
||||
- GDAL COG driver and OGC COG standard: `https://gdal.org/en/stable/drivers/raster/cog.html`, `http://www.opengis.net/doc/is/COG/1.0`
|
||||
- AerialVL dataset: `https://github.com/hmf21/AerialVL`
|
||||
- UAV-VisLoc paper: `https://arxiv.org/html/2405.11936v1`
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- AC assessment: `_docs/00_research/00_ac_assessment.md`
|
||||
- Question decomposition: `_docs/00_research/00_question_decomposition.md`
|
||||
- Source registry: `_docs/00_research/01_source_registry.md`
|
||||
- Fact cards: `_docs/00_research/02_fact_cards.md`
|
||||
- Component fit matrix: `_docs/00_research/06_component_fit_matrix.md`
|
||||
- Tech stack evaluation: `_docs/01_solution/tech_stack.md`
|
||||
- Security analysis: `_docs/01_solution/security_analysis.md`
|
||||
@@ -1,98 +0,0 @@
|
||||
# Tech Stack Evaluation
|
||||
|
||||
## Requirements Analysis
|
||||
|
||||
| Requirement | Implication |
|
||||
|-------------|-------------|
|
||||
| Jetson Orin Nano Super, 8 GB, 25 W | Prefer Python orchestration with C++/TensorRT hot paths; benchmark memory early. |
|
||||
| Fixed nadir monocular nav camera | Avoid stereo-only VO dependencies; use custom planar VO/IMU. |
|
||||
| ArduPilot v1 GPS substitute | Use `pymavlink` `GPS_INPUT`; defer `ODOMETRY`. |
|
||||
| Offline satellite cache | Use local SQLite/MBTiles-like packages with explicit manifests and precomputed descriptors. |
|
||||
| OpenAPI docs | Use FastAPI for local control/health/object API. |
|
||||
| No raw frame persistence | FDR stores metadata, estimates, MAVLink, IMU, health, generated tiles, and failure thumbnails only. |
|
||||
|
||||
## Technology Evaluation
|
||||
|
||||
### Language and Runtime
|
||||
|
||||
| Option | Fitness | Maturity | Security | Team/Cost | Scalability | Score | Decision |
|
||||
|--------|---------|----------|----------|-----------|-------------|-------|----------|
|
||||
| Python 3.11+ orchestration | High | High | Medium | High | Medium | 4/5 | Selected |
|
||||
| C++ hot-path modules | High | High | Medium | Medium | High | 4/5 | Selected for VO/matching kernels as needed |
|
||||
| .NET backend | Low | High | High | Medium | High | 2/5 | Rejected for onboard CV hot path |
|
||||
| Rust hot-path modules | Medium | Medium | High | Medium | High | 3/5 | Deferred unless memory safety becomes critical |
|
||||
|
||||
### Vision and Inference
|
||||
|
||||
| Option | Fitness | Maturity | Security | Team/Cost | Scalability | Score | Decision |
|
||||
|--------|---------|----------|----------|-----------|-------------|-------|----------|
|
||||
| OpenCV | High | High | Medium | High | Medium | 5/5 | Selected |
|
||||
| TensorRT | High | High | Medium | Medium | High | 5/5 | Selected for deployed models |
|
||||
| PyTorch runtime | Medium | High | Medium | High | Medium | 3/5 | Dev/training only, not hot v1 runtime |
|
||||
| cuVSLAM | Low for v1 | High | Medium | Medium | High | 2/5 | Rejected for v1 sensor mismatch |
|
||||
| ORB-SLAM3 / VINS-Fusion | Medium | Medium | Low licensing fit | Medium | Medium | 2/5 | Experimental only |
|
||||
|
||||
### Matching and VPR
|
||||
|
||||
| Option | Fitness | Maturity | Security | Team/Cost | Scalability | Score | Decision |
|
||||
|--------|---------|----------|----------|-----------|-------------|-------|----------|
|
||||
| AnyLoc/DINOv2-VLAD style descriptors | High | Medium | Medium | Medium | High | 4/5 | Selected with benchmark |
|
||||
| FAISS CPU/HNSW-flat | High | High | Medium | High | Medium | 4/5 | Selected v1 baseline |
|
||||
| FAISS GPU/cuVS | Medium | Medium | Medium | Low on Jetson | High | 3/5 | Optimization only |
|
||||
| ALIKED + LightGlue | High | Medium | Medium | Medium | High | 4/5 | Selected learned-feature candidate |
|
||||
| OpenCV SIFT/AKAZE | Medium | High | High | High | Medium | 3/5 | Selected legal baseline |
|
||||
| DeDoDe | Medium | Medium | Medium | Medium | Medium | 3/5 | Experimental fallback |
|
||||
| Official Magic Leap SuperPoint weights | High | Medium | Low for product | Medium | High | 1/5 | Rejected unless commercial license is obtained |
|
||||
|
||||
### State, API, and Storage
|
||||
|
||||
| Option | Fitness | Maturity | Security | Team/Cost | Scalability | Score | Decision |
|
||||
|--------|---------|----------|----------|-----------|-------------|-------|----------|
|
||||
| ESKF with NumPy/SciPy prototype, C++ Eigen if needed | High | High | Medium | Medium | High | 5/5 | Selected |
|
||||
| FastAPI + Pydantic | High | High | Medium | High | Medium | 5/5 | Selected |
|
||||
| SQLite / MBTiles-like local cache | High | High | Medium | High | Medium | 4/5 | Selected |
|
||||
| COG/GeoTIFF + GDAL/Rasterio exchange | High | High | Medium | Medium | High | 4/5 | Selected |
|
||||
| PostgreSQL/PostGIS onboard | Low | High | Medium | Low | High | 2/5 | Rejected for embedded v1; too heavy |
|
||||
| Parquet/JSONL FDR segments | High | High | Medium | High | Medium | 4/5 | Selected |
|
||||
|
||||
### MAVLink and Ground Station
|
||||
|
||||
| Option | Fitness | Maturity | Security | Team/Cost | Scalability | Score | Decision |
|
||||
|--------|---------|----------|----------|-----------|-------------|-------|----------|
|
||||
| pymavlink `GPS_INPUT` | High | High | Medium | High | Medium | 5/5 | Selected |
|
||||
| MAVSDK telemetry plumbing | Medium | High | Medium | High | Medium | 4/5 | Selected for non-GPS telemetry where useful |
|
||||
| Mission Planner support | Low | High | Medium | Medium | Medium | 2/5 | Out of scope |
|
||||
| QGroundControl | High | High | Medium | High | Medium | 5/5 | Selected |
|
||||
|
||||
## Tech Stack Summary
|
||||
|
||||
- **Primary language**: Python 3.11+ for orchestration, API, data pipelines, tests, and integration.
|
||||
- **Hot path**: C++/CUDA/TensorRT modules only where profiling proves Python/OpenCV is insufficient.
|
||||
- **Computer vision**: OpenCV, TensorRT, DINOv2/AnyLoc-style descriptors, ALIKED + LightGlue candidate, SIFT/AKAZE legal baseline, DeDoDe experimental fallback.
|
||||
- **State estimation**: ESKF in local NED/ENU, prototyped in Python and migrated to C++ Eigen only if latency requires.
|
||||
- **Autopilot**: pymavlink `GPS_INPUT` v1; MAVSDK only for general telemetry if it does not interfere with GPS emission.
|
||||
- **API**: FastAPI with generated OpenAPI at `/openapi.json`, local-first deployment.
|
||||
- **Storage**: SQLite/MBTiles-like onboard cache, manifest sidecars, GDAL/Rasterio for COG/GeoTIFF exchange, segmented FDR logs.
|
||||
- **Testing**: pytest, SITL ArduPilot, replay harness, Jetson benchmark harness, thermal/memory profiling.
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
|------|--------|------------|
|
||||
| Noncommercial SuperPoint weights enter product build | High | Reject official Magic Leap weights by default; CI/license scan model artifacts; use ALIKED + LightGlue or SIFT/AKAZE unless a commercial grant exists. |
|
||||
| VPR model too slow or memory-heavy | High | Conditional VPR only; benchmark DINOv2-S/tiny variants; CPU FAISS baseline; precompute gallery descriptors. |
|
||||
| Cross-view false positives in sparse fields | High | Top-K retrieval, local geometry, freshness gates, ESKF innovation gating, covariance calibration. |
|
||||
| ArduPilot EKF source behavior changes by version | High | Pin ArduPilot version and SITL tests; v1 GPS_INPUT only. |
|
||||
| Cache format exceeds 10 GB | Medium | Validate provider compression and descriptor index size before implementation freeze. |
|
||||
| Python hot path misses 400 ms p95 | High | Profile first; move only measured bottlenecks to C++/TensorRT. |
|
||||
|
||||
## Learning / Validation Requirements
|
||||
|
||||
| Area | Required Proof |
|
||||
|------|----------------|
|
||||
| Jetson runtime | Benchmark VO, VPR trigger, local matching, ESKF, API, and FDR together under sustained load. |
|
||||
| Scheduler latency | Prove bounded latest-frame queue, frame-drop accounting, and timestamp-correct GPS_INPUT under heavy VPR/local-matcher load. |
|
||||
| ArduPilot integration | SITL proves `GPS_INPUT` acceptance, failsafe behavior, and no ODOMETRY emission in v1. |
|
||||
| Dataset realism | Add real FC IMU logs or approved SITL IMU generated from the trajectory. |
|
||||
| Cache ingestion | Validate COG/GeoTIFF import, local package generation, manifests, and stale-tile rejection. |
|
||||
| Safety | Monte Carlo validates false-position and cache-poisoning budgets before production release. |
|
||||
@@ -1,13 +0,0 @@
|
||||
# Autodev State
|
||||
|
||||
## Current Step
|
||||
flow: greenfield
|
||||
step: 2
|
||||
name: Research
|
||||
status: in_progress
|
||||
sub_step:
|
||||
phase: 1
|
||||
name: mode-b-investigation
|
||||
detail: "Paused planning; researching relative motion estimation and cuVSLAM fit"
|
||||
retry_count: 0
|
||||
cycle: 1
|
||||
Reference in New Issue
Block a user