Revise acceptance criteria and restrictions documentation to clarify recent updates and specifications. Key changes include enhanced definitions for position accuracy, image processing quality, and operational parameters, as well as updates to camera specifications and validation requirements. This revision aims to improve clarity and ensure alignment with project goals.

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-01 16:24:46 +03:00
parent 3f173c1bb7
commit 7e15868d39
62 changed files with 6878 additions and 13 deletions
+181
View File
@@ -0,0 +1,181 @@
# Reasoning Chain
## Dimension 1: Core Architecture
### Fact Confirmation
Fixed-wing monocular VO accumulates drift because scale and terrain assumptions are imperfect (Fact #1). Aerial VPR can provide absolute anchors but has weather, scale, and repetition failure modes (Fact #2).
### Reference Comparison
Pure VO/IMU cannot satisfy the long 8-hour mission. Pure satellite matching cannot run every frame inside the latency budget and will fail in turns, stale imagery, and repetitive fields.
### Conclusion
Select a hybrid architecture: steady-state VO/IMU propagation, conditional satellite/VPR anchoring, and ESKF covariance gates.
### Confidence
High. Supported by Sources #1 and #2.
## Dimension 2: VO / VIO Dependency
### Fact Confirmation
OpenVINS and ORB-SLAM3 both support visual-inertial estimation patterns, but both are GPL-family production dependency risks (Facts #4 and #5).
### Reference Comparison
The project needs a production system with custom mode labels, covariance propagation, MAVLink-specific output behavior, and no hidden GPL obligation. A full SLAM stack also adds initialization and map-management complexity that the ACs do not require.
### Conclusion
Use a custom VO/IMU propagation and ESKF mode machine for production. Use OpenVINS/ORB-SLAM3 only as benchmarks/reference algorithms.
### Confidence
High for licensing/scope decision; medium for final estimator performance until prototype profiling.
## Dimension 3: Satellite Retrieval And Local Matching
### Fact Confirmation
Aerial VPR benefits from scale/overlap-aware chunks and retrieval + local alignment (Fact #2). LightGlue provides keypoint correspondences/scores from local descriptors (Fact #7). FAISS provides top-K retrieval over descriptors (Fact #12).
### Reference Comparison
Global DINOv2/VLAD retrieval alone gives candidate chunks but not enough precision for AC-2.2. Local matching alone over the whole map is too expensive. The two-stage retrieval+alignment structure matches the operational need.
### Conclusion
Use DINOv2-VLAD descriptors with FAISS top-K for conditional candidate retrieval, followed by DISK/ALIKED+LightGlue and OpenCV RANSAC homography for local alignment.
### Confidence
Medium-high. API fit is strong; embedded runtime must be profiled.
## Dimension 4: Latency And Memory
### Fact Confirmation
Some aerial VPR re-ranking methods are too slow for the steady-state path (Fact #3). Jetson Orin Nano Super has 8 GB shared memory and 25 W power mode (Fact #18), with thermal throttling risk (Fact #19).
### Reference Comparison
At 3 fps and <400 ms p95, the system can process every frame through lightweight VO/IMU and use heavier VPR only on triggers. Running DINOv2/LightGlue across many candidates per frame would violate the AC unless proven otherwise.
### Conclusion
Make VPR conditional, cap top-K by covariance/sector, run descriptor extraction on downsampled/orthorectified crops, precompute cache descriptors offline, and add performance regression tests.
### Confidence
High for architecture direction; exact model sizes need profiling.
## Dimension 5: Cache Format
### Fact Confirmation
COG supports tiled compressed rasters and geospatial tooling (Fact #21). PMTiles is read-efficient but read-only (Fact #20).
### Reference Comparison
The onboard system must both read service tiles and write new generated tiles in-flight. A read-only archive is a poor primary mutable store.
### Conclusion
Use COG files plus STAC-like manifests/sidecars for imagery and metadata; use FAISS sidecar indexes for descriptors. PMTiles may be an export/snapshot format, not the live mutable cache.
### Confidence
High.
## Dimension 6: Autopilot Integration
### Fact Confirmation
ArduPilot MAVLink GPS input requires `GPS1_TYPE=14` (Fact #15). `GPS_INPUT` carries the fields needed for WGS84 position and accuracy (Fact #16). MAVSDK covers telemetry but raw `GPS_INPUT` emission still needs pymavlink/raw MAVLink (Fact #14).
### Reference Comparison
MAVSDK-only output would not satisfy AC-4.3. Raw pymavlink-only telemetry is possible but gives up MAVSDK's high-level subscriptions.
### Conclusion
Use MAVSDK for telemetry subscriptions and pymavlink for `GPS_INPUT` emission. Verify all failsafe/spoofing behavior in ArduPilot Plane SITL.
### Confidence
High for interface; medium for exact Plane failsafe timing until SITL.
## Dimension 7: Validation
### Fact Confirmation
AerialVL provides aerial localization/reference-map data (Fact #22). EuRoC provides camera+IMU+ground truth but not fixed-wing nadir imagery (Fact #23). The current sample set lacks IMU/ground truth.
### Reference Comparison
No single public dataset proves every AC. Public datasets can de-risk components, while final acceptance requires representative synchronized flight/replay data.
### Conclusion
Use layered validation: EuRoC for VIO sanity, AerialVL/VPAir-style data for VPR/anchor tests, ArduPilot Plane SITL for MAVLink/failsafe/spoofing, and a final representative flight/replay rig for AC proof.
### Confidence
High.
## Dimension 8: OpenVINS vs Project-Owned Estimator
### Fact Confirmation
OpenVINS is a strong monocular+IMU EKF/MSCKF VIO reference with covariance-aware estimation (Facts #4 and #30). OpenCV supplies calibration, undistortion, homography, RANSAC/USAC, and reprojection-error primitives, but it is not a full estimator by itself (Facts #6 and #31).
### Reference Comparison
If the alternative is a naive OpenCV-only VIO stack, OpenVINS is the better technical starting point. The project's actual production choice is different: it needs a project-owned ESKF/mode machine that fuses VO/IMU, accepts/rejects satellite anchors, emits `GPS_INPUT`, labels every estimate, handles spoofing/blackout, and gates cache write-back.
### Conclusion
Keep OpenVINS as a mandatory benchmark/reference implementation, not as the default production dependency. The production estimator remains project-owned, with OpenCV as the geometry utility layer.
### Confidence
High. The technical comparison favors OpenVINS over naive custom VIO, while the product fit favors the project-owned estimator.
## Dimension 9: Satellite Retrieval And Anchor Verification
### Fact Confirmation
DINOv2-VLAD/AnyLoc-style retrieval has strong VPR evidence but descriptor/model size and TensorRT fidelity must be validated (Facts #33 and #36). Aerial VPR survey evidence emphasizes tile scale, overlap, season/weather shifts, repetitive patterns, and re-ranking cost (Fact #34). LightGlue supports ALIKED/DISK/SIFT feature matching and returns correspondences/scores suitable for RANSAC verification, but Jetson latency must be profiled (Facts #7, #8, #35).
### Reference Comparison
Classical SIFT/ORB is simpler and cheap but weaker for cross-domain UAV-to-satellite matching. SuperPoint+LightGlue is technically strong but remains license-gated. Pure global retrieval without local verification is unsafe because repetitive farmland and stale imagery can produce plausible but wrong candidates.
### Conclusion
Use DINOv2-VLAD + CPU-first FAISS as the triggered global retriever, then verify bounded top-K candidates with ALIKED/LightGlue + OpenCV RANSAC. Keep SIFT/ORB as a regression baseline and SuperPoint only after legal approval.
### Confidence
High for architecture and interfaces; medium for final runtime until Jetson profiling.
## Dimension 10: BASALT vs Kimera-VIO vs OpenVINS
### Fact Confirmation
BASALT has strong published EuRoC evidence and production-friendly licensing (Fact #37). OpenVINS has the clearest EKF covariance API and consistency tooling, but GPLv3 remains a production constraint (Fact #38). Kimera-VIO is BSD-friendly but heavier and has documented mono-inertial caveats (Fact #39). All three require calibrated camera-to-IMU extrinsics; none has a special fixed-wing nadir mode (Fact #40).
### Reference Comparison
For a single fixed downward camera, the selection criterion is not just benchmark ATE. The project needs a VIO core that can run on Jetson, tolerate calibrated nadir geometry, and be wrapped by project-specific satellite-anchor, confidence, MAVLink, and safety logic. OpenVINS is attractive for confidence/covariance but problematic as a shipped component. Kimera is acceptable as a BSD fallback, but mono-inertial risk makes it weaker as the first pick. BASALT provides the best production trade-off if its uncertainty can be calibrated and wrapped.
### Conclusion
Select BASALT as the production VIO candidate. Keep OpenVINS as a reference/covariance baseline and Kimera-VIO as a backup candidate. The project-owned safety/anchor wrapper remains mandatory around BASALT because BASALT alone does not satisfy GPS-denied source labels, satellite anchors, false-position budgets, cache-write gates, or `GPS_INPUT` behavior.
### Confidence
Medium-high. Library ranking is well supported; final acceptance still depends on representative fixed-wing nadir replay data.