fresh start v2

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-04-29 17:07:28 +03:00
parent af5eb13ecb
commit 3ef26c515e
15 changed files with 0 additions and 1864 deletions
-111
View File
@@ -1,111 +0,0 @@
# Reasoning Chain
## Dimension 1: Local Matcher Product Fit
### Fact Confirmation
SuperPoint-style features remain technically attractive for local geometric verification, but the official Magic Leap pretrained weights are noncommercial research-only (Fact #17). LightGlue itself is Apache-2.0, but it does not license upstream extractors (Fact #18). LightGlue supports ALIKED, DISK, SIFT, and other extractors (Fact #24), DeDoDe is MIT-licensed with deployment ports (Fact #25), and OpenCV SIFT is now a commercial-safe classical baseline (Fact #26).
### Reference Comparison
`solution_draft02.md` fixed the SuperPoint licensing issue but left "license-cleared extractor" too abstract for planning. The architecture can keep the local-verification stage, but planning needs named candidates so benchmark and licensing tasks can be decomposed.
### Conclusion
Reject official SuperPoint pretrained weights for product v1 unless a commercial license is obtained. Select ALIKED + LightGlue as the first learned-feature candidate, OpenCV SIFT/AKAZE as the legal baseline, and DeDoDe as an experimental fallback pending Jetson/model-size validation.
### Confidence
High for licensing; Medium for final extractor accuracy until benchmarked.
---
## Dimension 1.5: Real-Time Scheduling
### Fact Confirmation
The camera produces frames at 3 Hz, AC-4.1 allows <400 ms p95 end-to-end latency with up to ~10% dropped frames, and AC-4.4 forbids batching or delaying output (Fact #27).
### Reference Comparison
A FIFO queue can accumulate stale frames whenever a heavy VPR or local-matching event exceeds the 333 ms camera interval. That would make the system accurate on old images while violating the flight-controller output latency budget.
### Conclusion
Add a bounded latest-frame scheduler: camera queue size 1, explicit drop accounting, IMU propagation continues between image fixes, VPR/local matching run under deadlines, and every emitted `GPS_INPUT` references the freshest state timestamp.
### Confidence
High.
---
## Dimension 2: VPR Descriptor and Cache Footprint
### Fact Confirmation
AnyLoc DINOv2 VLAD examples produce 49,152-dimensional descriptors (Fact #19). The operational area can be up to 400 km² with multi-scale, overlapping chunks.
### Reference Comparison
Event-triggered VPR is still the right architecture, but uncompressed VLAD descriptors can quietly consume a large fraction of RAM/cache. For example, 4,000-10,000 chunks at 49,152 float32 values each is roughly 0.8-2.0 GB before multi-scale variants, indexes, metadata, and model/runtime memory.
### Conclusion
Keep AnyLoc/DINOv2-style VPR as the lead retrieval family only with a mandatory descriptor-compression gate: PCA/float16/product quantization or a smaller descriptor must be chosen before implementation freeze. CPU FAISS/HNSW remains the v1 baseline until Jetson GPU indexing is proven.
### Confidence
High for the footprint risk; Medium for the best compression/index choice.
---
## Dimension 3: Satellite Cache Storage
### Fact Confirmation
COG supports tiled imagery, overviews, and multiple compression profiles, but docs do not provide a universal bytes-per-pixel budget for the target imagery (Fact #21). Zoom level alone does not prove physical resolution (Fact #13).
### Reference Comparison
The 10 GB persistent cache budget may be plausible with lossy compressed 0.3-0.5 m/px imagery and careful indexing, but it is not proven until representative Suite Satellite Service imagery is packaged with overviews, manifests, descriptors, and generated-tile sidecars.
### Conclusion
Treat cache size as a hard measurement gate. The architecture should preserve the 10 GB budget but require a cache-packing benchmark before task decomposition commits to descriptor formats or chunk overlap settings.
### Confidence
Medium-high.
---
## Dimension 4: Relative Motion and cuVSLAM
### Fact Confirmation
NVIDIA describes cuVSLAM as stereo-visual-inertial SLAM/odometry, with IMU-only degraded tracking suitable only for short intervals around one second (Fact #20). The project has one fixed downward navigation camera for v1.
### Reference Comparison
cuVSLAM is a strong Jetson stack, but the selected v1 camera geometry does not match its documented primary input assumptions. A custom planar VO/IMU module can exploit nadir imagery, flat terrain, camera intrinsics, altitude, and FC attitude directly.
### Conclusion
Keep custom planar VO/IMU as the lead. Keep cuVSLAM rejected for v1 product use, but preserve it as a benchmark/reference if the hardware changes to stereo or if NVIDIA documents an exact monocular deployment path matching the project.
### Confidence
High.
---
## Dimension 5: Validation Data
### Fact Confirmation
AerialVL and UAV-VisLoc provide useful public aerial localization data, but they only partially match the fixed-wing, ArduPilot, high-rate IMU, camera-timing, and Ukraine steppe deployment context (Facts #22, #23).
### Reference Comparison
Public datasets can validate VPR/cross-view ideas and regression-test retrieval. They cannot prove ESKF covariance, MAVLink timing, companion reboot, or false-position budgets without representative IMU and FC traces.
### Conclusion
Use public datasets for early VPR/local-matcher benchmarking, then require ArduPilot SITL-generated IMU traces and at least one real FC/camera timing capture before final acceptance.
### Confidence
High.
---
## Dimension 6: ArduPilot Output
### Fact Confirmation
ArduPilot documents MAVLink GPS input with `GPS1_TYPE=14` (Fact #1), MAVLink defines `GPS_INPUT` as raw GPS sensor input rather than the global position estimate (Fact #2), and external-nav/GPS source-fusion issues are version-specific (Fact #3).
### Reference Comparison
`ODOMETRY` is semantically richer but increases EKF source-interaction risk. v1 `GPS_INPUT` only is narrower and forces honest accuracy fields, but it matches the "GPS substitute" framing and avoids dual-source overlap.
### Conclusion
Keep v1 `GPS_INPUT` only. Add a v1.1 research/testing backlog item for `ODOMETRY`, gated by exact ArduPilot release, params, and SITL proof.
### Confidence
High.