⚠️ ARCHIVED — Дивіться solution.md для актуальної специфікації
Увага: Цей документ містить застарілі дані (3fps замість 0.7fps, LiteSAM 480px замість 1280px).
Актуальна специфікація: _docs/01_solution/solution.md
Tech Stack Evaluation (ARCHIVED)
Requirements Summary
Functional
- GPS-denied visual navigation for fixed-wing UAV
- Frame-center GPS estimation via VO + satellite matching + IMU fusion
- Object-center GPS via geometric projection
- Real-time streaming via REST API + SSE
- Disconnected route segment handling
- User-input fallback for unresolvable frames
Non-Functional
- <400ms per-frame processing (camera @ ~3fps)
- <50m accuracy for 80% of frames, <20m for 60%
- <8GB total memory (CPU+GPU shared pool)
- Up to 3000 frames per flight session
- Image Registration Rate >95% (normal segments)
Hardware Constraints
- Jetson Orin Nano Super (8GB LPDDR5, 1024 CUDA cores, 67 TOPS INT8)
- JetPack 6.2.2: CUDA 12.6.10, TensorRT 10.3.0, cuDNN 9.3
- ARM64 (aarch64) architecture
- No internet connectivity during flight
Technology Evaluation
Platform & OS
| Option |
Version |
Score (1-5) |
Notes |
| JetPack 6.2.2 (L4T) |
Ubuntu 22.04 based |
5 |
Only supported OS for Orin Nano Super. Includes CUDA 12.6, TensorRT 10.3, cuDNN 9.3 |
Selected: JetPack 6.2.2 — no alternative.
Primary Language
| Option |
Fitness |
Maturity |
Perf on Jetson |
Ecosystem |
Score |
| Python 3.10+ |
5 |
5 |
4 |
5 |
4.8 |
| C++ |
5 |
5 |
5 |
3 |
4.5 |
| Rust |
3 |
3 |
5 |
2 |
3.3 |
Selected: Python 3.10+ as primary language.
- cuVSLAM provides Python bindings (PyCuVSLAM v15.0.0)
- TensorRT has Python API
- FastAPI is Python-native
- OpenCV has full Python+CUDA bindings
- Performance-critical paths offloaded to CUDA via cuVSLAM/TensorRT — Python is glue code only
- C++ for custom ESKF if NumPy proves too slow (unlikely for 16-state EKF at 100Hz)
Visual Odometry
| Option |
Version |
FPS on Orin Nano |
Memory |
License |
Score |
| cuVSLAM (PyCuVSLAM) |
v15.0.0 (Mar 2026) |
116fps @ 720p |
~200-300MB |
Free (NVIDIA, closed-source) |
5 |
| XFeat frame-to-frame |
TensorRT engine |
~30-50ms/frame |
~50MB |
MIT |
3.5 |
| ORB-SLAM3 |
v1.0 |
~30fps |
~300MB |
GPLv3 |
2.5 |
Selected: PyCuVSLAM v15.0.0
- 116fps on Orin Nano 8G at 720p (verified via Intermodalics benchmark)
- Mono + IMU mode natively supported
- Auto IMU fallback on tracking loss
- Pre-built aarch64 wheel:
pip install -e bin/aarch64
- Loop closure built-in
Risk: Closed-source; nadir-only camera not explicitly tested. Fallback: XFeat frame-to-frame matching.
Satellite Image Matching (Benchmark-Driven Selection)
Day-one benchmark decides between two candidates:
| Option |
Params |
Accuracy (UAV-VisLoc) |
Est. Time on Orin Nano |
License |
Score |
| LiteSAM (opt) |
6.31M |
RMSE@30 = 17.86m |
~300-500ms @ 480px (estimated) |
Open-source |
4 (if fast enough) |
| XFeat semi-dense |
~5M |
Not benchmarked on UAV-VisLoc |
~50-100ms |
MIT |
4 (if LiteSAM too slow) |
Decision rule:
- Export LiteSAM (opt) to TensorRT FP16 on Orin Nano Super
- Benchmark at 480px, 640px, 800px
- If ≤400ms at 480px → LiteSAM
- If >400ms → abandon LiteSAM, XFeat is primary
| Requirement |
LiteSAM (opt) |
XFeat semi-dense |
| PyTorch → ONNX → TensorRT export |
Required |
Required |
| TensorRT FP16 engine |
6.31M params, ~25MB engine |
~5M params, ~20MB engine |
| Input preprocessing |
Resize to 480px, normalize |
Resize to 640px, normalize |
| Matching pipeline |
End-to-end (detect + match + refine) |
Detect → KNN match → geometric verify |
| Cross-view robustness |
Designed for satellite-aerial gap |
General-purpose, less robust |
Sensor Fusion
| Option |
Complexity |
Accuracy |
Compute @ 100Hz |
Score |
| ESKF (custom) |
Low |
Good |
<1ms/step |
5 |
| Hybrid ESKF/UKF |
Medium |
49% better |
~2-3ms/step |
3.5 |
| GTSAM Factor Graph |
High |
Best |
~10-50ms/step |
2 |
Selected: Custom ESKF in Python (NumPy/SciPy)
- 16-state vector, well within NumPy capability
- FilterPy (v1.4.5, MIT) as reference/fallback, but custom implementation preferred for tighter control
- If 100Hz IMU prediction step proves slow in Python: rewrite as Cython or C extension (~1 day effort)
Image Preprocessing
| Option |
Tool |
Time on Orin Nano |
Notes |
Score |
| OpenCV CUDA resize |
cv2.cuda.resize |
~2-3ms (pre-allocated) |
Must build OpenCV with CUDA from source. Pre-allocate GPU mats to avoid allocation overhead |
4 |
| NVIDIA VPI resize |
VPI 3.2 |
~1-2ms |
Part of JetPack, potentially faster |
4 |
| CPU resize (OpenCV) |
cv2.resize |
~5-10ms |
No GPU needed, simpler |
3 |
Selected: OpenCV CUDA (pre-allocated GPU memory) or VPI 3.2 (whichever is faster in benchmark). Both available in JetPack 6.2.
- Must build OpenCV from source with
CUDA_ARCH_BIN=8.7 for Orin Nano Ampere architecture
- Alternative: VPI 3.2 is pre-installed in JetPack 6.2, no build step needed
API & Streaming Framework
| Option |
Version |
Async Support |
SSE Support |
Score |
| FastAPI + sse-starlette |
FastAPI 0.115+, sse-starlette 3.3.2 |
Native async/await |
EventSourceResponse with auto-disconnect |
5 |
| Flask + flask-sse |
Flask 3.x |
Limited |
Redis dependency |
2 |
| Raw aiohttp |
aiohttp 3.x |
Full |
Manual SSE implementation |
3 |
Selected: FastAPI + sse-starlette v3.3.2
- sse-starlette: 108M downloads/month, BSD-3 license, production-stable
- Auto-generated OpenAPI docs
- Native async for non-blocking VO + satellite pipeline
- Uvicorn as ASGI server
Satellite Tile Storage & Indexing
| Option |
Complexity |
Lookup Speed |
Score |
| GeoHash-indexed directory |
Low |
O(1) hash lookup |
5 |
| SQLite + spatial index |
Medium |
O(log n) |
4 |
| PostGIS |
High |
O(log n) |
2 (overkill) |
Selected: GeoHash-indexed directory structure
- Pre-flight: download tiles, store as
{geohash}/{zoom}_{x}_{y}.jpg + {geohash}/{zoom}_{x}_{y}_resized.jpg
- Runtime: compute geohash from ESKF position → direct directory lookup
- Metadata in JSON sidecar files
- No database dependency on the Jetson during flight
Satellite Tile Provider
| Provider |
Max Zoom |
GSD |
Pricing |
Eastern Ukraine Coverage |
Score |
| Google Maps Tile API |
18-19 |
~0.3-0.5 m/px |
100K tiles free/month, then $0.48/1K |
Partial (conflict zone gaps) |
4 |
| Bing Maps |
18-19 |
~0.3-0.5 m/px |
125K free/year (basic) |
Similar |
3.5 |
| Mapbox Satellite |
18-19 |
~0.5 m/px |
200K free/month |
Similar |
3.5 |
Selected: Google Maps Tile API (per restrictions.md). 100K free tiles/month covers ~25km² at zoom 19. For larger operational areas, costs are manageable at $0.48/1K tiles.
Output Format
| Format |
Standard |
Tooling |
Score |
| GeoJSON |
RFC 7946 |
Universal GIS support |
5 |
| CSV (lat, lon, confidence) |
De facto |
Simple, lightweight |
4 |
Selected: GeoJSON as primary, CSV as export option. Per AC: WGS84 coordinates.
Tech Stack Summary
Dependency List
Python Packages (pip)
| Package |
Version |
Purpose |
| pycuvslam |
v15.0.0 (aarch64 wheel) |
Visual odometry |
| fastapi |
>=0.115 |
REST API framework |
| sse-starlette |
>=3.3.2 |
SSE streaming |
| uvicorn |
>=0.30 |
ASGI server |
| numpy |
>=1.26 |
ESKF math, array ops |
| scipy |
>=1.12 |
Rotation matrices, spatial transforms |
| opencv-python (CUDA build) |
>=4.8 |
Image preprocessing (must build from source with CUDA) |
| torch (aarch64) |
>=2.3 (JetPack-compatible) |
LiteSAM model loading (if selected) |
| tensorrt |
10.3.0 (JetPack bundled) |
Inference engine |
| pycuda |
>=2024.1 |
CUDA stream management |
| geojson |
>=3.1 |
GeoJSON output formatting |
| pygeohash |
>=1.2 |
GeoHash tile indexing |
System Dependencies (JetPack 6.2.2)
| Component |
Version |
Notes |
| CUDA Toolkit |
12.6.10 |
Pre-installed |
| TensorRT |
10.3.0 |
Pre-installed |
| cuDNN |
9.3 |
Pre-installed |
| VPI |
3.2 |
Pre-installed, alternative to OpenCV CUDA for resize |
| cuVSLAM runtime |
Bundled with PyCuVSLAM wheel |
|
Offline Preprocessing Tools (developer machine, not Jetson)
| Tool |
Purpose |
| Python 3.10+ |
Tile download script |
| Google Maps Tile API key |
Satellite tile access |
| torch + LiteSAM weights |
Feature pre-extraction (if LiteSAM selected) |
| trtexec (TensorRT) |
Model export to TensorRT engine |
Risk Assessment
| Technology |
Risk |
Likelihood |
Impact |
Mitigation |
| cuVSLAM |
Closed-source, nadir camera untested |
Medium |
High |
XFeat frame-to-frame as open-source fallback |
| LiteSAM |
May exceed 400ms on Orin Nano Super |
High |
High |
Abandon for XFeat — day-one benchmark is go/no-go |
| OpenCV CUDA build |
Build complexity on Jetson, CUDA arch compatibility |
Medium |
Low |
VPI 3.2 as drop-in alternative (pre-installed) |
| Google Maps Tile API |
Conflict zone coverage gaps, EEA restrictions |
Medium |
Medium |
Test tile availability for operational area pre-flight; alternative providers (Bing, Mapbox) |
| Custom ESKF |
Implementation bugs, tuning effort |
Low |
Medium |
FilterPy v1.4.5 as reference; well-understood algorithm |
| Python GIL |
Concurrent VO + satellite matching contention |
Low |
Low |
CUDA operations release GIL; use asyncio + threading for I/O |
Learning Requirements
| Technology |
Team Expertise Needed |
Ramp-up Time |
| PyCuVSLAM |
SLAM concepts, Python API, camera calibration |
2-3 days |
| TensorRT model export |
ONNX export, trtexec, FP16 optimization |
2-3 days |
| LiteSAM architecture |
Transformer-based matching (if selected) |
1-2 days |
| XFeat |
Feature detection/matching concepts |
1 day |
| ESKF |
Kalman filtering, quaternion math, multi-rate fusion |
3-5 days |
| FastAPI + SSE |
Async Python, ASGI, SSE protocol |
1 day |
| GeoHash spatial indexing |
Geospatial concepts |
0.5 days |
| Jetson deployment |
JetPack, power modes, thermal management |
2-3 days |
Development Environment
| Environment |
Purpose |
Setup |
| Developer machine (x86_64, GPU) |
Development, unit testing, model export |
Docker with CUDA + TensorRT |
| Jetson Orin Nano Super |
Integration testing, benchmarking, deployment |
JetPack 6.2.2 flashed, SSH access |
Code should be developed and unit-tested on x86_64, then deployed to Jetson for integration/performance testing. cuVSLAM and TensorRT engines are aarch64-only — mock these in x86_64 tests.