diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md new file mode 100644 index 0000000..f081c50 --- /dev/null +++ b/.planning/PROJECT.md @@ -0,0 +1,98 @@ +# GPS-Denied Onboard Navigation System + +## What This Is + +Real-time GPS-independent position estimation system for a fixed-wing UAV operating in GPS-denied/spoofed environments (flat terrain, Ukraine). Runs onboard a Jetson Orin Nano Super (8GB shared, 67 TOPS). Fuses visual odometry (cuVSLAM), satellite image matching (TensorRT FP16), and IMU via an ESKF to output MAVLink GPS_INPUT to an ArduPilot flight controller at 5-10Hz, while also streaming position and confidence over SSE to a ground station. + +## Core Value + +The flight controller must receive valid MAVLink GPS_INPUT at 5-10Hz with position accuracy ≤50m for 80% of frames — without this, the UAV cannot navigate in GPS-denied airspace. + +## Requirements + +### Validated + +- ✓ FastAPI service scaffold with SSE streaming — existing +- ✓ FlightProcessor orchestrator with NORMAL/LOST/RECOVERY state machine — existing +- ✓ CoordinateTransformer (GPS↔ENU, pixel→camera→body→NED→WGS84) — existing +- ✓ SatelliteDataManager (tile fetch, diskcache, GeoHash lookup) — existing +- ✓ ImageInputPipeline (frame queue, batch validation, storage) — existing +- ✓ SQLAlchemy async DB layer (flights, waypoints, frames, results) — existing +- ✓ Pydantic schema contracts for all inter-component data — existing +- ✓ ABC interfaces for all core components (VO, GPR, metric, graph) — existing + +### Active + +- [ ] ESKF implementation (15-state error-state Kalman filter: IMU prediction + VO update + satellite update + covariance propagation) +- [ ] MAVLink GPS_INPUT output (pymavlink, UART/UDP, 5-10Hz loop, ESKF state→GPS_INPUT field mapping) +- [ ] Real VO implementation (cuVSLAM on Jetson / OpenCV ORB stub on dev for CI) +- [ ] Real TensorRT inference (SuperPoint+LightGlue for VO, XFeat for satellite matching — FP16 on Jetson) +- [ ] Satellite feature matching pipeline (tile selection by ESKF uncertainty, RANSAC homography, WGS84 extraction) +- [ ] GlobalPlaceRecognition implementation (AnyLoc/DINOv2 candidate retrieval, FAISS index, tile scoring) +- [ ] FactorGraph implementation (pose graph with VO edges + satellite anchor nodes, optimization loop) +- [ ] FailureRecoveryCoordinator (tracking loss detection, re-init protocol, operator re-localization hint) +- [ ] End-to-end pipeline wiring (processor.process_frame → VO → ESKF → satellite → GPS_INPUT) +- [ ] Docker SITL test harness (ArduPilot SITL, camera replay, tile server mock, CI integration) +- [ ] Confidence scoring and GPS_INPUT fix_type mapping (HIGH/MEDIUM/LOW → fix_type 3/2/0) +- [ ] Object GPS localization endpoint (POST /objects/locate with gimbal angle projection) + +### Out of Scope + +- TensorRT engine building tooling — engines are pre-built offline, system only loads them +- Google Maps tile download tooling — tiles pre-cached before flight, not streamed live +- Full ArduPilot integration testing on hardware — Jetson hardware validation is post-v1 +- Mobile/web ground station UI — SSE stream is consumed by external systems +- Multi-UAV coordination — single UAV instance only + +## Context + +**Hardware target:** Jetson Orin Nano Super (8GB LPDDR5 shared, JetPack 6.2.2, CUDA 12.6, TRT 10.3.0). All development happens on x86 Linux; cuVSLAM and TRT are Jetson-only — dev machine uses OpenCV ORB stub and MockInferenceEngine. + +**Camera:** ADTI 20L V1 (5456×3632, APS-C, 16mm lens, nadir fixed, 0.7fps). AI detection camera: Viewpro A40 Pro (separate). + +**Flight controller:** ArduPilot via MAVLink UART. System sends GPS_INPUT; receives IMU (200Hz) and GLOBAL_POSITION_INT (1Hz) from FC. + +**Key latency budget:** <400ms end-to-end per frame (camera @ 0.7fps = 1430ms window). + +**Existing scaffold:** ~2800 lines of Python code exist as a well-structured scaffold. All modules are present with ABC interfaces and schemas, but critical algorithmic kernels (ESKF, real VO, TRT inference, MAVLink) are missing or mocked. + +**Test data:** 60 UAV frames (AD000001-AD000060.jpg), coordinates.csv with ground-truth GPS, expected_results/position_accuracy.csv. 43 documented test scenarios across 7 categories. + +## Constraints + +- **Performance**: <400ms/frame end-to-end, <8GB RAM+VRAM — non-negotiable for real-time flight +- **Hardware**: cuVSLAM v15.0.0 (aarch64-only wheel) — stub interface required for CI +- **Platform**: JetPack 6.2.2, Python 3.10+, TensorRT 10.3.0, CUDA 12.6 +- **Navigation accuracy**: 80% frames ≤50m, 60% frames ≤20m, max drift 100m between satellite corrections +- **Resilience**: Handle sharp turns (disconnected VO segments), 3+ consecutive satellite match failures + +## Key Decisions + +| Decision | Rationale | Outcome | +|----------|-----------|---------| +| ESKF over EKF/UKF | 15-state error-state formulation avoids quaternion singularities, standard for INS | — Pending | +| XFeat over LiteSAM for satellite matching | LiteSAM may exceed 400ms budget on Jetson; XFeat is faster | — Pending (benchmark required) | +| OpenCV ORB stub for dev/CI | cuVSLAM is aarch64-only; CI must run on x86 | — Pending | +| AnyLoc/DINOv2 for GPR | Validated on UAV-VisLoc benchmark (17.86m RMSE) | — Pending | +| diskcache + GeoHash for tiles | O(1) tile lookup, no DB overhead, LRU eviction | ✓ Good | +| AsyncSQLAlchemy + aiosqlite | Non-blocking DB for async FastAPI service | ✓ Good | + +## Evolution + +This document evolves at phase transitions and milestone boundaries. + +**After each phase transition** (via `/gsd:transition`): +1. Requirements invalidated? → Move to Out of Scope with reason +2. Requirements validated? → Move to Validated with phase reference +3. New requirements emerged? → Add to Active +4. Decisions to log? → Add to Key Decisions +5. "What This Is" still accurate? → Update if drifted + +**After each milestone** (via `/gsd:complete-milestone`): +1. Full review of all sections +2. Core Value check — still the right priority? +3. Audit Out of Scope — reasons still valid? +4. Update Context with current state + +--- +*Last updated: 2026-04-01 after initialization*