From 659050f20b0ac035c8e8f76ba3e0c0649d501bcf Mon Sep 17 00:00:00 2001 From: Yuzviak Date: Wed, 1 Apr 2026 20:52:42 +0300 Subject: [PATCH] docs: add requirements and roadmap MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 36 v1 requirements across 6 categories (ESKF, VO, SAT, GPR, MAV, PIPE, TEST). 7-phase roadmap ordered by dependency: ESKF → VO → Satellite → MAVLink → Pipeline → SITL → Validation. Co-Authored-By: Claude Sonnet 4.6 --- .planning/REQUIREMENTS.md | 157 ++++++++++++++++++++++++++++++++++++++ .planning/ROADMAP.md | 110 ++++++++++++++++++++++++++ 2 files changed, 267 insertions(+) create mode 100644 .planning/REQUIREMENTS.md create mode 100644 .planning/ROADMAP.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md new file mode 100644 index 0000000..933d230 --- /dev/null +++ b/.planning/REQUIREMENTS.md @@ -0,0 +1,157 @@ +# Requirements: GPS-Denied Onboard Navigation System + +**Defined:** 2026-04-01 +**Core Value:** The flight controller must receive valid MAVLink GPS_INPUT at 5-10Hz with position accuracy ≤50m for 80% of frames — without this, the UAV cannot navigate in GPS-denied airspace. + +## v1 Requirements + +Requirements for this milestone. The scaffold (~2800 lines) exists; all algorithmic kernels are missing or mocked. Every requirement below maps to one phase of implementation work. + +### ESKF — Error-State Kalman Filter + +- [ ] **ESKF-01**: 15-state ESKF implemented (δp, δv, δθ, δb_a, δb_g) with IMU prediction step (F, Q matrices, bias propagation) +- [ ] **ESKF-02**: VO measurement update implemented (relative pose ΔR/Δt from cuVSLAM, H_vo, R_vo covariance, Kalman gain) +- [ ] **ESKF-03**: Satellite measurement update implemented (absolute WGS84 position from matching, H_sat, R_sat from RANSAC inlier ratio) +- [ ] **ESKF-04**: ESKF state initializes from GLOBAL_POSITION_INT at startup and on mid-flight reboot with high-uncertainty covariance +- [ ] **ESKF-05**: Confidence tier computation outputs HIGH/MEDIUM/LOW based on covariance magnitude and last satellite correction age +- [ ] **ESKF-06**: Coordinate transform chain implemented: pixel→camera ray (K matrix), camera→body (T_cam_body), body→NED (ESKF quaternion), NED→WGS84 — replacing all FAKE Math stubs + +### VO — Visual Odometry + +- [ ] **VO-01**: cuVSLAM wrapper implemented for Jetson target (Inertial mode, camera + IMU inputs, relative pose output with metric scale) +- [ ] **VO-02**: OpenCV ORB stub conforms to the same `ISequentialVisualOdometry` interface as cuVSLAM wrapper, used on dev/CI (x86) +- [ ] **VO-03**: TensorRT FP16 inference engine loader implemented for SuperPoint and LightGlue on Jetson; MockInferenceEngine used on dev/CI +- [ ] **VO-04**: Scale ambiguity resolved — `scale_ambiguous` is False when ESKF provides metric scale reference; VO relative pose is metric in NED +- [ ] **VO-05**: ImageInputPipeline batch validation minimum lowered to 1 image (not 10); `get_image_by_sequence` uses exact filename matching + +### SAT — Satellite Matching + +- [ ] **SAT-01**: XFeat TRT FP16 inference engine implemented for satellite feature matching on Jetson; MockInferenceEngine used on dev/CI +- [ ] **SAT-02**: Satellite tile selection uses ESKF position ± 3σ_horizontal to define search area; tiles assembled into mosaic at matcher resolution +- [ ] **SAT-03**: GSD normalization implemented — camera frame downsampled to match satellite GSD (0.3–0.6 m/px) before matching +- [ ] **SAT-04**: RANSAC homography estimation produces WGS84 absolute position with confidence score from inlier ratio +- [ ] **SAT-05**: SatelliteDataManager reads from pre-loaded GeoHash-indexed local directory (read-only, no live HTTP fetches during flight) + +### GPR — Global Place Recognition + +- [ ] **GPR-01**: Real Faiss index loaded at runtime from file path (not synthetic random vectors); index built from DINOv2 descriptors of actual satellite tiles during offline pre-processing +- [ ] **GPR-02**: DINOv2/AnyLoc TRT FP16 inference engine implemented on Jetson; MockInferenceEngine used on dev/CI +- [ ] **GPR-03**: GPR candidate retrieval returns real tile matches ranked by descriptor similarity, used for re-localization after tracking loss + +### MAV — MAVLink Output + +- [ ] **MAV-01**: pymavlink added to dependencies; MAVLink output component implemented sending GPS_INPUT over UART at 5-10Hz +- [ ] **MAV-02**: ESKF state and covariance mapped to GPS_INPUT fields (lat/lon/alt from position, velocity from v-state, accuracy from covariance diagonal, fix_type from confidence tier, synthesized hdop/vdop, GPS time from system clock) +- [ ] **MAV-03**: IMU input path implemented — MAVLink listener receives ATTITUDE/RAW_IMU from flight controller at 5-10Hz and feeds ESKF prediction step +- [ ] **MAV-04**: Consecutive-failure counter detects 3 frames without any position estimate; sends MAVLink NAMED_VALUE_FLOAT re-localization request to ground station operator +- [ ] **MAV-05**: Telemetry output at 1Hz sends confidence score and drift estimate to ground station via MAVLink NAMED_VALUE_FLOAT + +### PIPE — Pipeline Wiring + +- [ ] **PIPE-01**: FlightProcessor.process_frame wired end-to-end: image in → cuVSLAM VO → ESKF VO update → (keyframe) satellite match → ESKF satellite update → GPS_INPUT output +- [ ] **PIPE-02**: SatelliteDataManager and CoordinateTransformer instantiated and wired into processor pipeline (currently standalone, not connected) +- [ ] **PIPE-03**: FactorGraph replaced or backed by real GTSAM ISAM2 incremental smoothing with BetweenFactorPose3 (VO) and GPSFactor (satellite anchors) +- [ ] **PIPE-04**: FailureRecoveryCoordinator connected to ESKF — on tracking loss, ESKF continues IMU-only prediction with growing uncertainty; on recovery success, ESKF is reset with satellite position +- [ ] **PIPE-05**: ImageRotationManager integrated into process_frame — heading sweep on first frame; `calculate_precise_angle` implemented with real VO-based refinement +- [ ] **PIPE-06**: Object GPS localization endpoint (POST /objects/locate) uses full pixel→ray→ground→WGS84 chain with ESKF attitude; hardcoded stub removed +- [ ] **PIPE-07**: Confidence scoring and fix_type mapping wired end-to-end: ESKF confidence tier → GPS_INPUT fix_type (3/2/0), accuracy fields +- [ ] **PIPE-08**: ImageRotationManager constructor signature fixed (accepts optional ModelManager); startup TypeError resolved + +### TEST — Test Harness and Validation + +- [ ] **TEST-01**: Docker SITL test harness implemented: ArduPilot SITL container, camera-replay service, satellite tile server mock, MAVLink capture +- [ ] **TEST-02**: CI pipeline runs on x86 using OpenCV ORB stub and MockInferenceEngine; all unit tests pass +- [ ] **TEST-03**: Accuracy validation test runs against 60-frame dataset (AD000001–AD000060.jpg) with coordinates.csv ground truth; reports 80%/50m and 60%/20m hit rates +- [ ] **TEST-04**: Performance benchmark test validates <400ms end-to-end per frame on Jetson (or reports estimated latency breakdown on dev) +- [ ] **TEST-05**: All 21 blackbox test scenarios (FT-P-01 to FT-P-14, FT-N-01 to FT-N-07) implemented as runnable pytest tests using SITL harness + +## v2 Requirements + +Deferred to future release. Tracked but not in current roadmap. + +### Security + +- **SEC-01**: JWT bearer token authentication on all API endpoints +- **SEC-02**: TLS 1.3 on all HTTPS connections +- **SEC-03**: Satellite tile manifest SHA-256 integrity verification +- **SEC-04**: Mahalanobis distance outlier rejection in ESKF measurement updates +- **SEC-05**: CORS origins locked down (remove wildcard default) + +### Operational + +- **OPS-01**: Uvicorn `reload` flag defaults to False in production config +- **OPS-02**: Structured logging with configurable log levels per module +- **OPS-03**: Pre-flight health check validates TRT engines loaded, tiles present, IMU receiving +- **OPS-04**: ResultManager.publish_waypoint_update implemented for waypoint SSE emission + +### Performance + +- **PERF-01**: Dual CUDA stream execution (Stream A: VO, Stream B: satellite matching) for pipeline parallelism +- **PERF-02**: Satellite tile RAM preload (±2km corridor) at startup for sub-millisecond tile access + +## Out of Scope + +Explicitly excluded. Documented to prevent scope creep. + +| Feature | Reason | +|---------|--------| +| TRT engine building tooling | Engines are pre-built offline via trtexec; system only loads them | +| Google Maps tile download tooling | Tiles pre-cached before flight; no live internet during flight | +| Full ArduPilot hardware validation on Jetson | Post-v1; Jetson hardware testing is not in scope for this milestone | +| Mobile/web ground station UI | SSE stream consumed by external systems; UI is out of scope | +| Multi-UAV coordination | Single UAV instance only | +| GTSAM ARM64 source build tooling | GTSAM on Jetson requires source compilation; CI uses mock; Jetson build is ops concern | +| tech_stack.md synchronization | Documented inconsistency (3fps vs 0.7fps, etc.); separate documentation task | + +## Traceability + +Which phases cover which requirements. Populated from ROADMAP.md phase assignments. + +| Requirement | Phase | Status | +|-------------|-------|--------| +| ESKF-01 | Phase 1 | Pending | +| ESKF-02 | Phase 1 | Pending | +| ESKF-03 | Phase 1 | Pending | +| ESKF-04 | Phase 1 | Pending | +| ESKF-05 | Phase 1 | Pending | +| ESKF-06 | Phase 1 | Pending | +| VO-01 | Phase 2 | Pending | +| VO-02 | Phase 2 | Pending | +| VO-03 | Phase 2 | Pending | +| VO-04 | Phase 2 | Pending | +| VO-05 | Phase 2 | Pending | +| SAT-01 | Phase 3 | Pending | +| SAT-02 | Phase 3 | Pending | +| SAT-03 | Phase 3 | Pending | +| SAT-04 | Phase 3 | Pending | +| SAT-05 | Phase 3 | Pending | +| GPR-01 | Phase 3 | Pending | +| GPR-02 | Phase 3 | Pending | +| GPR-03 | Phase 3 | Pending | +| MAV-01 | Phase 4 | Pending | +| MAV-02 | Phase 4 | Pending | +| MAV-03 | Phase 4 | Pending | +| MAV-04 | Phase 4 | Pending | +| MAV-05 | Phase 4 | Pending | +| PIPE-01 | Phase 5 | Pending | +| PIPE-02 | Phase 5 | Pending | +| PIPE-03 | Phase 5 | Pending | +| PIPE-04 | Phase 5 | Pending | +| PIPE-05 | Phase 5 | Pending | +| PIPE-06 | Phase 5 | Pending | +| PIPE-07 | Phase 5 | Pending | +| PIPE-08 | Phase 5 | Pending | +| TEST-01 | Phase 6 | Pending | +| TEST-02 | Phase 6 | Pending | +| TEST-03 | Phase 7 | Pending | +| TEST-04 | Phase 7 | Pending | +| TEST-05 | Phase 7 | Pending | + +**Coverage:** +- v1 requirements: 36 total +- Mapped to phases: 36 +- Unmapped: 0 + +--- +*Requirements defined: 2026-04-01* +*Last updated: 2026-04-01 after initial definition* diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md new file mode 100644 index 0000000..5a5a473 --- /dev/null +++ b/.planning/ROADMAP.md @@ -0,0 +1,110 @@ +# Roadmap: GPS-Denied Onboard Navigation System + +## Overview + +The scaffold exists (~2800 lines): FastAPI service, all component ABCs, Pydantic schemas, database layer, and SSE streaming are in place. What is missing is every algorithmic kernel. This roadmap implements them in dependency order: the ESKF math core first (everything else feeds into it), then the two sensor inputs (VO and satellite/GPR), then the MAVLink output that closes the loop to the flight controller, then end-to-end pipeline wiring, then a Docker SITL test harness, and finally accuracy validation against real flight data. + +## Phases + +- [ ] **Phase 1: ESKF Core** - 15-state error-state Kalman filter, coordinate transforms, confidence scoring +- [ ] **Phase 2: Visual Odometry** - cuVSLAM wrapper (Jetson) + OpenCV ORB stub (dev/CI) + TRT SuperPoint/LightGlue +- [ ] **Phase 3: Satellite Matching + GPR** - XFeat TRT matching, offline tile pipeline, real Faiss GPR index +- [ ] **Phase 4: MAVLink I/O** - pymavlink GPS_INPUT output loop, IMU input listener, telemetry, re-localization request +- [ ] **Phase 5: End-to-End Pipeline Wiring** - processor integration, GTSAM factor graph, recovery coordinator, object localization +- [ ] **Phase 6: Docker SITL Harness + CI** - ArduPilot SITL, camera replay, tile server mock, CI integration +- [ ] **Phase 7: Accuracy Validation** - 60-frame dataset validation, latency profiling, blackbox test suite + +## Phase Details + +### Phase 1: ESKF Core +**Goal**: A correct, standalone ESKF implementation exists that fuses IMU, VO, and satellite measurements and outputs confidence-tiered position estimates in WGS84 +**Depends on**: Nothing (first phase — no other algorithmic component depends on this being absent) +**Requirements**: ESKF-01, ESKF-02, ESKF-03, ESKF-04, ESKF-05, ESKF-06 +**Success Criteria** (what must be TRUE): + 1. ESKF propagates nominal state (position, velocity, quaternion, biases) from synthetic IMU inputs and covariance grows correctly between measurement updates + 2. VO measurement update reduces position uncertainty and innovation is within expected bounds for a simulated relative pose input + 3. Satellite measurement update corrects absolute position and covariance tightens to satellite noise level + 4. Confidence tier outputs HIGH when last satellite correction is recent and covariance is small, MEDIUM on VO-only, LOW on IMU-only — verified by unit tests + 5. Full coordinate chain (pixel → camera ray → body → NED → WGS84) produces correct GPS coordinates for a known geometry test case; all FAKE Math stubs replaced +**Plans**: TBD + +### Phase 2: Visual Odometry +**Goal**: VO produces metric relative poses via cuVSLAM on Jetson and via OpenCV ORB on dev/CI, both satisfying the same interface — no more scale-ambiguous unit vectors +**Depends on**: Phase 1 (ESKF provides metric scale reference and coordinate transforms for VO measurement update) +**Requirements**: VO-01, VO-02, VO-03, VO-04, VO-05 +**Success Criteria** (what must be TRUE): + 1. cuVSLAM wrapper initializes in Inertial mode with camera intrinsics and IMU parameters, and returns RelativePose with `scale_ambiguous=False` and metric translation in NED + 2. OpenCV ORB stub satisfies the same ISequentialVisualOdometry interface and passes the same interface contract tests as the cuVSLAM wrapper + 3. TRT SuperPoint/LightGlue engines load and run inference on Jetson; MockInferenceEngine is selected automatically on dev/x86 + 4. ImageInputPipeline accepts single-image batches without error; sequence lookup returns the correct frame with no substring collision +**Plans**: TBD + +### Phase 3: Satellite Matching + GPR +**Goal**: The system can correct absolute position from pre-loaded satellite tiles and re-localize after tracking loss using a real Faiss descriptor index +**Depends on**: Phase 1 (ESKF position uncertainty drives tile selection radius and measurement update), Phase 2 (VO provides keyframe selection timing) +**Requirements**: SAT-01, SAT-02, SAT-03, SAT-04, SAT-05, GPR-01, GPR-02, GPR-03 +**Success Criteria** (what must be TRUE): + 1. Satellite tile selection queries the local GeoHash-indexed directory using ESKF position ± 3σ and returns correct tiles without any HTTP requests + 2. Camera frame is GSD-normalized to satellite resolution before matching; XFeat TRT inference runs on Jetson and MockInferenceEngine on dev/CI + 3. RANSAC homography produces a WGS84 position estimate with a confidence score derived from inlier ratio, accepted by ESKF satellite measurement update + 4. GPR loads a real Faiss index from disk and returns tile candidates ranked by DINOv2 descriptor similarity (not random vectors) + 5. After simulated tracking loss, GPR candidate + MetricRefinement produces an ESKF re-localization within expected accuracy bounds +**Plans**: TBD + +### Phase 4: MAVLink I/O +**Goal**: The flight controller receives GPS_INPUT at 5-10Hz and the system receives IMU data from the flight controller — the primary acceptance criterion is met end-to-end for the communication layer +**Depends on**: Phase 1 (ESKF state is the source for GPS_INPUT field population; IMU data drives ESKF prediction) +**Requirements**: MAV-01, MAV-02, MAV-03, MAV-04, MAV-05 +**Success Criteria** (what must be TRUE): + 1. pymavlink sends GPS_INPUT messages to a MAVLink endpoint at 5-10Hz; all required fields populated (lat, lon, alt, velocity, accuracy, fix_type, hdop, vdop, GPS time) + 2. fix_type maps correctly from ESKF confidence tier: HIGH → 3 (3D fix), MEDIUM → 2 (2D fix), LOW → 0 (no fix) + 3. IMU listener receives ATTITUDE/RAW_IMU from flight controller at 5-10Hz and ESKF prediction step runs at that rate between camera frames + 4. After 3 consecutive frames with no position estimate, a MAVLink NAMED_VALUE_FLOAT message with last known position is sent (verifiable in SITL logs) + 5. Telemetry at 1Hz emits confidence score and drift estimate to ground station via NAMED_VALUE_FLOAT +**Plans**: TBD + +### Phase 5: End-to-End Pipeline Wiring +**Goal**: A single uploaded camera frame travels through the full pipeline — VO, ESKF update, satellite correction (on keyframes), GPS_INPUT output — with no hardcoded stubs in the path +**Depends on**: Phase 1, Phase 2, Phase 3, Phase 4 (all algorithmic components must exist to be wired) +**Requirements**: PIPE-01, PIPE-02, PIPE-03, PIPE-04, PIPE-05, PIPE-06, PIPE-07, PIPE-08 +**Success Criteria** (what must be TRUE): + 1. process_frame executes the full chain without error: VO relative pose → ESKF VO update → (every 5-10 frames) satellite match → ESKF satellite update → GPS_INPUT sent to flight controller + 2. SatelliteDataManager and CoordinateTransformer are instantiated in app.py lifespan and injected into the processor; no component is standalone + 3. FactorGraphOptimizer calls real GTSAM ISAM2 update when GTSAM is available; mock path remains for CI + 4. Object GPS localization (POST /objects/locate) returns a WGS84 position using the real pixel→ray→ground chain; hardcoded (48.0, 37.0) stub is gone + 5. Application starts without TypeError; ImageRotationManager constructor accepts the model manager argument +**Plans**: TBD + +### Phase 6: Docker SITL Harness + CI +**Goal**: The full pipeline can be tested in a reproducible Docker environment with ArduPilot SITL, camera replay, and a tile server mock — and CI runs this on every commit +**Depends on**: Phase 5 (all components must be wired before integration testing is meaningful) +**Requirements**: TEST-01, TEST-02 +**Success Criteria** (what must be TRUE): + 1. `docker compose up` starts ArduPilot SITL, the GPS-denied service, a camera-replay container, and a satellite tile server mock — all communicate over MAVLink and HTTP + 2. CI pipeline runs on x86 using OpenCV ORB stub and MockInferenceEngine; all 85+ unit tests pass with no manual steps + 3. MAVLink GPS_INPUT messages are captured in SITL logs and show 5-10Hz output rate during camera replay + 4. Tracking loss scenario (simulated by replaying frames with no overlap) triggers RECOVERY state and sends re-localization request +**Plans**: TBD + +### Phase 7: Accuracy Validation +**Goal**: The system demonstrably meets the navigation accuracy acceptance criteria on the 60-frame test dataset, and all 21 blackbox test scenarios are implemented as runnable tests +**Depends on**: Phase 6 (SITL harness is required for the blackbox test scenarios) +**Requirements**: TEST-03, TEST-04, TEST-05 +**Success Criteria** (what must be TRUE): + 1. Running against AD000001–AD000060.jpg with coordinates.csv ground truth: 80% of frames within 50m error and 60% of frames within 20m error + 2. Maximum cumulative VO drift between satellite corrections is less than 100m across any segment in the test dataset + 3. End-to-end latency per frame (camera capture to GPS_INPUT) is under 400ms on Jetson, with a breakdown report per pipeline stage + 4. All 21 blackbox test scenarios (FT-P-01 to FT-P-14, FT-N-01 to FT-N-07) run as pytest tests against the SITL harness and produce a pass/fail report +**Plans**: TBD + +## Progress + +| Phase | Plans Complete | Status | Completed | +|-------|----------------|--------|-----------| +| 1. ESKF Core | 0/TBD | Not started | - | +| 2. Visual Odometry | 0/TBD | Not started | - | +| 3. Satellite Matching + GPR | 0/TBD | Not started | - | +| 4. MAVLink I/O | 0/TBD | Not started | - | +| 5. End-to-End Pipeline Wiring | 0/TBD | Not started | - | +| 6. Docker SITL Harness + CI | 0/TBD | Not started | - | +| 7. Accuracy Validation | 0/TBD | Not started | - |