docs: add requirements and roadmap

36 v1 requirements across 6 categories (ESKF, VO, SAT, GPR, MAV, PIPE, TEST).
7-phase roadmap ordered by dependency: ESKF → VO → Satellite → MAVLink → Pipeline → SITL → Validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Yuzviak
2026-04-01 20:52:42 +03:00
parent 06f9ccd28f
commit 659050f20b
2 changed files with 267 additions and 0 deletions
+157
View File
@@ -0,0 +1,157 @@
# Requirements: GPS-Denied Onboard Navigation System
**Defined:** 2026-04-01
**Core Value:** The flight controller must receive valid MAVLink GPS_INPUT at 5-10Hz with position accuracy ≤50m for 80% of frames — without this, the UAV cannot navigate in GPS-denied airspace.
## v1 Requirements
Requirements for this milestone. The scaffold (~2800 lines) exists; all algorithmic kernels are missing or mocked. Every requirement below maps to one phase of implementation work.
### ESKF — Error-State Kalman Filter
- [ ] **ESKF-01**: 15-state ESKF implemented (δp, δv, δθ, δb_a, δb_g) with IMU prediction step (F, Q matrices, bias propagation)
- [ ] **ESKF-02**: VO measurement update implemented (relative pose ΔR/Δt from cuVSLAM, H_vo, R_vo covariance, Kalman gain)
- [ ] **ESKF-03**: Satellite measurement update implemented (absolute WGS84 position from matching, H_sat, R_sat from RANSAC inlier ratio)
- [ ] **ESKF-04**: ESKF state initializes from GLOBAL_POSITION_INT at startup and on mid-flight reboot with high-uncertainty covariance
- [ ] **ESKF-05**: Confidence tier computation outputs HIGH/MEDIUM/LOW based on covariance magnitude and last satellite correction age
- [ ] **ESKF-06**: Coordinate transform chain implemented: pixel→camera ray (K matrix), camera→body (T_cam_body), body→NED (ESKF quaternion), NED→WGS84 — replacing all FAKE Math stubs
### VO — Visual Odometry
- [ ] **VO-01**: cuVSLAM wrapper implemented for Jetson target (Inertial mode, camera + IMU inputs, relative pose output with metric scale)
- [ ] **VO-02**: OpenCV ORB stub conforms to the same `ISequentialVisualOdometry` interface as cuVSLAM wrapper, used on dev/CI (x86)
- [ ] **VO-03**: TensorRT FP16 inference engine loader implemented for SuperPoint and LightGlue on Jetson; MockInferenceEngine used on dev/CI
- [ ] **VO-04**: Scale ambiguity resolved — `scale_ambiguous` is False when ESKF provides metric scale reference; VO relative pose is metric in NED
- [ ] **VO-05**: ImageInputPipeline batch validation minimum lowered to 1 image (not 10); `get_image_by_sequence` uses exact filename matching
### SAT — Satellite Matching
- [ ] **SAT-01**: XFeat TRT FP16 inference engine implemented for satellite feature matching on Jetson; MockInferenceEngine used on dev/CI
- [ ] **SAT-02**: Satellite tile selection uses ESKF position ± 3σ_horizontal to define search area; tiles assembled into mosaic at matcher resolution
- [ ] **SAT-03**: GSD normalization implemented — camera frame downsampled to match satellite GSD (0.30.6 m/px) before matching
- [ ] **SAT-04**: RANSAC homography estimation produces WGS84 absolute position with confidence score from inlier ratio
- [ ] **SAT-05**: SatelliteDataManager reads from pre-loaded GeoHash-indexed local directory (read-only, no live HTTP fetches during flight)
### GPR — Global Place Recognition
- [ ] **GPR-01**: Real Faiss index loaded at runtime from file path (not synthetic random vectors); index built from DINOv2 descriptors of actual satellite tiles during offline pre-processing
- [ ] **GPR-02**: DINOv2/AnyLoc TRT FP16 inference engine implemented on Jetson; MockInferenceEngine used on dev/CI
- [ ] **GPR-03**: GPR candidate retrieval returns real tile matches ranked by descriptor similarity, used for re-localization after tracking loss
### MAV — MAVLink Output
- [ ] **MAV-01**: pymavlink added to dependencies; MAVLink output component implemented sending GPS_INPUT over UART at 5-10Hz
- [ ] **MAV-02**: ESKF state and covariance mapped to GPS_INPUT fields (lat/lon/alt from position, velocity from v-state, accuracy from covariance diagonal, fix_type from confidence tier, synthesized hdop/vdop, GPS time from system clock)
- [ ] **MAV-03**: IMU input path implemented — MAVLink listener receives ATTITUDE/RAW_IMU from flight controller at 5-10Hz and feeds ESKF prediction step
- [ ] **MAV-04**: Consecutive-failure counter detects 3 frames without any position estimate; sends MAVLink NAMED_VALUE_FLOAT re-localization request to ground station operator
- [ ] **MAV-05**: Telemetry output at 1Hz sends confidence score and drift estimate to ground station via MAVLink NAMED_VALUE_FLOAT
### PIPE — Pipeline Wiring
- [ ] **PIPE-01**: FlightProcessor.process_frame wired end-to-end: image in → cuVSLAM VO → ESKF VO update → (keyframe) satellite match → ESKF satellite update → GPS_INPUT output
- [ ] **PIPE-02**: SatelliteDataManager and CoordinateTransformer instantiated and wired into processor pipeline (currently standalone, not connected)
- [ ] **PIPE-03**: FactorGraph replaced or backed by real GTSAM ISAM2 incremental smoothing with BetweenFactorPose3 (VO) and GPSFactor (satellite anchors)
- [ ] **PIPE-04**: FailureRecoveryCoordinator connected to ESKF — on tracking loss, ESKF continues IMU-only prediction with growing uncertainty; on recovery success, ESKF is reset with satellite position
- [ ] **PIPE-05**: ImageRotationManager integrated into process_frame — heading sweep on first frame; `calculate_precise_angle` implemented with real VO-based refinement
- [ ] **PIPE-06**: Object GPS localization endpoint (POST /objects/locate) uses full pixel→ray→ground→WGS84 chain with ESKF attitude; hardcoded stub removed
- [ ] **PIPE-07**: Confidence scoring and fix_type mapping wired end-to-end: ESKF confidence tier → GPS_INPUT fix_type (3/2/0), accuracy fields
- [ ] **PIPE-08**: ImageRotationManager constructor signature fixed (accepts optional ModelManager); startup TypeError resolved
### TEST — Test Harness and Validation
- [ ] **TEST-01**: Docker SITL test harness implemented: ArduPilot SITL container, camera-replay service, satellite tile server mock, MAVLink capture
- [ ] **TEST-02**: CI pipeline runs on x86 using OpenCV ORB stub and MockInferenceEngine; all unit tests pass
- [ ] **TEST-03**: Accuracy validation test runs against 60-frame dataset (AD000001AD000060.jpg) with coordinates.csv ground truth; reports 80%/50m and 60%/20m hit rates
- [ ] **TEST-04**: Performance benchmark test validates <400ms end-to-end per frame on Jetson (or reports estimated latency breakdown on dev)
- [ ] **TEST-05**: All 21 blackbox test scenarios (FT-P-01 to FT-P-14, FT-N-01 to FT-N-07) implemented as runnable pytest tests using SITL harness
## v2 Requirements
Deferred to future release. Tracked but not in current roadmap.
### Security
- **SEC-01**: JWT bearer token authentication on all API endpoints
- **SEC-02**: TLS 1.3 on all HTTPS connections
- **SEC-03**: Satellite tile manifest SHA-256 integrity verification
- **SEC-04**: Mahalanobis distance outlier rejection in ESKF measurement updates
- **SEC-05**: CORS origins locked down (remove wildcard default)
### Operational
- **OPS-01**: Uvicorn `reload` flag defaults to False in production config
- **OPS-02**: Structured logging with configurable log levels per module
- **OPS-03**: Pre-flight health check validates TRT engines loaded, tiles present, IMU receiving
- **OPS-04**: ResultManager.publish_waypoint_update implemented for waypoint SSE emission
### Performance
- **PERF-01**: Dual CUDA stream execution (Stream A: VO, Stream B: satellite matching) for pipeline parallelism
- **PERF-02**: Satellite tile RAM preload (±2km corridor) at startup for sub-millisecond tile access
## Out of Scope
Explicitly excluded. Documented to prevent scope creep.
| Feature | Reason |
|---------|--------|
| TRT engine building tooling | Engines are pre-built offline via trtexec; system only loads them |
| Google Maps tile download tooling | Tiles pre-cached before flight; no live internet during flight |
| Full ArduPilot hardware validation on Jetson | Post-v1; Jetson hardware testing is not in scope for this milestone |
| Mobile/web ground station UI | SSE stream consumed by external systems; UI is out of scope |
| Multi-UAV coordination | Single UAV instance only |
| GTSAM ARM64 source build tooling | GTSAM on Jetson requires source compilation; CI uses mock; Jetson build is ops concern |
| tech_stack.md synchronization | Documented inconsistency (3fps vs 0.7fps, etc.); separate documentation task |
## Traceability
Which phases cover which requirements. Populated from ROADMAP.md phase assignments.
| Requirement | Phase | Status |
|-------------|-------|--------|
| ESKF-01 | Phase 1 | Pending |
| ESKF-02 | Phase 1 | Pending |
| ESKF-03 | Phase 1 | Pending |
| ESKF-04 | Phase 1 | Pending |
| ESKF-05 | Phase 1 | Pending |
| ESKF-06 | Phase 1 | Pending |
| VO-01 | Phase 2 | Pending |
| VO-02 | Phase 2 | Pending |
| VO-03 | Phase 2 | Pending |
| VO-04 | Phase 2 | Pending |
| VO-05 | Phase 2 | Pending |
| SAT-01 | Phase 3 | Pending |
| SAT-02 | Phase 3 | Pending |
| SAT-03 | Phase 3 | Pending |
| SAT-04 | Phase 3 | Pending |
| SAT-05 | Phase 3 | Pending |
| GPR-01 | Phase 3 | Pending |
| GPR-02 | Phase 3 | Pending |
| GPR-03 | Phase 3 | Pending |
| MAV-01 | Phase 4 | Pending |
| MAV-02 | Phase 4 | Pending |
| MAV-03 | Phase 4 | Pending |
| MAV-04 | Phase 4 | Pending |
| MAV-05 | Phase 4 | Pending |
| PIPE-01 | Phase 5 | Pending |
| PIPE-02 | Phase 5 | Pending |
| PIPE-03 | Phase 5 | Pending |
| PIPE-04 | Phase 5 | Pending |
| PIPE-05 | Phase 5 | Pending |
| PIPE-06 | Phase 5 | Pending |
| PIPE-07 | Phase 5 | Pending |
| PIPE-08 | Phase 5 | Pending |
| TEST-01 | Phase 6 | Pending |
| TEST-02 | Phase 6 | Pending |
| TEST-03 | Phase 7 | Pending |
| TEST-04 | Phase 7 | Pending |
| TEST-05 | Phase 7 | Pending |
**Coverage:**
- v1 requirements: 36 total
- Mapped to phases: 36
- Unmapped: 0
---
*Requirements defined: 2026-04-01*
*Last updated: 2026-04-01 after initial definition*
+110
View File
@@ -0,0 +1,110 @@
# Roadmap: GPS-Denied Onboard Navigation System
## Overview
The scaffold exists (~2800 lines): FastAPI service, all component ABCs, Pydantic schemas, database layer, and SSE streaming are in place. What is missing is every algorithmic kernel. This roadmap implements them in dependency order: the ESKF math core first (everything else feeds into it), then the two sensor inputs (VO and satellite/GPR), then the MAVLink output that closes the loop to the flight controller, then end-to-end pipeline wiring, then a Docker SITL test harness, and finally accuracy validation against real flight data.
## Phases
- [ ] **Phase 1: ESKF Core** - 15-state error-state Kalman filter, coordinate transforms, confidence scoring
- [ ] **Phase 2: Visual Odometry** - cuVSLAM wrapper (Jetson) + OpenCV ORB stub (dev/CI) + TRT SuperPoint/LightGlue
- [ ] **Phase 3: Satellite Matching + GPR** - XFeat TRT matching, offline tile pipeline, real Faiss GPR index
- [ ] **Phase 4: MAVLink I/O** - pymavlink GPS_INPUT output loop, IMU input listener, telemetry, re-localization request
- [ ] **Phase 5: End-to-End Pipeline Wiring** - processor integration, GTSAM factor graph, recovery coordinator, object localization
- [ ] **Phase 6: Docker SITL Harness + CI** - ArduPilot SITL, camera replay, tile server mock, CI integration
- [ ] **Phase 7: Accuracy Validation** - 60-frame dataset validation, latency profiling, blackbox test suite
## Phase Details
### Phase 1: ESKF Core
**Goal**: A correct, standalone ESKF implementation exists that fuses IMU, VO, and satellite measurements and outputs confidence-tiered position estimates in WGS84
**Depends on**: Nothing (first phase — no other algorithmic component depends on this being absent)
**Requirements**: ESKF-01, ESKF-02, ESKF-03, ESKF-04, ESKF-05, ESKF-06
**Success Criteria** (what must be TRUE):
1. ESKF propagates nominal state (position, velocity, quaternion, biases) from synthetic IMU inputs and covariance grows correctly between measurement updates
2. VO measurement update reduces position uncertainty and innovation is within expected bounds for a simulated relative pose input
3. Satellite measurement update corrects absolute position and covariance tightens to satellite noise level
4. Confidence tier outputs HIGH when last satellite correction is recent and covariance is small, MEDIUM on VO-only, LOW on IMU-only — verified by unit tests
5. Full coordinate chain (pixel → camera ray → body → NED → WGS84) produces correct GPS coordinates for a known geometry test case; all FAKE Math stubs replaced
**Plans**: TBD
### Phase 2: Visual Odometry
**Goal**: VO produces metric relative poses via cuVSLAM on Jetson and via OpenCV ORB on dev/CI, both satisfying the same interface — no more scale-ambiguous unit vectors
**Depends on**: Phase 1 (ESKF provides metric scale reference and coordinate transforms for VO measurement update)
**Requirements**: VO-01, VO-02, VO-03, VO-04, VO-05
**Success Criteria** (what must be TRUE):
1. cuVSLAM wrapper initializes in Inertial mode with camera intrinsics and IMU parameters, and returns RelativePose with `scale_ambiguous=False` and metric translation in NED
2. OpenCV ORB stub satisfies the same ISequentialVisualOdometry interface and passes the same interface contract tests as the cuVSLAM wrapper
3. TRT SuperPoint/LightGlue engines load and run inference on Jetson; MockInferenceEngine is selected automatically on dev/x86
4. ImageInputPipeline accepts single-image batches without error; sequence lookup returns the correct frame with no substring collision
**Plans**: TBD
### Phase 3: Satellite Matching + GPR
**Goal**: The system can correct absolute position from pre-loaded satellite tiles and re-localize after tracking loss using a real Faiss descriptor index
**Depends on**: Phase 1 (ESKF position uncertainty drives tile selection radius and measurement update), Phase 2 (VO provides keyframe selection timing)
**Requirements**: SAT-01, SAT-02, SAT-03, SAT-04, SAT-05, GPR-01, GPR-02, GPR-03
**Success Criteria** (what must be TRUE):
1. Satellite tile selection queries the local GeoHash-indexed directory using ESKF position ± 3σ and returns correct tiles without any HTTP requests
2. Camera frame is GSD-normalized to satellite resolution before matching; XFeat TRT inference runs on Jetson and MockInferenceEngine on dev/CI
3. RANSAC homography produces a WGS84 position estimate with a confidence score derived from inlier ratio, accepted by ESKF satellite measurement update
4. GPR loads a real Faiss index from disk and returns tile candidates ranked by DINOv2 descriptor similarity (not random vectors)
5. After simulated tracking loss, GPR candidate + MetricRefinement produces an ESKF re-localization within expected accuracy bounds
**Plans**: TBD
### Phase 4: MAVLink I/O
**Goal**: The flight controller receives GPS_INPUT at 5-10Hz and the system receives IMU data from the flight controller — the primary acceptance criterion is met end-to-end for the communication layer
**Depends on**: Phase 1 (ESKF state is the source for GPS_INPUT field population; IMU data drives ESKF prediction)
**Requirements**: MAV-01, MAV-02, MAV-03, MAV-04, MAV-05
**Success Criteria** (what must be TRUE):
1. pymavlink sends GPS_INPUT messages to a MAVLink endpoint at 5-10Hz; all required fields populated (lat, lon, alt, velocity, accuracy, fix_type, hdop, vdop, GPS time)
2. fix_type maps correctly from ESKF confidence tier: HIGH → 3 (3D fix), MEDIUM → 2 (2D fix), LOW → 0 (no fix)
3. IMU listener receives ATTITUDE/RAW_IMU from flight controller at 5-10Hz and ESKF prediction step runs at that rate between camera frames
4. After 3 consecutive frames with no position estimate, a MAVLink NAMED_VALUE_FLOAT message with last known position is sent (verifiable in SITL logs)
5. Telemetry at 1Hz emits confidence score and drift estimate to ground station via NAMED_VALUE_FLOAT
**Plans**: TBD
### Phase 5: End-to-End Pipeline Wiring
**Goal**: A single uploaded camera frame travels through the full pipeline — VO, ESKF update, satellite correction (on keyframes), GPS_INPUT output — with no hardcoded stubs in the path
**Depends on**: Phase 1, Phase 2, Phase 3, Phase 4 (all algorithmic components must exist to be wired)
**Requirements**: PIPE-01, PIPE-02, PIPE-03, PIPE-04, PIPE-05, PIPE-06, PIPE-07, PIPE-08
**Success Criteria** (what must be TRUE):
1. process_frame executes the full chain without error: VO relative pose → ESKF VO update → (every 5-10 frames) satellite match → ESKF satellite update → GPS_INPUT sent to flight controller
2. SatelliteDataManager and CoordinateTransformer are instantiated in app.py lifespan and injected into the processor; no component is standalone
3. FactorGraphOptimizer calls real GTSAM ISAM2 update when GTSAM is available; mock path remains for CI
4. Object GPS localization (POST /objects/locate) returns a WGS84 position using the real pixel→ray→ground chain; hardcoded (48.0, 37.0) stub is gone
5. Application starts without TypeError; ImageRotationManager constructor accepts the model manager argument
**Plans**: TBD
### Phase 6: Docker SITL Harness + CI
**Goal**: The full pipeline can be tested in a reproducible Docker environment with ArduPilot SITL, camera replay, and a tile server mock — and CI runs this on every commit
**Depends on**: Phase 5 (all components must be wired before integration testing is meaningful)
**Requirements**: TEST-01, TEST-02
**Success Criteria** (what must be TRUE):
1. `docker compose up` starts ArduPilot SITL, the GPS-denied service, a camera-replay container, and a satellite tile server mock — all communicate over MAVLink and HTTP
2. CI pipeline runs on x86 using OpenCV ORB stub and MockInferenceEngine; all 85+ unit tests pass with no manual steps
3. MAVLink GPS_INPUT messages are captured in SITL logs and show 5-10Hz output rate during camera replay
4. Tracking loss scenario (simulated by replaying frames with no overlap) triggers RECOVERY state and sends re-localization request
**Plans**: TBD
### Phase 7: Accuracy Validation
**Goal**: The system demonstrably meets the navigation accuracy acceptance criteria on the 60-frame test dataset, and all 21 blackbox test scenarios are implemented as runnable tests
**Depends on**: Phase 6 (SITL harness is required for the blackbox test scenarios)
**Requirements**: TEST-03, TEST-04, TEST-05
**Success Criteria** (what must be TRUE):
1. Running against AD000001AD000060.jpg with coordinates.csv ground truth: 80% of frames within 50m error and 60% of frames within 20m error
2. Maximum cumulative VO drift between satellite corrections is less than 100m across any segment in the test dataset
3. End-to-end latency per frame (camera capture to GPS_INPUT) is under 400ms on Jetson, with a breakdown report per pipeline stage
4. All 21 blackbox test scenarios (FT-P-01 to FT-P-14, FT-N-01 to FT-N-07) run as pytest tests against the SITL harness and produce a pass/fail report
**Plans**: TBD
## Progress
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. ESKF Core | 0/TBD | Not started | - |
| 2. Visual Odometry | 0/TBD | Not started | - |
| 3. Satellite Matching + GPR | 0/TBD | Not started | - |
| 4. MAVLink I/O | 0/TBD | Not started | - |
| 5. End-to-End Pipeline Wiring | 0/TBD | Not started | - |
| 6. Docker SITL Harness + CI | 0/TBD | Not started | - |
| 7. Accuracy Validation | 0/TBD | Not started | - |