ADR 0002: hexagonal/ports-and-adapters architecture — components/ layout, protocol.py per component, composition root, core/ for concentrated math. ADR 0003: @dataclass(slots=True, frozen=True) on hot path; Pydantic retained only at REST/config/DB boundaries. Pose/GPSPoint migration deferred to Phase 2. ADR 0004: Stage 2 as independent iteration — own phases 1-6, own requirements, stage1 code treated as MVP starting capital. PROJECT.md: Stage 2 Key Decisions updated from Pending → Accepted with Phase 1 implementation notes, deferred work list, and final architecture summary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
GPS-Denied Onboard Navigation System — Stage 2
What This Is
Real-time GPS-independent position estimation system for a fixed-wing UAV operating in GPS-denied/spoofed environments (flat terrain, Ukraine). Runs onboard a Jetson Orin Nano Super (8GB shared, 67 TOPS). Fuses visual odometry (cuVSLAM), satellite image matching (TensorRT FP16), and IMU via an ESKF to output MAVLink GPS_INPUT to an ArduPilot flight controller at 5-10Hz, while also streaming position and confidence over SSE to a ground station.
Stage 2 Iteration
Stage 2 is a self-contained iteration of the project. It is NOT a continuation of Stage 1's phase numbering — it has its own roadmap (Phases 1–6), its own requirements list, and its own success criteria. Each stage is conceptually a new pass at the system: same problem, same end goal, fresh decisions about HOW.
Stage 2 starting capital:
- From stage1 (own work): The full v1 pipeline as MVP — ESKF (15-state), cuVSLAM/ORB VO, satellite matching + GPR, MAVLink GPS_INPUT, pipeline orchestration, SITL harness, accuracy benchmarks, 195 passing tests. Treated as MVP, not production — refactoring is allowed and expected.
- From try02 (parallel team): Concept-level ideas only — Safety Anchor State Machine, Geometry-Gated Anchor Verifier, Flight Data Recorder, Conditional Multi-Scale VPR, dual-channel MAVLink design, formal Acceptance Criteria document with numeric thresholds, structured test taxonomy.
- From real-flight data: Azaion 10.05.2026 dataset (tlog + 6min video + 9.5Hz GPS ground truth) as integration fixture.
Stage 2 is free to:
- Reorganize the codebase (hexagonal layout) — no production lock-in
- Replace, swap, or rebuild components — only AC-driven test outcomes are sacred
- Change the architecture wholesale if a better path emerges mid-stage
- Diverge from try02's choices where the evidence supports it (e.g., reject BASALT in favor of cuVSLAM, reject Pydantic on hot path)
Stage 2 archive: _planning/archive/v1.0/ preserves stage1's PROJECT.md, REQUIREMENTS.md, ROADMAP.md, and Phase 1 artifacts as historical record.
Core Value
The flight controller must receive valid MAVLink GPS_INPUT at 5-10Hz with position accuracy ≤50m for 80% of frames — without this, the UAV cannot navigate in GPS-denied airspace.
Stage 2 Goal
Refactor the inherited stage1 MVP into a hexagonal/ports-and-adapters architecture with explicit DI composition root, integrate selected concept-level ideas from try02, formalize acceptance criteria with testable numerics, and add a real-flight integration fixture (Azaion 10.05.2026).
Stage 2 Target Features
Architecture:
- Hexagonal layout —
src/gps_denied/components/{vio, satellite_matcher, gpr, anchor_verifier, safety_state, flight_recorder, mavlink_io, coordinate_transforms}/withprotocol.py+ concrete impls per component - Hot-path types as
@dataclass(slots=True, frozen=True)forFrameState,IMUSample,PositionEstimate; Pydantic kept only at REST/config/DB boundaries - Composition root
pipeline/composition.pywith explicit DI for env-specific wiring (jetson/x86_dev/ci/sitl) - Per-environment config —
config/{jetson,x86_dev,ci,sitl}.yamldriven by pydantic-settings core/retained for concentrated math (ESKF, factor graph, RANSAC) — single-file pure functions
try02 concept integration:
- Acceptance Criteria document — formal AC-1.x…AC-NEW-x with numeric thresholds, validation methods, test linkage
- Safety Anchor State Machine — separate layer over ESKF owning
source_label(satellite_anchored/vo_extrapolated/dead_reckoned), monotonic covariance growth, anchor age, tile write eligibility - Geometry-gated Anchor Verifier — formal accept/reject gates (inliers, MRE, reprojection error) before anchor enters ESKF
- Flight Data Recorder (FDR) — append-only event log with bounded segment storage and health states
- Conditional VPR invocation — DINOv2 forward only on re-loc triggers; steady-state geometric prior
- Multi-scale VPR chunks — 600-800m ground-footprint chunks at 40-50% overlap, decoupled from storage tiles, fine (z=20) + coarse (z=17) scales
- Source label + anchor_age_ms emitted in every GPS_INPUT estimate
- Visual blackout handling — switch to
dead_reckoned≤400ms, monotonic covariance growth,VISUAL_BLACKOUT_IMU_ONLYSTATUSTEXT @ 1-2Hz - Spoofing-promotion latency monitor — promote own estimate to FC primary within <3s of detected real-GPS health drop
- Test taxonomy —
tests/{unit,integration,blackbox,sitl,e2e}/ - Dual-channel MAVLink design —
GPS_INPUTprimary (v1 only),ODOMETRYauxiliary scaffolded behind feature flag for v1.1 - Structured JSON logging with
correlation_id(frame_id) per-frame - CLI tool
gps_denied replay --tlog ... --video ... - Real-flight integration fixture — Azaion 10.05.2026 as
tests/integration/azaion_flight/
Stage 2 Explicit Non-Goals
- BASALT VIO backend — cuVSLAM remains primary (aarch64) with ORB-SLAM3 as CI baseline
- Pydantic on the per-frame hot path — dataclasses replace it
- Mandatory PostgreSQL — SQLite remains the embedded default
- Microservice processes / IPC — single-process architecture preserved
- Folder-per-component split for
core/math files — ESKF/factor graph stay concentrated - Mid-flight tile generation + write-back to Suite (AC-8.4) — deferred to Stage 3
- Production hardware validation on Jetson — deferred to Stage 3
Future Stages (parking lot)
- Stage 3 candidates: Jetson hardware validation, mid-flight tile generation + Suite write-back, ODOMETRY channel enabled, AC-NEW-1 cold-boot benchmark, BASALT evaluation if cuVSLAM blockers emerge
Out of Scope (across all stages, unless re-opened)
- TensorRT engine building tooling — engines are pre-built offline
- Google Maps tile download tooling — tiles pre-cached before flight
- Mobile/web ground station UI — SSE consumed by external systems
- Multi-UAV coordination — single UAV instance only
Context
Hardware target: Jetson Orin Nano Super (8GB LPDDR5 shared, JetPack 6.2.2, CUDA 12.6, TRT 10.3.0). Development on x86 Linux; cuVSLAM and TRT are Jetson-only — dev/CI uses OpenCV ORB stub and MockInferenceEngine.
Camera (target): ADTI 20L V1 (5456×3632, APS-C, 16mm lens, nadir fixed, 0.7fps). AI detection camera: Viewpro A40 Pro (separate).
Camera (Azaion fixture): Multirotor gimbal EO+IR split-screen with HUD overlay, 1280×720 @ 30fps. Used for integration testing only — does not represent target deployment camera.
Flight controller: ArduPilot via MAVLink UART. System sends GPS_INPUT; receives IMU (200Hz target / 9.7Hz in Azaion fixture) and GLOBAL_POSITION_INT (1Hz) from FC.
Key latency budget: <400ms end-to-end per frame.
Stage 1 inheritance: ~7,800 lines of working Python code with 195 passing tests. All algorithmic kernels (ESKF, VO, GPR, MAVLink, factor graph) implemented. Stage 2 starts from this codebase on branch stage2 (HEAD = stage1).
Reference branch: try02 is checked out as a worktree at ../gps-denied-onboard-try02/ for concept harvesting. We do NOT merge from try02 — we read it for ideas and re-implement what fits.
Constraints
- Performance: <400ms/frame end-to-end p95, <8GB RAM+VRAM — non-negotiable
- Hardware: cuVSLAM v15.0.0 (aarch64-only wheel) — Protocol with stub on x86
- Platform: JetPack 6.2.2, Python 3.10+, TensorRT 10.3.0, CUDA 12.6
- Navigation accuracy: 80% frames ≤50m, 60% frames ≤20m, max drift 100m between satellite corrections
- Resilience: Handle sharp turns (disconnected VO segments), 3+ consecutive satellite match failures, visual blackout, GPS spoofing promotion <3s
- Regression floor: All 195 stage1 passing tests must continue to pass after refactor
Stage 2 Key Decisions
| Decision | Rationale | Outcome |
|---|---|---|
Hexagonal layout with components/ folders |
Clear ownership per swappable backend, native bridges colocate with adapter | ✓ Phase 1 |
@dataclass(slots=True, frozen=True) on hot path, Pydantic at boundaries only |
Avoid try02's per-frame Pydantic latency cost; validate where it catches bugs (REST input, config) | ✓ Phase 1 (hot_types/ scaffolded; full migration Phase 2) |
| Explicit DI composition root | One file wires environment-specific implementations; tests pass mock dependencies | ✓ Phase 1 (pipeline/composition.py:build_pipeline) |
| Adopt try02 concept ideas, reject try02 layout details | Take Safety Anchor / Anchor Verifier / FDR / Conditional VPR; reject Pydantic-on-hot-path, BASALT | ✓ Adopted — Phases 3–5 |
| Take try02 acceptance criteria with numeric thresholds | Their AC-1.x…AC-NEW-x is more rigorous than stage1's drafts; bind every AC to ≥1 test | ✓ Adopted — Phase 2 |
Test taxonomy unit/integration/blackbox/sitl/e2e |
Clarifies CI-on-push vs PR vs nightly vs hardware-only test runs | ✓ Phase 2 |
| Stage as iteration, not phase continuation | Each stage = own roadmap, own phase numbering, own success criteria | ✓ Adopted |
Phase 1 Outcome (2026-05-11, completed)
ARCH-01..07 all satisfied. 216 tests pass (baseline 195+21 new = 216), 0 failures, accuracy benchmarks unchanged.
What was built
Components scaffold (src/gps_denied/components/):
vio/—protocol.py+orbslam_backend.py+cuvslam_backend.py+factory.py;core/vo.pyis a shimgpr/—protocol.py+faiss_gpr.py(inline numpy fallback preserved);core/gpr.pyis a shimsatellite_matcher/—protocol.py+local_tile_loader.py+metric_refinement.py;core/satellite.py,core/metric.pyare shimsmavlink_io/—protocol.py+pymavlink_bridge.py+mock_mavlink.py;core/mavlink.pyis a shim (re-exports private helpers_confidence_to_fix_type,_eskf_to_gps_input,_unix_to_gps_time)anchor_verifier/,safety_state/,flight_recorder/,coordinate_transforms/— Protocol stubs only (Phases 3–5)
Hot-path types (src/gps_denied/hot_types/): FrameState, IMUSample, PositionEstimate, VOEstimate, SatelliteAnchor as @dataclass(slots=True, frozen=True). Schemas shimmed to re-export. Pose stays Pydantic (mutation sites in factor_graph.py lines 182–297); GPSPoint stays Pydantic. Full hot-path migration deferred to Phase 2.
Pipeline package (src/gps_denied/pipeline/):
orchestrator.py—FlightProcessor(moved fromcore/processor.py)image_input.py,result_manager.py,sse_streamer.py(moved fromcore/)composition.py—build_pipeline(env: Literal["jetson","x86_dev","ci","sitl"]) -> FlightProcessor
Composition root: wires 10 components; lazy imports inside function body to avoid circular imports; Jetson env → prefer_cuvslam=True, prefer_mono_depth=True; other envs → mocks.
Config: AppSettings.env Literal field + RuntimeConfig = AppSettings alias. pydantic-settings YamlConfigSettingsSource loads config/{env}.yaml. pyyaml>=6.0 declared.
ABC→Protocol sweep: 6 interfaces converted to typing.Protocol with @runtime_checkable:
IFactorGraphOptimizer, IRouteChunkManager, IFailureRecoveryCoordinator, IModelManager, IImageMatcher, + all 8 component Protocols from components/*/protocol.py.
core/ retained for concentrated math: eskf.py, factor_graph.py, coordinates.py, chunk_manager.py, recovery.py, rotation.py, models.py.
Shim policy: every moved file leaves a re-export shim at its old path. Tests import from old paths — shims keep them green. Shim removal is Phase 2 work.
Deferred to Phase 2
- Full hot-path type migration (
Pose,GPSPoint, remaining Pydantic models on frame path) - Test reorganization to
tests/{unit,integration,blackbox,sitl,e2e}/ - Shim removal from
core/ - YAML config enrichment with env-specific overrides (MAVLink connection strings, tile dirs)
Stage 1 Decisions Inherited (validated, kept)
| Decision | Outcome |
|---|---|
| ESKF over EKF/UKF | ✓ Stage 1 |
| XFeat over LiteSAM for satellite matching | ✓ Stage 1 |
| OpenCV ORB stub for dev/CI; cuVSLAM on Jetson | ✓ Stage 1 |
| AnyLoc/DINOv2 for GPR | ✓ Stage 1 |
| diskcache + GeoHash for tiles | ✓ Stage 1 |
| AsyncSQLAlchemy + aiosqlite | ✓ Stage 1 |
Evolution
Each stage is its own iteration with its own PROJECT.md, REQUIREMENTS.md, ROADMAP.md. At stage completion:
- Snapshot current PROJECT.md / REQUIREMENTS.md / ROADMAP.md / phases/ →
.planning/archive/v[X.Y]/ - Open new stage with fresh roadmap (Phase 1 of the new stage)
- Carry forward only validated decisions and unresolved Future-stages items
Stage 2 opened: 2026-05-10