Files
Yuzviak a11ed15187 docs: add Phase 1 ADRs and update PROJECT.md with completed decisions
ADR 0002: hexagonal/ports-and-adapters architecture — components/ layout,
  protocol.py per component, composition root, core/ for concentrated math.
ADR 0003: @dataclass(slots=True, frozen=True) on hot path; Pydantic retained
  only at REST/config/DB boundaries. Pose/GPSPoint migration deferred to Phase 2.
ADR 0004: Stage 2 as independent iteration — own phases 1-6, own requirements,
  stage1 code treated as MVP starting capital.

PROJECT.md: Stage 2 Key Decisions updated from Pending → Accepted with Phase 1
  implementation notes, deferred work list, and final architecture summary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 09:23:09 +03:00

12 KiB
Raw Permalink Blame History

GPS-Denied Onboard Navigation System — Stage 2

What This Is

Real-time GPS-independent position estimation system for a fixed-wing UAV operating in GPS-denied/spoofed environments (flat terrain, Ukraine). Runs onboard a Jetson Orin Nano Super (8GB shared, 67 TOPS). Fuses visual odometry (cuVSLAM), satellite image matching (TensorRT FP16), and IMU via an ESKF to output MAVLink GPS_INPUT to an ArduPilot flight controller at 5-10Hz, while also streaming position and confidence over SSE to a ground station.

Stage 2 Iteration

Stage 2 is a self-contained iteration of the project. It is NOT a continuation of Stage 1's phase numbering — it has its own roadmap (Phases 16), its own requirements list, and its own success criteria. Each stage is conceptually a new pass at the system: same problem, same end goal, fresh decisions about HOW.

Stage 2 starting capital:

  • From stage1 (own work): The full v1 pipeline as MVP — ESKF (15-state), cuVSLAM/ORB VO, satellite matching + GPR, MAVLink GPS_INPUT, pipeline orchestration, SITL harness, accuracy benchmarks, 195 passing tests. Treated as MVP, not production — refactoring is allowed and expected.
  • From try02 (parallel team): Concept-level ideas only — Safety Anchor State Machine, Geometry-Gated Anchor Verifier, Flight Data Recorder, Conditional Multi-Scale VPR, dual-channel MAVLink design, formal Acceptance Criteria document with numeric thresholds, structured test taxonomy.
  • From real-flight data: Azaion 10.05.2026 dataset (tlog + 6min video + 9.5Hz GPS ground truth) as integration fixture.

Stage 2 is free to:

  • Reorganize the codebase (hexagonal layout) — no production lock-in
  • Replace, swap, or rebuild components — only AC-driven test outcomes are sacred
  • Change the architecture wholesale if a better path emerges mid-stage
  • Diverge from try02's choices where the evidence supports it (e.g., reject BASALT in favor of cuVSLAM, reject Pydantic on hot path)

Stage 2 archive: _planning/archive/v1.0/ preserves stage1's PROJECT.md, REQUIREMENTS.md, ROADMAP.md, and Phase 1 artifacts as historical record.

Core Value

The flight controller must receive valid MAVLink GPS_INPUT at 5-10Hz with position accuracy ≤50m for 80% of frames — without this, the UAV cannot navigate in GPS-denied airspace.

Stage 2 Goal

Refactor the inherited stage1 MVP into a hexagonal/ports-and-adapters architecture with explicit DI composition root, integrate selected concept-level ideas from try02, formalize acceptance criteria with testable numerics, and add a real-flight integration fixture (Azaion 10.05.2026).

Stage 2 Target Features

Architecture:

  • Hexagonal layout — src/gps_denied/components/{vio, satellite_matcher, gpr, anchor_verifier, safety_state, flight_recorder, mavlink_io, coordinate_transforms}/ with protocol.py + concrete impls per component
  • Hot-path types as @dataclass(slots=True, frozen=True) for FrameState, IMUSample, PositionEstimate; Pydantic kept only at REST/config/DB boundaries
  • Composition root pipeline/composition.py with explicit DI for env-specific wiring (jetson/x86_dev/ci/sitl)
  • Per-environment config — config/{jetson,x86_dev,ci,sitl}.yaml driven by pydantic-settings
  • core/ retained for concentrated math (ESKF, factor graph, RANSAC) — single-file pure functions

try02 concept integration:

  • Acceptance Criteria document — formal AC-1.x…AC-NEW-x with numeric thresholds, validation methods, test linkage
  • Safety Anchor State Machine — separate layer over ESKF owning source_label (satellite_anchored/vo_extrapolated/dead_reckoned), monotonic covariance growth, anchor age, tile write eligibility
  • Geometry-gated Anchor Verifier — formal accept/reject gates (inliers, MRE, reprojection error) before anchor enters ESKF
  • Flight Data Recorder (FDR) — append-only event log with bounded segment storage and health states
  • Conditional VPR invocation — DINOv2 forward only on re-loc triggers; steady-state geometric prior
  • Multi-scale VPR chunks — 600-800m ground-footprint chunks at 40-50% overlap, decoupled from storage tiles, fine (z=20) + coarse (z=17) scales
  • Source label + anchor_age_ms emitted in every GPS_INPUT estimate
  • Visual blackout handling — switch to dead_reckoned ≤400ms, monotonic covariance growth, VISUAL_BLACKOUT_IMU_ONLY STATUSTEXT @ 1-2Hz
  • Spoofing-promotion latency monitor — promote own estimate to FC primary within <3s of detected real-GPS health drop
  • Test taxonomy — tests/{unit,integration,blackbox,sitl,e2e}/
  • Dual-channel MAVLink design — GPS_INPUT primary (v1 only), ODOMETRY auxiliary scaffolded behind feature flag for v1.1
  • Structured JSON logging with correlation_id (frame_id) per-frame
  • CLI tool gps_denied replay --tlog ... --video ...
  • Real-flight integration fixture — Azaion 10.05.2026 as tests/integration/azaion_flight/

Stage 2 Explicit Non-Goals

  • BASALT VIO backend — cuVSLAM remains primary (aarch64) with ORB-SLAM3 as CI baseline
  • Pydantic on the per-frame hot path — dataclasses replace it
  • Mandatory PostgreSQL — SQLite remains the embedded default
  • Microservice processes / IPC — single-process architecture preserved
  • Folder-per-component split for core/ math files — ESKF/factor graph stay concentrated
  • Mid-flight tile generation + write-back to Suite (AC-8.4) — deferred to Stage 3
  • Production hardware validation on Jetson — deferred to Stage 3

Future Stages (parking lot)

  • Stage 3 candidates: Jetson hardware validation, mid-flight tile generation + Suite write-back, ODOMETRY channel enabled, AC-NEW-1 cold-boot benchmark, BASALT evaluation if cuVSLAM blockers emerge

Out of Scope (across all stages, unless re-opened)

  • TensorRT engine building tooling — engines are pre-built offline
  • Google Maps tile download tooling — tiles pre-cached before flight
  • Mobile/web ground station UI — SSE consumed by external systems
  • Multi-UAV coordination — single UAV instance only

Context

Hardware target: Jetson Orin Nano Super (8GB LPDDR5 shared, JetPack 6.2.2, CUDA 12.6, TRT 10.3.0). Development on x86 Linux; cuVSLAM and TRT are Jetson-only — dev/CI uses OpenCV ORB stub and MockInferenceEngine.

Camera (target): ADTI 20L V1 (5456×3632, APS-C, 16mm lens, nadir fixed, 0.7fps). AI detection camera: Viewpro A40 Pro (separate).

Camera (Azaion fixture): Multirotor gimbal EO+IR split-screen with HUD overlay, 1280×720 @ 30fps. Used for integration testing only — does not represent target deployment camera.

Flight controller: ArduPilot via MAVLink UART. System sends GPS_INPUT; receives IMU (200Hz target / 9.7Hz in Azaion fixture) and GLOBAL_POSITION_INT (1Hz) from FC.

Key latency budget: <400ms end-to-end per frame.

Stage 1 inheritance: ~7,800 lines of working Python code with 195 passing tests. All algorithmic kernels (ESKF, VO, GPR, MAVLink, factor graph) implemented. Stage 2 starts from this codebase on branch stage2 (HEAD = stage1).

Reference branch: try02 is checked out as a worktree at ../gps-denied-onboard-try02/ for concept harvesting. We do NOT merge from try02 — we read it for ideas and re-implement what fits.

Constraints

  • Performance: <400ms/frame end-to-end p95, <8GB RAM+VRAM — non-negotiable
  • Hardware: cuVSLAM v15.0.0 (aarch64-only wheel) — Protocol with stub on x86
  • Platform: JetPack 6.2.2, Python 3.10+, TensorRT 10.3.0, CUDA 12.6
  • Navigation accuracy: 80% frames ≤50m, 60% frames ≤20m, max drift 100m between satellite corrections
  • Resilience: Handle sharp turns (disconnected VO segments), 3+ consecutive satellite match failures, visual blackout, GPS spoofing promotion <3s
  • Regression floor: All 195 stage1 passing tests must continue to pass after refactor

Stage 2 Key Decisions

Decision Rationale Outcome
Hexagonal layout with components/ folders Clear ownership per swappable backend, native bridges colocate with adapter ✓ Phase 1
@dataclass(slots=True, frozen=True) on hot path, Pydantic at boundaries only Avoid try02's per-frame Pydantic latency cost; validate where it catches bugs (REST input, config) ✓ Phase 1 (hot_types/ scaffolded; full migration Phase 2)
Explicit DI composition root One file wires environment-specific implementations; tests pass mock dependencies ✓ Phase 1 (pipeline/composition.py:build_pipeline)
Adopt try02 concept ideas, reject try02 layout details Take Safety Anchor / Anchor Verifier / FDR / Conditional VPR; reject Pydantic-on-hot-path, BASALT ✓ Adopted — Phases 35
Take try02 acceptance criteria with numeric thresholds Their AC-1.x…AC-NEW-x is more rigorous than stage1's drafts; bind every AC to ≥1 test ✓ Adopted — Phase 2
Test taxonomy unit/integration/blackbox/sitl/e2e Clarifies CI-on-push vs PR vs nightly vs hardware-only test runs ✓ Phase 2
Stage as iteration, not phase continuation Each stage = own roadmap, own phase numbering, own success criteria ✓ Adopted

Phase 1 Outcome (2026-05-11, completed)

ARCH-01..07 all satisfied. 216 tests pass (baseline 195+21 new = 216), 0 failures, accuracy benchmarks unchanged.

What was built

Components scaffold (src/gps_denied/components/):

  • vio/protocol.py + orbslam_backend.py + cuvslam_backend.py + factory.py; core/vo.py is a shim
  • gpr/protocol.py + faiss_gpr.py (inline numpy fallback preserved); core/gpr.py is a shim
  • satellite_matcher/protocol.py + local_tile_loader.py + metric_refinement.py; core/satellite.py, core/metric.py are shims
  • mavlink_io/protocol.py + pymavlink_bridge.py + mock_mavlink.py; core/mavlink.py is a shim (re-exports private helpers _confidence_to_fix_type, _eskf_to_gps_input, _unix_to_gps_time)
  • anchor_verifier/, safety_state/, flight_recorder/, coordinate_transforms/ — Protocol stubs only (Phases 35)

Hot-path types (src/gps_denied/hot_types/): FrameState, IMUSample, PositionEstimate, VOEstimate, SatelliteAnchor as @dataclass(slots=True, frozen=True). Schemas shimmed to re-export. Pose stays Pydantic (mutation sites in factor_graph.py lines 182297); GPSPoint stays Pydantic. Full hot-path migration deferred to Phase 2.

Pipeline package (src/gps_denied/pipeline/):

  • orchestrator.pyFlightProcessor (moved from core/processor.py)
  • image_input.py, result_manager.py, sse_streamer.py (moved from core/)
  • composition.pybuild_pipeline(env: Literal["jetson","x86_dev","ci","sitl"]) -> FlightProcessor

Composition root: wires 10 components; lazy imports inside function body to avoid circular imports; Jetson env → prefer_cuvslam=True, prefer_mono_depth=True; other envs → mocks.

Config: AppSettings.env Literal field + RuntimeConfig = AppSettings alias. pydantic-settings YamlConfigSettingsSource loads config/{env}.yaml. pyyaml>=6.0 declared.

ABC→Protocol sweep: 6 interfaces converted to typing.Protocol with @runtime_checkable: IFactorGraphOptimizer, IRouteChunkManager, IFailureRecoveryCoordinator, IModelManager, IImageMatcher, + all 8 component Protocols from components/*/protocol.py.

core/ retained for concentrated math: eskf.py, factor_graph.py, coordinates.py, chunk_manager.py, recovery.py, rotation.py, models.py.

Shim policy: every moved file leaves a re-export shim at its old path. Tests import from old paths — shims keep them green. Shim removal is Phase 2 work.

Deferred to Phase 2

  • Full hot-path type migration (Pose, GPSPoint, remaining Pydantic models on frame path)
  • Test reorganization to tests/{unit,integration,blackbox,sitl,e2e}/
  • Shim removal from core/
  • YAML config enrichment with env-specific overrides (MAVLink connection strings, tile dirs)

Stage 1 Decisions Inherited (validated, kept)

Decision Outcome
ESKF over EKF/UKF ✓ Stage 1
XFeat over LiteSAM for satellite matching ✓ Stage 1
OpenCV ORB stub for dev/CI; cuVSLAM on Jetson ✓ Stage 1
AnyLoc/DINOv2 for GPR ✓ Stage 1
diskcache + GeoHash for tiles ✓ Stage 1
AsyncSQLAlchemy + aiosqlite ✓ Stage 1

Evolution

Each stage is its own iteration with its own PROJECT.md, REQUIREMENTS.md, ROADMAP.md. At stage completion:

  1. Snapshot current PROJECT.md / REQUIREMENTS.md / ROADMAP.md / phases/ → .planning/archive/v[X.Y]/
  2. Open new stage with fresh roadmap (Phase 1 of the new stage)
  3. Carry forward only validated decisions and unresolved Future-stages items

Stage 2 opened: 2026-05-10