mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 15:11:12 +00:00
a11ed15187
ADR 0002: hexagonal/ports-and-adapters architecture — components/ layout, protocol.py per component, composition root, core/ for concentrated math. ADR 0003: @dataclass(slots=True, frozen=True) on hot path; Pydantic retained only at REST/config/DB boundaries. Pose/GPSPoint migration deferred to Phase 2. ADR 0004: Stage 2 as independent iteration — own phases 1-6, own requirements, stage1 code treated as MVP starting capital. PROJECT.md: Stage 2 Key Decisions updated from Pending → Accepted with Phase 1 implementation notes, deferred work list, and final architecture summary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
176 lines
12 KiB
Markdown
176 lines
12 KiB
Markdown
# GPS-Denied Onboard Navigation System — Stage 2
|
||
|
||
## What This Is
|
||
|
||
Real-time GPS-independent position estimation system for a fixed-wing UAV operating in GPS-denied/spoofed environments (flat terrain, Ukraine). Runs onboard a Jetson Orin Nano Super (8GB shared, 67 TOPS). Fuses visual odometry (cuVSLAM), satellite image matching (TensorRT FP16), and IMU via an ESKF to output MAVLink GPS_INPUT to an ArduPilot flight controller at 5-10Hz, while also streaming position and confidence over SSE to a ground station.
|
||
|
||
## Stage 2 Iteration
|
||
|
||
**Stage 2 is a self-contained iteration of the project.** It is NOT a continuation of Stage 1's phase numbering — it has its own roadmap (Phases 1–6), its own requirements list, and its own success criteria. Each stage is conceptually a new pass at the system: same problem, same end goal, fresh decisions about HOW.
|
||
|
||
**Stage 2 starting capital:**
|
||
|
||
- **From stage1 (own work):** The full v1 pipeline as MVP — ESKF (15-state), cuVSLAM/ORB VO, satellite matching + GPR, MAVLink GPS_INPUT, pipeline orchestration, SITL harness, accuracy benchmarks, 195 passing tests. Treated as **MVP, not production** — refactoring is allowed and expected.
|
||
- **From try02 (parallel team):** Concept-level ideas only — Safety Anchor State Machine, Geometry-Gated Anchor Verifier, Flight Data Recorder, Conditional Multi-Scale VPR, dual-channel MAVLink design, formal Acceptance Criteria document with numeric thresholds, structured test taxonomy.
|
||
- **From real-flight data:** Azaion 10.05.2026 dataset (tlog + 6min video + 9.5Hz GPS ground truth) as integration fixture.
|
||
|
||
**Stage 2 is free to:**
|
||
|
||
- Reorganize the codebase (hexagonal layout) — no production lock-in
|
||
- Replace, swap, or rebuild components — only AC-driven test outcomes are sacred
|
||
- Change the architecture wholesale if a better path emerges mid-stage
|
||
- Diverge from try02's choices where the evidence supports it (e.g., reject BASALT in favor of cuVSLAM, reject Pydantic on hot path)
|
||
|
||
**Stage 2 archive:** `_planning/archive/v1.0/` preserves stage1's PROJECT.md, REQUIREMENTS.md, ROADMAP.md, and Phase 1 artifacts as historical record.
|
||
|
||
## Core Value
|
||
|
||
The flight controller must receive valid MAVLink GPS_INPUT at 5-10Hz with position accuracy ≤50m for 80% of frames — without this, the UAV cannot navigate in GPS-denied airspace.
|
||
|
||
## Stage 2 Goal
|
||
|
||
Refactor the inherited stage1 MVP into a hexagonal/ports-and-adapters architecture with explicit DI composition root, integrate selected concept-level ideas from `try02`, formalize acceptance criteria with testable numerics, and add a real-flight integration fixture (Azaion 10.05.2026).
|
||
|
||
## Stage 2 Target Features
|
||
|
||
**Architecture:**
|
||
- Hexagonal layout — `src/gps_denied/components/{vio, satellite_matcher, gpr, anchor_verifier, safety_state, flight_recorder, mavlink_io, coordinate_transforms}/` with `protocol.py` + concrete impls per component
|
||
- Hot-path types as `@dataclass(slots=True, frozen=True)` for `FrameState`, `IMUSample`, `PositionEstimate`; Pydantic kept only at REST/config/DB boundaries
|
||
- Composition root `pipeline/composition.py` with explicit DI for env-specific wiring (jetson/x86_dev/ci/sitl)
|
||
- Per-environment config — `config/{jetson,x86_dev,ci,sitl}.yaml` driven by pydantic-settings
|
||
- `core/` retained for concentrated math (ESKF, factor graph, RANSAC) — single-file pure functions
|
||
|
||
**try02 concept integration:**
|
||
- Acceptance Criteria document — formal AC-1.x…AC-NEW-x with numeric thresholds, validation methods, test linkage
|
||
- Safety Anchor State Machine — separate layer over ESKF owning `source_label` (`satellite_anchored`/`vo_extrapolated`/`dead_reckoned`), monotonic covariance growth, anchor age, tile write eligibility
|
||
- Geometry-gated Anchor Verifier — formal accept/reject gates (inliers, MRE, reprojection error) before anchor enters ESKF
|
||
- Flight Data Recorder (FDR) — append-only event log with bounded segment storage and health states
|
||
- Conditional VPR invocation — DINOv2 forward only on re-loc triggers; steady-state geometric prior
|
||
- Multi-scale VPR chunks — 600-800m ground-footprint chunks at 40-50% overlap, decoupled from storage tiles, fine (z=20) + coarse (z=17) scales
|
||
- Source label + anchor_age_ms emitted in every GPS_INPUT estimate
|
||
- Visual blackout handling — switch to `dead_reckoned` ≤400ms, monotonic covariance growth, `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT @ 1-2Hz
|
||
- Spoofing-promotion latency monitor — promote own estimate to FC primary within <3s of detected real-GPS health drop
|
||
- Test taxonomy — `tests/{unit,integration,blackbox,sitl,e2e}/`
|
||
- Dual-channel MAVLink design — `GPS_INPUT` primary (v1 only), `ODOMETRY` auxiliary scaffolded behind feature flag for v1.1
|
||
- Structured JSON logging with `correlation_id` (frame_id) per-frame
|
||
- CLI tool `gps_denied replay --tlog ... --video ...`
|
||
- Real-flight integration fixture — Azaion 10.05.2026 as `tests/integration/azaion_flight/`
|
||
|
||
## Stage 2 Explicit Non-Goals
|
||
|
||
- BASALT VIO backend — cuVSLAM remains primary (aarch64) with ORB-SLAM3 as CI baseline
|
||
- Pydantic on the per-frame hot path — dataclasses replace it
|
||
- Mandatory PostgreSQL — SQLite remains the embedded default
|
||
- Microservice processes / IPC — single-process architecture preserved
|
||
- Folder-per-component split for `core/` math files — ESKF/factor graph stay concentrated
|
||
- Mid-flight tile generation + write-back to Suite (AC-8.4) — deferred to Stage 3
|
||
- Production hardware validation on Jetson — deferred to Stage 3
|
||
|
||
## Future Stages (parking lot)
|
||
|
||
- **Stage 3 candidates:** Jetson hardware validation, mid-flight tile generation + Suite write-back, ODOMETRY channel enabled, AC-NEW-1 cold-boot benchmark, BASALT evaluation if cuVSLAM blockers emerge
|
||
|
||
## Out of Scope (across all stages, unless re-opened)
|
||
|
||
- TensorRT engine building tooling — engines are pre-built offline
|
||
- Google Maps tile download tooling — tiles pre-cached before flight
|
||
- Mobile/web ground station UI — SSE consumed by external systems
|
||
- Multi-UAV coordination — single UAV instance only
|
||
|
||
## Context
|
||
|
||
**Hardware target:** Jetson Orin Nano Super (8GB LPDDR5 shared, JetPack 6.2.2, CUDA 12.6, TRT 10.3.0). Development on x86 Linux; cuVSLAM and TRT are Jetson-only — dev/CI uses OpenCV ORB stub and MockInferenceEngine.
|
||
|
||
**Camera (target):** ADTI 20L V1 (5456×3632, APS-C, 16mm lens, nadir fixed, 0.7fps). AI detection camera: Viewpro A40 Pro (separate).
|
||
|
||
**Camera (Azaion fixture):** Multirotor gimbal EO+IR split-screen with HUD overlay, 1280×720 @ 30fps. Used for integration testing only — does not represent target deployment camera.
|
||
|
||
**Flight controller:** ArduPilot via MAVLink UART. System sends GPS_INPUT; receives IMU (200Hz target / 9.7Hz in Azaion fixture) and GLOBAL_POSITION_INT (1Hz) from FC.
|
||
|
||
**Key latency budget:** <400ms end-to-end per frame.
|
||
|
||
**Stage 1 inheritance:** ~7,800 lines of working Python code with 195 passing tests. All algorithmic kernels (ESKF, VO, GPR, MAVLink, factor graph) implemented. Stage 2 starts from this codebase on branch `stage2` (HEAD = stage1).
|
||
|
||
**Reference branch:** `try02` is checked out as a worktree at `../gps-denied-onboard-try02/` for concept harvesting. We do NOT merge from try02 — we read it for ideas and re-implement what fits.
|
||
|
||
## Constraints
|
||
|
||
- **Performance:** <400ms/frame end-to-end p95, <8GB RAM+VRAM — non-negotiable
|
||
- **Hardware:** cuVSLAM v15.0.0 (aarch64-only wheel) — Protocol with stub on x86
|
||
- **Platform:** JetPack 6.2.2, Python 3.10+, TensorRT 10.3.0, CUDA 12.6
|
||
- **Navigation accuracy:** 80% frames ≤50m, 60% frames ≤20m, max drift 100m between satellite corrections
|
||
- **Resilience:** Handle sharp turns (disconnected VO segments), 3+ consecutive satellite match failures, visual blackout, GPS spoofing promotion <3s
|
||
- **Regression floor:** All 195 stage1 passing tests must continue to pass after refactor
|
||
|
||
## Stage 2 Key Decisions
|
||
|
||
| Decision | Rationale | Outcome |
|
||
|----------|-----------|---------|
|
||
| Hexagonal layout with `components/` folders | Clear ownership per swappable backend, native bridges colocate with adapter | ✓ Phase 1 |
|
||
| `@dataclass(slots=True, frozen=True)` on hot path, Pydantic at boundaries only | Avoid try02's per-frame Pydantic latency cost; validate where it catches bugs (REST input, config) | ✓ Phase 1 (hot_types/ scaffolded; full migration Phase 2) |
|
||
| Explicit DI composition root | One file wires environment-specific implementations; tests pass mock dependencies | ✓ Phase 1 (`pipeline/composition.py:build_pipeline`) |
|
||
| Adopt try02 concept ideas, reject try02 layout details | Take Safety Anchor / Anchor Verifier / FDR / Conditional VPR; reject Pydantic-on-hot-path, BASALT | ✓ Adopted — Phases 3–5 |
|
||
| Take try02 acceptance criteria with numeric thresholds | Their AC-1.x…AC-NEW-x is more rigorous than stage1's drafts; bind every AC to ≥1 test | ✓ Adopted — Phase 2 |
|
||
| Test taxonomy `unit/integration/blackbox/sitl/e2e` | Clarifies CI-on-push vs PR vs nightly vs hardware-only test runs | ✓ Phase 2 |
|
||
| Stage as iteration, not phase continuation | Each stage = own roadmap, own phase numbering, own success criteria | ✓ Adopted |
|
||
|
||
## Phase 1 Outcome (2026-05-11, completed)
|
||
|
||
**ARCH-01..07 all satisfied.** 216 tests pass (baseline 195+21 new = 216), 0 failures, accuracy benchmarks unchanged.
|
||
|
||
### What was built
|
||
|
||
**Components scaffold** (`src/gps_denied/components/`):
|
||
- `vio/` — `protocol.py` + `orbslam_backend.py` + `cuvslam_backend.py` + `factory.py`; `core/vo.py` is a shim
|
||
- `gpr/` — `protocol.py` + `faiss_gpr.py` (inline numpy fallback preserved); `core/gpr.py` is a shim
|
||
- `satellite_matcher/` — `protocol.py` + `local_tile_loader.py` + `metric_refinement.py`; `core/satellite.py`, `core/metric.py` are shims
|
||
- `mavlink_io/` — `protocol.py` + `pymavlink_bridge.py` + `mock_mavlink.py`; `core/mavlink.py` is a shim (re-exports private helpers `_confidence_to_fix_type`, `_eskf_to_gps_input`, `_unix_to_gps_time`)
|
||
- `anchor_verifier/`, `safety_state/`, `flight_recorder/`, `coordinate_transforms/` — Protocol stubs only (Phases 3–5)
|
||
|
||
**Hot-path types** (`src/gps_denied/hot_types/`): `FrameState`, `IMUSample`, `PositionEstimate`, `VOEstimate`, `SatelliteAnchor` as `@dataclass(slots=True, frozen=True)`. Schemas shimmed to re-export. `Pose` stays Pydantic (mutation sites in `factor_graph.py` lines 182–297); `GPSPoint` stays Pydantic. Full hot-path migration deferred to Phase 2.
|
||
|
||
**Pipeline package** (`src/gps_denied/pipeline/`):
|
||
- `orchestrator.py` — `FlightProcessor` (moved from `core/processor.py`)
|
||
- `image_input.py`, `result_manager.py`, `sse_streamer.py` (moved from `core/`)
|
||
- `composition.py` — `build_pipeline(env: Literal["jetson","x86_dev","ci","sitl"]) -> FlightProcessor`
|
||
|
||
**Composition root**: wires 10 components; lazy imports inside function body to avoid circular imports; Jetson env → `prefer_cuvslam=True`, `prefer_mono_depth=True`; other envs → mocks.
|
||
|
||
**Config**: `AppSettings.env` Literal field + `RuntimeConfig = AppSettings` alias. `pydantic-settings YamlConfigSettingsSource` loads `config/{env}.yaml`. `pyyaml>=6.0` declared.
|
||
|
||
**ABC→Protocol sweep**: 6 interfaces converted to `typing.Protocol` with `@runtime_checkable`:
|
||
`IFactorGraphOptimizer`, `IRouteChunkManager`, `IFailureRecoveryCoordinator`, `IModelManager`, `IImageMatcher`, + all 8 component Protocols from `components/*/protocol.py`.
|
||
|
||
**`core/` retained** for concentrated math: `eskf.py`, `factor_graph.py`, `coordinates.py`, `chunk_manager.py`, `recovery.py`, `rotation.py`, `models.py`.
|
||
|
||
**Shim policy**: every moved file leaves a re-export shim at its old path. Tests import from old paths — shims keep them green. Shim removal is Phase 2 work.
|
||
|
||
### Deferred to Phase 2
|
||
|
||
- Full hot-path type migration (`Pose`, `GPSPoint`, remaining Pydantic models on frame path)
|
||
- Test reorganization to `tests/{unit,integration,blackbox,sitl,e2e}/`
|
||
- Shim removal from `core/`
|
||
- YAML config enrichment with env-specific overrides (MAVLink connection strings, tile dirs)
|
||
|
||
## Stage 1 Decisions Inherited (validated, kept)
|
||
|
||
| Decision | Outcome |
|
||
|----------|---------|
|
||
| ESKF over EKF/UKF | ✓ Stage 1 |
|
||
| XFeat over LiteSAM for satellite matching | ✓ Stage 1 |
|
||
| OpenCV ORB stub for dev/CI; cuVSLAM on Jetson | ✓ Stage 1 |
|
||
| AnyLoc/DINOv2 for GPR | ✓ Stage 1 |
|
||
| diskcache + GeoHash for tiles | ✓ Stage 1 |
|
||
| AsyncSQLAlchemy + aiosqlite | ✓ Stage 1 |
|
||
|
||
## Evolution
|
||
|
||
Each stage is its own iteration with its own PROJECT.md, REQUIREMENTS.md, ROADMAP.md. At stage completion:
|
||
|
||
1. Snapshot current PROJECT.md / REQUIREMENTS.md / ROADMAP.md / phases/ → `.planning/archive/v[X.Y]/`
|
||
2. Open new stage with fresh roadmap (Phase 1 of the new stage)
|
||
3. Carry forward only validated decisions and unresolved Future-stages items
|
||
|
||
---
|
||
*Stage 2 opened: 2026-05-10*
|